當(dāng)前位置：首頁 > 编程语言 > python >内容正文

python

python字典合并去重_十三、深入Python字典和集合

發(fā)布時(shí)間：2024/1/1 python 24 豆豆

生活随笔收集整理的這篇文章主要介紹了 python字典合并去重_十三、深入Python字典和集合小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

「@Author ：Runsen」

字典和集合

字典是一系列無序元素的組合，其長(zhǎng)度大小可變，元素可以任意地刪減和改變。不過要注意，這里的元素，是一對(duì)鍵(key)和值(value)

相比于列表和元組，字典的性能更優(yōu)，特別是對(duì)于查找、添加和刪除，字典都能在常數(shù)的時(shí)間復(fù)雜度內(nèi)完成

而集合和字典基本相同，唯一的區(qū)別，就是集合沒有鍵和值的配對(duì)是一系列無序的、唯一的元素組合。

d1 = {'name': 'jason', 'age': 20, 'gender': 'male'}

d2 = dict({'name': 'jason', 'age': 20, 'gender': 'male'})

d3 = dict([('name', 'jason'), ('age', 20), ('gender', 'male')])

d4 = dict(name='jason', age=20, gender='male')

d1 == d2 == d3 ==d4

True

s1 = {1, 2, 3}

s2 = Set([1, 2, 3])

s1 == s2

True

集合并不支持索引操作，因?yàn)榧媳举|(zhì)上是一個(gè)哈希表，和列表不一樣

s = {1, 2, 3}

s[0]

Traceback (most recent call last):

File "", line 1, in

TypeError: 'set' object does not support indexing

想要判斷一個(gè)元素在不在字典或集合內(nèi)，我們可以用 value in dict/set

s = {1, 2, 3}

1 in s

True

10 in s

False

d = {'name': 'Runsen', 'age': 20}

'name' in d

True

'location' in d

False

「字典的增刪改」

In [1]: d = {'name': 'Runsen', 'age': 20}^M

...:

In [2]: d['gender'] = 'male'

In [3]: d['birthday'] = '1999-10-01'

In [4]: d

Out[4]: {'name': 'Runsen', 'age': 20, 'gender': 'male', 'birthday': '1999-10-01'}

In [5]: d['birthday'] = '1999/10/01'

In [6]: d.pop('birthday')

Out[6]: '1999/10/01'

In [8]: d

Out[8]: {'name': 'Runsen', 'age': 20, 'gender': 'male'}

In [9]: s = {1, 2, 3}^M

...:

In [10]: s.add(4)

In [11]: s

Out[11]: {1, 2, 3, 4}

In [12]: s.remove(4)

In [13]: s

Out[13]: {1, 2, 3}****

「字典的升序和降序排序」

d = {'b': 1, 'a': 2, 'c': 10}

d_sorted_by_key = sorted(d.items(), key=lambda x: x[0]) # 根據(jù)字典鍵的升序排序

d_sorted_by_value = sorted(d.items(), key=lambda x: x[1]) # 根據(jù)字典值的升序排序

d_sorted_by_key

[('a', 2), ('b', 1), ('c', 10)]

d_sorted_by_value

[('b', 1), ('a', 2), ('c', 10)]

增刪查找

字典和集合是進(jìn)行過性能高度優(yōu)化的數(shù)據(jù)結(jié)構(gòu)，特別是對(duì)于查找、添加和刪除操作

「列表的做法」

# list version

def find_unique_price_using_list(products):

unique_price_list = []

for _, price in products: # A

if price not in unique_price_list: #B

unique_price_list.append(price)

return len(unique_price_list)

# products id 和 price

products = [

(143121312, 100),

(432314553, 30),

(32421912367, 150),

(937153201, 30)

]

print('number of unique price is: {}'.format(find_unique_price_using_list(products)))

# 輸出

number of unique price is: 3

「集合的做法」

# set version

def find_unique_price_using_set(products):

unique_price_set = set()

for _, price in products:

unique_price_set.add(price)

return len(unique_price_set)

products = [

(143121312, 100),

(432314553, 30),

(32421912367, 150),

(937153201, 30)

]

print('number of unique price is: {}'.format(find_unique_price_using_set(products)))

# 輸出

number of unique price is: 3

比較運(yùn)行的時(shí)間，也就是性能

import time

id = [x for x in range(0, 100000)]

price = [x for x in range(200000, 300000)]

products = list(zip(id, price))

# 計(jì)算列表版本的時(shí)間

start_using_list = time.perf_counter()

find_unique_price_using_list(products)

end_using_list = time.perf_counter()

print("time elapse using list: {}".format(end_using_list - start_using_list))

## 輸出

time elapse using list: 41.61519479751587

# 計(jì)算集合版本的時(shí)間

start_using_set = time.perf_counter()

find_unique_price_using_set(products)

end_using_set = time.perf_counter()

print("time elapse using set: {}".format(end_using_set - start_using_set))

# 輸出

time elapse using set: 0.008238077163696289

在性能上集合完爆列表

對(duì)于字典，哈希表存儲(chǔ)了哈希值，鍵和值這桑三個(gè)元素

字典和集合都是無序的數(shù)據(jù)結(jié)構(gòu)，其內(nèi)部的哈希表存儲(chǔ)結(jié)構(gòu)，保證了查找，插入，刪除操作的高效性。所以，字典和集合通常運(yùn)用在對(duì)元素的查找，去重

初始化字典的方式有兩種方法，比較下哪一種更高效，

In [20]: timeit a ={'name':"runsen",'age':20}

127 ns ± 0.8 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [21]: timeit b =dict({'name':"runsen",'age':20})

438 ns ± 3.41 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

第一種，因?yàn)椴挥谜{(diào)用相關(guān)的函數(shù)

字典的鍵可以是一個(gè)列表嗎？下面這段代碼中，字典的初始化是否正確

In [22]: d = {'name': 'Runsen', ['education']: [' primary school', 'junior middle school']}^M

...:

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

----> 1 d = {'name': 'Runsen', ['education']: [' primary school', 'junior middle school']}

TypeError: unhashable type: 'list'

In [23]: d = {'name': 'Runsen', ('education'): [' primary school', 'junior middle school']}^M

...:

In [24]: d

Out[24]: {'name': 'Runsen', 'education': [' primary school', 'junior middle school']}

用列表作為 Key 在這里是不被允許的，因?yàn)榱斜硎且粋€(gè)動(dòng)態(tài)變化的數(shù)據(jù)結(jié)構(gòu)，字典當(dāng)中的 key 要求是不可變的，原因也很好理解.

key 首先是不重復(fù)的，如果 Key 是可以變化的話，那么隨著 Key 的變化，這里就有可能就會(huì)有重復(fù)的 Key，那么這就和字典的定義相違背；如果把這里的列表換成之前我們講過的元組是可以的，因?yàn)樵M不可變。? 本文已收錄 GitHub，傳送門~[1] ，里面更有大廠面試完整考點(diǎn)，歡迎 Star。

Reference

[1]

總結(jié)

以上是生活随笔為你收集整理的python字典合并去重_十三、深入Python字典和集合的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： JWT 的登出问题
下一篇： websocket python爬虫_p