當(dāng)前位置：首頁(yè) > 人文社科 > 生活经验 >内容正文

生活经验

Redis源码解析——字典基本操作

發(fā)布時(shí)間：2023/11/27 生活经验 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 Redis源码解析——字典基本操作小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

? ? ? ? 有了《Redis源碼解析——字典結(jié)構(gòu)》的基礎(chǔ)，我們便可以對(duì)dict的實(shí)現(xiàn)進(jìn)行展開(kāi)分析。（轉(zhuǎn)載請(qǐng)指明出于breaksoftware的csdn博客）

創(chuàng)建字典

? ? ? ? 一般字典創(chuàng)建時(shí)，都是沒(méi)有數(shù)據(jù)的，但是字典類型需要確定，所以我們看到Redis字典創(chuàng)建主要需要定義數(shù)據(jù)操作的dictType對(duì)象：

static void _dictReset(dictht *ht)
{ht->table = NULL;ht->size = 0;ht->sizemask = 0;ht->used = 0;
}/* Create a new hash table */
dict *dictCreate(dictType *type,void *privDataPtr)
{dict *d = zmalloc(sizeof(*d));_dictInit(d,type,privDataPtr);return d;
}/* Initialize the hash table */
int _dictInit(dict *d, dictType *type,void *privDataPtr)
{_dictReset(&d->ht[0]);_dictReset(&d->ht[1]);d->type = type;d->privdata = privDataPtr;d->rehashidx = -1;d->iterators = 0;return DICT_OK;
}

? ? ? ? dictCreate的privaDataPtr一般都傳Null。但是這個(gè)變量的設(shè)計(jì)是有原因的，因?yàn)樽髡呦Ｍ峁┮环N能力，在框架調(diào)用一些使用者提供的方法時(shí)，能夠?qū)⒁恍┧麄兛赡荜P(guān)心的數(shù)據(jù)透?jìng)骰厝ァ＿@種數(shù)據(jù)可能不一定是簡(jiǎn)單的數(shù)據(jù)，也可能是個(gè)函數(shù)指針。如果是個(gè)函數(shù)指針的話，那么在框架調(diào)用相關(guān)函數(shù)時(shí)，使用者通過(guò)privaDataPtr傳遞進(jìn)來(lái)的函數(shù)指針將被回傳，并在用戶自定義的方法中執(zhí)行。比如調(diào)用用戶提供的對(duì)比數(shù)據(jù)的函數(shù)：

#define dictCompareKeys(d, key1, key2) \(((d)->type->keyCompare) ? \(d)->type->keyCompare((d)->privdata, key1, key2) : \(key1) == (key2))

? ? ? ? 還有一個(gè)需要注意的是rehashidx。因?yàn)閯倓?chuàng)建的初始字典不需要rehash，所以rehashidx為-1。

刪除字典

? ? ? ? 字典刪除操作也非常簡(jiǎn)單，其主要處理的就是兩個(gè)dictht對(duì)象。因?yàn)檫@兩個(gè)對(duì)象中有dictEntry數(shù)組，而每個(gè)數(shù)組元素均為一條鏈的首地址，于是刪除操作既有鏈表釋放，也有動(dòng)態(tài)數(shù)組釋放操作。

int _dictClear(dict *d, dictht *ht, void(callback)(void *)) {unsigned long i;/* Free all the elements */for (i = 0; i < ht->size && ht->used > 0; i++) {dictEntry *he, *nextHe;if (callback && (i & 65535) == 0) callback(d->privdata);if ((he = ht->table[i]) == NULL) continue;while(he) {nextHe = he->next;dictFreeKey(d, he);dictFreeVal(d, he);zfree(he);ht->used--;he = nextHe;}}/* Free the table and the allocated cache structure */zfree(ht->table);/* Re-initialize the table */_dictReset(ht);return DICT_OK; /* never fails */
}/* Clear & Release the hash table */
void dictRelease(dict *d)
{_dictClear(d,&d->ht[0],NULL);_dictClear(d,&d->ht[1],NULL);zfree(d);
}

? ? ? ? 上面函數(shù)中dictFreeKey和dictFreeValue實(shí)則是調(diào)用dictType中傳入的數(shù)據(jù)釋放函數(shù)

#define dictFreeVal(d, entry) \if ((d)->type->valDestructor) \(d)->type->valDestructor((d)->privdata, (entry)->v.val)#define dictFreeKey(d, entry) \if ((d)->type->keyDestructor) \(d)->type->keyDestructor((d)->privdata, (entry)->key)

字典擴(kuò)容和縮容

? ? ? ? 我們知道Redis的字典是通過(guò)數(shù)組和鏈表相結(jié)合的方式實(shí)現(xiàn)的。理論上說(shuō)，如果數(shù)組長(zhǎng)度不變，鏈表長(zhǎng)度改變則可以達(dá)到字典內(nèi)容增減的目的。但是為什么還要設(shè)計(jì)擴(kuò)容和縮容呢？首先說(shuō)明下，這兒講解的兩個(gè)概念是針dictht的table的——即針對(duì)數(shù)組結(jié)構(gòu)的。那么有了《Redis源碼解析——字典結(jié)構(gòu)》知識(shí)，我們可以得知，針對(duì)數(shù)組長(zhǎng)度的增減是為了：在鏈表過(guò)長(zhǎng)影響查找效率時(shí)，擴(kuò)大數(shù)組長(zhǎng)度以減小鏈表長(zhǎng)度，達(dá)到性能優(yōu)化。在數(shù)據(jù)過(guò)于稀疏的情況下，減小數(shù)組長(zhǎng)度以使得無(wú)效數(shù)組指針變少，從而達(dá)到節(jié)約空間的目的。

? ? ? ? 我們先看看擴(kuò)容的計(jì)算：

/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{/* Incremental rehashing already in progress. Return. */if (dictIsRehashing(d)) return DICT_OK;/* If the hash table is empty expand it to the initial size. */if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);/* If we reached the 1:1 ratio, and we are allowed to resize the hash* table (global setting) or we should avoid it but the ratio between* elements/buckets is over the "safe" threshold, we resize doubling* the number of buckets. */if (d->ht[0].used >= d->ht[0].size &&(dict_can_resize ||d->ht[0].used/d->ht[0].size > dict_force_resize_ratio)){return dictExpand(d, d->ht[0].used*2);}return DICT_OK;
}

? ? ? ? 其中最核心的是檢查ht[0]中元素個(gè)數(shù)和保存鏈表首地址的數(shù)組長(zhǎng)度的商是否大于dict_force_resize_ratio——5。這個(gè)公式是計(jì)算鏈表的平均長(zhǎng)度（數(shù)組中NULL意味著該對(duì)應(yīng)的鏈表長(zhǎng)度為0）。如果平均長(zhǎng)度大于5，則需要通過(guò)dictExpand方法讓數(shù)組去擴(kuò)容

int dictExpand(dict *d, unsigned long size)
{dictht n; /* the new hash table */unsigned long realsize = _dictNextPower(size);/* the size is invalid if it is smaller than the number of* elements already inside the hash table */if (dictIsRehashing(d) || d->ht[0].used > size)return DICT_ERR;/* Rehashing to the same table size is not useful. */if (realsize == d->ht[0].size) return DICT_ERR;/* Allocate the new hash table and initialize all pointers to NULL */n.size = realsize;n.sizemask = realsize-1;n.table = zcalloc(realsize*sizeof(dictEntry*));n.used = 0;/* Is this the first initialization? If so it's not really a rehashing* we just set the first hash table so that it can accept keys. */if (d->ht[0].table == NULL) {d->ht[0] = n;return DICT_OK;}/* Prepare a second hash table for incremental rehashing */d->ht[1] = n;d->rehashidx = 0;return DICT_OK;
}

? ? ? ? 至于擴(kuò)容的大小要看下面的實(shí)現(xiàn)

static unsigned long _dictNextPower(unsigned long size) {unsigned long i = DICT_HT_INITIAL_SIZE;if (size >= LONG_MAX) return LONG_MAX;while(1) {if (i >= size)return i;i *= 2;}
}

? ? ? ? 可以見(jiàn)的_dictNextPower是獲取最近接size的，但是比size大的2的N次冪。這樣就可以讓鏈表平均長(zhǎng)度降低到5/4~5/2之間（1.24~2.5）。

? ? ? ? 我們?cè)僮⒁庀耫ictExpand函數(shù)，它最后將分配的空間賦值給ht[1]。如果進(jìn)入這個(gè)場(chǎng)景，就意味著要進(jìn)行rehash操作了——因?yàn)閔t[1]就是為了臨時(shí)保存rehash結(jié)果的。

? ? ? ? 接下來(lái)看看縮容計(jì)算：

/* Resize the table to the minimal size that contains all the elements,* but with the invariant of a USED/BUCKETS ratio near to <= 1 */
int dictResize(dict *d)
{int minimal;if (!dict_can_resize || dictIsRehashing(d)) return DICT_ERR;minimal = d->ht[0].used;if (minimal < DICT_HT_INITIAL_SIZE)minimal = DICT_HT_INITIAL_SIZE;return dictExpand(d, minimal);
}

? ? ? ? 函數(shù)注釋寫的很清楚：在平均鏈表長(zhǎng)度低于1時(shí)要縮容了。但是作者并沒(méi)有在字典內(nèi)容減少時(shí)檢測(cè)是否需要縮容，甚至沒(méi)有設(shè)計(jì)一個(gè)檢測(cè)是否需要縮容的函數(shù)，而是將這個(gè)方法暴露給用戶去做。我想是因?yàn)檫@種場(chǎng)景不影響字典的執(zhí)行效率，而內(nèi)存問(wèn)題可能更多是應(yīng)該讓用戶去考慮。

Rehash操作 ? ? ? ?

? ? ? ? Rehash操作是Dict庫(kù)的重要算法，好在邏輯我們已經(jīng)在《Redis源碼解析——字典結(jié)構(gòu)》講清楚了，現(xiàn)在我們就看看它的實(shí)現(xiàn)

int dictRehash(dict *d, int n) {

? ? ? ? 該函數(shù)需要傳入字典指針d和步進(jìn)長(zhǎng)度n，返回0或者1。這兒的步進(jìn)長(zhǎng)度需要說(shuō)明下，因?yàn)镽edis的字典rehash操作是漸進(jìn)的分步來(lái)完成，所以每步需要漸進(jìn)多少距離需要指定。然后dictht的dictEntry數(shù)組可能存在連續(xù)的空指針，這些空指針沒(méi)有數(shù)據(jù)鏈，因此不需要rehash，所以不用對(duì)它們進(jìn)行操作。于是步進(jìn)距離只是針對(duì)有效的數(shù)組指針，比如我們針對(duì)下圖結(jié)構(gòu)進(jìn)行rehash

? ? ? ? 我們假設(shè)步進(jìn)長(zhǎng)度為1，則對(duì)上面進(jìn)行rehash時(shí)，ht[0].table的前兩個(gè)元素均被跳過(guò)，第三個(gè)元素所指向的鏈上數(shù)據(jù)將被rehash。因?yàn)椴竭M(jìn)長(zhǎng)度為1，且已經(jīng)rehash了數(shù)組中第三條鏈的數(shù)據(jù)，所以認(rèn)為該次步進(jìn)結(jié)束。

    int empty_visits = n*10; /* Max number of empty buckets to visit. */if (!dictIsRehashing(d)) return 0;while(n-- && d->ht[0].used != 0) {dictEntry *de, *nextde;/* Note that rehashidx can't overflow as we are sure there are more* elements because ht[0].used != 0 */assert(d->ht[0].size > (unsigned long)d->rehashidx);while(d->ht[0].table[d->rehashidx] == NULL) {d->rehashidx++;if (--empty_visits == 0) return 1;}

? ? ? ? 但是作者認(rèn)為數(shù)組中有效步進(jìn)長(zhǎng)度內(nèi)，過(guò)多的空指針也是會(huì)影響rehash效率。于是作者定義了empty_visits的值為步進(jìn)長(zhǎng)度10倍，如果有效步進(jìn)長(zhǎng)度內(nèi)空指針數(shù)大于empty_visits的值，則需要提前跳出rehash操作，并返回1。可能有讀者會(huì)疑問(wèn)，跳過(guò)空指針又不耗費(fèi)時(shí)間，干嘛要做這個(gè)限制呢？其實(shí)問(wèn)題不出在空指針上，而是因?yàn)閿?shù)組中有過(guò)多空指針的話，意味著數(shù)據(jù)向數(shù)據(jù)鏈上堆積，于是每步進(jìn)一次，需要rehash該鏈上的數(shù)據(jù)也會(huì)相對(duì)較多，時(shí)間消耗也會(huì)變長(zhǎng)。所以限制空數(shù)據(jù)鏈的實(shí)質(zhì)是優(yōu)化步進(jìn)的操作耗時(shí)的不確定性。

? ? ? ? 通過(guò)while我們可以看出，如果達(dá)到步進(jìn)長(zhǎng)度，或者h(yuǎn)t[0]上的數(shù)據(jù)已經(jīng)全被rehash到ht[1]上去了，rehash操作就完成了。我們?cè)倏磻騬ehash的具體操作：

        de = d->ht[0].table[d->rehashidx];/* Move all the keys in this bucket from the old to the new hash HT */while(de) {unsigned int h;nextde = de->next;/* Get the index in the new hash table */h = dictHashKey(d, de->key) & d->ht[1].sizemask;de->next = d->ht[1].table[h];d->ht[1].table[h] = de;d->ht[0].used--;d->ht[1].used++;de = nextde;}d->ht[0].table[d->rehashidx] = NULL;d->rehashidx++;}

? ? ? ? 這個(gè)過(guò)程就是不停的對(duì)ht[0].table上數(shù)組進(jìn)行遍歷，如果數(shù)組元素不為空，則遍歷并rehash該元素指向的鏈表上的元素。如果ht[0]上數(shù)據(jù)已經(jīng)全rehash到ht[1]上，則其used參數(shù)為0。這個(gè)時(shí)候則讓ht[0]等于ht[1]，而ht[1]自身釋放掉，從而達(dá)到在ht[0]中的數(shù)據(jù)被全部rehash的目的。

    /* Check if we already rehashed the whole table... */if (d->ht[0].used == 0) {zfree(d->ht[0].table);d->ht[0] = d->ht[1];_dictReset(&d->ht[1]);d->rehashidx = -1;return 0;}/* More to rehash... */return 1;
}

? ? ? ? 總結(jié)下dictRehash操作：它是通過(guò)用戶指定有效步進(jìn)長(zhǎng)度，并結(jié)合實(shí)際數(shù)據(jù)分布情況，將ht[0]上數(shù)據(jù)重新rehash到ht[1]上。如果ht[0].table數(shù)組全部被遍歷過(guò)，則認(rèn)為rehash完成并返回0，否則返回1。

Rehash的時(shí)機(jī)

? ? ? ? 之前我們講過(guò)為什么要rehash，現(xiàn)在我們探討下分步rehash的時(shí)機(jī)。

? ? ? ? 當(dāng)一個(gè)Redis字典需要rehash時(shí)，它沒(méi)有采用一次性完成的方案，而是采用漸進(jìn)式。于是保持在中間狀態(tài)的字典又是在何時(shí)被繼續(xù)rehash的呢？Redis的字典庫(kù)提供了兩個(gè)時(shí)機(jī)，一個(gè)是在對(duì)字典進(jìn)行更新或者查找操作時(shí)；另一個(gè)則是提供給使用者一個(gè)接口，由其決定決定何時(shí)去rehash。

? ? ? ??因?yàn)椴檎一蛘吒虏僮鞫际切枰馁M(fèi)一定時(shí)間，所以此時(shí)的rehash也不應(yīng)該“蹭”過(guò)多的時(shí)間，于是步進(jìn)設(shè)置為1。

static void _dictRehashStep(dict *d) {if (d->iterators == 0) dictRehash(d,1);
}

? ? ? ? 另一種是是提供給用戶觸發(fā)的，但是作者還是希望盡量保證其操作時(shí)間不可以過(guò)長(zhǎng)，所以提供了下面的方法：

/* Rehash for an amount of time between ms milliseconds and ms+1 milliseconds */
int dictRehashMilliseconds(dict *d, int ms) {long long start = timeInMilliseconds();int rehashes = 0;while(dictRehash(d,100)) {rehashes += 100;if (timeInMilliseconds()-start > ms) break;}return rehashes;
}

? ? ? ? 此時(shí)rehash操作的步進(jìn)長(zhǎng)度為100，這樣相對(duì)于步進(jìn)長(zhǎng)度為1的情況，算是批量操作，可以省去函數(shù)調(diào)用和返回的時(shí)間消耗。相應(yīng)的，還需要使用者提供時(shí)間進(jìn)行約束。至于時(shí)長(zhǎng)多少，使用者需要自己權(quán)衡了。

增加元素

? ? ? ? 新增元素通過(guò)下面函數(shù)實(shí)現(xiàn)

int dictAdd(dict *d, void *key, void *val)
{dictEntry *entry = dictAddRaw(d,key);if (!entry) return DICT_ERR;dictSetVal(d, entry, val);return DICT_OK;
}

? ? ? ??dictAddRaw方法獲取一個(gè)新的dictEntry指針，然后通過(guò)用于傳入的函數(shù)指針，將value復(fù)制到dictEntry所指向的對(duì)象的值上

#define dictSetVal(d, entry, _val_) do { \if ((d)->type->valDup) \entry->v.val = (d)->type->valDup((d)->privdata, _val_); \else \entry->v.val = (_val_); \
} while(0)

? ? ? ??dictAddRaw方法的實(shí)現(xiàn)我們需要注意下。首先它會(huì)檢測(cè)該字典是否處在rehash的狀態(tài)，如果是，則讓其rehash一步

dictEntry *dictAddRaw(dict *d, void *key)
{int index;dictEntry *entry;dictht *ht;if (dictIsRehashing(d)) _dictRehashStep(d);

? ? ? ? 然后檢測(cè)key是否已經(jīng)在map中存在，如果存在則不能新增；否則返回key所在的指針數(shù)組的下標(biāo)。（dictHashKey(ht, key) & ht->sizemask;）

    /* Get the index of the new element, or -1 if* the element already exists. */if ((index = _dictKeyIndex(d, key)) == -1)return NULL;

? ? ? ? 由于字典可能處在rehash的中間狀態(tài)，數(shù)據(jù)一部分在ht[0]中，有一部分在ht[1]中。這個(gè)時(shí)候就需要判斷新增的dictEntry是要加到哪個(gè)dictht上：如果在rehash，則新增到ht[1]上。因?yàn)槿绻略龅絟t[0]上，此時(shí)rehashidx可能已經(jīng)越過(guò)剛新增key對(duì)應(yīng)的索引，導(dǎo)致數(shù)據(jù)丟失；如果不在rehash狀態(tài)，則新增到ht[0]上。

    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];entry = zmalloc(sizeof(*entry));entry->next = ht->table[index];ht->table[index] = entry;ht->used++;/* Set the hash entry fields. */dictSetKey(d, entry, key);return entry;
}

刪除元素

? ? ? ? 刪除元素時(shí)，需要在ht[0]和ht[1]中查找并刪除，所以會(huì)遍歷兩個(gè)table

static int dictGenericDelete(dict *d, const void *key, int nofree)
{unsigned int h, idx;dictEntry *he, *prevHe;int table;if (d->ht[0].size == 0) return DICT_ERR; /* d->ht[0].table is NULL */if (dictIsRehashing(d)) _dictRehashStep(d);h = dictHashKey(d, key);for (table = 0; table <= 1; table++) {

? ? ? ? 然后找到key對(duì)應(yīng)的指針數(shù)組的下標(biāo)

        idx = h & d->ht[table].sizemask;

? ? ? ? sizemask是數(shù)組長(zhǎng)度減去1。上面這步與操作，相當(dāng)于讓hash值向數(shù)組長(zhǎng)度取余數(shù)。比如我們hash值是5（0x101），數(shù)組長(zhǎng)度是4（0x100），則sizemask為3（0x011）。5和3進(jìn)行與運(yùn)算，得出的是0x001，即5%4的結(jié)果。

? ? ? ? 找到指針下標(biāo)后，則對(duì)該下標(biāo)所指向的鏈表進(jìn)行遍歷。找到元素就將其從鏈表中摘除。至于是否需要通過(guò)用戶提供的析構(gòu)函數(shù)將key和value析構(gòu)掉，要視傳入的force值決定。

        he = d->ht[table].table[idx];prevHe = NULL;while(he) {if (key==he->key || dictCompareKeys(d, key, he->key)) {/* Unlink the element from the list */if (prevHe)prevHe->next = he->next;elsed->ht[table].table[idx] = he->next;if (!nofree) {dictFreeKey(d, he);dictFreeVal(d, he);}zfree(he);d->ht[table].used--;return DICT_OK;}prevHe = he;he = he->next;}if (!dictIsRehashing(d)) break;}return DICT_ERR; /* not found */
}

? ? ? ? Redis字典庫(kù)對(duì)上面方法進(jìn)行封裝，提供了下面這兩個(gè)函數(shù)：

int dictDelete(dict *ht, const void *key) {return dictGenericDelete(ht,key,0);
}int dictDeleteNoFree(dict *ht, const void *key) {return dictGenericDelete(ht,key,1);
}

查找元素

? ? ? ? 查找元素也需要考慮字典是否在rehash的過(guò)程中，于是查找也要視情況看看在ht[0]中查找，還是也要在ht[1]中查找：

dictEntry *dictFind(dict *d, const void *key)
{dictEntry *he;unsigned int h, idx, table;if (d->ht[0].used + d->ht[1].used == 0) return NULL; /* dict is empty */if (dictIsRehashing(d)) _dictRehashStep(d);h = dictHashKey(d, key);for (table = 0; table <= 1; table++) {idx = h & d->ht[table].sizemask;he = d->ht[table].table[idx];while(he) {if (key==he->key || dictCompareKeys(d, key, he->key))return he;he = he->next;}if (!dictIsRehashing(d)) return NULL;}return NULL;
}void *dictFetchValue(dict *d, const void *key) {dictEntry *he;he = dictFind(d,key);return he ? dictGetVal(he) : NULL;
}

修改（無(wú)時(shí)新增）元素

? ? ? ? Redis的字典庫(kù)，會(huì)先嘗試往字典里新增該key，然后再查找到該key，讓其value變成需要替換的值，最后還要將原來(lái)的value釋放掉

int dictReplace(dict *d, void *key, void *val)
{dictEntry *entry, auxentry;/* Try to add the element. If the key* does not exists dictAdd will suceed. */if (dictAdd(d, key, val) == DICT_OK)return 1;/* It already exists, get the entry */entry = dictFind(d, key);/* Set the new value and free the old one. Note that it is important* to do that in this order, as the value may just be exactly the same* as the previous one. In this context, think to reference counting,* you want to increment (set), and then decrement (free), and not the* reverse. */auxentry = *entry;dictSetVal(d, entry, val);dictFreeVal(d, &auxentry);return 0;
}

總結(jié)

以上是生活随笔為你收集整理的Redis源码解析——字典基本操作的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。