當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

STL源码剖析 hashtable

發布時間：2023/12/13 编程问答 22 豆豆

生活随笔收集整理的這篇文章主要介紹了 STL源码剖析 hashtable 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

二叉搜索樹具有對數平均時間的表現，但是這個需要滿足的假設前提是輸入的數據需要具備隨機性
hashtable 散列表這種結構在插入、刪除、搜尋等操作層面上也具有常數平均時間的表現。而且不需要依賴元素的隨機性，這種表現是以統計為基礎的

hashtable的概述

hashtable可提供對任何有名項的存取和刪除操作
因為操作的對象是有名項，因此hashtable可以作為一種字典結構
將一個元素映射成為一個 “大小可以接受的索引”簡稱為hash function散列函數
考慮到元素的個數大于array的容量，可能有不同的元素被映射到相同的位置，簡稱為碰撞
解決碰撞的方法有很多，線性探測、二次探測、開鏈

線性探測

負載系數：元素的個數除以表格的大小，負載系數介于0-1，除非使用開鏈法
使用線性探測時，根據散列函數計算得到的位置已經存在了元素，就需要循環往下一一尋找，如果到達array的尾端，就需要繞回到頭部繼續尋找，直到找到一個可用的空間為止。
元素的搜尋也是類似
元素的刪除采用惰性機制，只標記刪除的記號，實際真正的刪除操作需要等待表格重新整理時才可以進行

需要兩個假設：1，表格足夠大；2，每個元素都夠獨立 (如果所有元素通過散列函數計算都得到相同的位置，造成了平均插入成本的廠長速度遠遠高于了負載系數的成長速度)

二次探測

F(i) = i^2,如果計算得到新元素的位置是H，但是這個位置已經被占用了，將會依序嘗試 H+1^2 H+2^2? H+3^2 等等，而不是H+1 H+2?

?如果將表格的大小設定為質數，保持負載系數低于0.5，那么沒插入一個元素所需要的探測次數不多于 2

開鏈

每一個表格元素維護一個list，然后對list進行元素的插入刪除等操作
hashtable使用開鏈法

hashtable的桶子和節點

hashtable表格內的元素為桶子，名稱的含義是表格內的每個單元涵蓋的不只是個節點，甚至是一桶節點

template <class Value> struct __hashtable_node{__hashtable_node* next;Value val; };

bucket使用的linked list，不是采用stl源碼中的list slist?，而是自行維護上述的hash table node
buckets聚合體則以vector完成，從而具備了擴充的能力

hashtable迭代器

hashable迭代器維持著與整個buckets vector的關系，并記錄目前所指的節點
前進操作是從目前節點出發前進一個位置，由于節點被安置于list內，使用next進行前進操作
如果目前是list的尾端，則跳轉至下一個bucket上，正是指向下一個list的頭部
一篇足矣，帶你吃透STL源碼中hash table(哈希表)與關聯式容器hash_set、hash_map_董哥的黑板報-CSDN博客
hashtable的迭代器沒有后退操作，hashtable沒有定義所謂的逆向迭代器

hashtable的數據結構

buckets聚合體以vector完成，以利動態擴充
<stl_hash_fun.h>定義數個現成的hash functions 全都是仿函數，hash function計算單元的位置，也就是元素對應的bucket的位置，具體調用的函數是bkt_num(),它調用hash function取得一個可以執行modulus(取模)運算的數值
按照質數設計vector的大小，事先準備好28個質數，并設計一個函數用于查詢最接近某數并大于某數的質數

hashtable的構造和內存管理

vector的reserve的使用（避免內存重新分配以及內存分配的方式）_Zero's Zone-CSDN博客

判斷元素落在哪一個bucket內？這是hash function的任務，但是SGI STL對其進行了封裝先交給bkt_num()函數再由此函數調用hash function，得到一個可以執行的modules(取模)運算的數值
以上的目的是出于有些元素的型別是無法直接對其進行取模運算的，比如字符串類型?

//版本1：接受實值（value）和buckets個數size_type bkt_num(const value_type& obj, size_t n) const{return bkt_num_key(get_key(obj), n); //調用版本4}//版本2：只接受實值（value）size_type bkt_num(const value_type& obj) const{return bkt_num_key(get_key(obj)); //調用版本3}//版本3，只接受鍵值size_type bkt_num_key(const key_type& key) const{return bkt_num_key(key, buckets.size()); //調用版本4}//版本4：接受鍵值和buckets個數size_type bkt_num_key(const key_type& key, size_t n) const{return hash(key) % n; //SGI的所有內建的hash()，在后面的hash functions中介紹}

復制和整體刪除

hash table是由vector和linked list組合而成的，因此復制和整體刪除都需要注意內存的釋放的問題

void clear(){//針對每一個bucketfor(size_type i = 0;i < buckets.size();++i){node * cur = buckets[i];//刪除bucket list中的每一個節點while(cur != 0){node* next = cur->next;delete_node(cur);cur = next;}buckets[i] = 0; //令buckets內容為null}num_elements = 0; //令總的節點的個數為0//需要注意 buckets vector并沒有釋放空間，仍然保存先前的大小}void copy_from(const hashtable& ht){//先清除己方的buckets vector，此操作是調用vector::clear() 造成所有的元素都為0buckets.clear();//為己方的buckets vector保留空間，使與對方相同//如果己方的空間大于對方就不需要改變；如果己方的空間小于對方就會增大buckets.reserve(ht.buckets.size());//從己方的buckets vector尾端開始，插入n個元素，其數值為 null 指針//注意此時buckets vector為空，所謂的尾端就是起頭處buckets.insert(buckets.end(),ht.buckets.size(),(node*)0);__STL_TRY{//針對buckets vectorfor (size_type i = 0;i<ht.buckets.size();++i) {//復制vector的每一個元素(是一個指針，指向hashtable節點)if (const node* cur = ht.buckets[i]){node* copy = new_node(cur->val);buckets[i] = copy;//針對同一個 buckets list 復制每一個節點for (node* next = cur->next;next ; cur = next,next = cur->next) {copy->next = new_node(next->val);copy = copy->next;}}}//重新登錄的節點的個數(hashtable的大小)num_elements = ht.num_elements;};__STL_UNWIND(clear());}

整體代碼

#include <iostream> #include <vector>#ifdef __STL_USE_EXCEPTIONS #define __STL_TRY try #define __STL_UNWIND(action) catch(...) { action; throw; } #else #define __STL_TRY #define __STL_UNWIND(action) #endiftemplate<class T,class Alloc> class simple_alloc{ public:static T* allocate(std::size_t n){return 0==n?0:(T*)Alloc::allocate(n * sizeof(T));}static T* allocate(void){return (T*)Alloc::allocate(sizeof (T));}static void deallocate(T* p,size_t n){if (n!=0){Alloc::deallocate(p,n * sizeof(T));}}static void deallocate(T* p){Alloc::deallocate(p,sizeof(T));} };namespace Chy{template <class T>inline T* _allocate(ptrdiff_t size,T*){std::set_new_handler(0);T* tmp = (T*)(::operator new((std::size_t)(size * sizeof (T))));if (tmp == 0){std::cerr << "out of memory" << std::endl;exit(1);}return tmp;}template<class T>inline void _deallocate(T* buffer){::operator delete (buffer);}template<class T1,class T2>inline void _construct(T1 *p,const T2& value){new(p) T1 (value); //沒看懂}template <class T>inline void _destroy(T* ptr){ptr->~T();}template <class T>class allocator{public:typedef T value_type;typedef T* pointer;typedef const T* const_pointer;typedef T& reference;typedef const T& const_reference;typedef std::size_t size_type;typedef ptrdiff_t difference_type;template<class U>struct rebind{typedef allocator<U>other;};pointer allocate(size_type n,const void * hint = 0){return _allocate((difference_type)n,(pointer)0);}void deallocate(pointer p,size_type n){_deallocate(p);}void construct(pointer p,const T& value){_construct(p,value);}void destroy(pointer p){_destroy(p);}pointer address(reference x){return (pointer)&x;}const_pointer const_address(const_reference x){return (const_pointer)&x;}size_type max_size()const{return size_type(UINT_MAX/sizeof (T));}}; }template <class Value> struct __hashtable_node{__hashtable_node* next;Value val; }; /** Key: 節點的實值類型* Value: 節點的鍵值類型* HashFun: hash function的函數型別* ExtractKey: 從節點中提取鍵值的方法 (函數或者仿函數)* EqualKey: 判斷鍵值是否相同 (函數或者仿函數)* Alloc: 空間配置器缺省使用 std::alloc*/template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc> class hashtable{ public:typedef Key key_type;typedef Value value_type;typedef HashFcn hasher; //為template型別參數重新定義一個名稱typedef EqualKey key_equal;//為template型別參數重新定義一個名稱typedef std::size_t size_type;typedef ptrdiff_t difference_type;private://以下三者都是function objects//<stl_hash_fun.h> 定義有數個標準型別(如 int、c-style、string等)的hasherhasher hash; //散列函數key_equal equals; //判斷鍵值是否相等ExtractKey get_key; //從節點取出鍵值typedef __hashtable_node<Value>node;//專屬的節點配置器typedef simple_alloc<node,Alloc>node_allocator;//節點的配置函數node* new_node(const value_type& obj){node* n = node_allocator::allocate();n->next = 0;__STL_TRY{Chy::allocator<Key>::construct(&n->val,obj);return n;};__STL_UNWIND(node_allocator::deallocate(n);)}//節點釋放函數void delete_node(node* n){Chy::allocator<Key>::destroy(n->val);node_allocator::deallocate(n);}public:std::vector<node*,Alloc>buckets;//以vector完成桶的集合，其實值是一個node*size_type num_elements; //node的個數 public://bucket個數即buckets vector的大小size_type bucket_count() const{return buckets.size();}//注意假設假設long至少有32bitstatic const int __stl_num_primes = 28;constexpr static const unsigned long __stl_prime_list[__stl_num_primes] ={53, 97, 193, 389, 769,1543, 3079, 6151, 12289, 24593,49157, 98317, 196613, 393241, 786433,1572869, 3145739, 6291469, 12582917, 25165843,50331653, 100663319, 201326611, 402653189, 805306457,1610612741, 3221225473, 4294967291};//找出上述28指數中，最接近并大于n的那個質數inline unsigned long __stl_next_prime(unsigned long n){const unsigned long *first = __stl_prime_list;const unsigned long *last = __stl_prime_list + __stl_num_primes;const unsigned long *pos = std::lower_bound(first,last,n);//使用lower_bound() 需要先進行排序return pos == last ? *(last-1) : *pos;}//總共有多少個buckets。以下是hash_table的一個member functionsize_type max_bucket_count()const{//其數值將為 4294967291return __stl_prime_list[__stl_num_primes - 1];}//構造函數hashtable(size_type n,const HashFcn& hf,const EqualKey& eql):hash(hf),equals(eql),get_key(ExtractKey()),num_elements(0){initialize_buckets(n);}//初始化函數void initialize_buckets(size_type n){//例子：傳入50 返回53//然后保留53個元素的空間然后將其全部填充為0const size_type n_buckets = next_size(n);buckets.reserve(n_buckets);//設定所有的buckets的初值為0(node*)buckets.insert(buckets.begin(),n_buckets,(node*)0);}public://版本1：接受實值（value）和buckets個數size_type bkt_num(const value_type& obj, size_t n) const{return bkt_num_key(get_key(obj), n); //調用版本4}//版本2：只接受實值（value）size_type bkt_num(const value_type& obj) const{return bkt_num_key(get_key(obj)); //調用版本3}//版本3，只接受鍵值size_type bkt_num_key(const key_type& key) const{return bkt_num_key(key, buckets.size()); //調用版本4}//版本4：接受鍵值和buckets個數size_type bkt_num_key(const key_type& key, size_t n) const{return hash(key) % n; //SGI的所有內建的hash()，在后面的hash functions中介紹}public://相關對應的函數//next_size()返回最接近n并大于n的質數size_type next_size(size_type n) const {return __stl_next_prime(n);}typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator;//插入操作和表格重整//插入元素不允許重復std::pair<iterator,bool>insert_unique(const value_type& obj){//判斷是否需要重建表格如果需要就進行擴充resize(num_elements + 1);return insert_unique_noresize(obj);}//函數判斷是否需要重建表格如果不需要立刻返回，如果需要就重建表格void resize(size_type num_elements_hint){//表格重建與否的原則是：元素的個數(新增元素計入之后)和先前分配的bucket vector進行比較//如果前者的大于后者就需要表格的重建//因此 bucket(list)的最大容量和buckets vector的大小相同const size_type old_n = buckets.size();if (old_n < num_elements_hint){//需要重新分配內存//計算下一個質數const size_type n = next_size(num_elements_hint);if (n > old_n){std::vector<node*,Alloc>tmp(n,(node*)0);__STL_TRY{//處理每一個舊的bucketfor (size_type bucket=0;bucket<old_n;bucket++) {//指向節點所對應的的串行的起始節點node* first = buckets[bucket];//處理每一個舊的bucket所含(串行)的每一個節點while(first){//串行節點還未結束//找出節點落在哪一個新的bucket內部size_type new_bucket = bkt_num(first->val,n);//以下四個操作頗為巧妙//(1)令舊bucket指向其所對應的串行的下一個節點(以便迭代處理)buckets[bucket] = first->next;//(2)(3)將當前節點插入到新的bucket內部，成為其對應串行的第一個節點first->next = tmp[new_bucket];tmp[new_bucket] = first;//(4)回到舊的bucket所指向的待處理的串行，準備處理下一個節點first = buckets[bucket];}}//對調新舊兩個buckets//離開的時候會釋放tmp的內存buckets.swap(tmp);};}}}//在不需要重建表格的情況下插入新的節點鍵值不允許重復std::pair<iterator,bool>insert_unique_noresize(const value_type& obj){const size_type n = bkt_num(obj) ;//決定obj應該位于第n n bucketnode* first = buckets[n]; //令first指向bucket對應的串行頭部//如果Buckets[n]已經被占用此時first不再是0 于是進入以下循環//走過bucket所對應的整個鏈表for (node* cur = first;cur;cur = cur->next) {if (equals(get_key(cur->val)),get_key(obj)){//如果發現和鏈表中的某個鍵值是相同的就不插入立刻返回return std::pair<iterator,bool>(iterator(cur, this), false);}//離開上述循環(或者根本沒有進入循環的時候)first指向bucket的所指鏈表的頭部節點node* tmp = new_node(obj); //產生新的節點tmp->next = first;buckets[n] = tmp; //令新的節點成為鏈表的第一個節點++num_elements; //節點的個數累加return std::pair<iterator,bool>(iterator(tmp,this),true);}}//客戶端執行的是另外一種節點的插入行為(不再是insert_unique 而是insert_equal)//插入元素允許重復iterator insert_equal(const value_type& obj){//判斷是否需要重建表格如果需要就進行擴充resize(num_elements+1);return insert_equal_noresize(obj);}//在不需要重建表格的情況下插入新的節點，鍵值是允許重復的iterator insert_equal_noresize(const value_type& obj){const size_type n = bkt_num(obj); //決定obj應該位于第 n bucketnode* first = buckets[n];//令first指向的bucket對應的鏈表的頭部//如果bucket[n]已經被占用，此時的first不為0，進入循環//遍歷整個鏈表for(node* cur = first;cur;cur = cur->next){if (equals(get_key(cur->val),get_key(obj))){//如果發現與鏈表中的某個鍵值相同，就馬上插入，然后返回node* tmp = new_node(obj); //產生新的節點tmp->next = cur->next;//新節點插入目前的位置cur->next = tmp;++num_elements;return iterator (tmp, this); //返回一個迭代器指向新增的節點}//進行到這個時候表示沒有發現重復的數值node* tmp = new_node(obj);tmp->next = first;buckets[n] = tmp;++num_elements;return iterator(tmp, this);}}void clear(){//針對每一個bucketfor(size_type i = 0;i < buckets.size();++i){node * cur = buckets[i];//刪除bucket list中的每一個節點while(cur != 0){node* next = cur->next;delete_node(cur);cur = next;}buckets[i] = 0; //令buckets內容為null}num_elements = 0; //令總的節點的個數為0//需要注意 buckets vector并沒有釋放空間，仍然保存先前的大小}void copy_from(const hashtable& ht){//先清除己方的buckets vector，此操作是調用vector::clear() 造成所有的元素都為0buckets.clear();//為己方的buckets vector保留空間，使與對方相同//如果己方的空間大于對方就不需要改變；如果己方的空間小于對方就會增大buckets.reserve(ht.buckets.size());//從己方的buckets vector尾端開始，插入n個元素，其數值為 null 指針//注意此時buckets vector為空，所謂的尾端就是起頭處buckets.insert(buckets.end(),ht.buckets.size(),(node*)0);__STL_TRY{//針對buckets vectorfor (size_type i = 0;i<ht.buckets.size();++i) {//復制vector的每一個元素(是一個指針，指向hashtable節點)if (const node* cur = ht.buckets[i]){node* copy = new_node(cur->val);buckets[i] = copy;//針對同一個 buckets list 復制每一個節點for (node* next = cur->next;next ; cur = next,next = cur->next) {copy->next = new_node(next->val);copy = copy->next;}}}//重新登錄的節點的個數(hashtable的大小)num_elements = ht.num_elements;};__STL_UNWIND(clear());}};template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc> struct __hashtable_iterator{typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>hashtable;typedef __hashtable_iterator<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator; // typedef __hash_const 靜態迭代器typedef __hashtable_node<Value>node;typedef std::forward_iterator_tag iterator_category;typedef Value value_type;typedef ptrdiff_t difference_type;typedef std::size_t size_type;typedef Value& reference;typedef Value* pointer;node* cur;// 迭代器目前所指的節點hashtable* ht;//保持對容器的連接關系 (因為可能需要從bucket跳到bucket)__hashtable_iterator(node*n,hashtable* tab):cur(n),ht(tab){}__hashtable_iterator(){}reference operator*() const {return cur->val;}pointer operator->() const {return &(operator*());}iterator& operator++();iterator operator++(int);bool operator==(const iterator& it)const {return cur == it.cur;}bool operator!=(const iterator& it)const {return cur != it.cur;} };template <class V,class K,class HF,class ExK,class EqK,class A> __hashtable_iterator<V,K,HF,ExK,EqK,A>& __hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++() {const node* old = cur;cur = cur->next; //如果存在就是他，否則進入以下的if流程if (!cur){//根據元素的數值，定位出下一個bucket，其起頭處就是我們的目的地size_type bucket = ht->bkt_num(old->val);while(!cur && ++bucket < ht->buckets.size()){cur = ht->buckets[bucket];}}return *this; }template <class V,class K,class HF,class ExK,class EqK,class A> __hashtable_iterator<V,K,HF,ExK,EqK,A> __hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++(int) {iterator tmp = *this;++this; //調用operator++return tmp; }

問題

hashtable不能直接被引用，屬于內置類型。不被外部使用
客戶端可以使用<hash_set.h> 和 <hash_map.h>

當超過了buckets vector就進行表格的重建?

//元素查找iterator find(const key_type& key){size_type n = bkt_num(key); //首先尋找落在哪一個bucket里面node* first;//以下從bucket list的頭部開始，逐一比對每個元素的數值，比對成功就退出for (first = buckets[n];first && !equals(get_key(first->val),key);first = first->next) {}return iterator (first,this);}//元素計數size_type count (const key_type& key)const{const size_type n = bkt_num_key(key);//首先尋找落在哪一個bucket里面size_type result = 0;//遍歷bucket list,從頭部開始，逐一比對每個元素的數值。比對成功就累加1for(const node* cur = buckets[n];cur;cur = cur->next){if (equals(get_key(cur->val),key)){++result;}}return result;}

?hash_functions

仿函數
bkt_num() 調用此處的hash function得到一個可以對hashtable進行模運算的數值
如果是char int long等整數型別，什么都不做；如果是字符串類型的比如const char* 就需要設計一個轉換函數

上述代碼表明 SGI hashtable無法處理上述各項型別之外的元素，比如string double float，如果想要處理這些型別是需要自行定義hash function的

#include <iostream> #include <vector>#ifdef __STL_USE_EXCEPTIONS #define __STL_TRY try #define __STL_UNWIND(action) catch(...) { action; throw; } #else #define __STL_TRY #define __STL_UNWIND(action) #endiftemplate<class T,class Alloc> class simple_alloc{ public:static T* allocate(std::size_t n){return 0==n?0:(T*)Alloc::allocate(n * sizeof(T));}static T* allocate(void){return (T*)Alloc::allocate(sizeof (T));}static void deallocate(T* p,size_t n){if (n!=0){Alloc::deallocate(p,n * sizeof(T));}}static void deallocate(T* p){Alloc::deallocate(p,sizeof(T));} };namespace Chy{template <class T>inline T* _allocate(ptrdiff_t size,T*){std::set_new_handler(0);T* tmp = (T*)(::operator new((std::size_t)(size * sizeof (T))));if (tmp == 0){std::cerr << "out of memory" << std::endl;exit(1);}return tmp;}template<class T>inline void _deallocate(T* buffer){::operator delete (buffer);}template<class T1,class T2>inline void _construct(T1 *p,const T2& value){new(p) T1 (value); //沒看懂}template <class T>inline void _destroy(T* ptr){ptr->~T();}template <class T>class allocator{public:typedef T value_type;typedef T* pointer;typedef const T* const_pointer;typedef T& reference;typedef const T& const_reference;typedef std::size_t size_type;typedef ptrdiff_t difference_type;template<class U>struct rebind{typedef allocator<U>other;};pointer allocate(size_type n,const void * hint = 0){return _allocate((difference_type)n,(pointer)0);}void deallocate(pointer p,size_type n){_deallocate(p);}void construct(pointer p,const T& value){_construct(p,value);}void destroy(pointer p){_destroy(p);}pointer address(reference x){return (pointer)&x;}const_pointer const_address(const_reference x){return (const_pointer)&x;}size_type max_size()const{return size_type(UINT_MAX/sizeof (T));}}; }template <class Value> struct __hashtable_node{__hashtable_node* next;Value val; }; /** Key: 節點的實值類型* Value: 節點的鍵值類型* HashFun: hash function的函數型別* ExtractKey: 從節點中提取鍵值的方法 (函數或者仿函數)* EqualKey: 判斷鍵值是否相同 (函數或者仿函數)* Alloc: 空間配置器缺省使用 std::alloc*/template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc> class hashtable{ public:typedef Key key_type;typedef Value value_type;typedef HashFcn hasher; //為template型別參數重新定義一個名稱typedef EqualKey key_equal;//為template型別參數重新定義一個名稱typedef std::size_t size_type;typedef ptrdiff_t difference_type;private://以下三者都是function objects//<stl_hash_fun.h> 定義有數個標準型別(如 int、c-style、string等)的hasherhasher hash; //散列函數key_equal equals; //判斷鍵值是否相等ExtractKey get_key; //從節點取出鍵值typedef __hashtable_node<Value>node;//專屬的節點配置器typedef simple_alloc<node,Alloc>node_allocator;//節點的配置函數node* new_node(const value_type& obj){node* n = node_allocator::allocate();n->next = 0;__STL_TRY{Chy::allocator<Key>::construct(&n->val,obj);return n;};__STL_UNWIND(node_allocator::deallocate(n);)}//節點釋放函數void delete_node(node* n){Chy::allocator<Key>::destroy(n->val);node_allocator::deallocate(n);}public:std::vector<node*,Alloc>buckets;//以vector完成桶的集合，其實值是一個node*size_type num_elements; //node的個數 public://bucket個數即buckets vector的大小size_type bucket_count() const{return buckets.size();}//注意假設假設long至少有32bitstatic const int __stl_num_primes = 28;constexpr static const unsigned long __stl_prime_list[__stl_num_primes] ={53, 97, 193, 389, 769,1543, 3079, 6151, 12289, 24593,49157, 98317, 196613, 393241, 786433,1572869, 3145739, 6291469, 12582917, 25165843,50331653, 100663319, 201326611, 402653189, 805306457,1610612741, 3221225473, 4294967291};//找出上述28指數中，最接近并大于n的那個質數inline unsigned long __stl_next_prime(unsigned long n){const unsigned long *first = __stl_prime_list;const unsigned long *last = __stl_prime_list + __stl_num_primes;const unsigned long *pos = std::lower_bound(first,last,n);//使用lower_bound() 需要先進行排序return pos == last ? *(last-1) : *pos;}//總共有多少個buckets。以下是hash_table的一個member functionsize_type max_bucket_count()const{//其數值將為 4294967291return __stl_prime_list[__stl_num_primes - 1];}//構造函數hashtable(size_type n,const HashFcn& hf,const EqualKey& eql):hash(hf),equals(eql),get_key(ExtractKey()),num_elements(0){initialize_buckets(n);}//初始化函數void initialize_buckets(size_type n){//例子：傳入50 返回53//然后保留53個元素的空間然后將其全部填充為0const size_type n_buckets = next_size(n);buckets.reserve(n_buckets);//設定所有的buckets的初值為0(node*)buckets.insert(buckets.begin(),n_buckets,(node*)0);}public://版本1：接受實值（value）和buckets個數size_type bkt_num(const value_type& obj, size_t n) const{return bkt_num_key(get_key(obj), n); //調用版本4}//版本2：只接受實值（value）size_type bkt_num(const value_type& obj) const{return bkt_num_key(get_key(obj)); //調用版本3}//版本3，只接受鍵值size_type bkt_num_key(const key_type& key) const{return bkt_num_key(key, buckets.size()); //調用版本4}//版本4：接受鍵值和buckets個數size_type bkt_num_key(const key_type& key, size_t n) const{return hash(key) % n; //SGI的所有內建的hash()，在后面的hash functions中介紹}public://相關對應的函數//next_size()返回最接近n并大于n的質數size_type next_size(size_type n) const {return __stl_next_prime(n);}typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator;//插入操作和表格重整//插入元素不允許重復std::pair<iterator,bool>insert_unique(const value_type& obj){//判斷是否需要重建表格如果需要就進行擴充resize(num_elements + 1);return insert_unique_noresize(obj);}//函數判斷是否需要重建表格如果不需要立刻返回，如果需要就重建表格void resize(size_type num_elements_hint){//表格重建與否的原則是：元素的個數(新增元素計入之后)和先前分配的bucket vector進行比較//如果前者的大于后者就需要表格的重建//因此 bucket(list)的最大容量和buckets vector的大小相同const size_type old_n = buckets.size();if (old_n < num_elements_hint){//需要重新分配內存//計算下一個質數const size_type n = next_size(num_elements_hint);if (n > old_n){std::vector<node*,Alloc>tmp(n,(node*)0);__STL_TRY{//處理每一個舊的bucketfor (size_type bucket=0;bucket<old_n;bucket++) {//指向節點所對應的的串行的起始節點node* first = buckets[bucket];//處理每一個舊的bucket所含(串行)的每一個節點while(first){//串行節點還未結束//找出節點落在哪一個新的bucket內部size_type new_bucket = bkt_num(first->val,n);//以下四個操作頗為巧妙//(1)令舊bucket指向其所對應的串行的下一個節點(以便迭代處理)buckets[bucket] = first->next;//(2)(3)將當前節點插入到新的bucket內部，成為其對應串行的第一個節點first->next = tmp[new_bucket];tmp[new_bucket] = first;//(4)回到舊的bucket所指向的待處理的串行，準備處理下一個節點first = buckets[bucket];}}//對調新舊兩個buckets//離開的時候會釋放tmp的內存buckets.swap(tmp);};}}}//在不需要重建表格的情況下插入新的節點鍵值不允許重復std::pair<iterator,bool>insert_unique_noresize(const value_type& obj){const size_type n = bkt_num(obj) ;//決定obj應該位于第n n bucketnode* first = buckets[n]; //令first指向bucket對應的串行頭部//如果Buckets[n]已經被占用此時first不再是0 于是進入以下循環//走過bucket所對應的整個鏈表for (node* cur = first;cur;cur = cur->next) {if (equals(get_key(cur->val)),get_key(obj)){//如果發現和鏈表中的某個鍵值是相同的就不插入立刻返回return std::pair<iterator,bool>(iterator(cur, this), false);}//離開上述循環(或者根本沒有進入循環的時候)first指向bucket的所指鏈表的頭部節點node* tmp = new_node(obj); //產生新的節點tmp->next = first;buckets[n] = tmp; //令新的節點成為鏈表的第一個節點++num_elements; //節點的個數累加return std::pair<iterator,bool>(iterator(tmp,this),true);}}//客戶端執行的是另外一種節點的插入行為(不再是insert_unique 而是insert_equal)//插入元素允許重復iterator insert_equal(const value_type& obj){//判斷是否需要重建表格如果需要就進行擴充resize(num_elements+1);return insert_equal_noresize(obj);}//在不需要重建表格的情況下插入新的節點，鍵值是允許重復的iterator insert_equal_noresize(const value_type& obj){const size_type n = bkt_num(obj); //決定obj應該位于第 n bucketnode* first = buckets[n];//令first指向的bucket對應的鏈表的頭部//如果bucket[n]已經被占用，此時的first不為0，進入循環//遍歷整個鏈表for(node* cur = first;cur;cur = cur->next){if (equals(get_key(cur->val),get_key(obj))){//如果發現與鏈表中的某個鍵值相同，就馬上插入，然后返回node* tmp = new_node(obj); //產生新的節點tmp->next = cur->next;//新節點插入目前的位置cur->next = tmp;++num_elements;return iterator (tmp, this); //返回一個迭代器指向新增的節點}//進行到這個時候表示沒有發現重復的數值node* tmp = new_node(obj);tmp->next = first;buckets[n] = tmp;++num_elements;return iterator(tmp, this);}}void clear(){//針對每一個bucketfor(size_type i = 0;i < buckets.size();++i){node * cur = buckets[i];//刪除bucket list中的每一個節點while(cur != 0){node* next = cur->next;delete_node(cur);cur = next;}buckets[i] = 0; //令buckets內容為null}num_elements = 0; //令總的節點的個數為0//需要注意 buckets vector并沒有釋放空間，仍然保存先前的大小}void copy_from(const hashtable& ht){//先清除己方的buckets vector，此操作是調用vector::clear() 造成所有的元素都為0buckets.clear();//為己方的buckets vector保留空間，使與對方相同//如果己方的空間大于對方就不需要改變；如果己方的空間小于對方就會增大buckets.reserve(ht.buckets.size());//從己方的buckets vector尾端開始，插入n個元素，其數值為 null 指針//注意此時buckets vector為空，所謂的尾端就是起頭處buckets.insert(buckets.end(),ht.buckets.size(),(node*)0);__STL_TRY{//針對buckets vectorfor (size_type i = 0;i<ht.buckets.size();++i) {//復制vector的每一個元素(是一個指針，指向hashtable節點)if (const node* cur = ht.buckets[i]){node* copy = new_node(cur->val);buckets[i] = copy;//針對同一個 buckets list 復制每一個節點for (node* next = cur->next;next ; cur = next,next = cur->next) {copy->next = new_node(next->val);copy = copy->next;}}}//重新登錄的節點的個數(hashtable的大小)num_elements = ht.num_elements;};__STL_UNWIND(clear());}//元素查找iterator find(const key_type& key){size_type n = bkt_num(key); //首先尋找落在哪一個bucket里面node* first;//以下從bucket list的頭部開始，逐一比對每個元素的數值，比對成功就退出for (first = buckets[n];first && !equals(get_key(first->val),key);first = first->next) {}return iterator (first,this);}//元素計數size_type count (const key_type& key)const{const size_type n = bkt_num_key(key);//首先尋找落在哪一個bucket里面size_type result = 0;//遍歷bucket list,從頭部開始，逐一比對每個元素的數值。比對成功就累加1for(const node* cur = buckets[n];cur;cur = cur->next){if (equals(get_key(cur->val),key)){++result;}}return result;}};template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc> struct __hashtable_iterator{typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>hashtable;typedef __hashtable_iterator<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator; // typedef __hash_const 靜態迭代器typedef __hashtable_node<Value>node;typedef std::forward_iterator_tag iterator_category;typedef Value value_type;typedef ptrdiff_t difference_type;typedef std::size_t size_type;typedef Value& reference;typedef Value* pointer;node* cur;// 迭代器目前所指的節點hashtable* ht;//保持對容器的連接關系 (因為可能需要從bucket跳到bucket)__hashtable_iterator(node*n,hashtable* tab):cur(n),ht(tab){}__hashtable_iterator(){}reference operator*() const {return cur->val;}pointer operator->() const {return &(operator*());}iterator& operator++();iterator operator++(int);bool operator==(const iterator& it)const {return cur == it.cur;}bool operator!=(const iterator& it)const {return cur != it.cur;} };template <class V,class K,class HF,class ExK,class EqK,class A> __hashtable_iterator<V,K,HF,ExK,EqK,A>& __hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++() {const node* old = cur;cur = cur->next; //如果存在就是他，否則進入以下的if流程if (!cur){//根據元素的數值，定位出下一個bucket，其起頭處就是我們的目的地size_type bucket = ht->bkt_num(old->val);while(!cur && ++bucket < ht->buckets.size()){cur = ht->buckets[bucket];}}return *this; }template <class V,class K,class HF,class ExK,class EqK,class A> __hashtable_iterator<V,K,HF,ExK,EqK,A> __hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++(int) {iterator tmp = *this;++this; //調用operator++return tmp; }template <class Key> struct hash{};inline size_t __stl_hash_string(const char* s){unsigned long h = 0;for(;*s;++s){h = 5*h + *s;}return std::size_t (h); }//下面所有的 __STL_TEMPLATE_NULL 在<stl_config.h>里面全部被定義為template<>int main(){const char *input_string("Hello");std::cout << input_string << std::endl;std::cout << __stl_hash_string(input_string) << std::endl; }

參考鏈接

關聯容器 — hashtable · STL源碼分析 · 看云

創作挑戰賽新人創作獎勵來咯，堅持創作打卡瓜分現金大獎

總結

以上是生活随笔為你收集整理的STL源码剖析 hashtable的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：如何使用mysql添加更新_Mysql
下一篇： STL源码剖析 set相关算法