Hash Functions and Hash Tables

(ハッシュ関数とハッシュ表)

Data Structures and Algorithms

10th lecture, November 19, 2015

http://www.sw.it.aoyama.ac.jp/2015/DA/lecture10.html

Martin J. Dürst

AGU

© 2009-15 Martin J. Dürst 青山学院大学

Today's Schedule

 

Time Complexity for Dictionary Implementations up to Here

Implementation Search Insertion Deletion
Sorted array O(log n) O(n) O(n)
Unordered array/linked list O(n) O(1) O(n)
Balanced tree O(log n) O(log n) O(log n)

 

Direct Addressing

Problem: Array size, non-numeric keys

Solution: key transformation

 

Overview of Hashing

(also called scatter storage technique)

Problem 1: Design of hash function

Problem 2: Resolution of conflicts

 

Overview of Hash Function

Step 2 is easy. Therefore, we concentrate on step 1.
(often step 1 alone is called 'hash function')

 

Hash Function Example 1

(SDBM Hash Function)

int sdbm_hash(char key[])
{
    int hash = 0;
    while (*key) {
        hash = *key++ + hash<<6
+ hash<<16 - hash; } }

 

Hash Function Example 2

(simplified from MurmurHash3; For 32bit machines)

#define ROTL32(n,by) (((n)<<(by)) | ((n)>>(32-(by))))
int too_simple_hash(int key[], int length)
{
    int h = 0;
    for (int i=0; i<length; i++) {
        int k = key[i] * C1;  // C1 is a constant
        h ^= ROTL32(k, R1);  // R1 is a constant
    }
    h ^= h >> 13;
    h *= 0xc2b2ae35;
    return 
}

 

Evaluation of Hash Functions

 

Precautions for Hash Functions

 

Special Hash Functions

 

Cryptographic Hash Function

 

Conflict

 

Terms and Variables for Conflict Resolution

Chaining

 

Implementation of Chaining

 

Open Addressing

Time Complexity of Hashing

(average, for chaining)

 

Expansion and Shrinking of Hash Table

 

Analisys of the Time Complexity of Expansion

(simple example of amortized analysis)

 

Evaluation of Hashing

Advantages:

Problems:

 

Comparison of Dictionary Implementations

Implementation Search Insertion Deletion Sorting
Sorted array O(log n) O(n) O(n) O(n)
Unordered array/linked list O(n) O(1) O(1) O(n log n)
Balanced tree O(log n) O(log n) O(log n) O(n)
Hash table O(1) O(1) O(1) O(n log n)

 

The Ruby Hash Class

(Perl: hash; Java: HashMap; Python: dict)

 

Implementation of Hashing in Ruby

 

Summary

 

Glossary

direct addressing
直接アドレス表
hashing, scatter storage technique
ハッシュ法、挽き混ぜ法
hash function
ハッシュ関数
hash table
ハッシュ表
joseki
定石 (囲碁)
universal hashing
万能ハッシュ法
denial of service attack
DOS 攻撃、サービス拒否攻撃
perfect hash function
完全ハッシュ関数
cryptographic hash function
暗号技術的ハッシュ関数
electronic signatures
電子署名
conflict
激突
Poisson distribution
ポアソン分布
chaining
チェイン法、連鎖法
open addressing
開番地法、オープン法
load factor
占有率
linear probing
線形探査法
quadratic probing
二次関数探査法
divisor
(割り算の) 法
amortized analysis
償却分析
proximity search
近接探索
similarity search
類似探索