1
曹天杰
Tianjie Cao
tjcao@cumt.edu.cn
College of Computer Science and
Technology,China University of Mining and Technology,
Xuzhou,China
中国矿业大学计算机科学与技术学院
2003.5.26
Hash Functions
2
Hash Functions,Introduction
? Cryptographic hash functions
– Input – any length
– Output – fixed length
– H(x) – easy
– H(x) – one way
?,hard to invert”
– H(x) collision free
3
Purposes for hash functions
? Data Integrity
– Ex,Tripwire
– Message digest
? y = h(x),y is called the message digest,
? 160 bits in size –,birthday attack”
? Message Source
? Digital Signatures
? Message Authentication Codes (MAC)
4
Digital Signatures and Message
Authentication Code (MAC) overview
? Suppose Alice and Bob share a secret key k
which determines hash function hk
? Alice sends (x,y) to Bob where y = hk(x)
? Bob receives (x,y) and verifies with y =
hk(x),If condition holds,neither x nor y
was modified in transit,
5
Keyed hash functions
(X,Y,K,H) is a keyed hash family,where
? X is a set of possible messages
? Y is the set of possible message digests,or
authentication tags
? K is the keyspace
? H is a set of hash functions
? For each key K there is a hash function hK,X ? Y
in H
? Assume |X| >= |Y| (even better,2|X| >= |Y|)
6
Unkeyed hash functions
An unkeyed hash function is a mapping h,X ? Y,
where
? X is a set of possible messages
? Y is the set of possible message digests
?Unkeyed hash function
–|K| = 1
–Ex,SHA-1 (successor of MD4)
7
Conditions of a secure hash function
? Preimage
– Find x such that h(x) = y,given y and the function f(),
– one-way(preimage resistant)
? Second Preimage
– Find x’ != x,such that h(x) = h(x’),given x and the
function h(),
– weak collision resistance(Second preimage resistant)
? Collision
– Find h(x) = h(x’) such that x != x’,given function h()
– strong collision resistance(Collision resistant)
8
The random oracle model
This is an idealized model in which captures the
concept of a hash function in an ideal fashion,
? A hash function h,X ? Y is chosen at random
from the set FX,Y of all functions from X to Y,
Given x,its hash y can only be obtained from an
oracle
9
The random Oracle model
Alice
Oracle
x
h(x)
10
Ideal hash functions
Theorem
Let FX,Y be an,ideal” family of hash functions,
If h is a random function in FX,Y and X0 is the subset
of queried messages then
Pr[ h(x)=y ] = 1/M
for all x in X \X0 and all y in Y,where M is the
cardinality of Y,
11
Ideal hash functions
? it is a simple matter to compute the probability
that k random elements z1,.,,,zk ? Z are
distinct,
? The first choice z1 is arbitrary; the probability that
z2 != z1 is 1 - l/n; the probability that z3 is distinct
from z1 and z2 is 1 - 2/n,etc,
? we estimate the probability of no collisions to be
12
Ideal hash functions
? If x is a small real number,then 1 - x ? e-x,
? So we estimate the probability of at least
one collision to be
13
Ideal hash functions
? If we denote this probability by ?,then we
can solve for k as a function of n and ?
? If we ignore the term -k,then we estimate
14
Ideal hash functions
? If we take ? =,5,then our estimate is
? Taking n = 365 in our estimate,we get k ?
22.3,
? It is usually suggested that the minimum
acceptable size of a message digest is 128
bits (the birthday attack will require over
264 hashes in this case),
15
Iterated hash function
Suppose that compress,{0,1}m+t ? {0,1}m is a
compression function,
Preprocessing
Given input x,|x| >= m+t+1,construct y using a
public algorithm such that |y| is a multiple of t,
For example,y = x || pad(x)
Let y = y1 || y2 || … || yr,
1<=i<=r,|yi|=t
16
17
Iterated hash function
Processing
Let IV be an initial value of length m,
Compute the following,
– z 0 ? IV
– z 1 ? compress(z0|| y1)
– …
– …
– z r ? compress(zr-1|| yr)
18
Iterated hash function
Output
Let g,{0,1}m ? {0,1}l be a public function,
The iterated hush function is,h(x)= g(zr)
So the hush function is h,Ui = m+t+1 {0,1} i ?
{0,1}l
19
Merkle - Damgard hash function
Transforms a collision resistant compression
function,
– compress,{0,1}m+t ? {0,1}m
to a collision resistant hash function,
– Ui = m+t+1 {0,1} i ? {0,1}m
20
?The need for collision-resistant hashing has
been a long-time goal of modern cryptography,
?The problem,find a hash function that has,
?An input of variable length;
?An output of fixed length,
Why is Merkle-Damgard needed?
21
?Merkle & Damgard’s solution,
?Start with a compression function with,
?An input of fixed length
?An output of fixed length
?The compression function should be
collision resistant
?Proved that starting with a collision
resistant compression function,a collision
resistant hash function can be constructed,
Merkle & Damguard’s Solution
22
f
Start with a collision resistant compression
function f,
f,{0,1}m+t ? {0,1}m where t ≥ 2
x,
Message
of length
m+t
Message
of length
m
So f will always compress a block by at
least two bits,
The Compression Function
23
External ?
n ? | x |
k ? ? n / (t -1) ?
d ? n – (k -1)(t - 1)
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
yk+1 ? binary rep,of d
z1 ? 0m+1 || y1
g1 ? ?(z1)
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
h(x) ? gk+1
return (h(x))
The Algorithm
24
External ?
n ? | x |
k ? ? n / (t -1) ?
d ? n – (k -1)(t - 1)
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
yk+1 ? binary rep,of d
z1 ? 0m+1 || y1
g1 ? ?(z1)
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
h(x) ? gk+1
return (h(x))
n ? | x |
k ? ? n / (t -1) ?
d ? n –(k-1)(t-1)
Initialization of Constants
25
Comment f,{0,1}m+t ? {0,1}m where t ≥ 2
?Let t ≥2
?Let k be the number of compressions
?Let d be the number of zeroes to be padded to the last xk
n ? | x |
k ? ? n / (t -1) ?
d ? n –(k-1)(t-1)
Partition x 1,Parameters n,k,d
26
x1 x2 … xk-1 xk
x
x = x1 || x2 || …|| x k
|x1| = |x2| = … |x k-1| = t-1
|xk| = t-1-d
Partition x 2,Substrings
27
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
External ?
n ? | x |
k ? ? n / (t -1) ?
d ? n – (k -1)(t - 1)
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
yk+1 ? binary rep,of d
z1 ? 0m+1 || y1
g1 ? ?(z1)
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
h(x) ? gk+1
return (h(x))
Creating y from x
28
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
y1 y2 … yk-1 xk d 0’s
y1 y2 … yk-1 yk
x1 x2 … xk-1 xk
Steps in Creating y
29
z1 ? 0m+1 || y1
g1 ? ?(z1)
External ?
n ? | x |
k ? ? n / (t -1) ?
d ? n – (k -1)(t - 1)
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
yk+1 ? binary rep,of d
z1 ? 0m+1 || y1
g1 ? ?(z1)
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
h(x) ? gk+1
return (h(x))
Start the Compression Sequence
30
z1 ? 0m+1 || y1
g1 ? ?(z1)
y1 y2 … yk-1 yk
f
z1
IV
g1
First Compression
31
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
External ?
n ? | x |
k ? ? n / (t -1) ?
d ? n – (k -1)(t - 1)
for i ? 1 to k - 1
do yi ? xi
yk ? xk || 0d
yk+1 ? binary rep,of d
z1 ? 0m+1 || y1
g1 ? ?(z1)
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
h(x) ? gk+1
return (h(x))
Continue to Compress
32
y1 y2 … yk-1 yk
z1
IV
f g1
z2
f g2
zk
f gk+1
1 1
gk …
for i ? 1 to k
zi+1 ? gi || 1 || yi+1
gi+1 ? ?(zi+1)
Compression Loop
33
? Merkle and Damgard showed that given a
collision resistant compression function,a
collision resistant hash function could be
constructed,
? Merkle-Damgard Construction is used by
many of the popular hash functions,
– Examples
? MD4,MD5,SHA-1,and RIPEMD-160
Importance of the Merkle-Damgard
Construction
34
? Finding a collision resistant function can be
hard to find,
– Collisions have been found in some of the
popular hash functions
? Example MD5
Pitfall of the Merkle-Damgard
Construction
35
MD5 Box
Initial
128-bit vector
512-bit message chunks (16 words)
128-bit result
F,(x?y)?(~x ? z)
G:(x ? z) ?(y ?~ z)
H:x?y? z
I,y?(x ? ~z)
+,binary sum
x?y,x left rotate y bits
36
MD5,Padding
input Message
Output 128 bits Digest
Padding 512 bit block
Initial Value
1 2 3 4
Final Output
MD5 Transformation block by block
37
Padding Twist
? Given original message M,add padding bits
“10*” such that resulting length is 64 bits
less than a multiple of 512 bits,
? Append (original length in bits mod 264),
represented in 64 bits to the padded
message
? Final message is chopped 512 bits a block
38
Padding Twist
? Append padding bits,
– The message M is padded as M’ so that the length is
512t+448,
– Furthermore,padding is necessary even if M is of
length 512t+448 (512 bits are padded),
? Append length,
– A 64-bit representation of the length in bits of the
original message (before the padding) is appended to
the result of step 1 (least significant bytes first),
– The 64-bit string R is then padded to the padded
message M’ to form a string Y=M’||R of multiple 512-
bit string,
39
MD5 Process
? As many stages as the number of 512-bit
blocks in the final padded message
? Digest,4 32-bit words,MD=A|B|C|D
? Every message block contains 16 32-bit
words,m0|m1|m2…|m 15
– Digest MD0 initialized to,
A=01234567,B=89abcdef,C=fedcba98,
D=76543210
– Every stage consists of 4 passes over the
message block,each modifying MD
40
MD5 Blocks
MD5
MD5
MD5
MD5
512,B1
512,B2
512,B3
512,B4
Result
41
Processing of Block mi - 4 Passes
ABCD=fF(ABCD,mi,T[1..16])
ABCD=fG(ABCD,mi,T[17..32])
ABCD=fH(ABCD,mi,T[33..48])
ABCD=fI(ABCD,mi,T[49..64])
Mi (512)
+ + + +
A B C D
MDi (128)
MD i+1 (128)
42
Different Passes..,
? Different functions and constants are used
? Different set of mi is used
? Different set of shift amount is used
43
Functions and Random Numbers
? F(x,y,z) == (x?y)?(~x ? z)
– selection function
? G(x,y,z) == (x ? z) ?(y ?~ z)
? H(x,y,z) == x?y? z
? I(x,y,z) == y?(x ? ~z)
? Ti = int(232 * abs(sin(i))),0<i<65,
44
Compression function
– How to choose k,s,and i?
– /* Process each 16-word (512-bit) block,*/
– For q = 0 to N/16) - 1 do
? /* Copy block q into X,*/
? For J = 0 to 15 do
? Set X[j] to M[q*16 + j],
? end /* of loop on j */
? /* Save A as AA,B as BB,C as CC,and D as DD,*/
? AA = A
? BB = B
? CC = C
? DD = D
45
– /* Round 1,*/
– /* Let [abcd k s i] denote the operation */
/* a = b + ((a + F(b,c,d) + X[k] + T[i] <<<s) */
– /* do the following 16 operations,*/
? [ABCD 0 7 1]
? [DABC 1 12 2]
? [CDAB 2 17 3]
? [BCDA 3 22 4]
? [ABCD 4 7 5]
? [DABC 5 12 6]
? [CDAB 6 17 7]
? [BCDA 7 22 8]
46
? [ABCD 8 7 9]
? [DABC 9 12 10]
? [CDAB 10 17 11]
? [BCDA 11 22 12]
? [ABCD 12 7 13]
? [DABC 13 12 14]
? [CDBA 14 17 15]
? [BCDA 15 22 16]
47
– /* Round 2,*/
– /* Let [abcd k s i] denote the operation */
/* a = b + ((a + G(b,c,d) + X[k] + T[i] <<<s) */
– /* do the following 16 operations,*/
? [ABCD 1 5 17]
? [DABC 6 9 18]
? [CDAB 11 14 19]
? [BCDA 0 20 20]
? [ABCD 5 5 21]
? [DABC 10 9 22]
? [CDAB 15 14 23]
? [BCDA 4 20 24]
48
? [ABCD 9 5 25]
? [DABC 14 9 26]
? [CDAB 3 14 27]
? [BCDA 8 20 28]
? [ABCD 13 5 29]
? [DABC 2 9 30]
? [CDBA 7 14 31]
? [BCDA 12 20 32]
49
– /* Round 3,*/
– /* Let [abcd k s i] denote the operation
– a = b + ((a + H(b,c,d) + X[k] + T[i] <<<s),
– Do the following 16 operations,*/
? [ABCD 5 4 33]
? [DABC 8 11 34]
? [CDAB 11 16 35]
? [BCDA 14 23 36]
? [ABCD 1 4 37]
? [DABC 4 11 38]
? [CDAB 7 16 39]
50
? [BCDA 10 23 40]
? [ABCD 13 4 41]
? [DABC 0 11 42]
? [CDAB 3 16 43]
? [BCDA 6 23 44]
? [ABCD 9 4 45]
? [DABC 12 11 46]
? [CDBA 15 16 47]
? [BCDA 2 23 48]
51
– /* Round 4,*/
– /* Let [abcd k s i] denote the operation
– a = b + ((a + I(b,c,d) + X[k] + T[i] <<<s),
– Do the following 16 operations,*/
? [ABCD 0 6 49]
? [DABC 7 10 50]
? [CDAB 14 15 51]
? [BCDA 5 21 52]
? [ABCD 12 6 53]
? [DABC 3 10 54]
? [CDAB 10 15 55]
52
? [BCDA 1 21 56]
? [ABCD 8 6 57]
? [DABC 15 10 58]
? [CDAB 6 15 59]
? [BCDA 13 21 60]
? [ABCD 4 6 61]
? [DABC 11 10 62]
? [CDBA 2 15 63]
? [BCDA 9 21 64]
53
– /* Then increment each of the four registers by
the
– value it had before this block was started,*/
? A = A + AA
? B = B + BB
? C = C + CC
? D = D + DD
– end /* of loop on q */
54
Secure Hash Algorithm
? Developed by NIST,specified in the Secure
Hash Standard (SHS,FIPS Pub 180),1993
? SHA is specified as the hash algorithm in
the Digital Signature Standard (DSS),NIST
55
General Logic
? Input message must be < 264 bits
– not really a problem
? Message is processed in 512-bit blocks
sequentially
? Message digest is 160 bits
? SHA design is similar to MD5,but a lot
stronger
56
Basic Steps
Step1,Padding
Step2,Appending length as 64 bit unsigned
Step3,Initialize MD buffer 5 32-bit words
A|B|C|D|E
A = 67452301
B = efcdab89
C = 98badcfe
D = 10325476
E = c3d2e1f0
57
Basic Steps..,
Step 4,the 80-step processing of 512-bit
blocks – 4 rounds,20 steps each,
Each step t (0 <= t <= 79),
– Input,
? Wt – a 32-bit word from the message
? Kt – a constant,
? ABCDE,current MD,
– Output,
? ABCDE,new MD,
58
Basic Steps..,
? Only 4 per-round distinctive additive
constants
0 <=t<= 19 Kt = 5A827999
20<=t<=39 Kt = 6ED9EBA1
40<=t<=59 Kt = 8F1BBCDC
60<=t<=79 Kt = CA62C1D6
59
Basic Steps - The Heart Of The
Matter
A E B C D
A E B C D
+
+
+
+
ft
CLS30
CLS5
Wt
Kt
60
Basic Logic Functions
? Only 3 different functions
Round Function ft(B,C,D)
0 <=t<= 19 (B?C)?(~B ?D)
20<=t<=39 B?C?D
40<=t<=59 (B?C)?(B?D)?(C?D)
60<=t<=79 B?C?D
61
Twist With Wt’s
? Additional mixing used with input message
512-bit block
W0|W1|…|W 15 = m0|m1|m2…|m 15
For 15 < t <80,
Wt = Wt-16 ?Wt-14 ?Wt-8 ?Wt-3
? XOR is a very efficient operation,but with
multilevel shifting,it should produce very
extensive and random mixing!
62
SHA Versus MD5
? SHA is a stronger algorithm,
– Brute-force birthday attacks requires on the
order of 280 operations vs,264 for MD5
? SHA’s 80 steps (vs,64) and 160 bits hash
(vs,128) requires a little more computation
63
SHA-256,SHA-384,SHA-512
? Current popular hashes produce hash values of length n=128 (MD4
and MD5)
and n=160( SHA-1),and therefore can provide no more than 64 or 80
bits of security,respectively,against collision attacks,
Algorithm Message
Size (bits)
Block Size
(bits)
Word Size
(bits)
Digest
Size
(bits)
Security
(bits)
SHA-1 2^64 512 32 160 80
SHA-256 2^64 512 32 256 128
SHA-384 2^128 1024 64 384 192
SHA-512 2^128 1024 64 512 256
64
Message Authentication codes (MACs)
Source (K) Destination (K)
Message m
y = hk(m)
Check
y = hk(m)
m,tag y
65
Message Authentication Codes
? Oscar’s (adversary) goal,
– produce a pair (x,y) that is valid,but the key k
is not known
? Oscar knows
– valid pairs
Pairs = {(x1,y1),(x2,y2),...,(xq,yq)}
? forgery
– Oscar outputs an (x,y) where x is not in Pairs
66
Ways of creating a MAC
? Base MAC on block cipher
– block cipher already implemented,so part of
implementation is done
? MAC from an unkeyed hash
– just add a key to output of unkeyed hash
– requires careful analysis
? Create a customized MAC
67
Nested MACs
? Motivation,
– Cryptographic hash functions,such as MD5 and SHA-1,
are generally faster than secret-key cryptosystems,such
as DES,
– Library code for cryptographic functions is widely
available,
– There are no export restrictions,whereas symmetric
block ciphers,even when used for MAC,are restricted,
? To use cryptographic hash functions for MAC,we
has to embed the secret key into the un-keyed hash
functions,
68
Nested MACs
? The design objectives of HMAC is
– To use,without modification,available functions,
– To allow for easy replacement of underlying hash
functions in case faster and more secure hash functions
are found or required,
– To preserve the original performance of hash functions
without significant increase of executing time,
– To use and handle keys in simple ways,
– To have a well-understood cryptographic analysis of
the strength of the authentication mechanism based on
reasonable assumptions on the embedded hash
functions,
69
Nested MACs
? Nested MAC
– composition of 2 keyed hash families
? G o H = {g o h, g is in G,h is in H} where (g o
h)(k,l)(x) = hl(gk(x))
– Secure if the following holds (given unknown
key),
? G is collision-resistant
? H is secure as a MAC
70
Types of attacks on nested MACs
? forger for nested MAC
? forger for the little MAC
– attack on component MAC H
? unknown-key collision attack
71
Attack 1,Forger on nested MAC
? pair of keys (k,l) are kept secret
? Oscar,
– chooses an x
– oracle –,magic box”
– given x,oracle computes z = hl(gk(x))
– tries to find (x’,z) where x’ was not any x
given to oracle
72
Attack 2,Forger on smaller MAC
component of nested MAC (H family)
? key l is chosen and kept secret (l is in
keyspace of H family of hashes)
? Oscar,
– chooses y
– given y,oracle computes z = hl(y)
– tries to output (y’,z) where y’ was not in one of
its previous queries to oracle
73
Attack 3,Collision Finder for a hash
family
? key k in K is kept secret
? Oscar,
– chooses an x
– given x,oracle computes gk(x)
– tries to find x’ and x’’ where x’ != x’’ and gk(x’)
= gk(x’’)
74
HMAC
? nested MAC algorithm (proposed standard)
– based on SHA-1
– uses 512-bit key k
– 2 512-bit constants,ipad and opad
? 160-bit MAC
– HMACk(x) = SHA-1((k ? opad) || SHA-1((K
? ipad) || x))
? ipad component resistant against unknown-key
collision attack
75
76
CBC MAC
x1
+ + + +
IV
x2 x3 x4
y4 y3 y2 y1
DES DES DES DES
77
CBC MAC
? use block cipher in CBC mode with fixed
IV
? best general attack is birthday attack
78
References
The MD5 Message-Digest Algorithm
http://www.ietf.org/rfc/rfc1321.txt
HMAC,Keyed-Hashing for Message Authentication
http://www.ietf.org/rfc/rfc2104.txt
The Keyed-Hash Message Authentication Code (HMAC)
http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf
March 2002
Secure Hash Standard (SHS)
http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
August 2002