Principles of Information Science
Chapter 5
Principles of Information Transferring,
Communication Theory
1,Model of Communication System
T T C
Source Sink
Channel
Transformation,to match the source with the channel,
Source,sink,and channel are given beforehand,
Noise
X Y
-1
T,Transformation
The Functions of Transformation
- Modulation,Spectra Matching,
Better Performance/Cost Seeking
- Amplification,Signal/Noise Improving
- Equalization,Channel Characteristic Adjusting
- Source Coding,Transmission Efficiency Bettering
- Channel Coding,Noise Immunity
- Cryptographic Coding,Security Protection
2,Model Analysis
Ignoring the content and utility factors,the
model of communication exhibits statistical
properties,
Source Entropy,H(X),H(X,Y)
I(X; Y) = H(X) - H(X|Y)
= H(Y) - H(Y|X)
A radical feature of communication,
The sent waveform recovery at receiving end
with a certain fidelity under noises,
Mutual Information,
Channel Capacity C = I(X; Y)
Max
{p(X)} Define
Example (AWGN Channel),
p(y|x) =(2ps ) exp[-(1/2s ) (y-x) ]
= H(Y) - ? p(x) ? p(y|x) log p(y|x) dy dx
2
= H(Y) - log (2ps e)
I(X; Y) = H(Y) - H(Y|X)
The only way to maximize I(X; Y) is to maximize H(Y),
This requires Y being normal variable with zero mean,
Due to Y = X+N,it requires X being a normal variable
with zero mean,
-1/2 2 2
? ?
-? -?
2 1/2
Let P = P + s
C = log (2peP ) - log(2pes ) Y 1/2 1/2
= (1/2) log (P /s ) 2 Y
= (1/2) log (1 + P/N) (bit/symbol)
If X is a white Gaussian signal with bandwidth F and
duration T,then there are 2FT symbols transmitted
per second,Therefore,we have
C = FT log (1 + P/N) bit/second,
2 Y
2
the famous capacity formula,
(N = s ) 2
F
F T
T
P P
Signal Volume to be transmitted through a channel
must be smaller than the channel capacity provided,
From
C = FT log (1 + P/N )
A) F,T,P are basic parameters of a channel,
C) Signal Division
Frequency Division -- FDMA
Time Division -- TDMA
Time Frequency Division
-- Frequency Hopping
B) For a given C,the parameters are exchangeable,
This provides great flexibility for communication
systems designing,
F
T
P
Channel Capacity
D) It is the origin of CDMA
For a given channel with additive white Gaussian
noise,the maximal value of the capacity can be
implemented when the signal form is also additive
white Gaussian in nature -- noise-like signal,
This lead to the noise communication and then the
pseudo-noise coding communication,or code
multiple access -- CDMA,
CDMA,each user is assigned a different,and also
mutual orthogonal,code as its address,
3,H(X) Analysis,Source Coding
For being able to express the amount of information
of a source with H(X),the encoder with an average
length,l,of codes should keep the relation as below,
Otherwise,distortion will unavoidably be introduced,
H(X) = - ? p(x ) log p(x ) < l = ? p(x ) l
Thus,the optimal coding is the one with
Source Encoder
X Y
n n
-
n=1
N
-
n=1
N
n n
In general case - log p(x ) ? l < - log p(x ) + 1 n n n M M
From the equation above we have
H (X) ? l < H (X) + 1 M M
_
For K-order memory-less source,
H (X ) ? l < H (X ) + 1 M M K K K
_
H (X) ? l /K < H (X) + 1,Lim = H (X) l K M M M K
_
K K??
l = - log p(x ) n n
The first theorem of coding,
For a given discrete,stable and memory-less source
with entropy H (X),there must exists such scheme
of source coding that the average length of the codes,
l,becomes the smallest while with no distortion when
the scheme is sufficient complex provided that
l ? H (X),
Otherwise,when the condition l ? H (X) is not valid
distortion will then be unavoidable,
M
-
M
-
M
Example A binary 7-element source given below,
X,x x x x x x x
P,1/2 1/2 1/2 1/2 1/2 1/2 1/2
1 2
2
3
3
4
4
5
5
6
6
7
6
(A) l 3 3 3 3 3 3 3 n
(B) l 1 2 3 4 5 6 6
Using equal-length coding (A),3 elements are needed
for each code for free of distortion;
n
By using the optimal coding (B),the length is defined
by the corresponding probabilities as seen above,
x
x
x
x
x
x
x
1
2
3
4
5
6
7
1/2
1/2
1/2
1/2
1/2
1/2
1/2 6
6
5
4
3
2
1
0
1
1
1
1
1
0
0
0
0 0
0
0 0
0 0 0
0 0 0 0
0 0 0 0 0
0 0 0 0 0 0
1
1
1
1
1
1
000
001
010
011
100
101
110
l = - log p n n the optimal coding
Comparison
For Scheme A,laverage = 3
H(Y) = H(X)/3 = 21/32 bits/code element
p(y1=0) = 115/(115+77) ? 0.6
p(y2=1) = 77/(115 +77) ? 0.4
For Scheme B,laverage = 63/32 codes/symbol
H(Y) = H(X)/ laverage = 1 bit/code element
p(y1=0) = p(y2=1) = 0.5
H(X) = - ? pn logpn = 63/32 bits/symbol
n
4,I(X; Y) Analysis,Channel Coding
It was proved that I(X; Y) is an above convex function
of {p(x)} for given {p(y|x)} and thus the maximum value
of I(X; Y) over all possible sources,C,can be obtained,
C = I(X; Y) Max {p(X)} b/s
C R The practical
N -- Noise
P -- Error Probability e
transmission rate over the channel
There had been a traditional misunderstanding over
the relationship between the transmission efficiency
and the reliability through a noisy channel,
Reliability
Efficiency
From the analysis of I(X; Y) it is found that
P ? exp[ - K E(R)] e
E(R)
C R
If R < C then E(R) > 0,It is possible to make P
approaches to zero when K,the length of the code,
is sufficiently large,
e
The second theorem of coding
For any given discrete,stable and memory-less noisy
channel with capacity C,as long as R < C is kept,
there must exist such scheme of channel coding that
R can maximally close to C while P approaches to 0
when the complexity of coding scheme is sufficiently
high,
This will becomes completely impossible if R > C,
e
It was also proved that I(X; Y) is a concave function
of P(y|x) when {p(x)} given,Thus a minimum value
of I(X;Y) can be obtained,
R(D) = I(X; Y) Min {P(y|x)}
D
is defined as the rate - distortion function,where
D = ? ? d(x,y ) i i j j
d(x,y ) = i j 1,x = y
0,otherwise {
i j
The third theorem of coding
For any given discrete,stable and memory-less source
and for allowed decoding distortion D,as long as the
compressed rate satisfies the condition,R > R(D),
there must exist such scheme of source coding that the
compress ratio reaches the highest possible while the
average decoding distortion will never exceed D,
However,if R > R(D) is no longer valid,the average
decoding distortion will certainly exceed D,
Security Coding
T M
M T
K
E
K
E = f(M,K) = T M K
M = T E
K
K
K
-1
K
-1
P(M|E) = 0 I(M; E) = 0 Requirement,
5,Comprehensive Information and
Communication Theory
5-1 Compression Coding,Higher Efficiency
5-2 Error Correcting Coding, Stronger Immunity
5-3 Cryptography,Better Security
Chapter 5
Principles of Information Transferring,
Communication Theory
1,Model of Communication System
T T C
Source Sink
Channel
Transformation,to match the source with the channel,
Source,sink,and channel are given beforehand,
Noise
X Y
-1
T,Transformation
The Functions of Transformation
- Modulation,Spectra Matching,
Better Performance/Cost Seeking
- Amplification,Signal/Noise Improving
- Equalization,Channel Characteristic Adjusting
- Source Coding,Transmission Efficiency Bettering
- Channel Coding,Noise Immunity
- Cryptographic Coding,Security Protection
2,Model Analysis
Ignoring the content and utility factors,the
model of communication exhibits statistical
properties,
Source Entropy,H(X),H(X,Y)
I(X; Y) = H(X) - H(X|Y)
= H(Y) - H(Y|X)
A radical feature of communication,
The sent waveform recovery at receiving end
with a certain fidelity under noises,
Mutual Information,
Channel Capacity C = I(X; Y)
Max
{p(X)} Define
Example (AWGN Channel),
p(y|x) =(2ps ) exp[-(1/2s ) (y-x) ]
= H(Y) - ? p(x) ? p(y|x) log p(y|x) dy dx
2
= H(Y) - log (2ps e)
I(X; Y) = H(Y) - H(Y|X)
The only way to maximize I(X; Y) is to maximize H(Y),
This requires Y being normal variable with zero mean,
Due to Y = X+N,it requires X being a normal variable
with zero mean,
-1/2 2 2
? ?
-? -?
2 1/2
Let P = P + s
C = log (2peP ) - log(2pes ) Y 1/2 1/2
= (1/2) log (P /s ) 2 Y
= (1/2) log (1 + P/N) (bit/symbol)
If X is a white Gaussian signal with bandwidth F and
duration T,then there are 2FT symbols transmitted
per second,Therefore,we have
C = FT log (1 + P/N) bit/second,
2 Y
2
the famous capacity formula,
(N = s ) 2
F
F T
T
P P
Signal Volume to be transmitted through a channel
must be smaller than the channel capacity provided,
From
C = FT log (1 + P/N )
A) F,T,P are basic parameters of a channel,
C) Signal Division
Frequency Division -- FDMA
Time Division -- TDMA
Time Frequency Division
-- Frequency Hopping
B) For a given C,the parameters are exchangeable,
This provides great flexibility for communication
systems designing,
F
T
P
Channel Capacity
D) It is the origin of CDMA
For a given channel with additive white Gaussian
noise,the maximal value of the capacity can be
implemented when the signal form is also additive
white Gaussian in nature -- noise-like signal,
This lead to the noise communication and then the
pseudo-noise coding communication,or code
multiple access -- CDMA,
CDMA,each user is assigned a different,and also
mutual orthogonal,code as its address,
3,H(X) Analysis,Source Coding
For being able to express the amount of information
of a source with H(X),the encoder with an average
length,l,of codes should keep the relation as below,
Otherwise,distortion will unavoidably be introduced,
H(X) = - ? p(x ) log p(x ) < l = ? p(x ) l
Thus,the optimal coding is the one with
Source Encoder
X Y
n n
-
n=1
N
-
n=1
N
n n
In general case - log p(x ) ? l < - log p(x ) + 1 n n n M M
From the equation above we have
H (X) ? l < H (X) + 1 M M
_
For K-order memory-less source,
H (X ) ? l < H (X ) + 1 M M K K K
_
H (X) ? l /K < H (X) + 1,Lim = H (X) l K M M M K
_
K K??
l = - log p(x ) n n
The first theorem of coding,
For a given discrete,stable and memory-less source
with entropy H (X),there must exists such scheme
of source coding that the average length of the codes,
l,becomes the smallest while with no distortion when
the scheme is sufficient complex provided that
l ? H (X),
Otherwise,when the condition l ? H (X) is not valid
distortion will then be unavoidable,
M
-
M
-
M
Example A binary 7-element source given below,
X,x x x x x x x
P,1/2 1/2 1/2 1/2 1/2 1/2 1/2
1 2
2
3
3
4
4
5
5
6
6
7
6
(A) l 3 3 3 3 3 3 3 n
(B) l 1 2 3 4 5 6 6
Using equal-length coding (A),3 elements are needed
for each code for free of distortion;
n
By using the optimal coding (B),the length is defined
by the corresponding probabilities as seen above,
x
x
x
x
x
x
x
1
2
3
4
5
6
7
1/2
1/2
1/2
1/2
1/2
1/2
1/2 6
6
5
4
3
2
1
0
1
1
1
1
1
0
0
0
0 0
0
0 0
0 0 0
0 0 0 0
0 0 0 0 0
0 0 0 0 0 0
1
1
1
1
1
1
000
001
010
011
100
101
110
l = - log p n n the optimal coding
Comparison
For Scheme A,laverage = 3
H(Y) = H(X)/3 = 21/32 bits/code element
p(y1=0) = 115/(115+77) ? 0.6
p(y2=1) = 77/(115 +77) ? 0.4
For Scheme B,laverage = 63/32 codes/symbol
H(Y) = H(X)/ laverage = 1 bit/code element
p(y1=0) = p(y2=1) = 0.5
H(X) = - ? pn logpn = 63/32 bits/symbol
n
4,I(X; Y) Analysis,Channel Coding
It was proved that I(X; Y) is an above convex function
of {p(x)} for given {p(y|x)} and thus the maximum value
of I(X; Y) over all possible sources,C,can be obtained,
C = I(X; Y) Max {p(X)} b/s
C R The practical
N -- Noise
P -- Error Probability e
transmission rate over the channel
There had been a traditional misunderstanding over
the relationship between the transmission efficiency
and the reliability through a noisy channel,
Reliability
Efficiency
From the analysis of I(X; Y) it is found that
P ? exp[ - K E(R)] e
E(R)
C R
If R < C then E(R) > 0,It is possible to make P
approaches to zero when K,the length of the code,
is sufficiently large,
e
The second theorem of coding
For any given discrete,stable and memory-less noisy
channel with capacity C,as long as R < C is kept,
there must exist such scheme of channel coding that
R can maximally close to C while P approaches to 0
when the complexity of coding scheme is sufficiently
high,
This will becomes completely impossible if R > C,
e
It was also proved that I(X; Y) is a concave function
of P(y|x) when {p(x)} given,Thus a minimum value
of I(X;Y) can be obtained,
R(D) = I(X; Y) Min {P(y|x)}
D
is defined as the rate - distortion function,where
D = ? ? d(x,y ) i i j j
d(x,y ) = i j 1,x = y
0,otherwise {
i j
The third theorem of coding
For any given discrete,stable and memory-less source
and for allowed decoding distortion D,as long as the
compressed rate satisfies the condition,R > R(D),
there must exist such scheme of source coding that the
compress ratio reaches the highest possible while the
average decoding distortion will never exceed D,
However,if R > R(D) is no longer valid,the average
decoding distortion will certainly exceed D,
Security Coding
T M
M T
K
E
K
E = f(M,K) = T M K
M = T E
K
K
K
-1
K
-1
P(M|E) = 0 I(M; E) = 0 Requirement,
5,Comprehensive Information and
Communication Theory
5-1 Compression Coding,Higher Efficiency
5-2 Error Correcting Coding, Stronger Immunity
5-3 Cryptography,Better Security