16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
Lecture 4
Last time: Left off with characteristic function.
4. Prove φ () =Πφ
i
() where X = X
1
+ X
2
+... + X (X
i
independent) t t
x x n
Let SX
1
+ X +...X where the X
i
are independent.=
2 n
...
?
t
jtS
?= E
?
e
jt X
1
+X
2
+ + X
n
)
φ () = E
?
?e
?
?
(
?
s
?
jtX
1
?Ee
jtX jtX
n
?
? ?
? ?
= Ee
? ?
?
2
?
?
...Ee
t=
∏
φ ()
X
i
This is the main reason why use of the characteristic function is convenient.
This would also follow from the more devious reasoning of the density function
for the sum of n independent random variables being the n
th
order convolution of
the individual density functions – and the knowledge that convolution in the
direct variable domain becomes multiplication in the transform domain.
5. MacLaurin series expansion of φ t()
∞ ∞
Because f(x) is non-negative and f ()xdx =1 (or, even better, f ()xdx =1), it
∫ ∫
?∞ ?∞
∞
follows that
∫
f ()xdx =1converges so that f(x) is Fourier transformable. Thus
?∞
the characteristic function φ t() exists for all distributions and the inverse relation
φ t () is analytic for all () → f (x) holds for all distributions. This implies that φ t
real values of t.
Then it can be expanded in a power series, which converges for all finite values
of t.
1 2
0
2 n n
0 0φ(t) =φ(0) +φ
()
()t +
1
φ
()
()t +... +
1
φ
()
()t +...
2! n!
∞
jtx
dx ,φ t )() =
∫
f (x e φ(0) =1
?∞
Page 1 of 6
1
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
d φ t
n jtx
fx
n
()
=
∞
∫
()(jx) e dx
n
dt
?∞
∞
n n n n
0
∫
)φ
()
()= j x
n
f (x dx = j X
?∞
n n n
()
X t +...+
1
()
X t +...φ(t) =+jXt +
1
j
2 2 2
j
2! n!
The coefficients of the expansion are given by the moments of the distribution.
Thus the characteristic function can be determined from the moments.
Similarly, the moments can be determined from the characteristic function
directly by
n
n
1 d φ(t)
X =
n n
j dt
t=0
()or by expanding φ t into its power series in some other way and identifying the
coefficients of the various powers of t.
The Generating Function
The generating function has its most useful application to random variables
which take integer values only. Examples of such would be the number of
telephone calls into a switchboard in a certain time interval, the number of cars
entering a toll station in a certain time interval, the number of times a 7 is thrown
in n tosses of 2 dice, etc.
For integer-valued random variables, the Generating Function yields the same
advantages as the Characteristic Function and is of simpler form.
Consider a random variable which takes the integer values k:
(PX = k ) = p (k=0,1,2,…)
k
For a discrete distribution you can sum in lieu of integration. The Characteristic
Function for this random variable is
∞
jtk
p()= E ?e
jtX
?=
∑
eφ t
? ?
k
k =0
∞
=
∑
pe
jt
k
k =0
k
()
If we define a new variable se
jt
, we have=
∞
Gs
k
k
()=
∑
p s
k =0
Page 2 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
which is called the Generating Function. It has all the interesting properties of
the characteristic function. Note that t → 0 corresponds to s → 1.
Let’s establish the connection between moments of a distribution and the
generating function:
∞
dG
=
∑
kp s
k ?1
ds
k =0
k
2 ∞
(
dG
k ?2
=
∑
kk ?1) p s
k
ds
2
k =0
∞ ∞
2 k ?2 k ?2
=
∑
kp s ?
∑
kp s
k k
k=0 k=0
2
dG dG
Just calculate and and reorganize them in terms of X and X
2
:
ds ds
2
s=1
s=1
∞
dG
st
=
∑
kp = X , ← 1 moment expression
k
ds
s=1 k =0
2 ∞ ∞
dG
2
=
∑
kp ?
∑
kp
k k
ds
2
k =0
s=1
k =0
2
X
2
dG dG
= + ← 2
nd
moment expression
ds
2
ds
s=1
s=1
Each moment is a linear combination of its order derivative and lower order
derivatives. The generating function for the sum of independent integer-valued
variables is the product of their generating functions. This is harder to prove
than the same property of the characteristic function, but it does, in fact, hold
true.
Multiple Random Variables
Characterizing a joint set of random variables, define a probability distribution
function
( (F x) = P X ≤ x , X ≤ x
2
,..., X ≤ x )
1 1 2 n n
This is called the joint probability distribution function.
Properties:
If any of the arguments x
i
goes to ?∞ , then F x() → 0 .
(
lim F x) = 0
any x
i
→?∞
Page 3 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
()If all of the x
i
go to ∞, then F x →1.
(lim F x) =1
all x
i
→∞
()Fx is monotonically non-decreasing in each x
i
.
Define joint density function by differentiation:
n
f x()=
?
xx
2
...?x??
1 n
f x 0,()≥?x
x
n
x
1
F
x
1
... x
n
( x
1
...x ) =
∫
du
1
...
∫
du f (u ...u )
n n x
1
... x 1 n
n
?∞ ?∞
Setting each x →∞,
i
∞ ∞
du
1
... du F (u ,..., u ) =1
∫ ∫
nu
1
,..., u 1 n
n
?∞ ?∞
(F
x
1
,..., x
k
( x
1
,..., x
k
) = P X
1
≤ x
1
,..., X
n
≤ x
n
)
= PX
1
≤ x ,..., X
k
≤ x X
k+1
≤∞ ,..., X ≤∞ )(
1 k
,
n
= F
x
1
,..., x
( x
1
,..., x , ∞,..., ∞)
k
n
Page 4 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
For the density function:
?
k
f
x
1
,..., x
k
( x
1
,..., x ) =
xx
2
...?x
k
F
x
1
,..., x
( x
1
,..., x
k
)
k
k
??
1
?
k
=
xx
2
...?x
k
F
x
1
,..., x
( x
1
,..., x
k
, ∞,..., ∞)
n
??
1
x
1
x
k
∞ ∞
?
k
du
1
...
∫
du
k
∫
du
k+1
... du f (u ,..., u )=
??
∫ ∫
n x
1
,., x 1 n
n
xx
2
...?x
k1 ?∞ ?∞ ?∞ ?∞
∞ ∞
∫
n x
1
,...,x
( x
1
,..., x u
k+1
,..., u )du
k+1
... du f
k
,
n
=
∫ n
?∞ ?∞
∞ ∞
=
∫
du
k+1
...
∫
du f ( x
1
,..., x )
n x
1
,..., x n
n
?∞ ?∞
Marginal density
If you integrate above over all variables but one, it is referred to as the marginal
density.
∞ ∞
f
x
i
( x
i
) = dx
1
... dx f
,..., x
( x ,..., x )
∫ ∫
n x
1 n
1 n
?∞
3
?∞
424414
n-1 terms: all except x
i
Mutually independent sets of random variables
Definition of independence:
PX
1
∈ s X ∈ s
2
,... ]= PX ∈ s PX
2
∈ s ] ...[
1
, [
1 1
] [
2 2
for any sets s
1
, s
2
, …
The product rule holds for joint probability distribution and density functions for
independent random variables.
F
,
( x x x ,...) = F x Fx Fx
2
)...
,xx x
3
,... 1
,
2
,
3 x
1
(
1
)
x
2
(
2
)
x
(
12 2
f ( x x ,...) = f ( x f
x
2
( x f ( x )...
, ,xx x
3
,... 1
,
2
,
3 x
1
1
)
2
)
x
2
2
1 2
Expectations
∞ ∞
Egx () ()
[
()
]
= dx
1
.. dxgxf x
∫ ∫
n
?∞ ?∞
Page 5 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
For the sum of multiple random variables:
∞ ∞
E X
1
+ X + ... + X
n
]=
∫
dx
1
... dx x
1
+ x + ... + x f
x
1
,..., x
( x ,..., x )[
2
∫
n
(
2 n
)
n
1 n
?∞ ?∞
∞ ∞ ∞ ∞
dx
1
... dx x f
1 n
∫ ∫
n n x
1
,..., x
( x
1
,..., x
n
)
n 1 x
1
,..., x
( x ,..., x ) + ... + dx
1
... dx x f =
∫ ∫ n n
?∞ ?∞ ?∞ ?∞
∞ ∞ ∞
=
∫
xf x dx
1
+ xf ( x dx+ ... + xf ( x )dx
1 x
1
(
1
)
∫
2 x
2
2
)
2
∫
n
n
n n
?∞ ?∞ ?∞
[]
+ EX
]
+ ... + EX
n
]
= EX
1
[
2
[
This relation is true whether or not the x
i
are independent.
For the product of multiple independent random variables:
∞ ∞
E X X
2
... X
n
]=
∫
dx
1
... dx x x
2
... x f
x
1
,..., x
( x ,..., x )[
1
∫
n
(
1 n
)
n
1 n
?∞ ?∞
∞ ∞
dx
1
... dx x
1
x
2
... x f x f ( x )... f ( x )=
∫ ∫
n
(
n
)
x
1
(
1
)
x
2
2 x
n
n
?∞ ?∞
∞ ∞ ∞
x f x dx
1
x f ( x dx
2
... x f ( x dx =
∫
1 x
1
(
1
)
∫
2 x
2
2
)
∫
n
n
n
)
n
?∞ ?∞ ?∞
= EX EX
2
]...
EX
n
][][ [
1
Page 6 of 6