16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Lecture 4 Last time: Left off with characteristic function. 4. Prove φ () =Πφ i () where X = X 1 + X 2 +... + X (X i independent) t t x x n Let SX 1 + X +...X where the X i are independent.= 2 n ... ? t jtS ?= E ? e jt X 1 +X 2 + + X n ) φ () = E ? ?e ? ? ( ? s ? jtX 1 ?Ee jtX jtX n ? ? ? ? ? = Ee ? ? ? 2 ? ? ...Ee t= ∏ φ () X i This is the main reason why use of the characteristic function is convenient. This would also follow from the more devious reasoning of the density function for the sum of n independent random variables being the n th order convolution of the individual density functions – and the knowledge that convolution in the direct variable domain becomes multiplication in the transform domain. 5. MacLaurin series expansion of φ t() ∞ ∞ Because f(x) is non-negative and f ()xdx =1 (or, even better, f ()xdx =1), it ∫ ∫ ?∞ ?∞ ∞ follows that ∫ f ()xdx =1converges so that f(x) is Fourier transformable. Thus ?∞ the characteristic function φ t() exists for all distributions and the inverse relation φ t () is analytic for all () → f (x) holds for all distributions. This implies that φ t real values of t. Then it can be expanded in a power series, which converges for all finite values of t. 1 2 0 2 n n 0 0φ(t) =φ(0) +φ () ()t + 1 φ () ()t +... + 1 φ () ()t +... 2! n! ∞ jtx dx ,φ t )() = ∫ f (x e φ(0) =1 ?∞ Page 1 of 6 1 16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde d φ t n jtx fx n () = ∞ ∫ ()(jx) e dx n dt ?∞ ∞ n n n n 0 ∫ )φ () ()= j x n f (x dx = j X ?∞ n n n () X t +...+ 1 () X t +...φ(t) =+jXt + 1 j 2 2 2 j 2! n! The coefficients of the expansion are given by the moments of the distribution. Thus the characteristic function can be determined from the moments. Similarly, the moments can be determined from the characteristic function directly by n n 1 d φ(t) X = n n j dt t=0 ()or by expanding φ t into its power series in some other way and identifying the coefficients of the various powers of t. The Generating Function The generating function has its most useful application to random variables which take integer values only. Examples of such would be the number of telephone calls into a switchboard in a certain time interval, the number of cars entering a toll station in a certain time interval, the number of times a 7 is thrown in n tosses of 2 dice, etc. For integer-valued random variables, the Generating Function yields the same advantages as the Characteristic Function and is of simpler form. Consider a random variable which takes the integer values k: (PX = k ) = p (k=0,1,2,…) k For a discrete distribution you can sum in lieu of integration. The Characteristic Function for this random variable is ∞ jtk p()= E ?e jtX ?= ∑ eφ t ? ? k k =0 ∞ = ∑ pe jt k k =0 k () If we define a new variable se jt , we have= ∞ Gs k k ()= ∑ p s k =0 Page 2 of 6 16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde which is called the Generating Function. It has all the interesting properties of the characteristic function. Note that t → 0 corresponds to s → 1. Let’s establish the connection between moments of a distribution and the generating function: ∞ dG = ∑ kp s k ?1 ds k =0 k 2 ∞ ( dG k ?2 = ∑ kk ?1) p s k ds 2 k =0 ∞ ∞ 2 k ?2 k ?2 = ∑ kp s ? ∑ kp s k k k=0 k=0 2 dG dG Just calculate and and reorganize them in terms of X and X 2 : ds ds 2 s=1 s=1 ∞ dG st = ∑ kp = X , ← 1 moment expression k ds s=1 k =0 2 ∞ ∞ dG 2 = ∑ kp ? ∑ kp k k ds 2 k =0 s=1 k =0 2 X 2 dG dG = + ← 2 nd moment expression ds 2 ds s=1 s=1 Each moment is a linear combination of its order derivative and lower order derivatives. The generating function for the sum of independent integer-valued variables is the product of their generating functions. This is harder to prove than the same property of the characteristic function, but it does, in fact, hold true. Multiple Random Variables Characterizing a joint set of random variables, define a probability distribution function ( (F x) = P X ≤ x , X ≤ x 2 ,..., X ≤ x ) 1 1 2 n n This is called the joint probability distribution function. Properties: If any of the arguments x i goes to ?∞ , then F x() → 0 . ( lim F x) = 0 any x i →?∞ Page 3 of 6 16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde ()If all of the x i go to ∞, then F x →1. (lim F x) =1 all x i →∞ ()Fx is monotonically non-decreasing in each x i . Define joint density function by differentiation: n f x()= ? xx 2 ...?x?? 1 n f x 0,()≥?x x n x 1 F x 1 ... x n ( x 1 ...x ) = ∫ du 1 ... ∫ du f (u ...u ) n n x 1 ... x 1 n n ?∞ ?∞ Setting each x →∞, i ∞ ∞ du 1 ... du F (u ,..., u ) =1 ∫ ∫ nu 1 ,..., u 1 n n ?∞ ?∞ (F x 1 ,..., x k ( x 1 ,..., x k ) = P X 1 ≤ x 1 ,..., X n ≤ x n ) = PX 1 ≤ x ,..., X k ≤ x X k+1 ≤∞ ,..., X ≤∞ )( 1 k , n = F x 1 ,..., x ( x 1 ,..., x , ∞,..., ∞) k n Page 4 of 6 16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde For the density function: ? k f x 1 ,..., x k ( x 1 ,..., x ) = xx 2 ...?x k F x 1 ,..., x ( x 1 ,..., x k ) k k ?? 1 ? k = xx 2 ...?x k F x 1 ,..., x ( x 1 ,..., x k , ∞,..., ∞) n ?? 1 x 1 x k ∞ ∞ ? k du 1 ... ∫ du k ∫ du k+1 ... du f (u ,..., u )= ?? ∫ ∫ n x 1 ,., x 1 n n xx 2 ...?x k1 ?∞ ?∞ ?∞ ?∞ ∞ ∞ ∫ n x 1 ,...,x ( x 1 ,..., x u k+1 ,..., u )du k+1 ... du f k , n = ∫ n ?∞ ?∞ ∞ ∞ = ∫ du k+1 ... ∫ du f ( x 1 ,..., x ) n x 1 ,..., x n n ?∞ ?∞ Marginal density If you integrate above over all variables but one, it is referred to as the marginal density. ∞ ∞ f x i ( x i ) = dx 1 ... dx f ,..., x ( x ,..., x ) ∫ ∫ n x 1 n 1 n ?∞ 3 ?∞ 424414 n-1 terms: all except x i Mutually independent sets of random variables Definition of independence: PX 1 ∈ s X ∈ s 2 ,... ]= PX ∈ s PX 2 ∈ s ] ...[ 1 , [ 1 1 ] [ 2 2 for any sets s 1 , s 2 , … The product rule holds for joint probability distribution and density functions for independent random variables. F , ( x x x ,...) = F x Fx Fx 2 )... ,xx x 3 ,... 1 , 2 , 3 x 1 ( 1 ) x 2 ( 2 ) x ( 12 2 f ( x x ,...) = f ( x f x 2 ( x f ( x )... , ,xx x 3 ,... 1 , 2 , 3 x 1 1 ) 2 ) x 2 2 1 2 Expectations ∞ ∞ Egx () () [ () ] = dx 1 .. dxgxf x ∫ ∫ n ?∞ ?∞ Page 5 of 6 16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde For the sum of multiple random variables: ∞ ∞ E X 1 + X + ... + X n ]= ∫ dx 1 ... dx x 1 + x + ... + x f x 1 ,..., x ( x ,..., x )[ 2 ∫ n ( 2 n ) n 1 n ?∞ ?∞ ∞ ∞ ∞ ∞ dx 1 ... dx x f 1 n ∫ ∫ n n x 1 ,..., x ( x 1 ,..., x n ) n 1 x 1 ,..., x ( x ,..., x ) + ... + dx 1 ... dx x f = ∫ ∫ n n ?∞ ?∞ ?∞ ?∞ ∞ ∞ ∞ = ∫ xf x dx 1 + xf ( x dx+ ... + xf ( x )dx 1 x 1 ( 1 ) ∫ 2 x 2 2 ) 2 ∫ n n n n ?∞ ?∞ ?∞ [] + EX ] + ... + EX n ] = EX 1 [ 2 [ This relation is true whether or not the x i are independent. For the product of multiple independent random variables: ∞ ∞ E X X 2 ... X n ]= ∫ dx 1 ... dx x x 2 ... x f x 1 ,..., x ( x ,..., x )[ 1 ∫ n ( 1 n ) n 1 n ?∞ ?∞ ∞ ∞ dx 1 ... dx x 1 x 2 ... x f x f ( x )... f ( x )= ∫ ∫ n ( n ) x 1 ( 1 ) x 2 2 x n n ?∞ ?∞ ∞ ∞ ∞ x f x dx 1 x f ( x dx 2 ... x f ( x dx = ∫ 1 x 1 ( 1 ) ∫ 2 x 2 2 ) ∫ n n n ) n ?∞ ?∞ ?∞ = EX EX 2 ]... EX n ][][ [ 1 Page 6 of 6