03/17/03 12.540 Lec 11 1
12.540 Principles of the Global
Positioning System
Lecture 11
03/17/03 12.540 Lec 11 2
Statistical approach to estimation
view
Prof. Thomas Herring
? Summary
–Look at estimation from statistical point of
–Propagation of covariance matrices
–Sequential estimation
1
?
03/17/03 12.540 Lec 11 3
Statistical approach to estimation
? Examine the multivariate Gaussian distribution:
?
probability density function, we maximize the
likelihood of the estimates (MLE).
? This is just weighted least squares where the weight
matrix is chosen to be the inverse of the covariance
matrix of data noise
f (x) =
1
(2p)
n
V
e
-
1
2
(x-m )
T
V
-1
(x-m )
(x - m)
T
V
-1
(x - m)
By minimizing the argument of the exponential in the
Multivariant
Minimize gives largest probability density
03/17/03 12.540 Lec 11 4
Data covariance matrix
? If we use the inverse of the covariance matrix of the
Gaussian distribution.
?
? Difficult question to answer completely
? Issues to be considered:
the noise (discussed later in course).
sometimes be treated as noise-like.
angles
noise in the data, we obtain a MLE if data noise is
How do you obtain data covariance matrix?
– Thermal noise in receiver gives on component
– Multipath could be treated as a noise-like quantity
– Signal-to-noise ratio of measurements allows an estimate of
– In-complete mathematical model of observables can
– Gain of GPS antenna will generate lower SNR at low elevation
2
03/17/03 12.540 Lec 11 5
Data covariance matrix
? In practice in GPS (as well as many other fields), the
data covariance matrix is somewhat arbitrarily chosen.
? Largest problem is temporal correlations in the
measurements. Typical GPS data set size for 24-
hours of data at 30 second sampling is 8x2880=23000
phase measurements. Since the inverse of the
correlations requires the inverse of 23000x23000
matrix.
? To store the matrix would require, 4Gbytes of memory
?
correlations over a time short compared to 24-hours),
covariance matrix is required, fully accounting for
Even if original covariance matrix is banded (ie.,
the inverse of banded matrix is usually a full matrix
03/17/03 12.540 Lec 11 6
Data covariance matrix
?
use samples every 5-minutes (JPL method)
measurement so that equivalent to say 5-minute sampling (ie.,
(GAMIT method)
assumptions (discussed more near end of course).
?
what can we say about noise in parameter estimates?
Methods on handling temporal correlations:
– If measurements correlated over say 5-minute period, then
– Use full rate data, but artificially inflate the noise on each
sqrt(10) higher noise on the 30-second sampled values
– When looking a GPS results, always check the data noise
Assuming a valid data noise model can be developed,
3
?
03/17/03 12.540 Lec 11 7
Propagation of covariances
y = Ax with V
xx
x
V
yy
=< yy
T
>=< Axx
T
A
T
>= A < xx
T
> A
T
V
yy
= AV
xx
A
T
? Given a data noise covariance matrix, the
characteristics of expected values can be
used to determine the covariance matrix of
any linear combination of the measurements.
Given linear operation : as
covariance matrix of
03/17/03 12.540 Lec 11 8
Propagation of covariance
known.
–
least squares
– Covariance matrix for post-fit residuals from least
squares
– Covariance matrix of derived quantities such as
estimates.
? Propagation of covariance can be used for
any linear operator applied to random
variables whose covariance matrix is already
? Specific examples:
Covariance matrix of parameter estimates from
latitude, longitude and height from XYZ coordinate
4
?
03/17/03 12.540 Lec 11 9
? Propagation of covariance can be applied to the
? Notice that the covariance matrix of parameter
estimates is a natural output of the estimator if A
T
V
-1
A
is inverted (does not need to be)
?x = (A
T
V
yy
-1
A)
-1
A
T
V
yy
-1
y
< ?x?x
T
>= (A
T
V
yy
-1
A)
-1
A
T
V
yy
-1
< yy
T
> V
yy
-1 T
V
yy
-1
A)
-1
V
?x?x
= (A
T
V
yy
-1
A)
-1
Covariance matrix of parameter estimates
weighted least squares problem:
A(A
03/17/03 12.540 Lec 11 10
Covariance matrix of estimated
? Notice that for the rigorous estimation, the inverse of the data
covariance is needed (time consuming if non-diagonal)
? To compute to parameter estimate covariance, only the
covariance matrix of the data is needed (not the inverse)
? In some cases, a non-rigorous inverse can be done with say a
diagonal covariance matrix, but the parameter covariance matrix
is rigorously computed using the full covariance matrix. This is a
correct (just not the best estimates that can found).
? This techniques could be used if storage of the full covariance
parameters
non-MLE but the covariance matrix of the parameters should be
matrix is possible, but inversion of the matrix is not because it
would take too long or inverse can not be performed in place.
5
?
03/17/03 12.540 Lec 11 11
Covariance matrix of post-fit residuals
? Post-fit residuals are the differences between the
observations and the values computed from the
estimated parameters
? Because some of the noise in the data are absorbed
into the parameter estimates, in general, the post-fit
residuals are not the same as the errors in the data.
? In some cases, they can be considerably smaller.
? The covariance matrix of the post-fit residuals can be
computed using propagation of covariances.
6
03/17/03 12.540 Lec 11 12
Covariance matrix of post-fit residuals
covariances: e
v
y = Ax + e
?x = (A
T
V
yy
-1
A)
-1
A
T
V
yy
-1
y
v = y - A?x = I - A(A
T
V
yy
-1
A)
-1
A
T
V
yy
-1
Amount error reduced
1 2 3
è
?
í
í
?
?
˙
˙
e Eqn 1
V
vv
=< vv
T
>= V
yy
- A(A
T
V
yy
-1
A)
-1
A
T
? This can be computed using propagation on
is the vector of true errors, and
is vector of residuals
444 444
03/17/03 12.540 Lec 11 13
Post-fit residuals
? Notice that we can compute the compute the
covariance matrix of the post-fit residuals (a large
matrix in generate)
?
v=Be; why can we not compute the actual errors with
e=B
-1
v?
?
(there is in fact one inverse which would generate the
true errors)
? Note: In this case, singularity does not mean that
there is no inverse, it means there are an infinite
number of inverses.
Eqn 1 on previous slide gives an equation of the form
B is a singular matrix which has no unique inverse
03/17/03 12.540 Lec 11 14
Example
as D
0
1
2
3
4
5
6
0.0 10.0 20.0 30.0 40.0 50.0
Data
Time
Dt
Postfit error bar
somewhat reduced
? Consider the case shown below: When a rate
of change is estimated, the slope estimate will
absorb error in the last data point particularly
t increases. (Try this case yourself)
Postfit error bar very small;
slope will always pass close
to this data point
Example of fitting slope to non-uniform data distribution
7
?
?
03/17/03 12.540 Lec 11 15
? Propagation of covariances can be used to determine
longitude and radius. q is co-latitude, l is longitude, R
is radius. DN, DE and DU are north, east and radial
changes (all in distance units).
DN
DE
DU
è
?
í
í
í
?
?
˙
˙
˙
=
-cos(q)cos(l) -cos(q) l) q)
- l) cos(l) 0
X / R Y / R Z / R
è
?
í
í
í
?
?
˙
˙
˙
A
1 2 3
DX
DY
DZ
è
?
í
í
í
?
?
˙
˙
˙
Covariance of derived quantities
the covariance of derived quantities. Example latitude,
Geocentric Case :
sin( sin(
sin(
matrix for use in propagation from Vxx
4444444 4444444
8
03/17/03 12.540 Lec 11 16
V
1
0 0
0 V
2
0
0 0 V
3
è
?
í
í
í
?
?
˙
˙
˙
-1
=
V
1
-1
0 0
0 V
2
-1
0
0 0 V
3
-1
è
?
í
í
í
?
?
˙
˙
˙
Estimation in parts/Sequential estimation
? A very powerful method for handling large
data sets, takes advantage of the structure of
the data covariance matrix if parts of it are
uncorrelated (or assumed to be uncorrelated).
?
03/17/03 12.540 Lec 11 17
Sequential estimation
? Since the blocks of the data covariance matrix can be
separately inverted, the blocks of the estimation
(A
T
V
-1
A) can be formed separately can combined
later.
? Also since the parameters to be estimated can be
often divided into those that effect all data (such as
station coordinates) and those that effect data a one
estimations (shown next page).
time or over a limited period of time (clocks and
atmospheric delays) it is possible to separate these
9
03/17/03 12.540 Lec 11 18
Sequential estimation
? Sequential estimation with division of global and local parameters.
V
parameter estimates), V
xg
is covariance matrix of prior parameter
x
g
and x
l
are local parameter estimates,
x
g
+
are new global parameter estimates.
y
x
g
è
?
í
?
?
˙
=
A
g
A
l
I 0
è
?
í
?
?
˙
x
g
x
l
è
?
í
?
?
˙
x
g
+
x
l
è
?
í
?
?
˙
=
A
g
T
V
-1
A
g
+ V
xg
-1
( )
A
g
T
V
-1
A
l
A
l
T
V
-1
A
g
A
l
T
V
-1
A
l
è
?
í
í
?
?
˙
˙
-1
A
g
T
V
-1
y + V
xg
-1
x
g
A
l
T
V
-1
y
è
?
í
?
?
˙
is covariance matrix of new data (uncorrelated with priori
estimates with estimates
03/17/03 12.540 Lec 11 19
Sequential estimation
parameters, x
l
x
g
03/17/03 12.540 Lec 11 20
? As each block of data is processed, the local
, can be dropped and the
covariance matrix of the global parameters
passed to the next estimation stage.
? Total size of adjustment is at maximum the
number of global parameters plus local
parameters needed for the data being
processed at the moment, rather than all of
the local parameters.
Summary
? We examined the way covariance matrices
and be manipulated
? Estimation from a statistical point of view
? Sequential estimation.
? Next class continue with sequential estimation
in terms of Kalman Filtering.
? Reminder: Paper topic and outline due
Wednesday.
10