Topic #22
16.31 Feedback Control
Deterministic LQR
? Optimal control and the Riccati equation
? Lagrange multipliers
? The Hamiltonian matrix and the symmetric root locus
Factoids: for symmtric R
?u
T
Ru
=2u
T
R
?u
?Ru
= R
?u
Copyright 2001 by Jonathan How.
1
Fall 2001 16.31 22—1
Linear Quadratic Regulator (LQR)
? We have seen the solutions to the LQR problem using the symmetric
root locus which defines the location of the closed-loop poles.
— Linear full-state feedback control.
— Would like to demonstrate from first principles that this is the
optimal form of the control.
? Deterministic Linear Quadratic Regulator
Plant:
x˙ (t)= A(t)x(t)+ B
u
(t)u(t), x(t
0
)= x
0
z(t)= C
z
(t)x(t)
Cost:
Z
t
f
£
J
LQR
= z
T
(t)R
zz
(t)z(t)+ u
T
(t)R
uu
(t)u(t)
¤
dt + x(t
f
)P
t
f
x(t
f
)
t
0
— Where P
t
f
≥ 0, R
zz
(t) > 0and R
uu
(t) > 0
— Define R
xx
= C
z
T
R
zz
C
z
≥ 0
— A(t) is a continuous function of time.
— B
u
(t), C
z
(t), R
zz
(t), R
uu
(t) are piecewise continuous functions
of time, and all are bounded.
? Problem Statement: Find the input u(t) ?t ∈ [t
0
,t
f
] to mini-
mize J
LQR
.
Fall 2001 16.31 22—2
? Note that this is the most general form of the LQR problem —
we rarely need this level of generality and often suppress the time
dependence of the matrices.
— Aircraft landing problem.
? To optimize the cost, we follow the procedure of augmenting the
constraints in the problem (the system dynamics) to the cost (inte-
grand) to form the Hamiltonian:
1
¢
H =
2
?
x
T
(t)R
xx
x(t)+ u
T
(t)R
uu
u(t) + λ
T
(t)(Ax(t)+ B
u
u(t))
— λ(t) ∈ R
n×1
is called the Adjoint variable or Costate
— It is the Lagrange multiplier in the problem.
? From Stengel (pg427), the necessary and su?cient conditions for
optimality are that:
T
1. λ
˙
(t)= ?
?H
= ?R
xx
x(t) ? A
T
λ(t)
?x
2. λ(t
f
)= P
t
f
x(t
f
)
3.
?H
=0 ? R
uu
u + B
u
T
λ(t)=0, so u
opt
= ?R
?1
?u
uu
B
u
T
λ(t)
4.
?
2
H
≥ 0(need to checkthat R
uu
≥ 0)
?u
2
Fall 2001 16.31 Optimization-1
? This control design problem is a constrained optimization, with the
constraints being the dynamics of the system.
? The standard way of handling the constraints in an optimization is
to add them to the cost using a Lagrange multiplier
— Results in an unconstrained optimization.
? Example: min f (x,y)= x
2
+ y
2
subjectto the constraintthat
c(x,y)= x + y +2= 0
2
1.5
1
0.5
0
?0.5
?1
?1.5
?2
?2 ?1.5 ?1 ?0.5 0 0.5 1 1.5 2
x
Figure 1: Optimization results
y
? Clearly the unconstrained minimum is at x = y =0
Fall 2001 16.31 Optimization-2
? To find the constrained minimum, form the augmented cost function
L , f (x,y)+ λc(x,y)= x
2
+ y
2
+ λ(x + y +2)
— Where λ is the Lagrange multiplier
? Note that if the constraint is satisfied, then L ≡ f
? The solution approach without constraints is to find the stationary
point of f (x,y)(?f/?x = ?f/?y =0)
— With constraints we find the stationary points of L
?L ?L ?L
= = =0
?x ?y ?λ
which gives
?L
?x
=2x + λ =0
?L
=2y + λ =0
?y
?L
= x + y +2= 0
?λ
? This gives 3 equations in 3 unknowns, solve to find x
?
= y
?
= ?1
? The key point here is that due to the constraint, the selection of x
and y during the minimization are not independent
— The Lagrange multiplier captures this dependency.
? The LQR optimization follows the same path as this, but it is com-
plicated by the fact that the cost involves an integration over time.
Fall 2001 16.31 22—3
? Note that we now have:
x˙ (t)= Ax(t)+ Bu
opt
(t)= Ax(t) ? B
u
R
?1
uu
B
u
T
λ(t)
with x(t
0
)= x
0
? So combine with equation for the adjoint variable
λ
˙
(t)= ?R
xx
x(t) ? A
T
λ(t)= ?C
z
T
R
zz
C
z
x(t) ? A
T
λ(t)
to get:
?
x˙ (t)
λ
˙
(t)
?
=
"
A ?B
u
R
?1
uu
B
u
T
?C
z
T
R
zz
C
z
?A
T
#
?
x(t)
λ(t)
?
which of course is the Hamiltonian Matrix again.
? Note that the dynamics of x(t)and λ(t) are coupled, but x(t)is
known initially and λ(t) is known at the terminal time, since λ(t
f
)=
P
t
f
x(t
f
)
— This is a two point boundary value problem that is very hard to
solve in general.
? However, in this case, we can introduce a new matrix variable P (t)
and show that:
1. λ(t)= P (t)x(t)
2. It is relatively easy to find P (t).
Fall 2001 16.31 22—4
? How proceed?
1. For the 2n system
" #
? ?
x˙ (t)
A ?B
u
R
?1
uu
B
u
T
x(t)
λ
˙
(t)
?
=
?C
z
T
R
zz
C
z
?A
T
λ(t)
?
define a transition matrix
" #
F
11
(t
1
,t
0
) F
12
(t
1
,t
0
)
F (t
1
,t
0
)=
F
21
(t
1
,t
0
) F
22
(t
1
,t
0
)
and use this to relate x(t)to x(t
f
)and λ(t
f
)
" #
? ?
λ(t)
?
=
F
11
(t,t
f
) F
12
(t,t
f
)
x(t
f
)x(t)
F
21
(t,t
f
) F
22
(t,t
f
)
λ(t
f
)
?
so
x(t)= F
h
11
(t,t
f
)x(t
f
)+ F
12
(t,t
f
)
i
λ(t
f
)
= F
11
(t,t
f
)+ F
12
(t,t
f
)P
t
f
x(t
f
)
2. Now find λ(t)interms of x(t
f
)
h i
λ(t)= F
12
(t,t
f
)+ F
22
(t,t
f
)P
t
f
x(t
f
)
3. Eliminate x(t
f
)toget:
h ih i
?1
λ(t)= F
12
(t,t
f
)+ F
22
(t,t
f
)P
t
f
F
11
(t,t
f
)+ F
12
(t,t
f
)P
t
f
x(t)
, P (t)x(t)
Fall 2001 16.31 22—5
4. Now, since λ(t)= P (t)x(t), then
λ
˙
(t)= P
˙
(t)x(t)+ P (t)x˙ (t)
?? C
z
T
R
zz
C
z
x(t) ? A
T
λ(t)=
?P
˙
(t)x(t)= C
z
T
R
zz
C
z
x(t)+ A
T
λ(t)+ P (t)x˙ (t)
= C
z
T
R
zz
C
z
x(t)+ A
T
λ(t)+ P (t)(Ax(t) ? B
u
R
?1
uu
B
u
T
λ(t))
=(C
z
T
R
zz
C
z
+ P (t)A)x(t)+(A
T
? P (t)B
u
R
?1
uu
B
u
T
)λ(t)
=(C
z
T
R
zz
C
z
+ P (t)A)x(t)+(A
T
? P (t)B
u
R
?1
uu
B
u
T
)P (t)x(t)
=
£
A
T
P (t)+ P (t)A + C
z
T
R
zz
C
z
? P (t)B
u
R
?1
uu
B
u
T
P (t)
¤
x(t)
? This must be true for arbitrary x(t), so P (t)mustsatisfy
?P
˙
(t)= A
T
P (t)+ P (t)A + C
z
T
R
zz
C
z
? P (t)B
u
R
?1
uu
B
u
T
P (t)
— Which is a matrix di?erential Riccati Equation.
? The optimal value of P (t) is found by solving this equation back-
wards in time from t
f
with P (t
f
)= P
t
f
Fall 2001 16.31 22—6
? The control gains are then
u
opt
= ?R
?1
uu
B
u
T
λ(t)
= ?R
?1
uu
B
u
T
P (t)x(t)= ?K(t)x(t)
— Where K(t) , R
?1
uu
B
u
T
P (t)
? Note that x(t)and λ(t)togetherdefine the closed-loop dynamics
for the system (and its adjoint), but we can eliminate λ(t)fromthe
solution by introducing P (t) which solves a Riccati Equation.
? The optimal control inputs are in fact a linear full-state
feedback control
? Note that normally we are interested in problems with t
0
=0 and
t
f
= ∞, in which case we can just use the steady-state value of P
that solves (assumes that A,B
u
is stabilizable)
A
T
P + PA + C
z
T
R
zz
C
z
? PB
u
R
?1
uu
B
u
T
P =0
which is the Algebraic Riccati Equation.
— If we use the steady-state value of P ,then K is constant.
Fall 2001 16.31 22—7
? Example: simple system with t
0
=0 and t
f
= 10sec.
? ?
0 1 0
x˙ =
0 ?1
?
x +
1
?
u
" #
Z
10
? ?
J = x
T
(10)
00
x(10) + x
T
(t)
q
00
0
?
x(t)+ ru
2
(t)
?
dt
0 h 0
? Compute gains using both time-varying P (t) and steady-state value.
? Find state solution x(0) = [1 1]
T
using both sets of gains
q = 1 r = 1 h = 5
1.4
K
1
(t)
K
1
5
4.5
1.2
4
1
3.5
3
0.8
2.5
0.6
2
0.4
1.5
1
0.2
0.5
K
2
(t)
K
2
x
1
x
2
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time (sec) Time (sec)
Dynamic Gains Static Gains
1.4 1.4
1.2
x
1
x
2
1.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
?0.2 ?0.2
?0.4 ?0.4
?0.6 ?0.6
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time (sec) Time (sec)
States
Gains
States
Gains
Figure 2: Set q =1, r =1, h =10, K
lqr
=[1 0.73]
Fall 2001 16.31 22—8
? As noted, the closed-loop dynamics couple x(t)and λ(t)and are
given by
?
x˙ (t)
λ
˙
(t)
?
=
"
A ?B
u
R
?1
uu
B
u
T
?C
z
T
R
zz
C
z
?A
T
#
?
x(t)
λ(t)
?
with the appropriate boundary conditions.
? OK, so where are the closed-loop poles of the system?
— They must be the eigenvalues of
" #
H ,
A ?B
u
R
?1
uu
B
u
T
?C
z
T
R
zz
C
z
?A
T
? When we analyzed this before for a SISO system, we found that the
closed-loop poles could be related to a SRL for the transfer function
G
zu
(s)= C
z
(sI ? A)
?1
B
u
=
b(s)
a(s)
and, in fact, the closed-loop poles were given by the LHP roots of
R
zz
a(s)a(?s)+ b(s)b(?s)=0
R
uu
where we previously had R
zz
/R
uu
≡ 1/r
? We now know enough to show that this is true.
Fall 2001 16.31 22—9
Derivation of the SRL
? The closed-loop poles are given by the eigenvalues of
" #
A ?B
u
R
?1
uu
B
u
T
?C
z
T
R
zz
C
z
?A
T
H ,
so solve det(sI ? H)=0
" #
=det(A)det(D ? CA
?1
B)
AB
CD
? If A is invertible: det
£
(sI + A
T
) ? C
z
T
R
zz
C
z
(sI ? A)
?1
B
u
R
?1
u
¤
uu
B
T
=det(sI ? A)det(sI + A
T
)det
? det(sI ? H)= det(sI ? A)det
£
I ? C
z
T
R
zz
C
z
(sI ? A)
?1
B
u
R
?1
u
(sI + A
T
)
?1
¤
uu
B
T
? Note that det(I + ABC)=det(I + CAB), and if a(s)=det(sI ?
A), then a(?s)=det(?sI ? A
T
)=(?1)
n
det(sI + A
T
)
det(sI?H)=(?1)
n
a(s)a(?s)det
£
I + R
?1
u
(?sI ? A
T
)
?1
C
z
T
R
zz
C
z
(sI ? A)
?1
B
u
¤
uu
B
T
? If G
zu
(s)= C
z
(sI ?A)
?1
B
u
,then G
T
zu
(?s)= B
u
T
(?sI ?A
T
)
?1
C
z
T
,
so for SISO systems
£
I + R
?1
zu
(?s)R
zz
G
zu
(s)
¤
uu
G
T
=(?1)
n
a(s)a(?s) I +
R
zz
G
zu
(?s)G
zu
(s)
?
R
uu
det(sI ? H)=(?1)
n
a(s)a(?s)det
?
?
R
zz
a(s)a(?s)+=(?1)
n
R
uu
b(s)b(?s)
?
= 0
Fall 2001 16.31 22—10
? Simple example from before: Ascalarsystemwith
x˙ = ax + bu
with cost (R
xx
> 0and R
uu
> 0)
J =
Z
∞
0
(R
xx
x
2
(t)+ R
uu
u
2
(t)) dt
? Then the steady-state P solves
2aP + R
xx
? P
2
b
2
/R
uu
=0
which gives that P =
a+
√
a
2
+b
2
R
xx
/R
uu
> 0
R
?1
uu
b
2
? Then u(t)= ?Kx(t)where
uu
bP =
a +
p
a
2
+ b
2
R
xx
/R
uu
K = R
?1
b
? The closed-loop dynamics are
x˙ =(a ? bK)x =
μ
a ?
b
b
(a +
p
a
2
+ b
2
R
xx
/R
uu
)
?
x
= ?
p
a
2
+ b
2
R
xx
/R
uu
x = A
cl
x(t)
? Note that as R
xx
/R
uu
→∞, A
cl
≈?|b|
p
R
xx
/R
uu
? And as R
xx
/R
uu
→ 0, K ≈ (a + |a|)/b
— If a< 0(open-loopstable), K ≈ 0and A
cl
= a ? bK ≈ a
— If a> 0 (OL unstable), K ≈ 2a/b and A
cl
= a ? bK ≈?a
Fall 2001 16.31 22—11
Summary
? Can find the optimal feedback gains u = ?Kx using the Matlab
command
K = lqr(A,B,R
xx
,R
uu
)
? Similar derivation for the optimal estimation problem (Linear Quadratic
Estimator)
— Full treatment requires detailed of advanced topics (e.g. stochas-
tic processes and Ito calculus) — better left to a second course.
— But, by duality, can compute optimal Kalman filter gains from
K
e
= lqr(A
T
,C
y
T
,B
w
R
w
B
T
w
,R
v
),L = K
e
T