Topic #22 16.31 Feedback Control Deterministic LQR ? Optimal control and the Riccati equation ? Lagrange multipliers ? The Hamiltonian matrix and the symmetric root locus Factoids: for symmtric R ?u T Ru =2u T R ?u ?Ru = R ?u Copyright 2001 by Jonathan How. 1 Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) ? We have seen the solutions to the LQR problem using the symmetric root locus which defines the location of the closed-loop poles. — Linear full-state feedback control. — Would like to demonstrate from first principles that this is the optimal form of the control. ? Deterministic Linear Quadratic Regulator Plant: x˙ (t)= A(t)x(t)+ B u (t)u(t), x(t 0 )= x 0 z(t)= C z (t)x(t) Cost: Z t f £ J LQR = z T (t)R zz (t)z(t)+ u T (t)R uu (t)u(t) ¤ dt + x(t f )P t f x(t f ) t 0 — Where P t f ≥ 0, R zz (t) > 0and R uu (t) > 0 — Define R xx = C z T R zz C z ≥ 0 — A(t) is a continuous function of time. — B u (t), C z (t), R zz (t), R uu (t) are piecewise continuous functions of time, and all are bounded. ? Problem Statement: Find the input u(t) ?t ∈ [t 0 ,t f ] to mini- mize J LQR . Fall 2001 16.31 22—2 ? Note that this is the most general form of the LQR problem — we rarely need this level of generality and often suppress the time dependence of the matrices. — Aircraft landing problem. ? To optimize the cost, we follow the procedure of augmenting the constraints in the problem (the system dynamics) to the cost (inte- grand) to form the Hamiltonian: 1 ¢ H = 2 ? x T (t)R xx x(t)+ u T (t)R uu u(t) + λ T (t)(Ax(t)+ B u u(t)) — λ(t) ∈ R n×1 is called the Adjoint variable or Costate — It is the Lagrange multiplier in the problem. ? From Stengel (pg427), the necessary and su?cient conditions for optimality are that: T 1. λ ˙ (t)= ? ?H = ?R xx x(t) ? A T λ(t) ?x 2. λ(t f )= P t f x(t f ) 3. ?H =0 ? R uu u + B u T λ(t)=0, so u opt = ?R ?1 ?u uu B u T λ(t) 4. ? 2 H ≥ 0(need to checkthat R uu ≥ 0) ?u 2 Fall 2001 16.31 Optimization-1 ? This control design problem is a constrained optimization, with the constraints being the dynamics of the system. ? The standard way of handling the constraints in an optimization is to add them to the cost using a Lagrange multiplier — Results in an unconstrained optimization. ? Example: min f (x,y)= x 2 + y 2 subjectto the constraintthat c(x,y)= x + y +2= 0 2 1.5 1 0.5 0 ?0.5 ?1 ?1.5 ?2 ?2 ?1.5 ?1 ?0.5 0 0.5 1 1.5 2 x Figure 1: Optimization results y ? Clearly the unconstrained minimum is at x = y =0 Fall 2001 16.31 Optimization-2 ? To find the constrained minimum, form the augmented cost function L , f (x,y)+ λc(x,y)= x 2 + y 2 + λ(x + y +2) — Where λ is the Lagrange multiplier ? Note that if the constraint is satisfied, then L ≡ f ? The solution approach without constraints is to find the stationary point of f (x,y)(?f/?x = ?f/?y =0) — With constraints we find the stationary points of L ?L ?L ?L = = =0 ?x ?y ?λ which gives ?L ?x =2x + λ =0 ?L =2y + λ =0 ?y ?L = x + y +2= 0 ?λ ? This gives 3 equations in 3 unknowns, solve to find x ? = y ? = ?1 ? The key point here is that due to the constraint, the selection of x and y during the minimization are not independent — The Lagrange multiplier captures this dependency. ? The LQR optimization follows the same path as this, but it is com- plicated by the fact that the cost involves an integration over time. Fall 2001 16.31 22—3 ? Note that we now have: x˙ (t)= Ax(t)+ Bu opt (t)= Ax(t) ? B u R ?1 uu B u T λ(t) with x(t 0 )= x 0 ? So combine with equation for the adjoint variable λ ˙ (t)= ?R xx x(t) ? A T λ(t)= ?C z T R zz C z x(t) ? A T λ(t) to get: ? x˙ (t) λ ˙ (t) ? = " A ?B u R ?1 uu B u T ?C z T R zz C z ?A T # ? x(t) λ(t) ? which of course is the Hamiltonian Matrix again. ? Note that the dynamics of x(t)and λ(t) are coupled, but x(t)is known initially and λ(t) is known at the terminal time, since λ(t f )= P t f x(t f ) — This is a two point boundary value problem that is very hard to solve in general. ? However, in this case, we can introduce a new matrix variable P (t) and show that: 1. λ(t)= P (t)x(t) 2. It is relatively easy to find P (t). Fall 2001 16.31 22—4 ? How proceed? 1. For the 2n system " # ? ? x˙ (t) A ?B u R ?1 uu B u T x(t) λ ˙ (t) ? = ?C z T R zz C z ?A T λ(t) ? define a transition matrix " # F 11 (t 1 ,t 0 ) F 12 (t 1 ,t 0 ) F (t 1 ,t 0 )= F 21 (t 1 ,t 0 ) F 22 (t 1 ,t 0 ) and use this to relate x(t)to x(t f )and λ(t f ) " # ? ? λ(t) ? = F 11 (t,t f ) F 12 (t,t f ) x(t f )x(t) F 21 (t,t f ) F 22 (t,t f ) λ(t f ) ? so x(t)= F h 11 (t,t f )x(t f )+ F 12 (t,t f ) i λ(t f ) = F 11 (t,t f )+ F 12 (t,t f )P t f x(t f ) 2. Now find λ(t)interms of x(t f ) h i λ(t)= F 12 (t,t f )+ F 22 (t,t f )P t f x(t f ) 3. Eliminate x(t f )toget: h ih i ?1 λ(t)= F 12 (t,t f )+ F 22 (t,t f )P t f F 11 (t,t f )+ F 12 (t,t f )P t f x(t) , P (t)x(t) Fall 2001 16.31 22—5 4. Now, since λ(t)= P (t)x(t), then λ ˙ (t)= P ˙ (t)x(t)+ P (t)x˙ (t) ?? C z T R zz C z x(t) ? A T λ(t)= ?P ˙ (t)x(t)= C z T R zz C z x(t)+ A T λ(t)+ P (t)x˙ (t) = C z T R zz C z x(t)+ A T λ(t)+ P (t)(Ax(t) ? B u R ?1 uu B u T λ(t)) =(C z T R zz C z + P (t)A)x(t)+(A T ? P (t)B u R ?1 uu B u T )λ(t) =(C z T R zz C z + P (t)A)x(t)+(A T ? P (t)B u R ?1 uu B u T )P (t)x(t) = £ A T P (t)+ P (t)A + C z T R zz C z ? P (t)B u R ?1 uu B u T P (t) ¤ x(t) ? This must be true for arbitrary x(t), so P (t)mustsatisfy ?P ˙ (t)= A T P (t)+ P (t)A + C z T R zz C z ? P (t)B u R ?1 uu B u T P (t) — Which is a matrix di?erential Riccati Equation. ? The optimal value of P (t) is found by solving this equation back- wards in time from t f with P (t f )= P t f Fall 2001 16.31 22—6 ? The control gains are then u opt = ?R ?1 uu B u T λ(t) = ?R ?1 uu B u T P (t)x(t)= ?K(t)x(t) — Where K(t) , R ?1 uu B u T P (t) ? Note that x(t)and λ(t)togetherdefine the closed-loop dynamics for the system (and its adjoint), but we can eliminate λ(t)fromthe solution by introducing P (t) which solves a Riccati Equation. ? The optimal control inputs are in fact a linear full-state feedback control ? Note that normally we are interested in problems with t 0 =0 and t f = ∞, in which case we can just use the steady-state value of P that solves (assumes that A,B u is stabilizable) A T P + PA + C z T R zz C z ? PB u R ?1 uu B u T P =0 which is the Algebraic Riccati Equation. — If we use the steady-state value of P ,then K is constant. Fall 2001 16.31 22—7 ? Example: simple system with t 0 =0 and t f = 10sec. ? ? 0 1 0 x˙ = 0 ?1 ? x + 1 ? u " # Z 10 ? ? J = x T (10) 00 x(10) + x T (t) q 00 0 ? x(t)+ ru 2 (t) ? dt 0 h 0 ? Compute gains using both time-varying P (t) and steady-state value. ? Find state solution x(0) = [1 1] T using both sets of gains q = 1 r = 1 h = 5 1.4 K 1 (t) K 1 5 4.5 1.2 4 1 3.5 3 0.8 2.5 0.6 2 0.4 1.5 1 0.2 0.5 K 2 (t) K 2 x 1 x 2 0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Time (sec) Time (sec) Dynamic Gains Static Gains 1.4 1.4 1.2 x 1 x 2 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 ?0.2 ?0.2 ?0.4 ?0.4 ?0.6 ?0.6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Time (sec) Time (sec) States Gains States Gains Figure 2: Set q =1, r =1, h =10, K lqr =[1 0.73] Fall 2001 16.31 22—8 ? As noted, the closed-loop dynamics couple x(t)and λ(t)and are given by ? x˙ (t) λ ˙ (t) ? = " A ?B u R ?1 uu B u T ?C z T R zz C z ?A T # ? x(t) λ(t) ? with the appropriate boundary conditions. ? OK, so where are the closed-loop poles of the system? — They must be the eigenvalues of " # H , A ?B u R ?1 uu B u T ?C z T R zz C z ?A T ? When we analyzed this before for a SISO system, we found that the closed-loop poles could be related to a SRL for the transfer function G zu (s)= C z (sI ? A) ?1 B u = b(s) a(s) and, in fact, the closed-loop poles were given by the LHP roots of R zz a(s)a(?s)+ b(s)b(?s)=0 R uu where we previously had R zz /R uu ≡ 1/r ? We now know enough to show that this is true. Fall 2001 16.31 22—9 Derivation of the SRL ? The closed-loop poles are given by the eigenvalues of " # A ?B u R ?1 uu B u T ?C z T R zz C z ?A T H , so solve det(sI ? H)=0 " # =det(A)det(D ? CA ?1 B) AB CD ? If A is invertible: det £ (sI + A T ) ? C z T R zz C z (sI ? A) ?1 B u R ?1 u ¤ uu B T =det(sI ? A)det(sI + A T )det ? det(sI ? H)= det(sI ? A)det £ I ? C z T R zz C z (sI ? A) ?1 B u R ?1 u (sI + A T ) ?1 ¤ uu B T ? Note that det(I + ABC)=det(I + CAB), and if a(s)=det(sI ? A), then a(?s)=det(?sI ? A T )=(?1) n det(sI + A T ) det(sI?H)=(?1) n a(s)a(?s)det £ I + R ?1 u (?sI ? A T ) ?1 C z T R zz C z (sI ? A) ?1 B u ¤ uu B T ? If G zu (s)= C z (sI ?A) ?1 B u ,then G T zu (?s)= B u T (?sI ?A T ) ?1 C z T , so for SISO systems £ I + R ?1 zu (?s)R zz G zu (s) ¤ uu G T =(?1) n a(s)a(?s) I + R zz G zu (?s)G zu (s) ? R uu det(sI ? H)=(?1) n a(s)a(?s)det ? ? R zz a(s)a(?s)+=(?1) n R uu b(s)b(?s) ? = 0 Fall 2001 16.31 22—10 ? Simple example from before: Ascalarsystemwith x˙ = ax + bu with cost (R xx > 0and R uu > 0) J = Z ∞ 0 (R xx x 2 (t)+ R uu u 2 (t)) dt ? Then the steady-state P solves 2aP + R xx ? P 2 b 2 /R uu =0 which gives that P = a+ √ a 2 +b 2 R xx /R uu > 0 R ?1 uu b 2 ? Then u(t)= ?Kx(t)where uu bP = a + p a 2 + b 2 R xx /R uu K = R ?1 b ? The closed-loop dynamics are x˙ =(a ? bK)x = μ a ? b b (a + p a 2 + b 2 R xx /R uu ) ? x = ? p a 2 + b 2 R xx /R uu x = A cl x(t) ? Note that as R xx /R uu →∞, A cl ≈?|b| p R xx /R uu ? And as R xx /R uu → 0, K ≈ (a + |a|)/b — If a< 0(open-loopstable), K ≈ 0and A cl = a ? bK ≈ a — If a> 0 (OL unstable), K ≈ 2a/b and A cl = a ? bK ≈?a Fall 2001 16.31 22—11 Summary ? Can find the optimal feedback gains u = ?Kx using the Matlab command K = lqr(A,B,R xx ,R uu ) ? Similar derivation for the optimal estimation problem (Linear Quadratic Estimator) — Full treatment requires detailed of advanced topics (e.g. stochas- tic processes and Ito calculus) — better left to a second course. — But, by duality, can compute optimal Kalman filter gains from K e = lqr(A T ,C y T ,B w R w B T w ,R v ),L = K e T