courtesy of JPL Fault Aware Systems: Model-based Programming and Diagnosis Brian C. Williams 16.412J/6.834J March 8 th , 2004 Brian C. Williams, copyright 2000 Four launches in 7 months Mars Climate Orbiter: 12/11/98 Mars Polar Lander: 1/3/99 Stardust: 2/7/99 QuickSCAT: 6/19/98 courtesy of JPL Outline ? Fault Aware Systems and Model-based Programming ? Model-based Diagnosis ? Multiple-fault Diagnosis based on Conflicts ? Mode Estimation Why Model-based Programming? Create Embedded Languages That Reason on the Fly from Commonsense Models Leading Diagnosis: ?Legs deployed during descent. ? Noise spike on leg sensors latched by monitors. ? Laser altimeter registers 50ft. ? Begins polling leg monitors to determine touch down. ? Latched noise spike read as touchdown. ? Engine shutdown at ~50ft. Mars 98: ? Climate Orbiter ? Mars Polar Lander Image courtesy of JPL. sense P(s) WORLD observations actions AGENT Diagnostic Agent: ? Monitors & Diagnoses ? Repairs & Avoids ? Probes and Tests Plant act Symptom-based Consistency-based Model-based Programs Interact Directly with State Embedded programs interact with plant sensors/actuators: ? Read sensors ? Set actuators Model-based programs interact with plant state: ? Read state ?Write state Embedded Program S Plant Obs Cntrl Model-based Embedded Program S Plant Programmer must map between state and sensors/actuators. Model-based executive maps between sensors, actuators to states. Control Sequencer Deductive Controller System Model CommandsObservations Control Program Plant Titan Model-based ExecutiveRMPL Model-based Program State goalsState estimates Generates target goal states conditioned on state estimates Mode Estimation Mode Reconfiguration Tracks likely plant states Tracks least cost goal states z Executes concurrently z Preempts z Queries (hidden) states z Asserts (hidden) state Closed Valve Open Stuck open Stuck closed Open Close 0. 01 0. 01 0.01 0.01 inflow = outflow = 0 Orbital Insertion Example EngineA EngineB Science Camera Turn camera off and engine on EngineA EngineB Science Camera Model-based Program Control program specifies state trajectories: ? fires one of two engines ? sets both engines to ‘standby’ ? prior to firing engine, camera must be turned off to avoid plume contamination ? in case of primary engine failure, fire backup engine instead OrbitInsert():: (do-watching ((EngineA = Thrusting) OR (EngineB = Thrusting)) (parallel (EngineA = Standby) (EngineB = Standby) (Camera = Off) (do-watching (EngineA = Failed) (when-donext ( (EngineA = Standby) AND (Camera = Off) ) (EngineA = Thrusting))) (when-donext ( (EngineA = Failed) AND (EngineB = Standby) AND (Camera = Off) ) (EngineB = Thrusting)))) Plant Model describes behavior of each component: – Nominal and Off nominal – qualitative constraints – likelihoods and costs Example: The model-based program sets engine = thrusting, and the deductive controller . . . . Determines valves on backup engine that will achieve thrust, and plans needed actions. Deduces that a valve failed - stuck closed Selects valve configuration; plans actions to open six valves Fuel tankOxidizer tank Deduces that thrust is off, and the engine is healthy Mode Estimation Mode Reconfiguration Mode Reconfiguration Mode Estimation Given observations… and command history… Mode estimation infers “hidden state” Executive Manipulates Hidden State ? States not DIRECTLY observable or controllable… (thrust = zero) AND (power_in = nominal) last command issued = “standby-cmd” ? (EngineA = Standby) Given state goals … and estimated state … Mode reconfiguration infers “commands” ? [Turn on DriverA]; [Open ValveA] ? Thinking in terms of “hidden states” abstracts away complexity of robustly observing and controlling state. ? Model-based executive raises assurance of software by correctly inferring and controlling states. (ValveA = Open) (DriverA = off) AND (ValveA = closed) Control Sequencer Deductive Controller System Model Commands Observations Control Program Plant Titan Model-based ExecutiveRMPL Model-based Program State goalsState estimates Control Sequencer: Generates goal states conditioned on state estimates Mode Estimation: Tracks likely States Mode Reconfiguration: Tracks least-cost state goals z Executes concurrently z Preempts z Asserts and queries states z Chooses based on reward OrbitInsert():: (do-watching ((EngineA = Firing) OR (EngineB = Firing)) (parallel (EngineA = Standby) (EngineB = Standby) (Camera = Off) (do-watching (EngineA = Failed) (when-donext ( (EngineA = Standby) AND (Camera = Off) ) (EngineA = Firing))) (when-donext ( (EngineA = Failed) AND (EngineB = Standby) AND (Camera = Off) ) (EngineB = Firing)))) MAINTAIN (EAR OR EBR) EBS CO LEGEND: EAS (EngineA = Standby) EAF (EngineA = Failed) EAR (EngineA = Firing) EBS (EngineB = Standby) EBF (EngineB = Failed) EBR (EngineB = Firing) CO (Camera = Off) MAINTAIN (EAF) EAS (EAS AND CO) EAR EAS AND CO (EAF AND EBS AND CO) EBR EAF AND EBS AND CO hierarchical constraint automata on state s Control Sequencer Deductive Controller System Model Commands Observations Control Program Plant Titan Model-based ExecutiveRMPL Model-based Program State goalsState estimates Control Sequencer: Generates goal states conditioned on state estimates Mode Estimation: Tracks likely States Mode Reconfiguration: Tracks least-cost state goals z Executes concurrently z Preempts z Asserts and queries states z Chooses based on reward Fire backup engine Valve fails stuck closed S T X0 X1 XN-1 XN S T X0 X1 XN-1 XN least cost reachable goal state First ActionCurrent Belief State Modeling Complex Behaviors through Probabilistic Constraint Automata ? Complex, discrete behaviors ? modeled through concurrency, hierarchy and timed transitions. ? Anomalies and uncertainty ? modeled by probabilistic transitions ? Physical interactions ? modeled by discrete and continuous constraints Standby Engine Model Off Failed off- cmd standby- cmd 0.01 (thrust = full) AND (power_in = nominal) Firing 0.01 standby- cmd fire- cmd (thrust = zero) AND (power_in = zero) (thrust = zero) AND (power_in = nominal) On Camera Model Off turnoff- cmd turnon- cmd (power_in = zero) AND (shutter = closed) (power_in = nominal) AND (shutter = open) 0 v 2 kv 2 kv 0 v 0 v 20 v 0.01 0.01 0 v Possible Behaviors Visualized by a Trellis Diagram S T X 0 X 1 X N-1 X N ?Assigns a value to each variable (e.g.,3,000 vars). ?Consistent with all state constraints (e.g., 12,000). ?A set of concurrent transitions, one per automata (e.g., 80). ?Previous & Next states consistent with source & target of transitions The Plant’s Behavior Deductive Controller Commands Observations Plant State goalsState estimates Mode Estimation: Tracks likely States Mode Reconfiguration: Tracks least-cost state goals Fire backup engine Valve fails stuck closed S T X0 X1 XN-1 XN S T X0 X1 XN-1 XN least cost reachable goal state First ActionCurrent Belief State Optimal CSP: arg min f(x) s.t. C(x) is satisfiable D(x) is unsatisfiable arg max P T (m’) s.t. M(m’) ^ O(m’) is satisfiable arg min R T* (m’) s.t. M(m’) entails G(m’) s.t. M(m’) is satisfiable Outline ? Fault Aware Systems and Model-based Programming ? Model-based Diagnosis ? Multiple-fault Diagnosis based on Conflicts ? Mode Estimation Issue 1: Handling Hidden Failures Requires Reasoning from a Model: STS-93 Symptoms: ? Engine temp sensor high ? LOX level low ? GN&C detects low thrust ? H2 level possibly low Problem: Liquid hydrogen leak Effect: ? LH2 used to cool engine ? Engine runs hot ? Consumes more LOX Image courtesy of NASA. ? Mars Observer ? Mars Climate Orbiter ? Mars Polar Lander ? Deep Space 2 courtesy of JPL Issue 2: Failures are Often Novel Helium tank Fuel tankOxidizer tank Main Engines Flow 1 = zero Pressure 1 = nominal Pressure 2 = nominal Acceleration = zero Issue 3: Multiple Faults Occur courtesy of NASA ? three shorts, tank-line and pressure jacket burst, panel flies off. APOLLO 13 When you have eliminated the impossible, whatever remains, however improbable, must be the truth. - Sherlock Holmes. The Sign of the Four. Model-based Diagnosis as Conflict-directed Best First Search 1. Test Hypothesis 2. If inconsistent, learn reason for inconsistency (a Conflict). 3. Use conflicts to leap over similarly infeasible options to next best hypothesis. Compare Most Likely Hypothesis to Observations Helium tank Fuel tankOxidizer tank Main Engines Flow 1 = zero Pressure 1 = nominal Pressure 2 = nominal Acceleration = zero It is most likely that all components are okay. Isolate Conflicting Information Helium tank Fuel tankOxidizer tank Main Engines Flow 1 = zero The red component modes conflict with the model and observations. Helium tank Fuel tankOxidizer tank Main Engines Flow 1 = zero Leap to the Next Most Likely Hypothesis that Resolves the Conflict The next hypothesis must remove the conflict New Hypothesis Exposes Additional Conflicts Pressure 1 = nominal Pressure 2 = nominal Acceleration = zero Helium tank Fuel tankOxidizer tank Main Engines Another conflict, try removing both Final Hypothesis Resolves all Conflicts Helium tank Fuel tankOxidizer tank Main Engines Pressure 1 = nominal Flow 1 = zero Pressure 2 = nominal Flow 2 = positive Acceleration = zero Implementation: Conflict-directed A* search. Soln 1: Model-based Diagnosis Given a system with symptomatic behavior and a model of the system, find diagnoses that account for symptoms. 1 Symptom1 1 Or1 Or2 Or3 And1 And2 A B C D E F G X Y Z 1 1 1 1 0 0 1 1 1 Given a system with symptomatic behavior and a model of the system, find diagnoses that account for symptoms. 1 Symptom1 1 Or1 Or2 Or3 And1 And2 A B C D E F G X Y Z 1 1 1 1 0 0 1 1 1 Soln 1: Model-based Diagnosis Diagnosis as Hypothesis Testing 1. Generate candidates, given symptoms. 2. Test if candidates account for all symptoms. ? Set of diagnoses should be complete. ? Set of diagnoses should consider all available information. Issue 2: How Should Diagnoses Account for Novel Symptoms? Consistency-based Diagnosis: Given symptoms, find diagnoses that are consistent with symptoms. Suspending Constraints: Make no presumptions about faulty component behavior. 1 1 1 Symptom Or1 Or2 Or3 And1 And2 A B C D E 1 1 1 1 0 F G X Y Z 0 1 Consistency-based Diagnosis: Given symptoms, find diagnoses that are consistent with symptoms. Suspending Constraints: Make no presumptions about faulty component behavior. 1 1 1 Symptom Or1 Or2 Or3 And1 And2 A B C D E 1 1 1 1 0 F G X Y Z 0 1 Issue 2: How Should Diagnoses Account for Novel Symptoms? Consistency-based Diagnosis: Given symptoms, find diagnoses that are consistent with symptoms. Suspending Constraints: Make no presumptions about faulty component behavior. 1 1 Or1 Or2 Or3 And1 And2 A B C D E 1 1 1 1 0 F G X Y Z 0 1 Issue 2: How Should Diagnoses Account for Novel Symptoms? Issue 3: Multiple Faults Occur courtesy of NASA ? three shorts, tank-line and pressure jacket burst, panel flies off. ? Divide & Conquer ? Diagnose each symptom. ? Summarize (conflicts) ? Combine APOLLO 13 ? Candidate: Assignment to all component modes. 3 2 2 3 3 M1 M2 M3 A1 A2 A B C D E F G X Y Z 10 Adder(i): ? G(i): Out(i) = In1(i)+In2(i) ? U(i): 12 Diagnosis identifies consistent modes Candidate = {A1=G, A2=G, M1=G, M2=G, M3=G} Diagnosis identifies All sets of consistent modes Adder(i): ? G(i): Out(i) = In1(i)+In2(i) ? U(i): ? Diagnosis D: Candidate consistent with model Phi and observables OBS. ? As more constraints are relaxed, candidates are more easily satisfied. ?Typically an exponential number of candidates. 3 2 2 3 3 M1 M3 A1 A B C D E F G X Y Z 10 12 Diagnosis = {A1=G, A2=U, M1=G, M2=U, M3=G} Testing Consistency →Propositional Logic ?DPLL ? Unit propagation (incomplete) ?Finite Domain Constraints ? Backtrack w forward checking, … ? Waltz constraint propagation (incomplete) ? Algebraic Constraints ? Sussman/Steele Constraint Propagation: Propagate newly assigned values through equations mentioning variables. ? To propagate, use assigned values of constraint to deduce unknown value(s) of constraint. X ∈{1,0} X=1 ∨ X=0 ?X=1∨?X=0 Encoding Models In Propositional Logic ?(i=G) ∨?(In1(i)=1) ∨ Out(i)=1 ?(i=G) ∨?(In2(i)=1) ∨ Out(i)=1 ?(i=G) ∨?(In1(i)=0) ∨?(In2(i)=0) ∨ Out(i)=0 And(i): ? G(i): Out(i) = In1(i) AND In2(i) ? U(i): Or(i): ? G(i): Out(i) = In1(i) OR In2(i) ? U(i): ?(i=G) ∨?(In1(i)=0) ∨ Out(i)=0 ?(i=G) ∨?(In2(i)=0) ∨ Out(i)=0 ?(i=G) ∨?(In1(i)=1) ∨?(In2(i)=1) ∨ Out(i)=1 Recap: Consistency-based Diagnosis And(i): ? G(i): Out(i) = In1(i) AND In2(i) ? U(i): ? Obs: Assignment to O ? Candidate C i : Assignment of modes to X ? Diagnosis D i : A candidate such that D i ∧ Obs ∧ C(X,Y) is satisfiable. 1 1 1 1 0 Or1 Or3 And1 A B C D E F G X Y Z 0 1 Diagnosis = {A1=G, A2=U O1=G, O2=U, O3=G} ALL components have “unknown Mode” U, Whose assignment is never mentioned in C