courtesy of JPL
Fault Aware Systems:
Model-based Programming
and Diagnosis
Brian C. Williams
16.412J/6.834J
March 8
th
, 2004
Brian C. Williams, copyright 2000
Four launches in 7 months
Mars Climate Orbiter: 12/11/98
Mars Polar Lander: 1/3/99
Stardust: 2/7/99
QuickSCAT: 6/19/98
courtesy of JPL
Outline
? Fault Aware Systems and
Model-based Programming
? Model-based Diagnosis
? Multiple-fault Diagnosis based on Conflicts
? Mode Estimation
Why Model-based Programming?
Create Embedded Languages
That Reason on the Fly
from Commonsense Models
Leading Diagnosis:
?Legs deployed during descent.
? Noise spike on leg sensors
latched by monitors.
? Laser altimeter registers 50ft.
? Begins polling leg monitors to
determine touch down.
? Latched noise spike read as
touchdown.
? Engine shutdown at ~50ft.
Mars 98:
? Climate Orbiter
? Mars Polar Lander
Image courtesy of JPL.
sense
P(s)
WORLD
observations
actions
AGENT
Diagnostic Agent:
? Monitors & Diagnoses
? Repairs & Avoids
? Probes and Tests
Plant
act
Symptom-based
Consistency-based
Model-based Programs
Interact Directly with State
Embedded programs interact with
plant sensors/actuators:
? Read sensors
? Set actuators
Model-based programs
interact with plant state:
? Read state
?Write state
Embedded Program
S
Plant
Obs
Cntrl
Model-based
Embedded Program
S
Plant
Programmer must map between
state and sensors/actuators.
Model-based executive maps
between sensors, actuators to states.
Control Sequencer
Deductive Controller
System Model
CommandsObservations
Control Program
Plant
Titan Model-based ExecutiveRMPL Model-based Program
State goalsState estimates
Generates target goal states
conditioned on state estimates
Mode
Estimation
Mode
Reconfiguration
Tracks
likely
plant states
Tracks least
cost goal states
z Executes concurrently
z Preempts
z Queries (hidden) states
z Asserts (hidden) state
Closed
Valve
Open
Stuck
open
Stuck
closed
Open Close
0. 01
0. 01
0.01
0.01
inflow = outflow = 0
Orbital Insertion Example
EngineA EngineB
Science Camera
Turn camera off and engine on
EngineA EngineB
Science Camera
Model-based Program
Control program specifies
state trajectories:
? fires one of two engines
? sets both engines to ‘standby’
? prior to firing engine, camera must be
turned off to avoid plume contamination
? in case of primary engine failure, fire
backup engine instead
OrbitInsert()::
(do-watching ((EngineA = Thrusting) OR
(EngineB = Thrusting))
(parallel
(EngineA = Standby)
(EngineB = Standby)
(Camera = Off)
(do-watching (EngineA = Failed)
(when-donext ( (EngineA = Standby) AND
(Camera = Off) )
(EngineA = Thrusting)))
(when-donext ( (EngineA = Failed) AND
(EngineB = Standby) AND
(Camera = Off) )
(EngineB = Thrusting))))
Plant Model describes
behavior of each component:
– Nominal and Off nominal
– qualitative constraints
– likelihoods and costs
Example: The model-based program sets engine = thrusting, and the
deductive controller . . . .
Determines valves
on backup engine that
will achieve thrust, and
plans needed actions.
Deduces that a valve
failed - stuck closed
Selects valve
configuration;
plans actions
to open
six valves
Fuel tankOxidizer tank
Deduces that
thrust is off, and
the engine is healthy
Mode Estimation
Mode Reconfiguration
Mode Reconfiguration Mode Estimation
Given observations…
and command history…
Mode estimation infers
“hidden state”
Executive Manipulates Hidden State
? States not DIRECTLY observable or controllable…
(thrust = zero) AND (power_in = nominal)
last command issued = “standby-cmd”
? (EngineA = Standby)
Given state goals …
and estimated state …
Mode reconfiguration
infers “commands”
? [Turn on DriverA]; [Open ValveA]
? Thinking in terms of “hidden states” abstracts away
complexity of robustly observing and controlling state.
? Model-based executive raises assurance of software by
correctly inferring and controlling states.
(ValveA = Open)
(DriverA = off) AND (ValveA = closed)
Control Sequencer
Deductive Controller
System Model
Commands
Observations
Control Program
Plant
Titan Model-based ExecutiveRMPL Model-based Program
State goalsState estimates
Control Sequencer:
Generates goal states
conditioned on state estimates
Mode
Estimation:
Tracks likely
States
Mode
Reconfiguration:
Tracks least-cost
state goals
z Executes concurrently
z Preempts
z Asserts and queries states
z Chooses based on reward
OrbitInsert()::
(do-watching ((EngineA = Firing) OR
(EngineB = Firing))
(parallel
(EngineA = Standby)
(EngineB = Standby)
(Camera = Off)
(do-watching (EngineA = Failed)
(when-donext ( (EngineA = Standby) AND
(Camera = Off) )
(EngineA = Firing)))
(when-donext ( (EngineA = Failed) AND
(EngineB = Standby) AND
(Camera = Off) )
(EngineB = Firing))))
MAINTAIN (EAR OR EBR)
EBS
CO
LEGEND:
EAS (EngineA = Standby)
EAF (EngineA = Failed)
EAR (EngineA = Firing)
EBS (EngineB = Standby)
EBF (EngineB = Failed)
EBR (EngineB = Firing)
CO (Camera = Off)
MAINTAIN (EAF)
EAS
(EAS AND CO)
EAR
EAS AND CO
(EAF AND EBS AND CO)
EBR
EAF AND EBS
AND CO
hierarchical constraint
automata on state s
Control Sequencer
Deductive Controller
System Model
Commands
Observations
Control Program
Plant
Titan Model-based ExecutiveRMPL Model-based Program
State goalsState estimates
Control Sequencer:
Generates goal states
conditioned on state estimates
Mode
Estimation:
Tracks likely
States
Mode
Reconfiguration:
Tracks least-cost
state goals
z Executes concurrently
z Preempts
z Asserts and queries states
z Chooses based on reward
Fire backup
engine
Valve fails
stuck closed
S T
X0 X1 XN-1 XN
S T
X0 X1 XN-1 XN
least cost reachable
goal state
First ActionCurrent Belief State
Modeling Complex Behaviors through
Probabilistic Constraint Automata
? Complex, discrete behaviors
? modeled through concurrency, hierarchy and timed transitions.
? Anomalies and uncertainty
? modeled by probabilistic transitions
? Physical interactions
? modeled by discrete and continuous constraints
Standby
Engine Model
Off
Failed
off-
cmd
standby-
cmd
0.01
(thrust = full) AND
(power_in = nominal)
Firing
0.01
standby-
cmd
fire-
cmd
(thrust = zero) AND
(power_in = zero)
(thrust = zero) AND
(power_in = nominal)
On
Camera Model
Off
turnoff-
cmd
turnon-
cmd
(power_in = zero)
AND
(shutter = closed)
(power_in = nominal)
AND
(shutter = open)
0 v
2 kv
2 kv
0 v
0 v
20 v
0.01
0.01
0 v
Possible Behaviors
Visualized by a Trellis Diagram
S T
X
0
X
1
X
N-1
X
N
?Assigns a value to each
variable (e.g.,3,000 vars).
?Consistent with all state
constraints (e.g., 12,000).
?A set of concurrent transitions,
one per automata (e.g., 80).
?Previous & Next states
consistent with source & target
of transitions
The Plant’s Behavior
Deductive Controller
Commands
Observations
Plant
State goalsState estimates
Mode
Estimation:
Tracks likely
States
Mode
Reconfiguration:
Tracks least-cost
state goals
Fire backup
engine
Valve fails
stuck closed
S T
X0 X1 XN-1 XN
S T
X0 X1 XN-1 XN
least cost reachable
goal state
First ActionCurrent Belief State
Optimal CSP:
arg min f(x)
s.t. C(x) is satisfiable
D(x) is unsatisfiable
arg max P
T
(m’)
s.t. M(m’) ^ O(m’) is satisfiable
arg min R
T*
(m’)
s.t. M(m’) entails G(m’)
s.t. M(m’) is satisfiable
Outline
? Fault Aware Systems and
Model-based Programming
? Model-based Diagnosis
? Multiple-fault Diagnosis based on Conflicts
? Mode Estimation
Issue 1: Handling Hidden Failures Requires
Reasoning from a Model: STS-93
Symptoms:
? Engine temp sensor high
? LOX level low
? GN&C detects low thrust
? H2 level possibly low
Problem: Liquid hydrogen leak
Effect:
? LH2 used to cool engine
? Engine runs hot
? Consumes more LOX
Image courtesy of NASA.
? Mars Observer
? Mars Climate Orbiter
? Mars Polar Lander
? Deep Space 2
courtesy of JPL
Issue 2: Failures are Often Novel
Helium tank
Fuel tankOxidizer tank
Main
Engines
Flow
1
= zero
Pressure
1
= nominal
Pressure
2
= nominal
Acceleration = zero
Issue 3: Multiple Faults Occur
courtesy of NASA
? three shorts, tank-line
and pressure jacket
burst, panel flies off.
APOLLO 13
When you have eliminated the impossible,
whatever remains, however improbable,
must be the truth.
- Sherlock Holmes. The Sign of the Four.
Model-based Diagnosis as
Conflict-directed Best First Search
1. Test Hypothesis
2. If inconsistent, learn reason for inconsistency
(a Conflict).
3. Use conflicts to leap over similarly infeasible options
to next best hypothesis.
Compare Most Likely Hypothesis
to Observations
Helium tank
Fuel tankOxidizer tank
Main
Engines
Flow
1
= zero
Pressure
1
= nominal
Pressure
2
= nominal
Acceleration = zero
It is most likely that all components are okay.
Isolate Conflicting Information
Helium tank
Fuel tankOxidizer tank
Main
Engines
Flow
1
= zero
The red component modes conflict with the model and
observations.
Helium tank
Fuel tankOxidizer tank
Main
Engines
Flow
1
= zero
Leap to the Next Most Likely Hypothesis
that Resolves the Conflict
The next hypothesis must remove the conflict
New Hypothesis Exposes Additional Conflicts
Pressure
1
= nominal Pressure
2
= nominal
Acceleration = zero
Helium tank
Fuel tankOxidizer tank
Main
Engines
Another conflict, try removing both
Final Hypothesis Resolves all Conflicts
Helium tank
Fuel tankOxidizer tank
Main
Engines
Pressure
1
= nominal
Flow
1
= zero
Pressure
2
= nominal
Flow
2
= positive
Acceleration = zero
Implementation: Conflict-directed A* search.
Soln 1: Model-based Diagnosis
Given a system with symptomatic behavior and a
model of the system, find diagnoses that account
for symptoms.
1
Symptom1
1
Or1
Or2
Or3
And1
And2
A
B
C
D
E
F
G
X
Y
Z
1
1
1
1
0
0
1
1
1
Given a system with symptomatic behavior and a
model of the system, find diagnoses that account
for symptoms.
1
Symptom1
1
Or1
Or2
Or3
And1
And2
A
B
C
D
E
F
G
X
Y
Z
1
1
1
1
0
0
1
1
1
Soln 1: Model-based Diagnosis
Diagnosis as
Hypothesis Testing
1. Generate candidates, given symptoms.
2. Test if candidates account for all symptoms.
? Set of diagnoses should be complete.
? Set of diagnoses should consider all
available information.
Issue 2: How Should Diagnoses
Account for Novel Symptoms?
Consistency-based Diagnosis: Given symptoms,
find diagnoses that are consistent with symptoms.
Suspending Constraints: Make no presumptions
about faulty component behavior.
1
1
1
Symptom
Or1
Or2
Or3
And1
And2
A
B
C
D
E
1
1
1
1
0
F
G
X
Y
Z
0
1
Consistency-based Diagnosis: Given symptoms,
find diagnoses that are consistent with symptoms.
Suspending Constraints: Make no presumptions
about faulty component behavior.
1
1
1
Symptom
Or1
Or2
Or3
And1
And2
A
B
C
D
E
1
1
1
1
0
F
G
X
Y
Z
0
1
Issue 2: How Should Diagnoses
Account for Novel Symptoms?
Consistency-based Diagnosis: Given symptoms,
find diagnoses that are consistent with symptoms.
Suspending Constraints: Make no presumptions
about faulty component behavior.
1
1
Or1
Or2
Or3
And1
And2
A
B
C
D
E
1
1
1
1
0
F
G
X
Y
Z
0
1
Issue 2: How Should Diagnoses
Account for Novel Symptoms?
Issue 3: Multiple Faults Occur
courtesy of NASA
? three shorts, tank-line
and pressure jacket
burst, panel flies off.
? Divide & Conquer
? Diagnose each
symptom.
? Summarize (conflicts)
? Combine
APOLLO 13
? Candidate: Assignment to all component modes.
3
2
2
3
3
M1
M2
M3
A1
A2
A
B
C
D
E
F
G
X
Y
Z
10
Adder(i):
? G(i):
Out(i) = In1(i)+In2(i)
? U(i):
12
Diagnosis identifies
consistent modes
Candidate = {A1=G, A2=G, M1=G, M2=G, M3=G}
Diagnosis identifies All
sets of consistent modes
Adder(i):
? G(i):
Out(i) = In1(i)+In2(i)
? U(i):
? Diagnosis D: Candidate consistent with model Phi and
observables OBS.
? As more constraints are relaxed, candidates are more
easily satisfied.
?Typically an exponential number of candidates.
3
2
2
3
3
M1
M3
A1
A
B
C
D
E
F
G
X
Y
Z
10
12
Diagnosis = {A1=G, A2=U, M1=G, M2=U, M3=G}
Testing Consistency
→Propositional Logic
?DPLL
? Unit propagation (incomplete)
?Finite Domain Constraints
? Backtrack w forward checking, …
? Waltz constraint propagation (incomplete)
? Algebraic Constraints
? Sussman/Steele Constraint Propagation:
Propagate newly assigned values through equations
mentioning variables.
? To propagate, use assigned values of constraint to
deduce unknown value(s) of constraint.
X ∈{1,0} X=1 ∨ X=0
?X=1∨?X=0
Encoding Models In Propositional Logic
?(i=G) ∨?(In1(i)=1) ∨ Out(i)=1
?(i=G) ∨?(In2(i)=1) ∨ Out(i)=1
?(i=G) ∨?(In1(i)=0) ∨?(In2(i)=0) ∨ Out(i)=0
And(i):
? G(i):
Out(i) = In1(i) AND In2(i)
? U(i):
Or(i):
? G(i):
Out(i) = In1(i) OR In2(i)
? U(i):
?(i=G) ∨?(In1(i)=0) ∨ Out(i)=0
?(i=G) ∨?(In2(i)=0) ∨ Out(i)=0
?(i=G) ∨?(In1(i)=1) ∨?(In2(i)=1) ∨ Out(i)=1
Recap: Consistency-based Diagnosis
And(i):
? G(i):
Out(i) = In1(i) AND In2(i)
? U(i):
? Obs: Assignment to O
? Candidate C
i
: Assignment of modes to X
? Diagnosis D
i
: A candidate such that
D
i
∧ Obs ∧ C(X,Y) is satisfiable.
1
1
1
1
0
Or1
Or3
And1
A
B
C
D
E
F
G
X
Y
Z
0
1
Diagnosis = {A1=G, A2=U O1=G, O2=U, O3=G}
ALL components have
“unknown Mode” U,
Whose assignment is
never mentioned in C