The sequence of actions to execute can be chosen; the state changes as a
consequence. Depending on them, the goal is reached or not.
In a satisfiability problem, the values of the variables can be chosen.
Depending on them, the formula may be satisfied or not.
correspondence
planning problem ⇒ sat
planning
sat
actions to execute,
resulting states
value of variables
correct translation:
each sequence of actions/states
correspond to
a propositional interpretation
and vice versa
if the sequence leads to the goal,
the propositional interpretation satisfies the formula
and vice versa
variables
planning problem ⇒ sat
planning
sat
actions: a, b, c
state variables: x, y
propositional variables
goal reached
formula satisfied
propositional variables represent:
state and action at each step
plan length
plans can be exponentially long
(exponential in the size of the domain)
fix a maximal length n
only n states and n-1 actions
time
planning
sat
actions: a, b, c
state variables: x, y
propositional variables for:
action at time 0 and 1
state at time 0, 1 and 2
in this example: state variables x and y are Boolean
If a state variable has more than two possible values, more propositional
variables are needed to represent it.
The encoding of actions is redundant, since two propositional variables
suffice to represent which of the three actions is executed a each time step.
It however allow for a simpler propositional formula representing the planning
problem.
all variables
if no formulae constrain the variables:
every state is possible, at every time point
every set of actions executable
instead:
initial state fixed
actions executable only if preconditions met
no two actions at the same time
next state from previous and action
negated variables
[variables-01.fig]
redundant (variable is negative if not positive)
simplifies connections
later generalized (mutex)
The negated variables are not strictly necessary in this diagram, as a variable
is negated exactly when not positive. However, this allows for a simple way of
depicting negative preconditions and effects of an action.
Also, pairs of inconsistent literals are later generalized by the concept of
mutex.
preconditions
[variables-02.fig]
example: x precondition to a
formula x0 → a0?
WRONG!
The diagram show how NOT to translate a precondition. Why it is wrong is
explained next.
permissions and obligations
x0 means that x is true at time 0 a0 means that a is executed at time 0
x precondition to a
means: if x is true, acan be executed
x0 → a0
means: if x is true at time 0,
then amust be executed at time 0
evaluate the whole history
if a executed at time 0
then x must have been true at time 0
formula a0→x0
Contrary to what the diagram may suggest, the plan will not be found by
starting from the initial state and following some precondition-action links.
It will be found by some method for propositional satisfiability, whose
algorithm is not of interest at this point.
In order for this to work, the translation must be correct: an evaluation of
the propositional variables satisfies the formula if and only if it represents
a sequence of actions and the states that result from executing them.
A model represents a whole sequence of states and actions were known. The
formula is what discriminates a random sequence of states and actions from one
that is coherent with the particular domain under considerations.
An inefficient but correct solution for the satisfiability problem is to check
every possible propositional interpretation in turn. The formula can be seen as
something that takes one such interpretation and decides whether it is a plan.
In terms of planning, it is as if the sequence of states and actions were
known, and the only decision to take is whether it is an actual plan or not.
correctness of states-actions links
[variables-04.fig]
x is precondition to a:
if a is executed at time 0
then x was true at time 0
as a propositional formula: a0→x0
effects
[variables-06.fig]
a has effect y
if a executed at time 0 y true at time 1
formula a0→y1
This time, the formula is intutive:
action implies effects.
It is however not sufficient.
inertia
[variables-06.fig]
what if a is not executed at time 0?
y1 is true if y0 was true
unless the action executed at time 0 has effect ¬y
Positive effects are easy to encode as a formula: if an action is executed,
all its effects become true.
However, the formula is correct only if it also encode inertia: variables
remain true if not changed by an action.
This in turns require considering all actions that change the value of the
variable.
inertia and blocking actions
[variables-08.fig]
example: action b makes y false
not only b0 → ¬y1
if y was true at time 0
then it is true at time 1
unless b is executed at time 0
formula:
(y0∧¬b0) → y1
not enough…
The action b0 has the usual effect of achieving its effect
¬y1 but not only. It also blocks inertia on its
negation, so that y0 no longer produces
y1.
The formula
(y0∧¬b0) → y1
is however not sufficient to correctly encode inertia.
missing bit
[variables-08.fig]
y1 is true if:
a0 is true, or
y0 is true and b is false
otherwise, y1 is false
The condition about a0, b0 and
y0 tell when y1 is true. The missing
part is that whenever this condition is false then y1 is
false as well.
effect
[variables-08.fig]
y1 is true if and only if:
a0 is true, or
y0 is true and b is false
formula:
y1 ≡ a0 ∨ y0∧¬b0
The formula is still simple also because the variable is only affected by two
actions.
The arrows in the figure represent the formula
y1← a0 ∨
y0∧¬b0, which is incomplete. The correct
encoding has ≡ instead of the implication.
multiple actions affecting a variable
example: variable y
made true by actions a, d, e
made false by action b, h
formula:
y1 ≡
a0 ∨ d0 ∨ e0 ∨
y0∧¬b0∧¬h0
effect formula
for every variable x and time t
xt+1
≡
∨at
∨
xt∧∧dt
where:
a are the actions making x true
d are the actions making x false
translated so far…
states and actions into variables
preconditions and effects of actions
missing: initial state, goal, single action
initial state
example: x=⊤, y=⊥
translated:
x0∧¬y0
goal
example: y=⊤
translated:
y0∨y1∨y2
single action
an action for each time point
no more than one
encoded, for each time t:
at∨bt∨ct
¬(at∧bt) ∧
¬(at∧ct) ∧
¬(bt∧ct)
parallel plans
the assumption of single action may not be necessary:
remove the single action assumption
obtain a plan
if the plan contains a and b both executed at time 0:
linearize the plan as either a,b or b,a
not always possible
actions not runnable in parallel
initially, x true and y false
a has precondition x and effect ¬y b has precondition ¬y and effect ¬x
both actions executable in the initial state
plan with both a0=⊤ and b0=⊤
linearizations:
sequence b,a
invalid
sequence a,b
valid
allowed parallel actions
two possibilities:
allow actions a and b at the same time only if
eithera,b or b,a executable in sequence
allow actions a and b at the same time only if
botha,b or b,a executable in sequence
previous example: a,b valid, b,a invalid
first condition met, second not
check the existence of linearization
encode as a propositional formula:
some linearization of the parallel actions is executable
requires: produce every linearization
check each
check all linearization simpler?
check validity of all linearizations
equivalent condition:
no action falsifies the precondition of another
example:
parallel actions a, b at time t
imples: all preconditions met at time t
sufficient condition holds
implies: all preconditions remains valid after executing any sequence of the
parallel actions
missing sequences
in the example: a and b conflict
because b falsifies a precondition of a
condition excludes a and b at the same time t
but a,b was a valid plan
still obtained:
a at time t and b at time t+1
The requirement on the actions exclude a and b at the same
time, while in fact these two actions have a valid linearization a,b.
This is not a problem, as this plan will still be obtained with a only
at the current time and b only at the next.
While no plan is lost this way, the requirement is much simpler to check than
the existence of a valid linearization. It is also simple to encode as a
propositional formula, as it is shown next.
encoding as a formula
no action falsifies the precondition of another
only for the same time t
formula: if b falsifies the precondition of a
add formula bt→¬at
equivalent to ¬bt∨¬at
for every time t
the formula
P ∧ E ∧ I ∧ G ∧ L
where:
P
preconditions of actions
(a0 → x0…)
E
effects, negative effects and inertia
(y1 ≡ a0 ∨
y0∧¬b0…)
I
initial state
(x0 ∧ ¬y0)
G
goal
(y0 ∧ y1 ∧ y2)
L
linearizability of paralel actions
(¬(a0∧b0)…)
plansat
given: planning problem
translate it into a formula P ∧ E ∧ I ∧ G ∧ L
find a propositional model
(how? next part of the course)
variables a0, b0, …, cn
tell the actions
why?
formula built so that:
a propositional interpretation satisfies the formula
if and only if
the values of the variables represent a plan (actions and states)
improvements
method works, but:
many variables
already 10 just for two Boolean variables and three actions over two steps
some not really useful
example: c not executable at time 0,
variable c0 useless
variable false, made true by c:
useless at time 0 and 1
impossible actions
[progressive-01.fig]
z initially false
d cannot be executed
variable d0 useless
also…
consequences of impossible actions
[progressive-02.fig]
w initially false
w only made true by dd0 removed ⇒ w1 removed
may fire removal of other actions, etc.
This is the first condition that allows the removal of variables: if an action
cannot be executed at the initial time, its variable at time 0 can be
removed. This implies the removal of all variables that are effects of actions
that cannot be executed. In turn, this allows for the removal of actions at
time 1 that have them as preconditions.
Other conditions will be shown that allow simplifying the formula.
mutex
[mutex.fig]
y cannot be both true and false at the same time
⇒ a and c cannot be executed at the same time
so?
This is different than linearizability.
Indeed, a and c could be executed in parallel,
since a falsifies no precondition of c
and c falsifies no precondition of a.
The reason why a and c cannot be executed at the
same time is different: their preconditions cannot be achieved at the same time.
executable actions
[mutex.fig]
variables a0 and c0 necessary
because action a alone can be executed at time 0
and action c alone can be executed at time 0
but: they cannot be both executed at the same time
consequence: ¬x1 and z1 cannot both be true
b not executable at time a
remove b1
and its consequences, if not otherwise achievable,
etc.
what to remove at time t
remove a literal
only if all actions making it true removed from time t-1
remove an action
one of its precondition at time t removed, or
two of its preconditions cannot be both true at time t
mutex
both for literals and actions:
variables
two literals that cannot both be true at a certain time
actions
two actions that cannot both be executed at a certain time
related: if two literals are only made true by two actions in mutex
they are in mutex as well
same for actions
actions in mutex: preconditions
[preconditions-02.fig]
actions have conflicting preconditions
[preconditions-a.fig]
they cannot be executed at the same time
more generally…
actions in mutex: preconditions, more generally
[preconditions-b.fig]
even if the actions have other preconditions
[preconditions-c.fig]
still they cannot be executed at the same time
even more generally…
actions in mutex: preconditions, generally
[preconditions-04.fig]
actions may have preconditions in mutex
[preconditions-05.fig]
regardless of other preconditions,
actions are in mutex
actions in mutex: effects
[effects-01.fig]
actions have conflicting effects
e.g., they are xt+1 and ¬xt+1
[effects-02.fig]
actions are in mutex
In theory, a mutex on actions could derive from a mutex on effects. This is
however not done because it requires going back and forth in the graph of
variables, from a level of variables to the previous level of actions. Only
opposite literals are checked, not the more general condition of literals in
mutex.
actions in mutex: linearizability
[nolinear-01.fig]
one action makes false the precondition of the other
[nolinear-02.fig]
actions are in mutex
This condition cannot be generalized to
“xt and yt+1 are in
mutex”, not even theoretically.
A mutex is only possible between literals (or actions) at the same time point.
A literal at time t cannot be in mutex with one at time t+1,
like in this case. This is why actions are in mutex only if “one generate
the opposite of the precondition of the other” and not “the
effect of one and the precondition of the other are in mutex.”.
literals in mutex: base case
[negated-01.fig]
opposite literals at the same time point
[negated-02.fig]
literals are in mutex
literals in mutex: conflicting actions
[conflict-01.fig]
literal obtained by some actions
literal obtained by some other
each in conflict with some
[conflict-02.fig]
literals are in mutex
The first literal is the effect of executing one of a set of actions. The same
for the second. But each action making the first true is in mutex with one
making the second true.
inertia
x1 and y1 in mutex if
(example):
x1 effect only of
a0 and b0
y1 effect only of
c0
c0 in mutex with a0
c0 in mutex with b0
every action achieving x1
in mutex with every action achieving y1
but x1 also achievable from
x0 by inertia
x0 true, execute c0 only
result: x1 and y1 both true
not a mutex
This example shows that neglecting intertia may generate a mutex like the one
on x1 and y1 while the two variables
can in fact be true at the same time.
The solution is to include inertia in the computation of mutexes.
Since intertia may make some variables true, it also affect the computation of
useless variables.
inertia as an action
action with
xt as precondition and
xt+1 as effect
action with
¬xt as precondition and
¬xt+1 as effect
too simple?
The propositional formula used for encoding inertia was more complicated than
this. It involved all actions making a variable true and all making it false.
That was however done towards a different aim. The difference is explained
next.
intertia-action vs. intertia-formula
when encoding the problem as a propositional formula:
complex formula
xt+1
≡
∨at
∨
xt∧∧dt
when checking mutexes and useless variables:
action with
xt as precondition and
xt+1 as effect
action with
¬xt as precondition and
¬xt+1 as effect
not enough in a formula
missing: mandatory action if no action falsifing
its effect is executed
enough for checking mutexes and useless variables
only needs to know that variables could be true by intertia
Such actions would not constraint enough the propositional interpretation, as
they could just not be executed at some time point even if the action at that
time point do not change x.
However, when checking mutexes and useless variables, the procedure of
expansion only needs to know which variables could be true. It works like
assuming that everything that could be made true might be, and excluding
everything else.
As a result, intertia can be encoded as an action that can be executed if
xt is true and makes xt+1 true. Another
action encodes intertia on the negated variable.
mandatory = obbligatorio
overall algorithm
start from variables at time 0
remove false literals
add mutex between opposite literals
check actions at time 0
remove unexecutable actions
add mutexes between actions
check variables at time 1
remove unachiavable literals
add mutexes between variables
…
The algorithm alternates between variables at time t, actions at time
t and variables at time t+1 etc. Each time, variables can be
removed and mutexes added based on what obtained from the previous step.
example, with inertia
[inertia-01.fig]
x3 and z3 in mutex
action mutex
[inertia-02.fig]
x3 and z3 in mutex
x3 precondition to a z3 precondition to inertia action
therefore:
mutex between a3 and inertia on z3
literal mutex
[inertia-03.fig]
only way to achieve y4 is by a
only way to achieve z4 is by inertia
therefore:
mutex between y4 and z4
goal
goal x and y:
both variables at time t
not in mutex
necessary condition
not checked: triple mutexes, for example
(actions a, b and c not all executable at the same
time, but each two of them are)
The algorithm for finding mutexes and useless variables is implcitely based on
over-optimistic assumptions: if an action could be executed, its effects are
achievable, and if actions are not in mutex they can all be executed. It only
checks these conditions locally, neglecting the other variables and actions.
time steps
if goals not at level t, or in mutex
expand another level
otherwise: translate into a formula and check satisfiability
mutexes as formulae
at and
bt in mutex
add formula ¬at∨¬bt
same for literals
formula increase?
This addition may look counterproductive: the algorithm looks for mutexes and
useless variables in order to simplify the formula resulting from the
conversion, but not new parts are added to it.
formula increase
¬at∨¬bt ¬xt∨¬yt
formulae encoding mutexes
enlarge the formula obtained by translation
but: make it simpler to check for satisfiability
For example, if at is true then the first formula allows
immediately concluding that bt is false. This depdends on
the specific satisfiability algorithm, but short subformulae like this may
prove useful for directing the search for a satisfying interpretation.
summary
planning ⇒ satisfiability of a formula
planning graph makes the formula simpler
historically:
graphplan in 1995,
the planning graph then used in plansat (1996)
graphplan
build the graph
go back from the goal to the initial state
graphplan, necessary conditions
build the graph with n time points
if not all goal literals in the graph:
n++ and goto 1
some goals in mutex:
n++ and goto 1
search for a plan (to be continued)
graphplan, search for a plan
goals not in mutex: maybe a plan exists
necessary but not sufficient condition
…
to achieve a set of goals:
choose a set of non-mutex actions that achieve all goal
try to achieve its preconditions
same way: choose a set of non-mutex actions, etc.
otherwise, n++ and expand the graph
first point already hard
use heuristics
innovativity of graphplan
before: search for a path in the graph of states
graphplan: introduced the graph of variables-actions-variables-…
then: used in plansat
then: heuristics based on the graph of variables-actions-variables-…
e.g., FF heuristics, etc.