planning by translation

translate:
planning problem ⇒ other problem
then solve the other problem

useful?


useful?

planning problem ⇒ other problem
viable solution if:

which other problem?

planning problem ⇒ other problem
popular targets of translation: this course: propositional satisfability

plansat

planning problem ⇒ sat
sat = find a model of a propositional formula

how to translate

planning problem ⇒ sat
planning sat
actions to execute,
resulting states
value of variables
The sequence of actions to execute can be chosen; the state changes as a consequence. Depending on them, the goal is reached or not.

In a satisfiability problem, the values of the variables can be chosen. Depending on them, the formula may be satisfied or not.


correspondence

planning problem ⇒ sat
planning sat
actions to execute,
resulting states
value of variables
correct translation:

variables

planning problem ⇒ sat
planning sat
actions: a, b, c
state variables: x, y
propositional variables
goal reached formula satisfied
propositional variables represent:
state and action at each step

plan length

plans can be exponentially long
(exponential in the size of the domain)

fix a maximal length n
only n states and n-1 actions


time

planning sat
actions: a, b, c
state variables: x, y
propositional variables for:
action at time 0 and 1
state at time 0, 1 and 2
goal reached formula satisfied
example: time 0, 1, 2

all variables

planning sat
actions: a, b, c
state variables: x, y
propositional variables:
a0 b0 c0
a1 b1 c1
x0 y0
x1 y1
x2 y2
goal reached formula satisfied
a propositional variable for: in this example: state variables x and y are Boolean
If a state variable has more than two possible values, more propositional variables are needed to represent it.

The encoding of actions is redundant, since two propositional variables suffice to represent which of the three actions is executed a each time step. It however allow for a simpler propositional formula representing the planning problem.


all variables

[positive.fig]

if no formulae constrain the variables: instead:

negated variables

[variables-01.fig]

redundant (variable is negative if not positive)

simplifies connections
later generalized (mutex)

The negated variables are not strictly necessary in this diagram, as a variable is negated exactly when not positive. However, this allows for a simple way of depicting negative preconditions and effects of an action.

Also, pairs of inconsistent literals are later generalized by the concept of mutex.


preconditions

[variables-02.fig]

example: x precondition to a

formula x0 → a0?

WRONG!

The diagram show how NOT to translate a precondition. Why it is wrong is explained next.

permissions and obligations

x0 means that x is true at time 0
a0 means that a is executed at time 0
x precondition to a
means: if x is true, a can be executed
x0 → a0
means: if x is true at time 0, then a must be executed at time 0

evaluate the whole history

if a executed at time 0
then x must have been true at time 0

formula a0→x0

Contrary to what the diagram may suggest, the plan will not be found by starting from the initial state and following some precondition-action links. It will be found by some method for propositional satisfiability, whose algorithm is not of interest at this point.

In order for this to work, the translation must be correct: an evaluation of the propositional variables satisfies the formula if and only if it represents a sequence of actions and the states that result from executing them.

A model represents a whole sequence of states and actions were known. The formula is what discriminates a random sequence of states and actions from one that is coherent with the particular domain under considerations.

An inefficient but correct solution for the satisfiability problem is to check every possible propositional interpretation in turn. The formula can be seen as something that takes one such interpretation and decides whether it is a plan. In terms of planning, it is as if the sequence of states and actions were known, and the only decision to take is whether it is an actual plan or not.


correctness of states-actions links

[variables-04.fig]

x is precondition to a:

if a is executed at time 0
then x was true at time 0

as a propositional formula: a0→x0


effects

[variables-06.fig]

a has effect y

if a executed at time 0
y true at time 1

formula a0→y1

This time, the formula is intutive: action implies effects.

It is however not sufficient.


inertia

[variables-06.fig]

what if a is not executed at time 0?

y1 is true if y0 was true

unless the action executed at time 0 has effect ¬y

Positive effects are easy to encode as a formula: if an action is executed, all its effects become true.

However, the formula is correct only if it also encode inertia: variables remain true if not changed by an action.

This in turns require considering all actions that change the value of the variable.


inertia and blocking actions

[variables-08.fig]

example: action b makes y false

not only b0 → ¬y1

if y was true at time 0
then it is true at time 1
unless b is executed at time 0

formula: (y0∧¬b0) → y1

not enough…

The action b0 has the usual effect of achieving its effect ¬y1 but not only. It also blocks inertia on its negation, so that y0 no longer produces y1.

The formula (y0∧¬b0) → y1 is however not sufficient to correctly encode inertia.


missing bit

[variables-08.fig]

y1 is true if:
a0 is true, or
y0 is true and b is false

otherwise, y1 is false

The condition about a0, b0 and y0 tell when y1 is true. The missing part is that whenever this condition is false then y1 is false as well.

effect

[variables-08.fig]

y1 is true if and only if:
a0 is true, or
y0 is true and b is false

formula:

y1 ≡ a0 ∨ y0∧¬b0
The formula is still simple also because the variable is only affected by two actions.

The arrows in the figure represent the formula y1 a0 ∨ y0∧¬b0, which is incomplete. The correct encoding has instead of the implication.


multiple actions affecting a variable

example: variable y formula:
y1 ≡ a0 ∨ d0 ∨ e0 ∨ y0∧¬b0∧¬h0

effect formula

for every variable x and time t
xt+1at ∨ xtdt
where:

translated so far…

missing: initial state, goal, single action

initial state

example: x=⊤, y=⊥

translated: x0∧¬y0


goal

example: y=⊤

translated: y0∨y1∨y2


single action

an action for each time point
no more than one

encoded, for each time t:


parallel plans

the assumption of single action may not be necessary: not always possible

actions not runnable in parallel

initially, x true and y false

a has precondition x and effect ¬y
b has precondition ¬y and effect ¬x

both actions executable in the initial state
plan with both a0=⊤ and b0=⊤

linearizations:

sequence b,a
invalid
sequence a,b
valid

allowed parallel actions

two possibilities: previous example: a,b valid, b,a invalid
first condition met, second not

check the existence of linearization

encode as a propositional formula:
some linearization of the parallel actions is executable

requires: produce every linearization
check each

check all linearization simpler?


check validity of all linearizations

equivalent condition:
no action falsifies the precondition of another
example:

missing sequences

in the example: a and b conflict
because b falsifies a precondition of a

condition excludes a and b at the same time t

but a,b was a valid plan

still obtained:
a at time t and b at time t+1

The requirement on the actions exclude a and b at the same time, while in fact these two actions have a valid linearization a,b. This is not a problem, as this plan will still be obtained with a only at the current time and b only at the next.

While no plan is lost this way, the requirement is much simpler to check than the existence of a valid linearization. It is also simple to encode as a propositional formula, as it is shown next.


encoding as a formula

no action falsifies the precondition of another
only for the same time t

formula: if b falsifies the precondition of a
add formula bt→¬at
equivalent to ¬bt∨¬at

for every time t


the formula

P ∧ E ∧ I ∧ G ∧ L

where:

P
preconditions of actions
(a0 → x0)
E
effects, negative effects and inertia
(y1 ≡ a0 ∨ y0∧¬b0…)
I
initial state
(x0 ∧ ¬y0)
G
goal
(y0 ∧ y1 ∧ y2)
L
linearizability of paralel actions
(¬(a0∧b0)…)

plansat

given: planning problem why?

formula built so that:

a propositional interpretation satisfies the formula
if and only if
the values of the variables represent a plan (actions and states)

improvements

method works, but:

impossible actions

[progressive-01.fig]

z initially false

d cannot be executed

variable d0 useless

also…


consequences of impossible actions

[progressive-02.fig]

w initially false
w only made true by d

d0 removed ⇒ w1 removed

may fire removal of other actions, etc.

This is the first condition that allows the removal of variables: if an action cannot be executed at the initial time, its variable at time 0 can be removed. This implies the removal of all variables that are effects of actions that cannot be executed. In turn, this allows for the removal of actions at time 1 that have them as preconditions.

Other conditions will be shown that allow simplifying the formula.


mutex

[mutex.fig]

y cannot be both true and false at the same time
a and c cannot be executed at the same time

so?

This is different than linearizability. Indeed, a and c could be executed in parallel, since a falsifies no precondition of c and c falsifies no precondition of a.

The reason why a and c cannot be executed at the same time is different: their preconditions cannot be achieved at the same time.


executable actions

[mutex.fig]

variables a0 and c0 necessary
because action a alone can be executed at time 0
and action c alone can be executed at time 0

but: they cannot be both executed at the same time

consequence: ¬x1 and z1 cannot both be true

b not executable at time a

remove b1
and its consequences, if not otherwise achievable,
etc.


what to remove at time t

remove a literal
only if all actions making it true removed from time t-1

remove an action

mutex

both for literals and actions:
variables
two literals that cannot both be true at a certain time
actions
two actions that cannot both be executed at a certain time
related: if two literals are only made true by two actions in mutex
they are in mutex as well

same for actions


actions in mutex: preconditions

[preconditions-02.fig]

actions have conflicting preconditions

[preconditions-a.fig]

they cannot be executed at the same time

more generally…


actions in mutex: preconditions, more generally

[preconditions-b.fig]

even if the actions have other preconditions

[preconditions-c.fig]

still they cannot be executed at the same time

even more generally…


actions in mutex: preconditions, generally

[preconditions-04.fig]

actions may have preconditions in mutex

[preconditions-05.fig]

regardless of other preconditions, actions are in mutex

actions in mutex: effects

[effects-01.fig]

actions have conflicting effects
e.g., they are xt+1 and ¬xt+1

[effects-02.fig]

actions are in mutex
In theory, a mutex on actions could derive from a mutex on effects. This is however not done because it requires going back and forth in the graph of variables, from a level of variables to the previous level of actions. Only opposite literals are checked, not the more general condition of literals in mutex.

actions in mutex: linearizability

[nolinear-01.fig]

one action makes false the precondition of the other

[nolinear-02.fig]

actions are in mutex
This condition cannot be generalized to “xt and yt+1 are in mutex”, not even theoretically.

A mutex is only possible between literals (or actions) at the same time point. A literal at time t cannot be in mutex with one at time t+1, like in this case. This is why actions are in mutex only if “one generate the opposite of the precondition of the other” and not “the effect of one and the precondition of the other are in mutex.”.


literals in mutex: base case

[negated-01.fig]

opposite literals at the same time point

[negated-02.fig]

literals are in mutex

literals in mutex: conflicting actions

[conflict-01.fig]

literal obtained by some actions
literal obtained by some other
each in conflict with some

[conflict-02.fig]

literals are in mutex
The first literal is the effect of executing one of a set of actions. The same for the second. But each action making the first true is in mutex with one making the second true.

inertia

x1 and y1 in mutex if (example): every action achieving x1 in mutex with every action achieving y1

but x1 also achievable from x0 by inertia

x0 true, execute c0 only
result: x1 and y1 both true
not a mutex

This example shows that neglecting intertia may generate a mutex like the one on x1 and y1 while the two variables can in fact be true at the same time.

The solution is to include inertia in the computation of mutexes.

Since intertia may make some variables true, it also affect the computation of useless variables.


inertia as an action

action with xt as precondition and xt+1 as effect
action with ¬xt as precondition and ¬xt+1 as effect

too simple?

The propositional formula used for encoding inertia was more complicated than this. It involved all actions making a variable true and all making it false. That was however done towards a different aim. The difference is explained next.

intertia-action vs. intertia-formula

when encoding the problem as a propositional formula:
complex formula xt+1at ∨ xtdt

when checking mutexes and useless variables:
action with xt as precondition and xt+1 as effect
action with ¬xt as precondition and ¬xt+1 as effect

not enough in a formula
missing: mandatory action if no action falsifing its effect is executed

enough for checking mutexes and useless variables
only needs to know that variables could be true by intertia

Such actions would not constraint enough the propositional interpretation, as they could just not be executed at some time point even if the action at that time point do not change x.

However, when checking mutexes and useless variables, the procedure of expansion only needs to know which variables could be true. It works like assuming that everything that could be made true might be, and excluding everything else.

As a result, intertia can be encoded as an action that can be executed if xt is true and makes xt+1 true. Another action encodes intertia on the negated variable.

mandatory = obbligatorio

overall algorithm

The algorithm alternates between variables at time t, actions at time t and variables at time t+1 etc. Each time, variables can be removed and mutexes added based on what obtained from the previous step.

example, with inertia

[inertia-01.fig]

x3 and z3 in mutex

action mutex

[inertia-02.fig]

x3 and z3 in mutex

x3 precondition to a
z3 precondition to inertia action

therefore: mutex between a3 and inertia on z3


literal mutex

[inertia-03.fig]

only way to achieve y4 is by a
only way to achieve z4 is by inertia

therefore: mutex between y4 and z4


goal

goal x and y:
both variables at time t
not in mutex

necessary condition
not checked: triple mutexes, for example
(actions a, b and c not all executable at the same time, but each two of them are)

The algorithm for finding mutexes and useless variables is implcitely based on over-optimistic assumptions: if an action could be executed, its effects are achievable, and if actions are not in mutex they can all be executed. It only checks these conditions locally, neglecting the other variables and actions.

time steps

if goals not at level t, or in mutex
expand another level

otherwise: translate into a formula and check satisfiability


mutexes as formulae

at and bt in mutex
add formula ¬at∨¬bt

same for literals

formula increase?

This addition may look counterproductive: the algorithm looks for mutexes and useless variables in order to simplify the formula resulting from the conversion, but not new parts are added to it.

formula increase

¬at∨¬bt
¬xt∨¬yt

formulae encoding mutexes

enlarge the formula obtained by translation
but: make it simpler to check for satisfiability

For example, if at is true then the first formula allows immediately concluding that bt is false. This depdends on the specific satisfiability algorithm, but short subformulae like this may prove useful for directing the search for a satisfying interpretation.

summary

  1. planning ⇒ satisfiability of a formula
  2. planning graph makes the formula simpler
historically:
graphplan in 1995,
the planning graph then used in plansat (1996)

graphplan

build the graph

go back from the goal to the initial state


graphplan, necessary conditions


graphplan, search for a plan

goals not in mutex: maybe a plan exists
necessary but not sufficient condition first point already hard
use heuristics

innovativity of graphplan

before: search for a path in the graph of states

graphplan: introduced the graph of variables-actions-variables-…

then: used in plansat
then: heuristics based on the graph of variables-actions-variables-…

e.g., FF heuristics, etc.