backtracking + unit propagation + pure literal rule
in general, to find a set of values satisfying some conditions:
set a variable to each possible value in turn
for each value, recursively repeat
find values of x1, ..., xn satisfying the formula F
algorithm:
(more details later)
the two recursive calls are: "check satisfiability of F + (xi=value)"
in general: check satisfiability when some variables already have a value
partial interpretation = assigns true/false to some variables
algorithm (some parts missing):
boolean sat(formula F, partial_interpretation I)
satisfiability of F = satisfiability of F with I=∅
missing: base case of recursion, choice of xi
recursion adds a xi=value to I
at some point, all variables are assigned
we can now check whether F is true or false
but:
sometimes, we can check whether F is true or false even if some variables are still unassigned
in the formula F:
result could be:
in the first two cases, the formula has a value that does not depend on the unassigned variables
I={x=true, z=false}
F = { x ∨ y, ¬x ∨ ¬y ∨ z }
replace variables with values:
F | = | { x ∨ y, ¬x ∨ ¬y ∨ z } = |
{ true ∨ y, ¬true ∨ ¬y ∨ false } = | ||
{ true, ¬y } = | ||
{ ¬y } |
formula is not true nor false
value depends on the value of variable y
I={x=true, z=false}
F = { ¬x ∨ y, ¬x ∨ z }
replace variables with values:
F | = | { ¬x ∨ y, ¬x ∨ z } = |
{ ¬true ∨ y, ¬true ∨ false } | ||
{ false ∨ y, false ∨ false } = | ||
{ y, false ∨ false } = | ||
{ y, false } |
formula is false
all clauses have to be satisfied
even a single false clause implies that the formula is false
(even if the first clause were true instead of z, formula would have been false
I={x=true, z=false}
F = {x ∨ y ∨ z, ¬y ∨ ¬z}
F | = | {x ∨ y ∨ z, ¬y ∨ ¬z} = |
{true ∨ y ∨ false, ¬y ∨ ¬false} = | ||
{true ∨ y ∨ false, ¬y ∨ true} = | ||
{true, true} |
all clauses are true
formula is true
given a partial interpretation, a formula could be:
in backtracking:
if the formula is true or false (first two cases) according to the partial interpretation, there is no need to perform the recursive calls
boolean sat(formula F, partial_interpretation I)
implicit in most imperative programming language: if the first argument of an or is true, do not evaluate the others
for clarity, backtracking is as follows:
boolean sat(formula F, partial_interpretation I)
{ ¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3 }
start with empty assignment {}
choose a variable, for example x1
do two recursive calls with assignments {x1=true} and {x1=false}
two recursive calls with assignments {x1=true} and {x1=false}
first recursive call is with assignment {x1=true}
{ ¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3 }
no clause of {¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3} is falsified by {x1=true}
no contradiction: choose an unassigned variable
branching variable x2 (for example)
do two recursive calls adding the two possible evaluations of x2 to the original one
partial interpretations in the recursive calls are then {x1=true, x2=true} and {x1=true, x2=false}
first recursive call with assignment {x1=true, x2=true}:
in {¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3}, the clause ¬x1 ∨ ¬x2 is falsified
contradiction, close branch of the tree
go back to node labeled x2
x2=true already tried
now try x2=false
assignment {x1=true, x2=false}
formula {¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3}, is not falsified
choose variable: only left unassigned is x3
two recursive calls: x3=true, x3=false
first recursive call has assignment {x1=true, x2=false, x3=true}
in formula {¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3}, the clause ¬x1 ∨ ¬x3 is falsified
clause is falsified=formula is falsified
close branch
backtrack to node labeled x3
second recursive call for x3
value x3=false
assignment is {x1=true, x2=false, x3=false}
all clauses in {¬x1 ∨ ¬x2, x1 ∨ ¬x2, ¬x1 ∨ ¬x3}, are satisfied!
no other recursive calls
if a subcall returns true, the call returns true as well
this means: in this case, we go back to original call and return true
model found, no need to go ahead
formula is satisfiable
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
start with empty assignment
formula is not false under this interpretation
choose a variable
as an example, we choose x1
branch on x1
first recursive calls with x1=true
formula { ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 } not made false by this assignment
choose an unassigned variable
as an example, we choose x2
two other recursive calls, with assignments {x1=true, x2=true} and {x1=true, x2=false}
first recursive (sub)call:
assignment {x1=true, x2=true}
formula was { ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
clause ¬x1 ∨ ¬ x2 false
call returns false
no need to proceed any further, even if x3 is still unassigned
recursion goes back to node marked x2
partial assignment were {x1=true} there
do second recursive (sub)call adding x2=false to x1=true
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
clause ¬x1 ∨ x2 false
close branch
branch closed, go back to x2
both recursive subcalls returned false, call returns false
go back to the first call, where x1=false is left to try
partial assignment is {x1=false}
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
formula is not false
choose a variable
as an example, we choose x3
recursive call with partial assignment {x1=false, x3=true}
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
formula is not false in this assignment
choose another variable and set it to true and false
only unassigned variable left is x2
assignment {x1=false, x3=true, x2=true}
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
clause x1 ∨ ¬x2 is falsified
backtrack to x2
assignment {x1=false, x3=true, x2=false}
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
clause x2 ∨ ¬x3 is falsified
backtrack to x2
both calls from node x2 returned false
go back to node x3
assignment {x1=false, x3=false}
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
clause x1 ∨ x3 falsified
calls from x3 both returned false
go back to x1
we already tried x1=true and x1=false
return false
formula is unsatisfiable
{ x2 ∨ x1, ¬x1, ¬x2 ∨ ¬x3, x3 ∨ x1 }
{ x2 ∨ x1, ¬x1, ¬x2 ∨ ¬x3, x3 ∨ x1 }
observation: set contains the unit clause x1
in DPLL can be used for:
second point is especially useful:
variables get a value by:
each recursive call generates a subtree of recursive calls
one instead of two means half recursive calls (on average)
boolean sat(formula F, partial_interpretation I)
extra advantage: UP may discover inconsistency
in the last of examples above, the set contains a unit clause:
{ x2 ∨ x1, ¬x1, ¬x2 ∨ ¬x3, x3 ∨ x1 }
up says x1 is false
remove from clauses where occurs positive:
as a result, both x2 and x3 are true
clause ¬x2 ∨ ¬x3 is contradicted
in the example, a unit clause was in the original set
may also show up with a partial assignment
second of the examples above:
{ ¬x1 ∨ ¬ x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
no unit clause in the original set
two recursive calls
first recursive call with x1=true
x1=true is like an additional unit clause {x1}
apply unit propagation
{ ¬x1 ∨ ¬x2, ¬x1 ∨ x2, x1 ∨ ¬x2, x2 ∨ ¬x3, x1 ∨ x3 }
recursive call with x1=true
remove x1 where negative:
contradiction is reached
recall that backtracking does a recursive call instead:
in this case, only two recursive calls are saved
more generally, the subtree rooted in the node could have been exponentially large
what about a in the following formula?
{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬d}
{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬c}
some occurrences of a
no occurrence of ¬a
if a variable is always positive or always negative in a formula, we say it is pure
in general (a not pure):
{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬c, ¬ a ∨ b}
if a is pure:
{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬ c}
setting a=true has some advantage and no disadvantage
if a variable only occurs positively in a formula, set it to true
if a variable only occurs negated in a formula, set it to false
remove clauses containing the literal (as usual)
may create new pure literals
in the example { a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬c}:
a only positive, set to true
remove clauses containing a
remains {b ∨ ¬c}
both b and c pure
(first positive, second negative)
keep count of how many clauses contain a and ¬a
if a clause is removed by UP, decrease
when a counter reach zero, variable is pure
complete algorithm:
boolean sat(formula F, partial_interpretation I)
both I and F change
but F changes only because of the removal of some clauses
we remove clauses that are satisfied:
if we remove them all, formula is satisfied
up(F,i) may create new pure literals
example: b is not pure here:
{a, a ∨ ¬b, ¬a ∨ ¬b ∨ ¬c, ¬b ∨ c}
performing up, we get:
{¬b ∨ ¬c, ¬b ∨ c}
b is now pure (all occurrences are negative)
pure(F,I) cannot create new unit clauses
reason: it only removes some clauses
no non-unary clause becomes unary by pure(F,i), since this procedure does not modify individual clauses
up might create new pure literals
pure cannot create new unit clauses
{
¬x1 ∨ x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3 ∨ ¬x1,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4,
x1 ∨ x5,
x1 ∨ x6,
¬x6 ∨ x3 ∨ ¬x5,
x1 ∨ ¬x3 ∨ ¬x5
}
choose branching variable x1 (for example)
try x1=true first
apply up and pure
with {x1=true} the clauses become:
{
¬x1 ∨ x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3 ∨ ¬x1,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4,
x1 ∨ x5,
x1 ∨ x6,
¬x6 ∨ x3 ∨ ¬x5,
x1 ∨ ¬x3 ∨ ¬x5
}
=
{
x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4,
¬x6 ∨ x3 ∨ ¬x5
}
x5 only occurs negated
can be set to false, removing clause
{
x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4
}
choose variable x2, value true first
with {x1=true, x2=true} the clauses become:
{
x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4
}
=
{
x3 ∨ x4,
x6 ∨ x4,
¬x6 ∨ ¬x3,
¬x4
}
from ¬x4 we derive x3 and x6
they falsify the clause ¬x6 ∨ ¬x3
contradiction, no need to apply pure
contradiction reached, backtrack
with {x1=true, x2=false}, clauses become:
{
x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4
}
=
{
x3 ∨ x4,
¬x3,
x6 ∨ x3,
¬x6 ∨ ¬x4
}
from ¬x3 we derive x4 and x6
they contradict clause ¬x6 ∨ ¬x4
contradiction, no need to apply pure
backtrack to first node, try other branch
with {x1=false} clauses become:
{
¬x1 ∨ x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3 ∨ ¬x1,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4,
x1 ∨ x5,
x1 ∨ x6,
¬x6 ∨ x3 ∨ ¬x5,
x1 ∨ ¬x3 ∨ ¬x5
}
=
{
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4,
x5,
x6,
¬x6 ∨ x3 ∨ ¬x5,
¬x3 ∨ ¬x5
}
from x5 we derive ¬x3
since x6 is true, clause ¬x6 ∨ x3 ∨ ¬x5 is falsified
contradiction reached on last node
set is unsatisfiable
which is the best, among the unassigned ones?
does it make any difference?
same set as previous example
{
¬x1 ∨ x3 ∨ x4,
¬x2 ∨ x6 ∨ x4,
¬x2 ∨ ¬x6 ∨ ¬x3,
¬x4 ∨ ¬x2,
x2 ∨ ¬x3 ∨ ¬x1,
x2 ∨ x6 ∨ x3,
x2 ∨ ¬x6 ∨ ¬x4,
x1 ∨ x5,
x1 ∨ x6,
¬x6 ∨ x3 ∨ ¬x5,
x1 ∨ ¬x3 ∨ ¬x5
}
choose x3 first
then other variables
larger tree → longer running time
in general: difference may be exponential
try to reduce the number of the subsequent recursive calls in sat(F, I ∪ {xi=true}) and sat(F, I ∪ {xi=false})
many binary clauses containing ¬xi = many assignments obtained by unit propagation in sat(F, I ∪ {xi=true})
same for xi and sat(F, I ∪ {xi=false})
choose xi that is contained in many binary clauses
x3 positive in 10 binary clauses and negative in none
x8 positive in 4 binary clauses and negative in 4
how large the two subtrees are?
assume that no further propagation is done after first step
cost is exponential in the number of variables
assume 15 total
better savings obtained by variables where positive and negative occurrences in binary clauses are balanced
old method based on an heuristics
for each variable xi
choose variable xi that maximizes 1024p1ni+pi+ni
idea: variables that have some positive and negative occurrences are preferred over some having many positive but few negative (or vice versa)
what if the formula is satisfiable?
different way of choosing the branching variable?
a first (wrong) principle: concentrate on choosing between xi and ¬xi
(wrong) principle: try to guess the sign of xi in a model
all choices correct: model found in linear time
impossible to make all choices right
what if one is wrong?
wrong choice=no model in the subtree
formula unsatisfiable with partial model
unsatisfiability: search tree may be exponential
therefore: most of the time spent on the unsatisfiable subformula
even if formula is satisfiable, the hard part of the problem is still dealing with unsatisfiable formulae