DPLL algorithm

backtracking + unit propagation + pure literal rule

The backtracking algorithm

in general, to find a set of values satisfying some conditions:

set a variable to each possible value in turn

for each value, recursively repeat

Backtracking for satisfiability

find values of x₁, ..., x_n satisfying the formula F

algorithm:

choose a variable x_i
check satisfiability of F + (x_i=true)
check satisfiability of F + (x_i=false)

(more details later)

Satisfiability: recursive calls

the two recursive calls are: "check satisfiability of F + (x_i=value)"

in general: check satisfiability when some variables already have a value

partial interpretation = assigns true/false to some variables

Backtracking with partial interpretation

algorithm (some parts missing):

boolean sat(formula F, partial_interpretation I)

... (see below)
choose x_i that I does not assign
return sat(F, I ∪ { x_i=true }) or sat(F, I ∪ { x_i=false })

satisfiability of F = satisfiability of F with I=∅

missing: base case of recursion, choice of x_i

Base case

recursion adds a x_i=value to I

at some point, all variables are assigned

we can now check whether F is true or false

but:

sometimes, we can check whether F is true or false even if some variables are still unassigned

Value of formulae under partial interpretations

in the formula F:

replace each x_i that is assigned in I with its truth value
(e.g. if I contains x_i=true replace each occurrence of x_i with true)
simplify using rules:
- something ∧ true = something
- something ∧ false = false
- something ∨ true = true
- something ∨ false = something

result could be:

true
false
some formula containing only unassigned variables

in the first two cases, the formula has a value that does not depend on the unassigned variables

Partial interpretation, example 1

I={x=true, z=false}

F = { x ∨ y, ¬x ∨ ¬y ∨ z }

replace variables with values:

F	=	{ x ∨ y, ¬x ∨ ¬y ∨ z } =
		{ true ∨ y, ¬true ∨ ¬y ∨ false } =
		{ true, ¬y } =
		{ ¬y }

formula is not true nor false

value depends on the value of variable y

Partial interpretation, example 2

I={x=true, z=false}

F = { ¬x ∨ y, ¬x ∨ z }

replace variables with values:

F	=	{ ¬x ∨ y, ¬x ∨ z } =
		{ ¬true ∨ y, ¬true ∨ false }
		{ false ∨ y, false ∨ false } =
		{ y, false ∨ false } =
		{ y, false }

formula is false

all clauses have to be satisfied

even a single false clause implies that the formula is false

(even if the first clause were true instead of z, formula would have been false

Partial interpretation, example 3

I={x=true, z=false}

F = {x ∨ y ∨ z, ¬y ∨ ¬z}

F	=	{x ∨ y ∨ z, ¬y ∨ ¬z} =
		{true ∨ y ∨ false, ¬y ∨ ¬false} =
		{true ∨ y ∨ false, ¬y ∨ true} =
		{true, true}

all clauses are true

formula is true

Partial interpretation and formula

given a partial interpretation, a formula could be:

true (denoted I ⇒ F)
false (denoted I ⇒ ¬F)
neither true nor false
(its value depends on the unassigned variables)

in backtracking:

if the formula is true or false (first two cases) according to the partial interpretation, there is no need to perform the recursive calls

Backtracking with check of partial interpretation

boolean sat(formula F, partial_interpretation I)

if ( I ⇒ F ) return true
if ( I ⇒ ¬F ) return false
choose x_i that I does not assign
return sat(F, I ∪ { x_i=true }) or sat(F, I ∪ { x_i=false })

Avoid second recursive calls

implicit in most imperative programming language: if the first argument of an or is true, do not evaluate the others

for clarity, backtracking is as follows:

boolean sat(formula F, partial_interpretation I)

if ( I ⇒ F ) return true
if ( I ⇒ ¬F ) return false
choose x_i that I does not assign
if sat(F, I ∪ { x_i=true }) return true
if sat(F, I ∪ { x_i=false }) return true
return false

Backtracking, first example

{ ¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃ }

Backtracking, first example (1)

start with empty assignment {}

choose a variable, for example x₁

do two recursive calls with assignments {x₁=true} and {x₁=false}

Backtracking, first example (2)

two recursive calls with assignments {x₁=true} and {x₁=false}

Backtracking, first example (3)

first recursive call is with assignment {x₁=true}

{ ¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃ }

no clause of {¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃} is falsified by {x₁=true}

no contradiction: choose an unassigned variable

Backtracking, first example (4)

branching variable x₂ (for example)

do two recursive calls adding the two possible evaluations of x₂ to the original one

partial interpretations in the recursive calls are then {x₁=true, x₂=true} and {x₁=true, x₂=false}

Backtracking, first example (5)

first recursive call with assignment {x₁=true, x₂=true}:

in {¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃}, the clause ¬x₁ ∨ ¬x₂ is falsified

Backtracking, first example (6)

contradiction, close branch of the tree

Backtracking, first example (7)

go back to node labeled x₂

x₂=true already tried

now try x₂=false

Backtracking, first example (8)

assignment {x₁=true, x₂=false}

formula {¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃}, is not falsified

choose variable: only left unassigned is x₃

Backtracking, first example (9)

two recursive calls: x₃=true, x₃=false

Backtracking, first example (10)

first recursive call has assignment {x₁=true, x₂=false, x₃=true}

in formula {¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃}, the clause ¬x₁ ∨ ¬x₃ is falsified

Backtracking, first example (11)

clause is falsified=formula is falsified

close branch

Backtracking, first example (12)

backtrack to node labeled x₃

Backtracking, first example (13)

second recursive call for x₃

value x₃=false

assignment is {x₁=true, x₂=false, x₃=false}

all clauses in {¬x₁ ∨ ¬x₂, x₁ ∨ ¬x₂, ¬x₁ ∨ ¬x₃}, are satisfied!

¬x₁ ∨ ¬x₂: because x₂=false
x₁ ∨ ¬x₂: because x₁=true
¬x₁ ∨ ¬x₃: because x₃=false

Backtracking, first example (14)

no other recursive calls

if a subcall returns true, the call returns true as well

this means: in this case, we go back to original call and return true

model found, no need to go ahead

formula is satisfiable

Backtracking, first example (15)

SVG animation of the tree

Backtracking, second example

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

start with empty assignment

formula is not false under this interpretation

choose a variable

as an example, we choose x₁

Backtracking, second example (1)

branch on x₁

Backtracking, second example (2)

first recursive calls with x₁=true

formula { ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ } not made false by this assignment

choose an unassigned variable

Backtracking, second example (3)

as an example, we choose x₂

two other recursive calls, with assignments {x₁=true, x₂=true} and {x₁=true, x₂=false}

Backtracking, second example (4)

first recursive (sub)call:
assignment {x₁=true, x₂=true}

formula was { ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

clause ¬x₁ ∨ ¬ x₂ false

call returns false

no need to proceed any further, even if x₃ is still unassigned

Backtracking, second example (5)

recursion goes back to node marked x₂

partial assignment were {x₁=true} there

Backtracking, second example (6)

do second recursive (sub)call adding x₂=false to x₁=true

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

clause ¬x₁ ∨ x₂ false

close branch

Backtracking, second example (7)

branch closed, go back to x₂

Backtracking, second example (8)

both recursive subcalls returned false, call returns false

go back to the first call, where x₁=false is left to try

Backtracking, second example (9)

partial assignment is {x₁=false}

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

formula is not false

choose a variable

Backtracking, second example (10)

as an example, we choose x₃

Backtracking, second example (11)

recursive call with partial assignment {x₁=false, x₃=true}

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

formula is not false in this assignment

choose another variable and set it to true and false

Backtracking, second example (12)

only unassigned variable left is x₂

Backtracking, second example (13)

assignment {x₁=false, x₃=true, x₂=true}

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

clause x₁ ∨ ¬x₂ is falsified

Backtracking, second example (14)

backtrack to x₂

Backtracking, second example (15)

assignment {x₁=false, x₃=true, x₂=false}

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

clause x₂ ∨ ¬x₃ is falsified

Backtracking, second example (16)

backtrack to x₂

Backtracking, second example (17)

both calls from node x₂ returned false

go back to node x₃

Backtracking, second example (18)

assignment {x₁=false, x₃=false}

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

clause x₁ ∨ x₃ falsified

Backtracking, second example (19)

calls from x₃ both returned false

Backtracking, second example (20)

go back to x₁

we already tried x₁=true and x₁=false

Backtracking, second example (21)

return false

formula is unsatisfiable

Backtracking, second example (22)

SVG animation of the tree

Backtracking, third example

{ x₂ ∨ x₁, ¬x₁, ¬x₂ ∨ ¬x₃, x₃ ∨ x₁ }

Backtracking, third example

{ x₂ ∨ x₁, ¬x₁, ¬x₂ ∨ ¬x₃, x₃ ∨ x₁ }

observation: set contains the unit clause x₁

Unit propagation

in DPLL can be used for:

simplify F (using unit clauses and values in I)
obtain new assignments to add to I

second point is especially useful:

base case of recursion: when I ⇒ F or I ⇒ ¬F
both are more likely with more variables evaluated in I
better to have as many evaluated variables as possible

variables get a value by:

performing the two recursive calls sat(F, I ∪ {x_i=value})
by unit propagation, in the same same call

each recursive call generates a subtree of recursive calls

one instead of two means half recursive calls (on average)

DPLL with UP

boolean sat(formula F, partial_interpretation I)

if ( I ⇒ F ) return true
if ( I ⇒ ¬F ) return false
F,I = up(F,I)
if I is inconsistent return false
choose x_i that I does not assign
if sat(F, I ∪ { x_i=true }) return true
if sat(F, I ∪ { x_i=false }) return true
return false

extra advantage: UP may discover inconsistency

Unit propagation: example

in the last of examples above, the set contains a unit clause:

{ x₂ ∨ x₁, ¬x₁, ¬x₂ ∨ ¬x₃, x₃ ∨ x₁ }

up says x₁ is false

remove from clauses where occurs positive:

x₂ ∨ x₁: becomes x₂ ∨ x₁, which is x₂
x₃ ∨ x₁: becomes x₃ ∨ x₁, which is x₃

as a result, both x₂ and x₃ are true

clause ¬x₂ ∨ ¬x₃ is contradicted

Unit clauses, in general

in the example, a unit clause was in the original set

may also show up with a partial assignment

Unit clauses from partial assignment

second of the examples above:

{ ¬x₁ ∨ ¬ x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

no unit clause in the original set

two recursive calls

first recursive call with x₁=true

x₁=true is like an additional unit clause {x₁}

apply unit propagation

Unit propagation in a recursive call

{ ¬x₁ ∨ ¬x₂, ¬x₁ ∨ x₂, x₁ ∨ ¬x₂, x₂ ∨ ¬x₃, x₁ ∨ x₃ }

recursive call with x₁=true

remove x₁ where negative:

¬x₁ ∨ ¬x₂: ~~¬x₁~~ ∨ ¬x₂ becomes ¬x₂
¬x₁ ∨ x₂: ~~¬x₁~~ ∨ x₂ becomes x₂

contradiction is reached

recall that backtracking does a recursive call instead:

Unit propagation: savings

in this case, only two recursive calls are saved

more generally, the subtree rooted in the node could have been exponentially large

Pure literal rule

what about a in the following formula?

{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬d}

Constraining a single value

{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬c}

some occurrences of a

no occurrence of ¬a

if a variable is always positive or always negative in a formula, we say it is pure

Choice of value of pure literals

in general (a not pure):

{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬c, ¬ a ∨ b}

a=true → a literal is made true in the first two clauses and false in the last
a=false → a literal is made true in the last clause and false in the first two ones

if a is pure:

{ a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬ c}

a=true → a literal is made true in the first two clauses
a=false → a literal is made false in the first two clauses

setting a=true has some advantage and no disadvantage

Pure literal rule

if a variable only occurs positively in a formula, set it to true
if a variable only occurs negated in a formula, set it to false

remove clauses containing the literal (as usual)

may create new pure literals

New pure literals

in the example { a ∨ ¬ b ∨ ¬ c, a ∨ c, b ∨ ¬c}:

a only positive, set to true

remove clauses containing a

remains {b ∨ ¬c}

both b and c pure
(first positive, second negative)

Pure literal rule, in practice

keep count of how many clauses contain a and ¬a

if a clause is removed by UP, decrease

when a counter reach zero, variable is pure

DPLL

complete algorithm:

boolean sat(formula F, partial_interpretation I)

if ( I ⇒ F ) return true
if ( I ⇒ ¬F ) return false
F,I = up(F,I)
if I is inconsistent return false
F,I = pure(F,I)
if F = ∅ return true
choose x_i that I does not assign
if sat(F, I ∪ { x_i=true }) return true
if sat(F, I ∪ { x_i=false }) return true
return false

Some observation about pure(F,I)

if a is pure, it sets a=value (changes I)
if a is for example positive, setting a=true means that all clauses containing a can be removed (already satisfied)
same for a negative

both I and F change

but F changes only because of the removal of some clauses

we remove clauses that are satisfied:
if we remove them all, formula is satisfied

New pure literals

up(F,i) may create new pure literals

example: b is not pure here:

{a, a ∨ ¬b, ¬a ∨ ¬b ∨ ¬c, ¬b ∨ c}

performing up, we get:

{¬b ∨ ¬c, ¬b ∨ c}

b is now pure (all occurrences are negative)

New unit clauses

pure(F,I) cannot create new unit clauses

reason: it only removes some clauses

no non-unary clause becomes unary by pure(F,i), since this procedure does not modify individual clauses

Why first up then pure

up might create new pure literals

pure cannot create new unit clauses

DPLL, complete example

{ ¬x₁ ∨ x₃ ∨ x₄, ¬x₂ ∨ x₆ ∨ x₄, ¬x₂ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ¬x₂, x₂ ∨ ¬x₃ ∨ ¬x₁, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄, x₁ ∨ x₅, x₁ ∨ x₆,
¬x₆ ∨ x₃ ∨ ¬x₅, x₁ ∨ ¬x₃ ∨ ¬x₅ }

DPLL, complete example (2)

choose branching variable x₁ (for example)

try x₁=true first

apply up and pure

DPLL, complete example (3)

with {x₁=true} the clauses become:

{ ~~¬x₁~~ ∨ x₃ ∨ x₄, ¬x₂ ∨ x₆ ∨ x₄, ¬x₂ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ¬x₂, x₂ ∨ ¬x₃ ∨ ~~¬x₁~~, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄, ~~x₁ ∨ x₅~~, ~~x₁ ∨ x₆~~,
¬x₆ ∨ x₃ ∨ ¬x₅, ~~x₁ ∨ ¬x₃ ∨ ¬x₅~~ }
=
{ x₃ ∨ x₄, ¬x₂ ∨ x₆ ∨ x₄, ¬x₂ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ¬x₂, x₂ ∨ ¬x₃, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄,
¬x₆ ∨ x₃ ∨ ¬x₅ }

x₅ only occurs negated

can be set to false, removing clause

{ x₃ ∨ x₄, ¬x₂ ∨ x₆ ∨ x₄, ¬x₂ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ¬x₂, x₂ ∨ ¬x₃, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄ }

DPLL, complete example (4)

choose variable x₂, value true first

DPLL, complete example (5)

with {x₁=true, x₂=true} the clauses become:

{ x₃ ∨ x₄, ~~¬x₂~~ ∨ x₆ ∨ x₄, ~~¬x₂~~ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ~~¬x₂~~, ~~x₂ ∨ ¬x₃~~, ~~x₂ ∨ x₆ ∨ x₃~~,
~~x₂ ∨ ¬x₆ ∨ ¬x₄~~ }
=
{ x₃ ∨ x₄, x₆ ∨ x₄, ¬x₆ ∨ ¬x₃,
¬x₄ }

from ¬x₄ we derive x₃ and x₆

they falsify the clause ¬x₆ ∨ ¬x₃

contradiction, no need to apply pure

DPLL, complete example (5)

contradiction reached, backtrack

DPLL, complete example (5)

with {x₁=true, x₂=false}, clauses become:

{ x₃ ∨ x₄, ~~¬x₂ ∨ x₆ ∨ x₄~~, ~~¬x₂ ∨ ¬x₆ ∨ ¬x₃~~,
~~¬x₄ ∨ ¬x₂~~, x₂ ∨ ¬x₃, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄ }
=
{ x₃ ∨ x₄,
¬x₃, x₆ ∨ x₃,
¬x₆ ∨ ¬x₄ }

from ¬x₃ we derive x₄ and x₆

they contradict clause ¬x₆ ∨ ¬x₄

contradiction, no need to apply pure

DPLL, complete example (6)

backtrack to first node, try other branch

DPLL, complete example (6)

with {x₁=false} clauses become:

{ ~~¬x₁ ∨ x₃ ∨ x₄~~, ¬x₂ ∨ x₆ ∨ x₄, ¬x₂ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ¬x₂, ~~x₂ ∨ ¬x₃ ∨ ¬x₁~~, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄, x₁ ∨ x₅, x₁ ∨ x₆,
¬x₆ ∨ x₃ ∨ ¬x₅, x₁ ∨ ¬x₃ ∨ ¬x₅ }
=
{ ¬x₂ ∨ x₆ ∨ x₄, ¬x₂ ∨ ¬x₆ ∨ ¬x₃,
¬x₄ ∨ ¬x₂, x₂ ∨ x₆ ∨ x₃,
x₂ ∨ ¬x₆ ∨ ¬x₄, x₅, x₆,
¬x₆ ∨ x₃ ∨ ¬x₅, ¬x₃ ∨ ¬x₅ }

from x₅ we derive ¬x₃

since x₆ is true, clause ¬x₆ ∨ x₃ ∨ ¬x₅ is falsified

DPLL, complete example (7)

contradiction reached on last node

set is unsatisfiable

Choice of branching variable

which is the best, among the unassigned ones?

does it make any difference?

Same example, different choices

same set as previous example

choose x₃ first
then other variables

Same example, different choices

(execution details)

larger tree → longer running time

in general: difference may be exponential

Choice of branching variable: principle

try to reduce the number of the subsequent recursive calls in sat(F, I ∪ {x_i=true}) and sat(F, I ∪ {x_i=false})

Heuristics based on binary clauses

many binary clauses containing ¬x_i = many assignments obtained by unit propagation in sat(F, I ∪ {x_i=true})

same for x_i and sat(F, I ∪ {x_i=false})

choose x_i that is contained in many binary clauses

heuristics based on the first step only of unit propagation, but…
many unit propagations are likely to lead to many ones more

Sign of variable

x₃ positive in 10 binary clauses and negative in none

x₈ positive in 4 binary clauses and negative in 4

how large the two subtrees are?

Evaluation of trees

assume that no further propagation is done after first step

evaluation is qualitative
(impossible to foresee the actual size of subtrees without specifying the whole formula)
likely that many propagations in first step lead to many in further steps

Assignments in first step of unit propagation

x₃ (positive in 10, negative in none): =true: zero
=false: 10
x₈ (positive in 4, negative in 4): =true: 4
=false: 4

cost is exponential in the number of variables
assume 15 total

x₃: cost= 2^15-1-10+2^15-1= 2⁴+2¹⁴= 8+16384= 16394
x₈: cost= 2^15-1-4+2^15-1-4= 2¹⁰+2¹⁰= 1024+1024= 2048

better savings obtained by variables where positive and negative occurrences in binary clauses are balanced

A possible choice

old method based on an heuristics

for each variable x_i

p_i is the number of binary clauses containing x_i
n_i is the number of binary clauses containing ¬x_i

choose variable x_i that maximizes 1024p₁n_i+p_i+n_i

idea: variables that have some positive and negative occurrences are preferred over some having many positive but few negative (or vice versa)

Finding the model

what if the formula is satisfiable?

different way of choosing the branching variable?

a first (wrong) principle: concentrate on choosing between x_i and ¬x_i

(wrong) principle: try to guess the sign of x_i in a model

Satisfiability and partial unsatisfiability

all choices correct: model found in linear time

impossible to make all choices right

what if one is wrong?

One wrong choice

wrong choice=no model in the subtree

formula unsatisfiable with partial model

unsatisfiability: search tree may be exponential

therefore: most of the time spent on the unsatisfiable subformula

even if formula is satisfiable, the hard part of the problem is still dealing with unsatisfiable formulae