Two incomplete methods: GSAT and Unit propagation

Two methods for finding a model of a formula

Incomplete=they may fail

Two different kinds of incompleteness (more later)

Both work on CNFs

GSAT

Works on CNFs: {C₁, ..., C_m}

Find a model = find an interpretation satisfying all clauses

Principle:

if I satisfies 3 clauses and I' 5, the latter is "closer" to be a model than the former

works by picking an interpretation and iteratively trying to increase the number of satisfied clauses

Increasing satisfied clauses

Given an interpretation I, try to change the value of a single variable
(if true make it false and vice versa)

Does this change increase the number of satisfied clauses?

Try this for all variables

Best change

given I, change the value of a variable

determine the number of satisfied clauses

do this for all variables

pick the best, in terms of number of satisfied clauses

GSAT: the algorithm

pick a random interpretation I
for every variable x
1. I'=same as I but for the value of x
2. compute number of satisfied clauses
I=interpretation I' maximizing this number
if all clauses are satisfied by I', stop
go to 1

(one point missing)

GSAT: incompleteness

limit=we change a variable at time

possible situation: I → I' → I'' → I

none of I, I', I'' are models
all interpretations that differ from them by one variable satisfy less clauses than them
a model exist, but differs from them by more than one variable

we keep cycling over them because the other close interpretations satisfy less clauses

we are stuck in a local maximum
(w.r.t. the surrounding interpretations, we have reached a maximum, but this is not the overall maximum)

but a model exists

GSAT: restarts

after a fixed number of iterations (say, 10000), GSAT stops and picks another random interpretation

try this for a fixed number of times (say, 10)

still, we may end up in a local maximum every time

GSAT is incomplete

GSAT: incompleteness

how it is incomplete?
- if GSAT finds a model, the formula is satisfiable
- if GSAT does not find a model, the formula may be satisfiable or not
when is it incomplete?
in general, we cannot tell in advance whether GSAT will work on a particular formula
(only way is: try it and see if it finds a model)

Unit Propagation

Still works on CNFs: {C₁, ..., C_m}

Principle:

if a clause contains a single literal, every model makes that literal true

replace every occurrence of that variable with its value

obvious on an example (next)

Unit Propagation: example

{x, ¬x ∨ y, x ∨ ¬z, y ∨ z, y ∨ ¬z}

every model of this set has x=true
(otherwise, the clause x would be false)

model=interpretation that satisfies all clauses

(cont.)

Unit Propagation: example

{x, ¬x ∨ y, x ∨ ¬z, y ∨ z, y ∨ ¬z}

every model of this set has x=true

under this assumption:

¬x ∨ y = false ∨ y = y
x ∨ ¬z = true

Unit + Propagation

from unit clauses to truth values

propagate truth values to clauses

repeat, if needed

Unit propagation: full example

{x, ¬x ∨ ¬y, x ∨ ¬z, y ∨ z, y ∨ ¬z}

x=true

replace x=true in the set and simplify:

{x, ¬x ∨ ¬y, x ∨ ¬z, y ∨ z, y ∨ ¬z} =
{true, ¬true ∨ ¬y, true ∨ ¬z, y ∨ z, y ∨ ¬z} =
{true, false ∨ ¬y, true ∨ ¬z, y ∨ z, y ∨ ¬z} =
{¬y, y ∨ z, y ∨ ¬z}

simplifications based on:

false ∨ something = something
(remove false literals from clauses)
true ∧ something = something
(remove true clauses from the set)

(cont.)

Unit propagation: full example

so far: x=true and the set simplifies to:

{¬y, y ∨ z, y ∨ ¬z}

all models has y false

set y=false and simplify:

{¬y, y ∨ z, y ∨ ¬z} =
{true, false ∨ z, false ∨ ¬z} =
{z, ¬z}

contradiction

the set is unsatisfiable

next: formal definition of algorithm

Partial interpretations

accumulate truth values in a partial interpretation I

partial interpretation=truth evaluation of some variables

initially, no variable is evaluated

(full) interpretations are a particular case (all variables evaluated)

Unit propagation: algorithm

I=empty partial interpretation
for every unit clause v or ¬v:
- add v=true or v=false, respectively, to I
- replace v with its value in the set of clauses and simplify
if the set contains some other unit clause, go to 2

stop on contradiction in the partial interpretation

Unit propagation, in practice

try to make all propagations at the same time

I=empty partial interpretation
I'=empty partial interpretation
for every unit clause v or ¬v:
- add v=true or v=false, respectively, to I'
if I' = ∅ stop
replace each v in I' with its value in the set of clauses and simplify
I'=I ∪ I'
go to 2

stop on contradiction in I or I'

Unit propagation on a consistent formula

{x ∨ y ∨ ¬ z, y, ¬x ∨ w, ¬y ∨ w}

y=true → I

formula becomes:

{~~x ∨ y ∨ ¬ z~~, y, ¬x ∨ w, ¬y ∨ w} = {¬x ∨ w, w}

w=true → I

replace in formula and simplify:

{~~¬x ∨ w~~, w} = ∅

all clauses are satisfied by I={y=true, w=true}

formula is satisfied

model=any extension of I to all variables
(e.g., x=false, y=true, z=true, w=true}

Unit propagation on an inconsistent formula

{x ∨ y, x ∨ ¬y, ¬z ∨ ¬x ∨ y, ¬x ∨ ¬y, z}

add z=true to I

replace z with its truth value and simplify:

{x ∨ y, x ∨ ¬y, ¬z ∨ ¬x ∨ y, ¬x ∨ ¬y, z} =

{x ∨ y, x ∨ ¬y, ¬x ∨ y, ¬x ∨ ¬y}

no unit clause

algorithm stops

(cont.)

Incompleteness of Unit propagation

formula {x ∨ y, x ∨ ¬y, ¬x ∨ y, ¬x ∨ ¬y} is inconsistent

yet, unit propagation did not reach contradiction

happened also in the previous example, with a consistent formula

if unit propagation does not reach contradiction, we cannot say whether formula is consistent or not

unit propagation is incomplete

Two kinds of incompleteness

GSAT and Unit Propagation are incomplete in two different ways:

GSAT

if it finds a model, formula is surely satisfiable
otherwise, it may be unsatisfiable or not
no general way to predict if it works on a formula

Unit propagation

if it reaches contradiction, formula is surely unsatisfiable
otherwise, it may be satisfiable or not
complete on a specific class of formulae

last point on next page

Horn formulae

Horn clause: a clause containing at most a positive literal

= contains zero or one positive literal

examples:

x ∨ ¬y
¬ y ∨ z ∨ ¬ w
y
¬ x
¬ x ∨ ¬ y ∨ ¬ z

(last two examples: zero positive literal, still Horn)

Horn formula: set of Horn clauses

Unit propagation on Horn formulae

unit propagation is complete on Horn formulae

in general, it is not because it might not reach contradiction even if formula is unsatisfiable

never the case on Horn formulae

unit propagation does not reach contradiction on an Horn formula ⇒ formula is satisfiable

Unit propagation complete on Horn formulae

assume unit propagation does not reach contradiction

let I be the partial interpretation and F the set of remaining clauses

F does not contain any variable in I
(otherwise, F would have been simplified)
F does not contain any unit clause
(otherwise the algorithm would not terminate)

all clauses are made of at least two literals, none of them evaluated in I

at most one positive=at least one negative

setting all variables to false, all clauses are satisfied

conclusion:

if unit propagation does not reach contradiction, the set is satisfiable
(model is I plus all other variables set to false

Up on Horn CNFs: example

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, x₄ ∨ ¬x₅, x₅, ¬x₁ ∨ x₃ ∨ ¬x₄, ¬x₂ ∨ ¬x₄ ∨ ¬x₅ }

(cont.)

Up on Horn CNFs: example (2)

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, x₄ ∨ ¬x₅, x₅, ¬x₁ ∨ x₃ ∨ ¬x₄, ¬x₂ ∨ ¬x₄ ∨ ¬x₅ }

x₅ is a unit clause

set x₅=true

Up on Horn CNFs: example (3)

replace x₅ with true

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, x₄ ∨ ~~¬x₅~~, x₅, ¬x₁ ∨ x₃ ∨ ¬x₄, ¬x₂ ∨ ¬x₄ ∨ ~~¬x₅~~ }

simplify:

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, x₄, ¬x₁ ∨ x₃ ∨ ¬x₄, ¬x₂ ∨ ¬x₄ }

(cont.)

Up on Horn CNFs: example (4)

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, x₄, ¬x₁ ∨ x₃ ∨ ¬x₄, ¬x₂ ∨ ¬x₄ }

set x₄=true

replace:

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, x₄, ¬x₁ ∨ x₃ ∨ ~~¬x₄~~, ¬x₂ ∨ ~~¬x₄~~ }

simplify:

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, ¬x₁ ∨ x₃, ¬x₂ }

(cont.)

Up on Horn CNFs: example (5)

{ x₁ ∨ ¬x₂ ∨ ¬x₃, ¬x₁ ∨ x₂ ∨ ¬x₃, ¬x₁ ∨ x₃, ¬x₂ }

clause ¬x₂ is unary

set x₂=false

replace:

{ ~~x₁ ∨ ¬x₂ ∨ ¬x₃~~, ¬x₁ ∨ x₂ ∨ ¬x₃, ¬x₁ ∨ x₃, ~~¬x₂~~ }

simplify:

{ ¬x₁ ∨ ¬x₃, ¬x₁ ∨ x₃ }

Up on Horn CNFs: example (6)

{ ¬x₁ ∨ ¬x₃, ¬x₁ ∨ x₃ }

no unit clause

propagation stops

set remaining variables to false:
all clauses are binary and Horn,
so they contain at least a negative literal each
setting all variables to false makes them true

result is assignment { x₁=false, x₂=false, x₃=false, x₄=true, x₅=true }

this is a model of the original CNF

Comparison GSAT-UP, in practice

GSAT: good in practice, no way to predict if it works on a specific formula
Unit propagation: complete on a specific kind of formulae (Horn formulae)

Running time

GSAT: can find a model at the first step, or after 100000
if it does not, running time is maximum number of flips per maximum number of tries
Unit propagation: runs in linear time
(proof: see next)

Unit propagation: dumb version

I=∅
scan formula for unit clauses, collecting values in I'
scan formula, replacing variables in I' and simplifying
I=I ∪ I'
if I' ≠ ∅ goto 2

worst case?

Worst case for UP, dumb version

{ x₁∨¬x₂, x₂∨¬x₃, ... x₉₈∨¬x₉₉, x₉₉∨¬x₁₀₀, x₁₀₀ }

first scan produces I'={x₁₀₀=true}

scan for replacing removes x₁₀₀ and simplifies x₉₉∨¬x₁₀₀ to x₉₉

second scan produces I'={x₉₉=true}

scan for replacing removes x₉₉ and simplifies x₉₈∨¬x₉₉ to x₉₈

etc.

Worst case, running time

two complete scans of the set for x₁₀₀
two scans up to the second-last clause for x₉₉
...

100+99+98...=?

like a triangle, area is base*height/2

cost is in the order of n², where n is the number of variables

Better algorithm?

the problem with the previous set was:

we need to scan the whole set just to find out that x₁₀₀ was only contained in the last two clauses

the solution:

create a data structure for finding the clauses that contain a given variable

Data structure

array of n lists, where n is the number of variables

each list contains the clauses that contain a variables

first, assign number to clauses:

{	x₁∨¬x₂,	x₂∨¬x₃,	...	x₉₈∨¬x₉₉,	x₉₉∨¬x₁₀₀,	x₁₀₀	}
	C₁	C₂	...	C₉₈	C₉₉	C₁₀₀

second, scan the set and add each clauses to the lists

x₁	:	C₁
x₂	:	C₁	C₂
...
x₉₉	:	C₉₈	C₉₉
x₁₀₀	:	C₉₉	C₁₀₀

Data structure, on the example

{	x₁∨¬x₂,	x₂∨¬x₃,	...	x₉₈∨¬x₉₉,	x₉₉∨¬x₁₀₀,	x₁₀₀	}
	C₁	C₂	...	C₉₈	C₉₉	C₁₀₀

x₁	:	C₁
x₂	:	C₁	C₂
...
x₉₉	:	C₉₈	C₉₉
x₁₀₀	:	C₉₉	C₁₀₀

scan the set, get x₁₀₀=true

the list of x₁₀₀ contains C₉₉ and C₁₀₀

simplify these clauses

C₉₉ becomes unary, set x₉₉=true

repeat

the data structure eliminates the need for scanning the set of clauses for simplifying

Unit propagation, with the data structure

build the array of lists
I=∅
scan formula for unit clauses, collecting values in I'
if I'=∅ end
I''=∅
for each x_i=value in I':
- for each clause C_j in the list of x_i:
  - simplify the clause
  - if it becomes unary, set its variable in I''
I=I ∪ I'
I'=I''
go to 4

Cost of the new algorithm

building the data structure is linear:
scan the set once, adding each clause to the lists of all variables it contains

every time we get a new value, we can immediately find the clauses that need simplification

linear for 3CNF (in general, for clauses of fixed length)

can be made linear in general

(use two lists, one for the clauses containing x_i and one for the clauses containing ¬x_i; use a counter for the number of literals left in clauses)