Constraint Propagation

Andrei included in category

2023-09-04 1681 words 8 minutes

Contents

Constraint propagation is one of the key mechanisms that allows constructive search to efficiently explore huge search spaces. It does so by applying constraints whenever search state changes to remove parts of search space that cannot contain solutions.

In this post I’ll cover two constraint propagation techniques: forward checking and arc consistency. As these techniques apply to variables’ live domains, I’ll start by defining what that is first, and then dive into constraint propagation itself.

Variables and domains

Variables in CSPs are associated with their domains, for example, in N-queens problem each variable represents a queen in a particular column and its domain represents the N possible rows that the queen may be assigned to; in order to find a solution to the CSP we must assign variables to exactly one value from their respective domains such that no constraints are violated. Before we discuss constraint propagation it would help to define another concept - live domain.

Live domain of a variable V is a subset of values from V’s domain that may still be part of a solution. Unlike domain of a variable, which is static and defined once as part of the CSP, live domain changes throughout the search:

Decisions made during search can directly affect a variable’s live domain. For example, choosing to assign a queen to a particular row, let’s call it k, will mutate its live domain to be just that value; i.e., {k}.
Similarly, some decisions may result in pruning values from the live domain, like stating that a queen cannot be on row m, and that would lead us to remove m from the live domain of the variable for that queen.
Constraint propagation is the other, key mechanism that prunes values from live domain of variables, by removing values that are guaranteed to not be part of a solution given the current search state.
The cases above all remove values from live domains, and apply as we traverse deeper into the search tree. When we go back up the search tree it is important to remember to restore variables’ live domains.

Constraint propagation

As search traverses deeper into a particular subtree a few things happen:

Decisions made on the path from the current node to the root reduce the search space, with the goal of finding a solution, or proving none exists in this subtree.
As those decisions are made, and as mentioned above, live domains of variables will be reduced, reflecting the reduction of search space as the search traverses deeper down the search tree.

Constraint propagation is the idea that when live domains of variables change, we have the opportunity to examine constraints of the CSP and use those to further reduce live domains by removing (aka pruning) values that would otherwise be guaranteed to violate constraints. For example, with N-queens, whenever we assign some queen to row k we know that the other queens cannot be on row k now without being attacked, and can remove k from their live domains. Let’s look at the two constraint propagation algorithms, and then look at how one might code those up.

Forward checking

Forward checking is a relatively simple and effective constraint propagation technique that can be summed up as follows:

When a variable V’s live domain changes, grab all constraints that involve the variable.
Look at each constraint one at a time. For each such constraint C, look at all the other variables that are part of the constraint.
For each such variable U ≠ V of C look at the live domain of U and see which values of U’s live domain can still be a part of a solution without violating C. Any values that don’t meet this criterion are removed from U’s live domain.

This last part is a bit tricky. Let’s look at the simplest case first, where we have binary constraints, like the diagonal constraint from N-queens, and just assigned one of the variables (i.e., placed one of the queens):

In this case, as we apply forward checking, we want to adjust the live domain of the other queen to remove values (row numbers) that would now be diagonally under attack.
We’d apply this to all the constraints that the assigned queen is involved in, thus updating live domains of possibly many other queens.

The more general case can have constraints with many variables, like all-different constraint, and possibly only partial live-domain changes, rather than reductions to a single assigned value. In this case we need to rely on supports of a constraint. A support is defined as a set of assignments of values to all variables involved in a constraint that do not violate the constraint. Coming back to constraint propagation, the last step of forward checking means that as we look at live domains of variables, we should remove any values that can no longer be supported given the current state of search.

Let’s explain this last bit with an example, this time looking at all-different constraint defined on 4 queens, which tells us that all 4 queens must be assigned to unique rows. Now, suppose that:

Initially Q1 (using the same notation as in the older post) has live domain of {1, 2, 3}, Q2 has live domain of {1, 2}, Q3 has live domain of {3, 4} and Q4 has live domain of {3}.
If we assign Q1 to 3, we will end up removing 3 from live domain of Q3 and Q4.

One interesting and important bit about this simple example is that this would mean that Q4 now has an empty live domain. This is referred to as a wipeout and means that we won’t be able to find a solution in this part of the search space and must backtrack. This is one of the key benefits of constraint propagation - by doing a bit of work up-front, not only can we reduce work further down in the search tree, but we can sometimes detect infeasibility (i.e., absence of a solution) early, and backtrack right away.

Arc consistency

Arc consistency is another constraint propagation technique that is more powerful (and more expensive to evaluate) than forward checking. Tersely speaking it:

Keeps track of variables whose live domains have changed. It starts by seeding a set S of variables with the variable that was affected by the latest decision.
Removes a variable from S and applies forward checking as described above. As this is done, any variables with live domain reductions are added to S.
Keeps going until S becomes empty, meaning no further live domain changes happen.

Arc consistency generally leads to fewer backtracks than forward-checking, as infeasibility can be detected faster, however arc consistency requires more constraint evaluations and can be significantly slower per iteration. As such, whether you should use forward checking or arc consistency tends to depend on the problem and the implementation of constraints. Benchmarking generally can help you choose the right constraint propagation approach.

Code

golang

 1// propagate returns true when no live domains are wiped out.
 2func (s *Search) propagate(arcConsistency bool) bool {
 3  changedVars := map[*model.Variable]bool{}
 4  changedVars[s.current.decision.variable] = true
 5  for len(changedVars) != 0 {
 6    nextV := pop(changedVars)
 7    for _, c := range s.problem.ConstraintsFor(nextV) {
 8      for _, cv := range c.Variables() {
 9        liveDomainChanged := false
10        cv.LiveDomain().ForEachValue(func(value int) bool {
11          if !c.Supported(cv, value) {
12            s.prune(cv, val)
13            liveDomainChanged = true
14          }
15          return true
16        })
17        if !liveDomainChanged {
18          continue
19        }
20        if cv.WipedOut() {
21          return false
22        }
23        if arcConsistency {
24          changedVars[cv] = true
25        }
26      }
27    }
28  }
29  return true
30}

The block above applies constraint propagation, and the logic between forward checking and arc consistency is really just about adding changed variables to changedVars on lines 23-25, and is gated by the bool passed to propagate() where arcConsistency being false means that forward checking will be applied instead. The key parts are:

Lines 3-5 initialize changedVars and keep going so long as this “set” isn’t empty.
Line 6 grabs the next variable nextV to process.
Line 7 goes through all the constraints c of nextV.
Line 8 goes through all variables cv of c.
Lines 9-22 apply the core of forward checking algorithm described above by checking all values of cv’s live domain for supports of constraint c, any that are not supported are removed, and if the entire domain of cv is wiped out (made empty), propagate() ends indicating that no solution exists.

Example with 4-queens

Hopefully by now the header image makes sense - it’s an example of applying forward checking to 4-queens after placing the first queen, Q1, on row #1. The cells with “X"s show all cells that Q1 attacks, with “X"s in red indicating the result of forward checking (and black “X"s are eliminated simply because we assigned Q1 to row #1).

Looking at how propagate() would behave for both algorithms:

propagate(false) applies forward checking and would proactively prune live domains of the remaining queens, as per above. It would still allow search to continue as it wouldn’t detect any wipeouts.
propagate(true) applies arc consistency and would go quite a bit further and detect that placing Q2 on row #3 is not viable, because the diagonal constraint between Q2 and Q3 isn’t supported by Q3’s live domain, and would thus remove row #3 from Q2’s domain. It would then notice that Q3 cannot be on row #4, because row #4 now must be taken by Q2 (i.e., the all-different constraint isn’t supported), leaving Q3 with only row #2. And finally it would note that Q4 cannot go on row #2 because Q3 must be on row #2, and the diagonal attack prevents Q4 from going on row #3, thus wiping out Q4’s live domain and causing propagate to prevent us from stepping any further into this subtree at all.

The interesting bit here is that you can see that even with such a simple example arc consistency can eliminate non-viable assignments earlier, but at higher computational cost per iteration of propagate().