2023.8: Finite Traces: Problems With Traces

CSCI 1710

Good job on Intro to Modeling! Your next homework will be released today. It's about two things:

  • exploring the design space of physical keys; and
  • pooling knowledge to hack them, with a little help from Forge.

After that homework, your curiosity modeling assignment will begin! Think of this like a miniature final project: you'll use Forge to model something you're interested in, on a relatively small scale. We like this to be a partner project if possible, so it's worth finding partners and thinking about ideas now. If you don't know anyone in the class, we'll provide some help.

Today, we'll focus on:

  • testing with examples and assertions;
  • some of the limitations of full-trace verification; and
  • trying another approach to verification that doesn't require generating full traces.

Reminder about Examples

Where an assert or run is about checking satisfiability or unsatisfiability of some set of constraints, an example is about whether a specific instance satisfies a given predicate. This style of test can be extremely useful for checking that (e.g.) small helper predicates do what you expect.

Why use example at all? A couple of reasons:

  • It is often much more convenient (once you get past the odd syntax) than adding one sigs or some quantification for every object in the instance, provided you're trying to describe an instance rather than a property that defines a set of them---which becomes a better option as models become more complex.
  • Because of how it's compiled, an example can sometimes run faster than a constraint-based equivalent.

You may be wondering whether there's a way to leverage that same speedup in a run command. Yes, there is! But for now, let's get used to the syntax just for writing examples. Here are some, well, examples:

pred someXTurn {some s: State | XTurn[s]}
example emptyBoardXturn is {someXTurn} for {
  State = `State0
  no `State0.board
}

Here, we've said that there is one state in the instance, and its board field has no entries. We could have also just written no board, and it would have worked the same.

-- You need to define all the sigs that you'll use values from
pred someOTurn {some b: Board | OTurn[b]}
example xMiddleOturn is {someOTurn} for {
  Board = `Board0
  Player = `X0 + `O0
  X = `X0
  O = `O0
  `Board0.board =  (1, 1) -> `X0
}

What about assertions, though? You can think of assertions as generalizing examples. I could have written something like this:

pred someXTurn {some b:Board | xturn[b]}
pred emptySingleBoard {
  one b: Board | true
  all b: Board, r,c: Int | no b.board[r][c]
}
assert emptySingleBoard is sufficient for someXTurn  

That's pretty coarse-grained, though. So let's write it in a better way:

pred emptyBoard[b: Board] { all r, c: Int | no b.board[r][c] }
assert all b: Board | emptyBoard[b] is sufficient for xturn[b]

Notice how, by adding variables to the assertion, we're able to write less-verbose assertions and re-use our predicates better.

But if examples are faster, why use assertions?

First, examples aren't always faster. There are also some models we'll write later where example isn't supported. And, of course, as the model becomes more complex, the example becomes longer and longer as you try to define the value of all fields.

But there's a more important reason: assertions can express properties. Because they can state arbitrary constraints, there's an analogy to property-based testing: where examples are like traditional unit tests, assertions are like the checker predicates you wrote in Hypothesis.

So there's a role for both of them.

Traces: Good and Bad

When we stopped last time, we'd finished our model of tic-tac-toe. We could generate a full game of up to 10 board states, and reason about what was possible in any game.

This works great for tic-tac-toe, and also in many other real verification settings. But there's a huge problem ahead. Think about verifying properties about a more complex system---one that didn't always stop after at most 9 steps. If we want to confirm that some bad condition can never be reached, how long a trace do we need to check?

Think, then click!

What's the longest (simple--i.e., no cycles) path in the transition system? That's the trace length we'd need.

That's potentially a lot of states in a trace. Hundreds, thousands, billions, ... So is this entire approach doomed from the start?

Note two things:

  • Often there are "shallow" bugs that can be encountered in only a few steps. In something like a protocol or algorithm, scaling to traces of length 10 or 20 can still find real bugs and increase confidence in correctness.
  • There's more than one way to verify. This wasn't the only technique we used to check properties of tic-tac-toe; let's look deeper at something we saw awhile back.

Proving Preservation Inductively

Let's turn to a programming problem. Suppose that we've just been asked to write the add method for a linked list class in Java. The code involves a start reference to the first node in the list, and every node has a next reference (which may be null).

Here's what we hope is a property of linked lists: the last node of a non-empty list always has null as its value for next.

How can we prove that our add method preserves this property, without generating traces of ever-increasing length? There's no limit to how long the list might get, and so the length of the longest path in the transition system is infinite: 0 nodes, 1 node, 2 nodes, 3 nodes,...

This might not be immediately obvious. After all, it's not as simple as asking Forge to run all s: State | last.next = none. (Why not?)

Think, then click!

Because that would just be asking Forge to find us instances full of good states. Really, we want a sort of higher-level all, something that says: "for all runs of the system, it's impossible for the run to contain a bad linked-list state.

This simple example illustrates a central challenge in software and hardware verification. Given a discrete-event model of a system, how can we check whether all reachable states satisfy some property? In your other courses, you might have heard properties like this called invariants.

One way to solve the problem without the limitation of bounded-length traces goes something like this:

  • Step 1: Ask Forge whether any starting states are bad states. If not, then at least we know that executions with no moves obey our invariant. (It's not much, but it's a start---and it's easy for Forge to check.)
  • Step 2: Ask Forge whether it's possible, in any good state, to transition to a bad state.

Consider what it means if both checks pass. We'd know that runs of length cannot involve a bad state. And since we know that good states can't transition to bad states, runs of length can't involve bad states either. And for the same reason, runs of length can't involve bad states, nor games of length , and so on.

How do we write this in Forge?

Not just Forge, but any other solver-based tool, including those used in industry!

Modeling linked lists in Forge is very doable, but more complicated than I'd like to try to do in 10 minutes of class. So let's do this with the tic-tac-toe model for today. A balanced state is a good state.

This assertion checks for counterexamples to the first component: are there any bad states that are also starting states?

assert all b: Board | initial[b] is sufficient for balanced[b]
  for 1 Board, 3 Int

Notice that we didn't need to use the next is linear annotation, because we're not asking for traces at all. We've also limited our scope to exactly 1 Board. We also don't need 4 integer bits; 3 suffices. This should be quite efficient. It should also pass, because the empty board isn't unbalanced.

Now we can ask: are there any transitions from a good state to a bad state? Again, we only need 2 boards for this to make sense.

pred moveFromBalanced[pre: Board, row, col: Int, p: Player, post: board] {
  balanced[pre]
  move[pre, row, col, p, post]
}
assert all pre, post: Board, row, col: Int, p: Player | 
  moveFromBalanced[pre, row, col, p, post] is sufficient for balanced[post]
    for 2 Board, 3 Int

If both of these pass, we've just shown that bad states are impossible to reach via valid moves of the system. Does this technique always work? We'll find out next time.

Aside: Performance

That second step is still pretty slow on my laptop: around 10 or 11 seconds to yield UNSAT. Can we give the solver any help? Hint: is the set of possible values for pre bigger than it really needs to be?

Think, then click!

If we assume the pre board is well-formed, we'll exclude transitions involving invalid boards. There are still a lot of these, even at 3 Int, since row and column indexes will range from -4 to 3 (inclusive).

But is it really safe to assert wellformed[pre]? Good question! Is there a way we could check?