P-Gen: Precondition Generator
P-Gen is a precondition generator tool. Given a procedure containing a specified assertion, P-Gen allows to infer a precondition for the procedure that implies the validity of the specified assertion. While most of the existing techniques permit to compute a sufficient precondition, P-Gen computes the exact (necessary and sufficient) precondition for the assertion.
The core idea behind P-gen is a counterexample guided abstraction refinement algorithm where both the set of safe and unsafe sates are abstracted. Both sets are refined at each iteration until they become disjoint. The final set of safe states is then exact: It includes all safe terminating runs and excludes all unsafe runs.
In addition to its ability to generate exact preconditions, P-Gen has other original features such as finding intrinsic counterexamples. These are procedure traces that do not depend on procedure parameters or global variables. P-Gen can also reason about collections of array elements as it is able to infer existentially and universally quantified preconditions.
Here you can download a Linux 64 bit binary version of P-Gen and the tutorial below in PDF format. .
P-Gen is written by Mohamed Nassim Seghir.
1. Generating Simple Preconditions
Let us consider the routine memcpy from the standard string library, a possible implementation for this routine is illustrated in Figure 1.
Figure 1. Program that copies a memory region to another memory region.
The routine above copies n bytes from the object pointed by s2 into the object pointed to by s1. We want to verify that the access to the destination memory region does not go beyond the buffer reserved for the object pointed by s1. As we do not know the upper bound for the memory region reserved for the destination object (pointed by s1), we express it using the variable s1_UB which is just used for the verification purpose. This variable can have an arbitrary value as it is not initialized. The access violation to the buffer pointed by s1 is expressed by the conditional statement which redirects the control ot the error location ERROR_1. To compute the precondition under which the procedure memcpy is safe, i.e., ERROR_1 is not reachable, we call P-Gen via the command below
./p-gen --reach --mainproc memcpy --file ../examples/memcpy.c --beyondwp --precond --noinline --tprover z3
The option reach indicates that we are performing a reachability analysis. Options mainproc and file specify the procedure and file name respectively. The most important options here are precond and beyondwp. The first one tells P-Gen to generate the precondition and the second one is for using the enhanced refinement mechanism. Finally, option noinline is to avoid renaming local variables of the procedure and tprover is for specifying the theorem prover to use (z3 in our case). P-Gen generates the precondition below
The first disjunct in the precondition represents the case where we skip the loop. The second disjunct represents the constraint that dst_UB must fulfil when entering the loop.
2. Generating Quantified Preconditions
Figure 2. Program that computes the length of a 0 terminal string.
Let us now consider the procedure strlen, illustrated in Figure 2. This procedure returns the length of a 0 terminal string. The safe access for each element of the string pointed by s is modelled by the conditional statement which leads to the error label ERROR_1. As in the previous example, here also we use s_UB to represent the upper bound for the memory occupied by the object which is pointed by s. In this case, the bound for the buffer is implicitly given as part of its content (‘\0’). We call P-Gen with the command below
./p-gen --reach --mainproc strlen --file ../examples/strlen.c --beyondwp --precond --noinline --tprover z3 --stringabs --dontabsarray --expheap
We have three new options: stringabs, dontabsarray and expheap. The option stringabs tells P-Gen to abstract string and character constants using freshly generated identifiers. This is required as we do not have a dedicated decision procedure. The option expheap is used to explicitly represent the heap as an array. Finally, dontabsarray is used to avoid abstracting arrays in conditional statements. As the heap is modelled as an array and it is relevant to the property we want to verify, it makes sense not to abstract arrays. After calling P-Gen, we obtain the following precondition
(-ABS_STR_1+ ACS_HEAP[ s] == 0 || !FORALL( -ABS_STR_1+ ACS_HEAP[ UNI_1] != 0 , UNI_1>= s+ 1 , UNI_1<= s_UB ) ) .
The identifier ABS_STR_1 is used to model the character '\0'. The heap is explicitly modeled as an array with the name ACS_HEAP, hence ACS HEAP[s] models *s. The computed precondition says that either s points to the character '\0' or the character '\0' must be pointed to by an element from the interval [s + 1. .s_UB].
3. Displaying Internal Information
One can display some additional internal information stored by P-Gen. The option printcmd allows to display the internal representation of programs as transition constraints. We can also display the abstraction map (location to predicate set) using option abstmap.