"Numerical abstractions, in our context, are finite representations of (possibly infinite) sets of points in some vector space.
Many applications in the area of analysis and verification require information about the numerical values of a number of parameters. By identifying each parameter with a unique dimension of the vector space, an assignment of numerical values to these parameters will specify a point in this space. Therefore, by defining sets of points in this space, a numerical abstraction can provide conservative approximations of the values of the parameters and some of the relations that must hold between these values. Although these parameters do not necessarily have to be program variables, for simplicity, here we refer to these as variables.
For instance, consider two variables x and y. In the following graph, the axes are labeled by x and y and the actual values this pair of variables can take are indicated by points in the xy-plane.
Numerical abstractions can come in many flavors; for example, a simple sign abstraction approximates the possible values of each variable as positive or negative, a bounding n-dimensional convex polyhedron defines a region within which the values of the vectors must lie, or a grid (i.e., lattice) of discrete regularly spaced points approximates their distribution.
The PPL currently supports, or is interested in providing in the immediate future, a range of numerical abstractions, all of which can be represented by means of some form of linear expression.
To give an overview of these abstractions, we subdivide them into four categories:
- polyhedral, which abstracts a set of values using the outer limits or bounds within which the values must lie;
- grid, which abstracts the pattern of distribution of these values;
- grid-polyhedral, which abstracts a set of values by providing limits on their distribution; and
- polyhedral grid, which abstracts a set of values as a distribution of copies of a bounded region of space.
In each case, the abstraction often comes in both non-relational and relational form; that is for instance, in relational form, the limits on the values of one variable may depend on the values of one or more other variables.
Abstractions can be refined. There are several different constructions that can be applied to pre-existing abstractions so that the enhanced abstraction can represent additional dependencies between the values [Bag97, Bag98]. One such construction currently supported by the PPL is the powerset construction [Bag98, BHZ03];this upgrades an abstraction to one that can represent (finite) disjunctive information. By using the previous example representing the actual values that variables <I>x</I> and <I>y</I> can take, we now illustrate some interesting and important instances of each of these categories and, in some cases, the powerset construction on them.
A very rough approximation of the points may be given by their signs [CC79], where each coordinate is approximated as being positive or negative. In our example, for all the points, both the x and y coordinates are positive so that the set of points can be safely approximated by the top right-hand quadrant.
The box abstraction [CC76], consists of a set of intervals defining a box. In this example, we approximate the given points by their smallest bounding box.
Powerset of Boxes
The box abstraction can be upgraded to a more precise abstraction by using the powerset construction; here we have a set of three boxes that together contain all the points.
The box abstraction shown above is non-relational. Several numerical abstractions have been defined that can represent relations between, at most, two variables. One of the simplest of these uses bounded differences which are constraints on two variables, say x and y, that provide an upper bound for their difference x - y. In a bounded difference abstraction, interval constraints as well as the actual bounded differences are used to describe an abstraction of a set of points. Bounded differences have been widely used, since the '80s, in the field of artificial intelligence: for references see, e.g., [Bag97] or E. Davis, Constraint propagation with interval labels, Artificial Intelligence, 32(3):281-331, 1987. Below we use our running example to illustrate this abstraction.
In the octagon abstraction, for any pair of variables, x and y, constraints that provide an upper bound for both the sum x + y and the difference x - y are allowed [BK89, Min01].Octagons were called simple sections in [BK89]. Below we illustrate how the points in our example may be approximated by an octagon.
The convex polyhedra approximation [CH78] uses a finite set of linear inequalities (with no restriction on their coefficients) all of which are satisfied by the concrete values. In general, if the set of concrete values is infinite, then there may not be a best approximating polyhedron (this is the case, for instance, when the set of concrete values is a circle in the plane).
Powerset of Convex Polyhedra
The simplest discrete distribution is that described by the integer grid; the set of points in the vector space with integer coordinates. (Although traditionally the terminology “lattice” instead “grid” has been used in this context, to avoid any confusion with the use of “lattice” in its set-theoretic context, we prefer the terminology “grid” for sets of vectors that can represent discrete and repetitive information.) The integer grid is easily generalised to integral grids that consist of all integral affine transformations of the integer grid [NW88, Sch99] and can be used when the allowed values are distributed at regular intervals.
In [Gra91, Gra97], and more recently, in [BDH+07], it is shown how these grids, together with some simple generalizations (where the points may be rational rather than integral, where congruence relations may be equalities and where copies of a subspace may replace the rational points) can be represented very naturally by means of a conjunction of a set of (rational) linear congruence relations of the form <a,x> = c mod d.
In the next two illustrations we first provide a non-relational grid and secondly, a relational grid that approximate the given set of points.
(Non-relational) Integral Grid
(Relational) Integral Grid
We use the terminology grid-polyhedron for the intersection of a polyhedron (or, possibly, a finite set of polyhedra) with a grid. As grid-polyhedra can represent information about the distribution of the discrete values as well the outer limits of the range in which they must lie, it can be used in analyzers for automatic parallelization and parallel VLSI synthesis and so forth. The Z-polyhedra introduced in [Anc91] provide a simple form of this abstraction where the elements are defined as the intersection of a polyhedron (or a set of polyhedra) with an integral grid (see also [NR00, QRR96]).
In the next two illustrations, we combine the previous example of a non-relational grid and then that of the relational grid with the polyhedron.
(Non-Relational) Integral Grid-Polyhedron
(Relational) Integral Grid-Polyhedron
Polyhedral Grid Abstractions
Another way in which the limit and distribution abstractions may be combined is by attaching, at each point of the grid, copies of a polyhedron (or any other appropriate abstraction). In particular, [Mas93] introduced trapezoidal grids; which are grids whose points are replaced by copies of an n-dimensional parallelogram. The advantage of this particular combined abstraction is that the elements can be represented very concisely by a conjunction of congruences where the constant terms in the congruence relations defining the grids are replaced by intervals (i.e., <a,x> = [cl, cu] mod d ).