Introduction

Act is a high level specification language for evm programs. The core aim is to allow for easy refinement. We want to make it as easy as possible for development teams to define a high level specification, which can then either be used "upwards" to prove higher level properties or "downwards" to demonstrate that an implementation in EVM bytecode conforms to the spec.

Act currently integrates with the following tools:

  • Hevm: automated refinement proof between the act spec and a given evm bytecode object
  • SMT: automated proof of invariants and postconditions
  • Coq: manual proof of high level properties against a model derived from the Act spec

Overview

At a very high level Act is a kind of mathy english over the EVM, where contracts are defined as a set of pure functions taking a given EVM state (i.e. storage & blockchain context) and some calldata and producing a new EVM state ((EVM, Calldata) -> EVM).

A specification of a contract written in Act consists of a constructor and a set of behaviours:

The constructor specification defines the structure of the contract’s state, the initial value of the state, and a list of invariants that the contract should satisfy.

Each behaviour specification determines how a contract method updates the state, the method’s return value (if any), and any conditions that must be satisfied in order for the state update to be applied.

Alternatively, they can be thought of as an initial state and a set of state transitions, determining an inductively defined state transition system.

Types

The types of Act consist of three basic primitives: Integers, Booleans and Bytestrings. Integers are unbounded, with true integer operations. However, as our integer expressions will often represent words in the EVM, we allow ourselves a slight abuse of notation and denote by uintN/intN integers together with the constraint that the value fits into a uintN/intN. If any operation could over- or underflow this bound, the constraint will be unfulfilled and the specification will fail to prove.

Using conventional ABI types for typing also allows us to specify function signatures in a concise way. As an example, consider this specification of a trivial contract that adds two numbers and stores the result on chain:

constructor of Add
interface constructor()

creates

  uint result := 0

behaviour add of Add
interface add(uint x, uint y)

iff in range uint

   x + y

storage

  result => x + y

returns x + y

Expressed in English and math, this specification would read:

The contract Add has a single state variable, named result, which is an integer such that 0 <= result < 2^256.

Given any pair of integers x and y, s.t. 0 <= x < 2^256 and 0 <= y < 2^256, an ABI encoded call to the contract Add with the signature add(uint256,uint256), with its respective arguments named x and y, will:

  • store x + y in result and return x + y if 0 <= x + y < 2^256
  • revert otherwise

Syntax of the Act language

A specification of a contract written in act consists of a constructor and a set of behaviours.

The constructor specification defines the structure of the contract state, the initial value for the state, and a list of invariants that the contract should satisfy.

Each behaviour specification determines how a contract method updates the state, and its return value, and the conditions that must be satisfied in order for it to be applied.

Alternatively, they can be thought of an initial state and a set of state transitions, determining an inductively defined state transition system.

The specifications can be exported to different backends in order to prove that the claims that are being made hold with respect to an implementation.

This document is a high level description of the syntax of act specifications. For a more formal treatment of the syntax, study the definitions of RefinedAst.hs.

Types

The types of Act consist of three basic primitives: Integers, Booleans and ByteStrings. Integers are unbounded, with true integer operations. However, as our integer expressions will often represent words in the EVM, we allow ourselves a slight abuse of notation and denote by uintN/intN integers together with the constraint that the value fits into a uintN/intN.

Using conventional ABI types for typing also allows us to specify function signatures in a concise way.

As an example consider the specification of overflow safe addition:

behaviour add of SafeMath
interface add(uint x, uint y)

iff in range uint

   x + y

returns x + y

In more verbose terms, this specification would read:

Given any pair of integers x and y, s.t. 0 <= x < 2^256 and 0 <= y < 2^256, an ABI encoded call to the contract SafeMath with the signature add(uint256,uint256), and x and y, will:

  • return x + y if 0 <= x + y < 2^256
  • revert otherwise

Contructor

A constructor specification is indicated by the constructor of <contractName> header, as in:

constructor of A

A constructor section consists of the following fields:

interface

Specifying the arguments the constructor takes. Example:

interface constructor(address _owner)

iff (optional)

The conditions that must be satisfied in order for the contract to be created. These must be necessary and sufficient conditions, in the sense that if any condition isn't met, the contract cannot be created.

Example:

iff

CALLER == _owner

creates

Defines the storage layout of the contract.

Example:


creates

  uint256 totalSupply := _totalSupply
  mapping(address => uint) balanceOf :=  [CALLER := _totalSupply]

External storage changes (optional)

If the contract creation updates the state of other contracts, they should be specified as:

contract B
   x => x + 1

contract C
   y => y - 1

Ensures (optional)

A list of predicates that should hold over the poststate. All references to storage variables in ensures sections need to specify whether they talk about the variable's value in the pre- or the poststate, by using pre(x) or post(x). Example:


ensures

   (post(x) == _x) or (pre(x) == _x)

Invariants (optional)

A list of predicates over the state of the contract that should hold before and after every contract invocation:

invariants

  totalSupply = initialSupply
  _k = x * y

Behaviour

A behaviour specification is indicated by the behaviour of <contractName> header, as in:

behaviour of A

and consists of the following fields:

interface

Specifying which method the behaviour refers to. Example:

interface transfer(address to, uint value)

iff (optional)

The conditions that must be satisfied in order for the transition to apply. These must be necessary and sufficient conditions, in the sense that if any condition isn't met, a call to the given method must revert.

Example:

iff

  balanceOf[CALLER] >= value
  CALLVALUE == 0

cases and state updates

Each behaviour must specify the state changes that happens a result of a valid call to the method, and its return argument, if any. All references to storage variables in returns arguments need to specify whether they talk about the variable's value in the pre- or the poststate, by using pre(x) or post(x).

This can be specified in two ways. Either directly, as in:


storage

  balanceOf[CALLER] => balanceOf[CALLER] - value
  balanceOf[to] => balanceOf[to] + value

returns post(balanceOf[CALLER])

or split between a number of cases, as in:

case to == CALLER:

   returns -1

case to =/= CALLER:

   storage
     balanceOf[CALLER] => balanceOf[CALLER] - value
     balanceOf[to] => balanceOf[to] + value

   returns post(balanceOf[CALLER])

Note that currently, either a storage or returns section, or both is required in every spec.

External storage changes (optional)

If the contract method updates the state of other contracts, they should be specified as:

contract B
   x => x + 1

contract C
   y => y - 1

Ensures (optional)

A list of predicates that should hold over the poststate. Example:


ensures

   x == _x

Referencing Storage Variables

Storage locations that are read and used in other expressions must be declared in a storage block.

Some examples:

storage
  x => y
  y
storage
  x
  y

returns x + y
``

Backends

Act currently implements 3 backends natively:

  • SMT: automated proof of postcondition and inductive invariants against the spec
  • Coq: manual proof of arbitrary properties against the spec
  • Hevm: automated equivalence proof between the spec and an implementation in evm

SMT Backend

The SMT backend for Act is currently able to automatically prove that:

  • post conditions are implied by pre conditions
  • contract level invariants hold

To perform these proofs, you can simply run act prove --file <PATH_TO_SPEC> against your spec.

act prove also accepts the following configuration flags:

  • --solver: you can choose to use cvc4 or z3 as the solver backend. The default is z3. Sometimes cvc4 may be able to prove things that z3 cannot (and vice versa). You can also prove the same properties with multiple solvers to gain confidence that the proofs are not affected by a bug in the solver itself.
  • --smttimeout: the timeout given for each smt query. This is set to 20s by default.
  • --debug: this prints the raw query dispatched to the SMT solver to stdout. Note that these queries are machine generated by the sbv library and so are fairly challenging to inspect manually.

Examples

Automatic Market Maker

As an example consider the following specification of a simplified uniswap style automatic market maker. For the purposes of this example we focus only on the core logic, and ignore e.g. interactions with the underlying tokens.

The Amm has two state variables representing its underlying token balances, reserve0 and reserve1. These are initialized to 1000 units each upon creation.

We define a single invariant, that the product of the two reserves should never decrease. This is an important safety property, if it is violated, an attacker will be able to execute a sequence of trades that can drain all funds from the exchange.

behaviour init of Amm
interface constructor()

creates

    uint256 reserve0 := 1000
    uint256 reserve1 := 1000

invariants

    1000 * 1000 <= reserve0 * reserve1

Next, we define the two swap methods, that allow a user to exchange tokens. These methods take an amount (amt) of input reserves, and the Amm will adjust the size of the other reserve according to the x * y == k constant product formula.

behaviour swap0 of Amm
interface swap0(uint256 amt)

iff in range uint256

    reserve0 + amt

storage

    reserve0 => reserve0 + amt
    reserve1 => (reserve0 * reserve1) / (reserve0 + amt)
behaviour swap1 of Amm
interface swap1(uint256 amt)

iff in range uint256

    reserve1 + amt

storage

    reserve0 => (reserve0 * reserve1) / (reserve1 + amt)
    reserve1 => reserve1 + amt

If we run act prove against the above spec, we find that there is in fact an error. If the Amm starts with reserve0 == 1 and reserve1 == 1,000,000, and swap0 is called with amt == 2, the final state of the Amm will be reserve0 == 3 and reserve1 == 333,333. In this case 3 * 333,333 is 999,999, and in fact the product of the reserves has decreased slightly due to imprecision introduced by the EVM's flooring division.

A safe specification for the swap methods is as follows. Notice the extra +1 added to the output reserve in both cases. With this implementation, the rounding error is now in favor of the pool instead of the trader, and the contract is now safe against this particular attack.

If we again run act prove against the fixed specification, we see that the invariant holds for all possible executions of the contract.

behaviour swap0 of Amm
interface swap0(uint256 amt)

iff in range uint256

    reserve0 + amt

storage

    reserve0 => reserve0 + amt
    reserve1 => (reserve0 * reserve1) / (reserve0 + amt) + 1
behaviour swap1 of Amm
interface swap1(uint256 amt)

iff in range uint256

    reserve1 + amt

storage

    reserve0 => (reserve0 * reserve1) / (reserve1 + amt) + 1
    reserve1 => reserve1 + amt

Underpowered Invariant

Due to the inductive nature of the proof, there are some true invariants that the SMT backend is unable to prove. For example, consider the following state machine:

behaviour init of C
interface constructor()

creates
    uint x := 0

invariants
    x < 9

behaviour f of C
interface f()

case x == 0:

    storage
        x => 1

behaviour g of C
interface g()

case x == 1:

    storage
        x => 2

behaviour j of C
interface j()

case x == 7:

    storage
        x => 100

The contract C can never be in a state were x == 7 and so the write of 100 to x in j() can never occur, however if we run act prove against this specification, this exact case (x == 7) is found as a counterexample.

This is due to the inductive nature of the proof: act checks that the invariant holds after running the constructor, and then for each method assumes that the invariant holds over the pre state and checks that the invariant holds over the post state.

In the case above, the invariant states that x < 9, and if this is assumed as a precondition, then the x == 7 branch in j() is still reachable.

We can fix this by strengthening the invariant to make the x == 7 branch unreachable, in the case of the spec above, an invariant of x < 7 is sufficient, although x < 3 is the strongest invariant of this form that holds over the full transition system.

Limitations

Although SMT solvers are powerful tools there are some important limitations to be aware of:

  • The invariants proven can only be inductive in nature, and can only be expressed in terms of state variables and constructor arguments.
  • SMT solvers do not support the exponentiation operation.
  • The solver may be unable to prove properties about behaviours making extensive use of non linear arithmetic (i.e. multiplication or division by a symbolic value). This is an inherent limitation of all SMT solvers.

Implementation Strategy

For each invariant claim, Act builds an individual SMT query for each behaviour in the transtion system. If there are no invariant claims defined, then act will insert an implicit invariant of True, meaning that the postconditions are still checked.

If the behaviour is a constructor, the query asks the solver to find instances where:

  • the preconditions hold
  • the storage values in the poststate match those specified by the creates block of the constructor
  • the invariant does not hold over the post state or the postconditions do not hold over the poststate

If the behaviour is a method, the query asks the solver to find instances where:

  • the invariant holds over the pre state
  • the preconditions hold
  • all storage variables in the prestate are within the range specified by their type
  • a predicate relating the pre and post state according to the specification in the storage block holds
  • the invariant does not hold over the post state or the postconditions do not hold over the poststate

If all of the queries for an invariant claim return unsat, then two properties have been proved about the transition system:

  1. The invariant holds over the post state
  2. The postconditions hold for every method level behaviour

Coq

While the automated proof backend is quite capable, there are still many properties that are too challenging for automated tools. For this reason Act allows exporting the transition system to the Coq proof assistant, where manual proofs of arbitrary complexity can be carried out.

A proof assistant provides tools that help to construct proofs. Coq, in particular, is highly interactive. The user typically builds proofs step by step, with the software giving feedback as the proof progresses.

The requirements on proofs in a system like Coq, Isabelle, or Lean are quite strict. These tools only accept proofs that are algorithmically verifiable to be valid series of applications of the system’s inference rules. This is generally stricter than what is typically expected of pen and paper proofs, which often omit tedious details in the interest of clarity and concision.

The verification of these proofs is performed in a minimal and well-audited kernel. Although occasionally bugs have been found in Coq’s and other systems’ kernels, a proof in these systems is generally quite strong evidence of correctness.

A Brief Introduction to Proof in Coq

Coq is a complex system with a steep learning curve, and while a full tutorial on programming in Coq is out of the scope of this blog post, we can give a little taste of how things work. For a more thorough introduction, the books Software Foundations and Certified Programming With Dependent Types are both excellent. Software Foundations in particular is a great introduction for users with little experience in the fields of formal logic and proof theory.

The Coq system is composed of three languages: a minimal functional programming language (Gallina), a tactics language for proof construction (Ltac), and a “vernacular” for interaction with the kernel. Let’s start with the very basics: defining the natural numbers and proving something about addition.

We start by defining the type of natural numbers. There are infinitely many natural numbers, so of course they must be defined inductively. In fact, all type definitions are done with the Inductive vernacular command, even if they are not in fact inductive. Coq’s Inductive is analogous to Haskell’s data and OCaml’s type (with the added power of dependent types).

We define two constructors: O, representing 0, and S, which when applied to the natural number n produces the representation of the number n + 1 (S as in "successor"). To give a concrete example, 3 would be represented in this encoding as S (S (S 0))) i.e 1 + (1 + (1 + 0)).

Inductive nat : Type :=
  | O
  | S (n : nat).

This is an example of a unary number representation. It can often be helpful to represent numbers this way, since the inductive nature of the definition lends itself to inductive proof techniques.

Let’s continue by defining addition over our nat type:

Fixpoint plus (n : nat) (m : nat) : nat :=
  match n with
  | O ⇒ m
  | S n' ⇒ S (plus n' m)
  end.

Here we define a recursive function (a Fixpoint) that takes two numbers n and m and returns the sum of these two numbers. The implementation is defined recursively with pattern matching. You might think of this definition as “unwrapping” each application of S from the first argument until we reach its O. Then we start wrapping the second argument in the same number of Ss.

Now we’re ready to prove something! Let’s prove that 0 + n == n:

Theorem plus_O_n :
  forall n : nat, plus O n = n.
Proof.
  intros n. simpl. reflexivity.
Qed.

We first define our theorem and give it a name (plus_O_n). Then we define the proof goal, in the form of a dependent type. We claim that for all n, where n is an instance of our nat type, 0 + n is equal to n. Finally, we construct a proof, in the form of a series of tactics. Tactics may implement either backwards inference (transforming the goal) or forwards inference (transforming evidence).

The best way to understand the system is to run the software yourself, and play around with the various tactics. In this case the goal is simple enough; once we’ve specified that the proof will be on n, Coq is able to simplify plus O n into n, leaving us the goal n = n. This turns out to be true by the definition of equality, and we invoke definitional equality by reflexivity.

More complicated proofs do not typically require proving basic facts about arithmetic, because Coq ships a substantial standard library of useful definitions and theorems. The above example hopefully serves to illustrate the formal nature of proof in these systems. In many cases it can be surprisingly hard to convince the kernel of the correctness of a statement that seems “obviously” true.

Act Export

Let’s take a look at using Coq to prove properties about a specification that is too difficult for the SMT backend. The following defines a contract that implements exponentiation via repeated multiplication. The contract is initialized with a base (b) and an exponent (e). exp() can then be repeatedly called until e is 1, and the result can then be read from the storage variable r. While obviously artificial, this example does highlight a key shortcoming of the SMT based analysis: exponentiation with a symbolic exponent is simply inexpressible in the smt-lib language used by all major SMT solvers, and so any contract making use of exponentiation where the exponent is a variable of some kind (e.g. calldata, storage) will be impossible to verify using SMT. Coq has no such restrictions, and we can export the spec below and prove correctness there.

constructor of Exponent
interface constructor(uint _b, uint _e)

iff

    _e > 0

creates

    uint b := _b
    uint e := _e
    uint r := _b
behaviour exp of Exponent
interface exp()

iff

    e > 1

iff in range uint

    r * b
    e - 1

storage

    r => r * b
    e => e - 1
    b

You can export the spec into Coq by running act coq --file Exponent.act. This will create a file called Exponent.v which contains a model of the above Act specification in Coq:

(* --- GENERATED BY ACT --- *)

Require Import Coq.ZArith.ZArith.
Require Import ActLib.ActLib.
Require Coq.Strings.String.

Module Str := Coq.Strings.String.
Open Scope Z_scope.

Record State : Set := state
{ b : Z
; e : Z
; r : Z
}.

Definition exp0 (STATE : State)  :=
state (b STATE) (((e STATE) - 1)) (((r STATE) * (b STATE))).

Definition Exponent0 (_b : Z) (_e : Z) :=
state (_b) (_e) (_b).

Inductive reachable  : State -> State -> Prop :=
| Exponent0_base : forall (_b : Z) (_e : Z),
     (_e > 0)
  -> ((0 <= _b) /\ (_b <= (UINT_MAX 256)))
  -> ((0 <= _e) /\ (_e <= (UINT_MAX 256)))
  -> reachable (Exponent0 _b _e) (Exponent0 _b _e)

| exp0_step : forall (BASE STATE : State),
     reachable BASE STATE
  -> ((e STATE) > 1)
  -> ((0 <= ((r STATE) * (b STATE))) /\ (((r STATE) * (b STATE)) <= (UINT_MAX 256)))
  -> ((0 <= ((e STATE) - 1)) /\ (((e STATE) - 1) <= (UINT_MAX 256)))
  -> ((0 <= (r STATE)) /\ ((r STATE) <= (UINT_MAX 256)))
  -> ((0 <= (e STATE)) /\ ((e STATE) <= (UINT_MAX 256)))
  -> ((0 <= (b STATE)) /\ ((b STATE) <= (UINT_MAX 256)))
  -> reachable BASE (exp0 STATE )
.

Let’s break this down a bit. We have a definition of contract storage State, which consists of three variables b, e and r, all of type Z. Z is an integer type using a binary encoding from the ZArith library bundled with Coq.

Next we have exp0, which defines how the state is updated by the exp behaviour, and Exponent0 which defines how the state variables are initialized by the constructor arguments.

Finally we have an Inductive Proposition reachable that defines the conditions under which a certain state is reachable from another. There are two parts to this definition:

  • Exponent0_base: states that given two integers _b and _e, the initial state is reachable
  • from the initial state if _e and _b are in the range of a uint256 and _e is greater than
  • 0. exp0_step: states that for a pair of states BASE and STATE, exp0 STATE (i.e. the
  • result of calling exp() against an arbitrary contract state) is reachable from BASE if STATE
  • is reachable from BASE, all the state variables in STATE (e, b, r) are within the range of a
  • uint256, the result of the calculations r * b and e - 1 are within the range of a uint256,
  • and e is greater than 1.

This gives us a pair of inference rules that we can use to prove facts about the set of reachable states defined by the specification for the Exponent contract.

The core fact that we wish to prove is that when e is 1, r is equal to b ^ e. This can be expressed in Coq as:

forall (base, s : State), reachable base s -> e s = 1 -> r s = (b base) ^ (e base). Expressed more verbosely: for all states base and s, if s is reachable from base, and the value of e in s is 1, then the result variable r in s is equal to b from base raised to the power of e from base.

The full proof is reproduced below. While an explanation of each step is out of scope for this post (and is anyway best made with the proof loaded into an interactive instance of the Coq prover like proof general or CoqIde), we can give a broad strokes overview.

We must first define a helper fact pow_pred which simply states that given two integers a and e, if e is greater than 0 then a * a ^ (e - 1) is equal to a ^ e. This fact is needed in the later steps of the proof. The next step is to define a loop invariant for exp() (i.e. a fact that is true before and after each loop iteration). This is the Lemma invariant, which states that for every state s reachable from base, r * b ^ (e - 1) over s is equal to b ^ e over base. Intuitively, this states that the partial result calculated so far (r), multiplied by the remaining portion of the input calculation b ^ (e - 1) is equal to the final expected result. Finally, given these two intermediate facts, we can discharge a proof for the correctness of Exponent as defined above.

Require Import Exponent.Exponent.
Require Import ActLib.ActLib.
Require Import Coq.ZArith.ZArith.
Open Scope Z_scope.

Lemma pow_pred : forall a e, 0 < e -> a * a ^ (Z.pred e) = a ^ e.
Proof.
  intros.
  apply eq_sym.
  replace (a ^ e) with (a ^ (Z.succ (Z.pred e))).
  - apply Z.pow_succ_r.
    apply Zlt_0_le_0_pred.
    assumption.
  - rewrite (Z.succ_pred e).
    reflexivity.
Qed.

Lemma invariant : forall base s,
  reachable base s -> (r s) * (b s) ^ ((e s) - 1) = (b base) ^ (e base).
Proof.
  intros base s H. induction H.
  - simpl.
    rewrite Z.sub_1_r.
    apply pow_pred.
    apply Z.gt_lt.
    assumption.
  - simpl.
    rewrite <- IHreachable.
    rewrite Z.sub_1_r.
    rewrite <- (pow_pred (b STATE) (e STATE - 1)).
    + rewrite Z.mul_assoc. reflexivity.
    + apply Z.gt_lt in H0.
      apply (proj1 (Z.sub_lt_mono_r 1 (e STATE) 1)).
      assumption.
Qed.

Theorem exp_correct : forall base s,
  reachable base s -> e s = 1 -> r s = (b base) ^ (e base).
Proof.
  intros base s H He.
  apply invariant in H.
  rewrite He in H. simpl in H.
  rewrite (Z.mul_1_r (r s)) in H.
  assumption.
Qed. Check exp_correct.

While this may seem like quite a lot of work to prove what looks like a pretty simple and obvious fact it is worth noting two things:

  • A proof of this property is beyond the reach of any automated tool available today.
  • Our mind is full of hidden assumptions, and facts that may seem obvious are not always so. This is not the case for the Coq proof kernel, and once we have convinced it that something is true, we can be very sure that it really is.

Hevm

Act leverages the symbolic execution engine in hevm to provide a backend that can prove equivalence between a contract specification and an implementation of that specification in EVM.

Two claims are generated for each behaviour, Pass and Fail. The Pass claim states that if all preconditions in the iff block are true, then all executions will succeed, storage will be updated according to the storage block, and the specified return value will, in fact, be returned. The Fail claim states that should any of the preconditions be false, all executions will revert.

In both cases we begin the proof by constraining calldata to be of the form specified in the behaviours’ interface blocks, as well as making the relevant assumptions depending on whether the claim is Pass or Fail, and then symbolically executing the bytecode object with storage held to be completely abstract.

This produces a tree of potential executions where each node represents a potential branching point, and each leaf represents a possible final state of the contract after the execution of a single call.

In the case of a Fail claim, we can then check that each leaf represents a state in which execution has reverted, while for a Pass claim we can check that storage has been updated as expected, and that the contents of the return buffer matches what was specified in the behaviour’s returns block.

As an example, consider the following contract:

contract Simple {
    uint val;

    function set(uint x) external payable returns (uint) {
        require(x > 100);
        val = x;
        return x;
    }
}

We can represent this in Act as:

constructor of Simple
interface constructor()

creates

  uint val := 0
behaviour set of Simple
interface set(uint x)

iff

  x > 100

storage

  val => x

returns x

Act needs to have access to the storage layout metadata output by solc to compute the index in storage for each variable mentioned in the spec, so we need to pass a solc output json when trying to prove equivalence.

> act hevm --spec src/simple.act --soljson out/dapp.sol.json
checking postcondition...
Q.E.D.
Successfully proved set(Pass), 2 cases.
checking postcondition...
Q.E.D.
Successfully proved set(Fail), 2 cases.
==== SUCCESS ====
All behaviours implemented as specified ∎.

If we try to prove equivalence of the spec and a faulty implementation like the one below:

contract Simple {
    uint val;

    function set(uint x) external payable returns (uint) {
        require(x > 100);
        if (x == 2000) {
          val = x + 1;
        } else {
          val = x;
        }
        return x;
    }
}

Then Act will give us a counterexample showing a case where the implementation differs from the specification:

> act hevm --spec src/simple.act --soljson out/dapp.sol.json
checking postcondition...
Calldata:
0x60fe47b100000000000000000000000000000000000000000000000000000000000007d0
Caller:
0x0000000000000000000000000000000000000000
Callvalue:
0
Failed to prove set(Pass)
checking postcondition...
Q.E.D.
Successfully proved set(Fail), 2 cases.
==== FAILURE ====
1 out of 2 claims unproven:

1	Failed to prove set(Pass)

Internals

The Act AST Type

All act specs are transformed into typed AST instances. The definition of this AST type can be found in Syntax/TimeAgnostic.hs.

The top level datatype here is a Claim, and the output of the frontend states is a list of (Timed) claims. A Claim is a sum type (enum) with four potential values:

  • C: constructor
  • B: behaviour
  • I: invariant
  • S: typing information for storage variables

Pass and Fail Claims

Act specs are intended to be fully exhaustive, for this reason we generate two claims from each constructor / behaviour in the spec:

  1. Pass: The Pass claim states that if all preconditions in the iff block are true, then all executions will succeed, storage will be updated according to the storage block, and the specified return value will, in fact, be returned.
  2. Fail: The Fail claim states that should any of the preconditions be false, all executions will revert.

Taken together a succesfull proof of these claims ensures that a given bytecode object implements only the behaviour specified in act and nothing else.

Timed vs Untimed Instances

AST instances can be either Timed (explicitly timed storage references) or Untimed (implicitly timed storage references), and this distinction is reflected at the type level (the t parameter in most of the data type definitions).

In some parts of an act spec (returns and ensures blocks), storage references must be qualified with either the pre or post operator to disambiguate references to storage in the pre or post state. In other blocks storage references are either implicitly refering to both pre and post states (invariants), or to the pre state only (e.g. storage). In order to simplify implementation in the various backends, there is a frontend stage (annotate) that makes all implicit timings explicit. The type parameter allows us to enforce the invariant that all backend stages always operate on Timed AST instances.

Compilation Pipeline

The act compilation pipeline is as follows:

flowchart LR;
    Source --> Lexer
    Lexer --> Parser;
    Parser --> Type[Type Checker];
    Type --> Annotate;
    Annotate --> Enrich;
    Enrich --> SMT;
    Enrich --> Coq;
    Enrich --> Hevm;
    Enrich --> K;

Lexer

Converts the source code into a sequence of tokens that can be consumed by the parser. We use alex to generate a lexer from the specification in Lex.x.

Parser

Parses a sequence of tokens into an untyped AST. We use happy to generate a parser from the specification in Parse.y. The untyped AST is defined in Syntax/Untyped.hs.

Type Checker

The type checker takes an untyped AST, checks that none of the typing rules are violated, and produces a typed AST. The typechecker is defined in Type.hs and the core typed AST definition is in Syntax/TimeAgnostic.hs. The meaning of the timing parameter to the AST is described below in Timed vs Untimed AST instances.

Annotation

The annotate stage makes any implicit timings explicit. This stage is implemented in Syntax/Annotated.hs.

Enrichment

The enrich stage adds preconditions to all behaviour / constructor / invariant instances based on the types of all variable references. This staged is implemented in Enrich.hs.

Error Handling

Stages that need to present errors to the user should return an instance of the Error type defined in Error.hs. This is a modified instance of Data.Validation specialised to the act context. This type allows us to accumulate multiple error messsages (useful if e.g. multiple syntax errors are present), and we also have unified routines to pretty print these messages.