An In-Depth Guide to Testing Ethereum Smart Contracts

Part Seven: Stateful Testing

7 min readJul 23, 2020

This article is part of a series. If you haven’t yet, check out the previous articles:

Part One: Why we Test
Part Two: Core Concepts of Testing
Part Three: Writing Basic Tests
Part Four: Running Your Tests
Part Five: Tools and Techniques for Effective Testing
Part Six: Parametrization and Property-Based Testing
Part Seven: Stateful Testing

Stateful testing is a more advanced method of property-based testing used to test complex systems. In a stateful test you define a number of actions that can be combined together in different ways, and Hypothesis attempts to find a sequence of those actions that results in a failure. This is useful for testing complex contracts or contract-to-contract interactions where there are many possible states.

State Machines

At the core of each stateful test is a class referred to as a state machine. This class defines the initial test state, a number of actions outlining the structure that the test will execute in, and invariants that should not be violated during execution.

State machines are composed of the following components:

Rules

At the core of every state machine are one or more rules. Rules are class methods that are very similar to @given based tests; they receive values drawn from strategies and pass them to a user defined test function. The key difference is that where @given based tests run independently, rules can be chained together - a single stateful test run may involve multiple rule invocations, which may interact in various ways.

Any state machine method named rule or beginning with rule_ is treated as a rule:

class StateMachine:

    def rule_one(self):
        # performs a test action

    def rule_two(self):
        # performs another, different test action

Initializers

There is also a special type of rule known as an initializer. These are rules that are guaranteed to be executed at most one time at the beginning of a run (i.e. before any normal rule is called). They may be called in any order, or not at all, and the order will vary from run to run.

Any state machine method named initialize or beginning with initialize_ is treated as an initializer.

class StateMachine:

    def initialize(self):
        # this method may or may not be called prior to rule_one

    def rule_one(self):
        # once this method is called, initialize will not be
        # called during the test run

Strategies

A state machine should contain one or more strategies, in order to provide data to it’s rules.

Strategies must be defined at the class level, typically before the first function. They can be given any name.

Similar to how fixtures work within pytest tests, state machine rules receive strategies by referencing them within their arguments. This is shown in the following example:

class StateMachine:

    st_uint = strategy('uint256')
    st_bytes32 = strategy('bytes32')

    def initialize(self, st_uint):
        # this method draws from the uint256 strategy

    def rule(self, st_uint, st_bytes32):
        # this method draws from both strategies

    def rule_two(self, value="st_uint", othervalue="st_uint"):
        # this method draws from the same strategy twice

Invariants

Along with rules, a state machine often defines invariants. These are properties that should remain unchanged, regardless of any actions performed by the rules. After each rule is executed, every invariant method is always called to ensure that the test has not failed.

Any state machine method named invariant or beginning with invariant_ is treated as an invariant. Invariants are meant for verifying correctness of state; they cannot receive strategies.

class StateMachine:

    def rule_one(self):
        pass

    def rule_two(self):
        pass

    def invariant(self):
        # assertions in this method should always pass regardless
        # of actions in both rule_one and rule_two

Setup and Teardown Methods

A state machine may optionally include setup and teardown procedures. Similar to pytest fixtures, setup and teardown methods are available to execute logic on a per-test and per-run basis.

__init__: This method is called once, prior to the chain snapshot taken before the first test run. It is run as a class method — changes made to the state machine will persist through every run of the test.
setup: This method is called at the beginning of each test run, immediately after chain is reverted to the snapshot. Changes applied during setup will only have an effect for the upcoming run.
teardown: This method is called at the end of each successful test run, prior to the chain revert. teardown is not called if the run fails.
teardown_final: This method is called after the final test run has completed and the chain has been reverted. teardown_final is called regardless of whether the test passed or failed.

Test Execution Sequence

A stateful test executes in the following sequence:

The setup phase of all pytest fixtures are executed in their regular order.
If present, the StateMachine.__init__ method is called.
A snapshot of the current chain state is taken.
If present, the StateMachine.setup method is called.
Zero or more StateMachine initialize methods are called, in no particular order.
One or more StateMachine rule methods are called, in no particular order.
After each initialize and rule, every StateMachine invariant method is called.
If present, the StateMachine.teardown method is called.
The chain is reverted to the snapshot taken in step 3.
Steps 4–9 are repeated 50 times, or until the test fails.
If present, the StateMachine.teardown_final method is called.
The teardown phase of all pytest fixtures are executed in their normal order.

Writing Stateful Tests

There are three steps to writing a stateful test:

Create a state machine class. It should include at least one rule and invariant.
Create a regular pytest-style test that includes the state_machine fixture
Within the test, call state_machine with the state machine as the first argument

As an example, let’s build a state machine to test the following Vyper Depositer contract:

This is very simple contract with two functions and a public mapping. Anyone can deposit ether for another account using the deposit_for method, or withdraw deposited ether using withdraw_from.

If you look closely you may noticed an issue in the contract code. If not, don’t worry! We’re going to find it using our test.

Here is the state machine and test function we will use to test our contract:

When this test is executed, it will call rule_deposit and rule_withdraw using random data from the given strategies until it encounters a state which violates one of the assertions. If this happens, it repeats the test in an attempt to find the shortest path and smallest data set possible that reproduces the error. Finally it saves the failing conditions to be used in future tests, and then delivers the following output:

From this output we can see where in the test an invariant failed: self.contract.deposited(address) is zero, when we expected it to be one. We also know the sequence of calls leading to the error. From this information we can infer that the contract is incorrectly adjusting balances within the withdraw_from function. Let’s take a look at that function again:

On line 4, rather than subtracting _value, the balance is being set to _value. We found the bug!

Running Stateful Tests

By default, stateful tests are included when you run your test suite. There is no special action required to invoke them. You can choose to exclude stateful tests, or to only run stateful tests, with the --stateful flag.

To only run stateful tests:

brownie test --stateful true

To skip stateful tests:

brownie test --stateful false

While a stateful test is active, the console shows a spinner that rotates each time a run of the test has finished. If the color changes from yellow to red, it means the test has failed and hypothesis is now searching for the shortest path to the failure.

Real-world Examples

To learn more about stateful testing, it might help to explore a few repositories making use of this technique:

iamdefinitelyahuman/NFToken: A non-fungible implementation of the ERC20 standard
apguerrera/DreamFrames: Buy and sell frames in movies
curvefi/curve-dao-contracts: Vyper contracts used by Curve DAO

…and that’s it!

If you’ve read this entire series start to finish — thanks for staying with me! I hope that you learned something new along the way, and that you’re feeling inspired to go and write some tests.

If you’re looking for more, check out “Ethereum Mainnet Testing with Python and Brownie” — it almost fits into this series, I nearly made it part eight, but the writing style is a bit too different so ultimately I decided to just mention it here as a bonus.

If you enjoyed this, please follow the Brownie Twitter and like and share our content! Help get the word out about and show others what’s possible when testing their contracts.

You can also follow me on Medium and check out other articles I’ve written, and join the Brownie Gitter to chat with and learn from other like-minded developers.