Property-based testing of Plutus contracts¶
Plutus comes with a library for testing contracts using QuickCheck. Tests generated by this library perform a sequence of calls to contract endpoints, checking that tokens end up in the correct wallets at the end of each test. These sequences can be generated at random, or in a more directed way to check that desirable states always remain reachable. This tutorial introduces the testing library by walking through a simple example: a contract that implements a guessing game.
An overview of the guessing game¶
The source code of the guessing game contract is provided as an example here, and the final test code is here.
The game is played as follows:
The first player locks a sum of Ada in the contract, which is donated as a prize. The prize is protected by a secret password.
Any player can now try to guess the password, by submitting a ‘guess’ transaction that attempts to withdraw some or all of the prize, along with a guess at the password, and a new password to replace the old one. If the guess is correct, and the contract contains enough Ada, then the guesser receives the withdrawal and the remainder of the prize is now protected by the new password. If the guess is wrong, the transaction is not accepted, and nothing changes.
As an extra wrinkle, when the first player locks the prize, a new token is also minted. Only the player currently holding the token is allowed to make a guess–which gives us an opportunity to illustrate minting and passing around tokens.
The generated tests will exercise the contract by locking the prize, then moving the game token and making guesses at random, checking that the game token and Ada move as they should.
Emulated wallets¶
To test contracts, we need emulated wallets. These and many other useful definitions for testing can be imported via
import Plutus.Contract.Test (Wallet, minLogLevel, mockWalletAddress, w1, w2, w3)
Now we can create a number of wallets: in this tutorial, we’ll settle for three:
wallets :: [Wallet]
wallets = [w1, w2, w3]
Values and tokens¶
Wallets contain ‘values’, which are mixtures of different quantities
of one or more types of token. The most common token is, of course, the Ada;
we can import functions manipulating Ada, and the Value
type
itself, as follows:
import Ledger.Address qualified as Address
import Plutus.Script.Utils.Ada qualified as Ada
import Plutus.Script.Utils.Value qualified as Value
With these imports, we can construct values in the Ada currency:
> Ada.lovelaceValueOf 1
Value (Map [(,Map [(,1)])])
We will also need a game token. After importing the Scripts
module
import Ledger.Typed.Scripts qualified as Scripts
we can define it as follows, applying a minting policy defined in the code under test (imported as module G
):
import Plutus.Contracts.GameStateMachine qualified as G
import Cardano.Node.Emulator.Internal.Node.TimeSlot qualified as TimeSlot
import Data.Default (Default (def))
gameParam :: G.GameParam
gameParam = G.GameParam (Address.toPlutusAddress $ mockWalletAddress w1) (TimeSlot.scSlotZeroTime def)
guessTokenVal :: Value.Value
guessTokenVal =
let sym = Scripts.forwardingMintingPolicyHash $ G.typedValidator gameParam
in G.token sym "guess"
The value of the token is (with long hash values abbreviated):
> guessTokenVal
Value (Map [(f687...,Map [(guess,1)])])
We can even construct a Value
containing an Ada and a game token:
> Ada.lovelaceValueOf 1 <> guessTokenVal
Value (Map [(,Map [(,1)]),(f687...,Map [(guess,1)])])
If you inspect the output closely, you will see that a Value
contains maps nested within another Map
. The outer Map
is
indexed by hashes of minting policy scripts, so each inner Map
contains a bag of tokens managed by the same policy. Token names can
be chosen freely, and each policy can manage any number of its own
token types. In this case the game token is called a “guess”, and the
script managing game tokens has the hash f687… A little confusingly,
the Ada token name is displayed as an empty string, as is the hash of
the corresponding minting policy.
Introducing contract models¶
We test contracts using a model of the system, which includes the state(s) of all the agents involved–the on-chain state, the off-chain state on your computer, the off-chain state on anyone else’s computer–everything relevant to the contract(s) under test. The first job to be done is thus defining that model. To do so, we import the contract modelling library
import Plutus.Contract.Test.ContractModel qualified as CM
import Test.QuickCheck.ContractModel qualified as CM (HasSymbolicRep, Symbolic)
and define the model type:
data GameModel = GameModel
This definition is incomplete: we shall fill in further details as we proceed.
The GameModel
type must be an instance of the Plutus.Contract.Test.ContractModel.Interface.ContractModel
class, which has an associated datatype defining the kinds of
actions that will be performed in generated tests.
instance CM.ContractModel GameModel where
data Action GameModel = Lock Wallet String Integer
| Guess Wallet String String Integer
| GiveToken Wallet
deriving (Eq, Show, Generic)
In this case we define three actions:
a
Lock
action to be performed by the first player when starting the game, containing the player’s wallet (from which the Ada will be taken), the secret password, and the prize amount.a
Guess
action to be performed by the other players, containing the player’s wallet (to receive the prize), the player’s guess, a new password, and the amount to be claimed if the guess is right.a
GiveToken
action, to give the game token to a player so they can make a guess.
A generated test is called Plutus.Contract.Test.ContractModel.Interface.Actions
, and is, as the name suggests, essentially a
sequence of Plutus.Contract.Test.ContractModel.Interface.Action
values. We can
run tests by using Plutus.Contract.Test.ContractModel.Interface.propRunActions_
:
prop_Game :: CM.Actions GameModel -> Property
prop_Game actions = CM.propRunActions_ actions
When we test this property, quickCheck
will generate random
action sequences to be tested, checking at the end of each test that tokens are
transferred correctly, and contracts didn’t crash.
Note
There is also a more general function
Plutus.Contract.Test.ContractModel.Interface.propRunActions
that allows the check at the end of each test to be customized.
But how does quickCheck
know what code to run when you check prop_Game
?
Plutus.Contract.Test.ContractModel.Interface.propRunActions_
needs to create a handle
for each contract instance, which is used to invoke their
endpoints from the test. Different contracts have different endpoints,
of different types–and thus different schemas. When we invoke an
endpoint, we need to know the schema of the contract we are invoking,
and the type of errors it can return, so that the type-checker can
ensure that the call is valid. We thus need to know the type of
contract that each handle refers to.
To achieve this, every contract instance in a test is named by a
Plutus.Contract.Test.ContractModel.Interface.ContractInstanceKey
, another
associated datatype of the
Plutus.Contract.Test.ContractModel.Interface.ContractModel
class; we talk to a
contract instance by referring to its
Plutus.Contract.Test.ContractModel.Interface.ContractInstanceKey
. The
Plutus.Contract.Test.ContractModel.Interface.ContractInstanceKey
type is
parameterised both on the type of the contract model, and on the observable
state, schema, and error type of the contract it refers to. Since the
same test may refer to contracts of several different types,
Plutus.Contract.Test.ContractModel.Interface.ContractInstanceKey
is defined as a
GADT.
In this particular case, there is only one type of contract under
test, and so it suffices to define a Plutus.Contract.Test.ContractModel.Interface.ContractInstanceKey
type with a single
constructor. There is one contract instance running in each emulated wallet, so
we simply distinguish contract instance keys by the wallet they are running in:
data ContractInstanceKey GameModel w schema err param where
WalletKey :: Wallet -> CM.ContractInstanceKey GameModel () G.GameStateMachineSchema G.GameError ()
Once this type is defined, we can tell QuickCheck what code to run for a given
contract by filling in the
Plutus.Contract.Test.ContractModel.Interface.initialInstances
,
Plutus.Contract.Test.ContractModel.Interface.instanceWallet
, and
Plutus.Contract.Test.ContractModel.Interface.instanceContract
fields of the
ContractModel
class:
initialInstances = (`CM.StartContract` ()) . WalletKey <$> wallets
instanceContract _ WalletKey{} _ = G.contract
instanceWallet (WalletKey w) = w
This specifies (reading top to bottom) that we should create one
contract instance per wallet w
, that will run G.contract
, in wallet w
.
Now we can run tests, although of course they will not yet succeed:
> quickCheck prop_Game
*** Failed! (after 1 test and 1 shrink):
Exception:
GSM.hs:65:10-32: No instance nor default method for class operation arbitraryAction
The contract modelling library cannot generate test cases, unless we
specify how to generate an Plutus.Contract.Test.ContractModel.Interface.Action
, which we will do next.
Generating actions¶
To generate actions, we need to be able to generate wallets, guesses, and suitable values of Ada, since these appear as action parameters.
genWallet :: Gen Wallet
genWallet = elements wallets
genGuess :: Gen String
genGuess = elements ["hello", "secret", "hunter2", "*******"]
genValue :: Gen Integer
genValue = getNonNegative <$> arbitrary
We choose wallets from the three available, and we choose passwords from a small set, so that random guesses will often be correct. We choose Ada amounts to be non-negative integers, because negative amounts would be error cases that we choose not to test.
Now we can define a generator for Plutus.Contract.Test.ContractModel.Interface.Action
, as a method of the
Plutus.Contract.Test.ContractModel.Interface.ContractModel
class:
arbitraryAction s = oneof $
[ Lock <$> genWallet <*> genGuess <*> genValue ] ++
[ Guess <$> genWallet <*> genGuess <*> genGuess <*> genValue ] ++
[ GiveToken <$> genWallet ]
With this method defined, we can start to generate test cases. Using
sample
we can see what action sequences look like:
> sample (arbitrary :: Gen (Actions GameModel))
Actions
[Lock (Wallet 2) "hunter2" 5,
Guess (Wallet 3) "*******" "hello" 6,
Guess (Wallet 1) "secret" "*******" 10,
Guess (Wallet 3) "*******" "*******" 6,
GiveToken (Wallet 3),
Guess (Wallet 2) "hunter2" "hunter2" 15]
.
.
We can even run ‘tests’ now, although they don’t do much yet:
> quickCheck prop_Game
+++ OK, passed 100 tests:
Actions (2263 in total):
33.94% Lock
33.89% Guess
32.17% GiveToken
The output tells us the distribution of generated actions, aggregated
across all the tests. We can see that each action was generated around
one third of the time, which is to be expected since our generator
does not weight them at all. Keep an eye on this table as we extend
our generation; if any Plutus.Contract.Test.ContractModel.Interface.Action
disappears altogether, or is generated
very rarely, then this indicates a problem in our tests.
Modelling expectations¶
The ultimate purpose of our tests is to check that funds are
transferred correctly by each operation–for example, that after a
guess, the guesser receives the requested Ada only if the guess was
correct. An important part of a Plutus.Contract.Test.ContractModel.Interface.ContractModel
defines how funds
are expected to move. However, it’s clear that in order to define how
we expect funds to move after a Guess
, we need to know more than
just where all the Ada are. We need to know:
what the current secret password is, so we can decide whether or not the guess is correct.
whether or not the guesser currently holds the game token, and so is entitled to make a guess.
how much Ada is currently locked in the contract, so we can determine whether the guesser is requesting funds that actually exist.
These all depend on the previous steps in the test case. To keep track
of such information, we store it in a contract state, which is the
type parameter of the Plutus.Contract.Test.ContractModel.Interface.ContractModel
class. (Note that this contract
state is a part of the model, it may be quite different from the
contract state in the implementation). In this case the contract state
is the GameModel
type, so let’s complete its definition:
data GameModel = GameModel
{ _gameValue :: Integer
, _hasToken :: Maybe Wallet
, _currentSecret :: String }
deriving (Show, Generic)
makeLenses 'GameModel
Initially the game token does not exist, so we record its current
owner as a Maybe Wallet
, so that we can represent the initial
situation before its creation. The locked funds are always in Ada, so
in the model it suffices to store an integer.
Now we can define the initial state of the model at the start of each
test case, Plutus.Contract.Test.ContractModel.Interface.initialState
, and a Plutus.Contract.Test.ContractModel.Interface.nextState
function to model the way
we expect each operation to change the state. These are both methods
in the Plutus.Contract.Test.ContractModel.Interface.ContractModel
class.
The initial state just records that the game token does not exist yet, and assigns default values to the other fields.
initialState = GameModel
{ _gameValue = 0
, _hasToken = Nothing
, _currentSecret = ""
}
The Plutus.Contract.Test.ContractModel.Interface.nextState
function is defined in the Plutus.Contract.Test.ContractModel.Interface.Spec
monad
nextState :: CM.Action state -> CM.Spec state ()
and defines the expected effect of each operation.
The Lock
operation creates the contract, initializing the model
contract state (using %=
and generated Lens
operations),
mints the game token (using Plutus.Contract.Test.ContractModel.Interface.mint
), deposits it in the creator’s
wallet, and withdraws the Ada locked in the contract (using Plutus.Contract.Test.ContractModel.Interface.deposit
and Plutus.Contract.Test.ContractModel.Interface.withdraw
):
nextState (Lock w secret val) = do
hasToken .= Just w
currentSecret .= secret
gameValue .= val
CM.mint guessTokenVal
CM.deposit w guessTokenVal
CM.withdraw w $ Ada.lovelaceValueOf val
A Plutus.Contract.Test.ContractModel.Interface.ContractModel
actually tracks not only the contract model state (in
our case the GameModel
type), but also the quantities of tokens
expected to be in each wallet, which are checked at the end of each
test. It is these expectations that are manipulated by Plutus.Contract.Test.ContractModel.Interface.mint
,
Plutus.Contract.Test.ContractModel.Interface.deposit
, etc… don’t confuse them with operations that actually
mint or move tokens in the implementation. The Plutus.Contract.Test.ContractModel.Interface.ModelState
type
contains all of this information.
When making a guess, we need to check parts of the contract state
(which we read using Plutus.Contract.Test.ContractModel.Interface.viewContractState
), and then we update the
stored password, game value, and wallet contents appropriately.
nextState (Guess w old new val) = do
correctGuess <- (old ==) <$> CM.viewContractState currentSecret
holdsToken <- (Just w ==) <$> CM.viewContractState hasToken
enoughAda <- (val <=) <$> CM.viewContractState gameValue
when (correctGuess && holdsToken && enoughAda) $ do
currentSecret .= new
gameValue %= subtract val
CM.deposit w $ Ada.lovelaceValueOf val
GiveToken
just transfers the game token from one wallet to another using Plutus.Contract.Test.ContractModel.Interface.transfer
.
nextState (GiveToken w) = do
w0 <- fromJust <$> CM.viewContractState hasToken
CM.transfer w0 w guessTokenVal
hasToken .= Just w
At the end of each test, the Plutus.Contract.Test.ContractModel.Interface.ContractModel
framework checks that
every wallet contains the tokens that the model says it should.
We can exercise the Plutus.Contract.Test.ContractModel.Interface.nextState
function already by generating and
‘running’ tests, even though we have not yet connected these tests to
the real contract. Doing so immediately reveals a problem:
> quickCheck prop_Game
*** Failed! (after 3 tests and 3 shrinks):
Exception:
Maybe.fromJust: Nothing
CallStack (from HasCallStack):
error, called at libraries/base/Data/Maybe.hs:148:21 in base:Data.Maybe
fromJust, called at GSM0.hs:122:15 in main:GSM0
Actions
[GiveToken (Wallet 1)]
Looking at the last two lines, we see the generated test sequence, and
the problem is evident: we generated a test that only gives the game
:term:`token` to wallet 1, but this makes no sense because the game token has
not yet been minted–so the fromJust
in the Plutus.Contract.Test.ContractModel.Interface.nextState
function
fails. We will see how to prevent this in the next section.
Restricting test cases with preconditions¶
As we just saw, not every sequence of actions makes sense as a test case; we need a way to restrict test cases to be ‘sensible’. Note this is not the same as restricting tests to ‘the happy path’: we want to test unexpected sequences of actions, and indeed, this is part of the strength of property-based testing. But there are some actions–like trying to give the game token to a wallet before it has been minted–that are not even interesting to test. These are the cases that we rule out by defining preconditions for actions; the effect is to prevent such test cases ever being generated.
To introduce preconditions, we add a definition of the
Plutus.Contract.Test.ContractModel.Interface.precondition
method to our
Plutus.Contract.Test.ContractModel.Interface.ContractModel
instance.
precondition :: CM.ModelState state -> CM.Action state -> Bool
The Plutus.Contract.Test.ContractModel.Interface.precondition
is parameterised on the entire model state, which
includes the contents of wallets as well as our contract state, so we
will need to extract this state as well as the fields we need from
it. For now, we just restrict GiveToken
actions to states in which
the token exists:
precondition s (GiveToken _) = isJust tok
where
tok = s ^. CM.contractState . hasToken
precondition s _ = True
Now if we try to run tests, something more interesting happens:
> quickCheck prop_Game
*** Failed! Assertion failed (after 2 tests):
Actions
[Lock (Wallet 1) "hello" 0]
Expected funds of W1 to change by
Value (Map [(f687...,Map [(guess,1)])])
but they did not change
Test failed.
Emulator log:
[INFO] Slot 1: TxnValidate 4feb...
[INFO] Slot 1: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
Contract instance started
[INFO] Slot 1: 00000000-0000-4000-8000-000000000001 {Contract instance for wallet 2}:
Contract instance started
[INFO] Slot 1: 00000000-0000-4000-8000-000000000002 {Contract instance for wallet 3}:
Contract instance started
The test has failed, of course. The generated (and simplified) test case only performs one action:
Actions
[Lock (Wallet 1) "hello" 0]
Wallet 1 attempts to create a game contract guarding zero Ada. Inspecting the error message, we can see that wallet 1 ended up with the wrong contents:
Expected funds of W1 to change by
Value (Map [(f687...,Map [(guess,1)])])
but they did not change
Our model predicted that wallet 1 would end up containing the game token, but in fact its contents were unchanged.
In this test, we have actually performed actions in the emulator, as
the log shows us: one transaction has been validated, and we have
started three contract instances (one for each wallet in the
test). But we have not created a game token for wallet 1, because
thus far we have not defined how actions in a test should be
performed–so the Lock
action in the test case behaves as a no-op,
which of course does not deposit a game token in wallet 1. It is time
to link actions in a test to the emulator.
Performing actions¶
So far we are generating actions, but we have not yet linked them to the contract they are supposed to test–so ‘running’ the tests, as we did above, did not invoke the contract at all. To do so, we must import the emulator
import Plutus.Trace.Emulator qualified as Trace
Then we define the Plutus.Contract.Test.ContractModel.Interface.perform
method of the Plutus.Contract.Test.ContractModel.Interface.ContractModel
class:
perform
:: CM.HandleFun state
-> (forall t. CM.HasSymbolicRep t => CM.Symbolic t -> t)
-> CM.ModelState state
-> CM.Action state
-> CM.SpecificationEmulatorTrace ()
The job of the Plutus.Contract.Test.ContractModel.Interface.perform
method in this case is just to invoke the
contract end-points, using the API defined in the code under test, and
transfer the game token from one wallet to another as specified by
GiveToken
actions.
gameParam :: G.GameParam
gameParam = G.GameParam (Address.toPlutusAddress $ mockWalletAddress w1) (TimeSlot.scSlotZeroTime def)
perform handle _ s cmd = case cmd of
Lock w new val -> do
Trace.callEndpoint @"lock" (handle $ WalletKey w)
G.LockArgs
{ G.lockArgsGameParam = gameParam
, G.lockArgsSecret = secretArg new
, G.lockArgsValue = Ada.lovelaceValueOf val
}
Guess w old new val -> do
Trace.callEndpoint @"guess" (handle $ WalletKey w)
G.GuessArgs
{ G.guessArgsGameParam = gameParam
, G.guessTokenTarget = Address.toPlutusAddress $ mockWalletAddress w
, G.guessArgsOldSecret = old
, G.guessArgsNewSecret = secretArg new
, G.guessArgsValueTakenOut = Ada.lovelaceValueOf val
}
GiveToken w' -> do
let w = fromJust (s ^. CM.contractState . hasToken)
Trace.payToWallet w w' guessTokenVal
return ()
Every call to an end-point must be associated with one of the contract
instances defined in our initialInstances
; the handle
argument to
Plutus.Contract.Test.ContractModel.Interface.perform
lets us find the contract handle associated with each
Plutus.Contract.Test.ContractModel.Interface.ContractInstanceKey
.
For the most part, it is good practice to keep the Plutus.Contract.Test.ContractModel.Interface.perform
function
simple: a direct relationship between actions in a test case and calls
to contract endpoints makes interpreting test failures much easier.
Note
Helping shrinking work better by choosing test case actions well
In the definition of
Plutus.Contract.Test.ContractModel.Interface.perform
above, theGiveToken
action is a little surprising: when we call the emulator, we have to specify not only the wallet to give the token to, but also the wallet to take the token from. So why did we choose to define aGiveToken w
action to include in test cases, rather than an actionPassToken w w'
, which would correspond more directly to the code inPlutus.Contract.Test.ContractModel.Interface.perform
?The answer is that using
GiveToken
actions instead helps QuickCheck to shrink failing tests more effectively. QuickCheck shrinks test cases by attempting to remove actions from them–essentially replacing an action by a no-op. But consider a sequence such asPassToken w1 w2 PassToken w2 w3which transfers the game token in two steps from wallet 1 to wallet 3. Deleting either one of these steps means the game token will end up in the wrong place, probably causing the next steps in the test to behave very differently (and thus, preventing this shrinking step). But given the sequence
GiveToken w2 GiveToken w3the first
GiveToken
can be deleted without affecting the behaviour of the second at all. Thus, by making token-passing steps independent of each other, we make it easier for QuickCheck to shrink a failing test without drastic changes to its behaviour.
Shrinking Actions¶
Before starting to run tests seriously, it is useful to make sure that
any failing tests will shrink well to small examples. By default, the
contract modelling library tries to shrink tests by removing actions,
but it cannot know how to shrink the actions themselves. We can
specify this shrinking by defining the Plutus.Contract.Test.ContractModel.Interface.shrinkAction
operation in the
Plutus.Contract.Test.ContractModel.Interface.ContractModel
class:
shrinkAction :: CM.ModelState state -> CM.Action state -> [CM.Action state]
This function returns a list of ‘simpler’ actions that should be tried
as replacements for the given Plutus.Contract.Test.ContractModel.Interface.Action
, when QuickCheck is simplifying
a failed test. In this case we define a shrinking function for wallets:
shrinkWallet :: Wallet -> [Wallet]
shrinkWallet w = [w' | w' <- wallets, w' < w]
and shrink actions by shrinking the wallet and Ada parameters.
shrinkAction _s (Lock w secret val) =
[Lock w' secret val | w' <- shrinkWallet w] ++
[Lock w secret val' | val' <- shrink val]
shrinkAction _s (GiveToken w) =
[GiveToken w' | w' <- shrinkWallet w]
shrinkAction _s (Guess w old new val) =
[Guess w' old new val | w' <- shrinkWallet w] ++
[Guess w old new val' | val' <- shrink val]
We choose not to shrink password/guess parameters, because they are not really significant–one password is as good as another in a failed test.
Debugging the model¶
At this point, the contract model is complete, and tests are
runnable. However, they do not pass, and so we need to adapt either
the tests or the contract to resolve the inconsistencies revealed. Testing prop_Game
now results in:
> quickCheck prop_Game
*** Failed! Falsified (after 6 tests and 3 shrinks):
Actions
[Lock (Wallet 1) "hunter2" 0]
Expected funds of W1 to change by
Value (Map [(f687...,Map [(guess,1)])])
but they did not change
Test failed.
Emulator log:
... 49 lines of emulator log messages ...
In this test, wallet 1 attempts to lock zero Ada, and our model predicts that wallet 1 should receive a game token–but this did not happen. To understand why, we need to study the emulator log. Here are the relevant parts:
...
[INFO] Slot 1: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
Receive endpoint call: Object (fromList [("tag",String "lock"),...
[INFO] Slot 1: W1: Balancing an unbalanced transaction:
Tx:
Tx 2542...:
{inputs:
outputs:
- Value (Map []) addressed to
ScriptAddress: d1e1...
...
[INFO] Slot 1: W1: TxSubmit: 2542...
[INFO] Slot 2: TxnValidate 2542...
[INFO] Slot 2: W1: Balancing an unbalanced transaction:
Tx:
Tx 1eba...:
{inputs:
- 2542...!0
Redeemer: <>
outputs:
- Value (Map []) addressed to
ScriptAddress: d1e1...
mint: Value (Map [(f687...,Map [(guess,1)])])
...
[INFO] Slot 2: W1: TxSubmit: 2d66...
Here we see the endpoint call to lock
being received during slot
1, resulting in a transaction with ID 2542...
, which pays zero Ada
to the contract script. The transaction is balanced (which has no
effect in this case), submitted, and validated by the emulator at
slot 2. Then another transaction, 1eba...
, is created, which
mints the game token. This transaction is in turn balanced (resulting
in a new hash, 2d66...
), and submitted without error–but although
no errors are reported, this transaction is not validated.
Since the transaction is submitted in slot 2, we would expect it to be validated in slot 3. In fact, the problem here is just that the test stopped too early, before the blockchain had validated this second transaction. The solution is just to delay long enough for the blockchain to validate all the transactions we have submitted.
Adding delays to test cases¶
To give the blockchain time to validate the transactions generated by
a Lock
call, we need to delay by two slots. Why two? Because the
Lock
contract endpoint submits two transactions to the
blockchain. Likewise, we delay one slot after each of the other
actions. (If the delays we insert are too short, we will discover this
later via failed tests).
We add a call to delay
in each branch of Plutus.Contract.Test.ContractModel.Interface.perform
:
gameParam :: G.GameParam
gameParam = G.GameParam (Address.toPlutusAddress $ mockWalletAddress w1) (TimeSlot.scSlotZeroTime def)
perform handle _ s cmd = case cmd of
Lock w new val -> do
Trace.callEndpoint @"lock" (handle $ WalletKey w)
G.LockArgs
{ G.lockArgsGameParam = gameParam
, G.lockArgsSecret = secretArg new
, G.lockArgsValue = Ada.lovelaceValueOf val
}
CM.delay 2
Guess w old new val -> do
Trace.callEndpoint @"guess" (handle $ WalletKey w)
G.GuessArgs
{ G.guessArgsGameParam = gameParam
, G.guessTokenTarget = Address.toPlutusAddress $ mockWalletAddress w
, G.guessArgsOldSecret = old
, G.guessArgsNewSecret = secretArg new
, G.guessArgsValueTakenOut = Ada.lovelaceValueOf val
}
CM.delay 1
GiveToken w' -> do
let w = fromJust (s ^. CM.contractState . hasToken)
Trace.payToWallet w w' guessTokenVal
CM.delay 1
This makes the emulator delay one or two slots, but we also need to
delay in our model, to keep the model state in sync with the
emulator. We do this using corresponding calls to Plutus.Contract.Test.ContractModel.Interface.wait
in the
definition of Plutus.Contract.Test.ContractModel.Interface.nextState
:
nextState (Lock w secret val) = do
hasToken .= Just w
currentSecret .= secret
gameValue .= val
CM.mint guessTokenVal
CM.deposit w guessTokenVal
CM.withdraw w $ Ada.lovelaceValueOf val
CM.wait 2
and similarly in the other cases.
Does this change fix the problem? To find out, we should rerun the same test case, after updating the code.
Rerunning a failed test¶
The best way to save and rerun a QuickCheck test case is to
copy-and-paste it from the QuickCheck output into your code. Since
prop_Game
is just a function that takes the generated test as an
argument, then we can rerun a test by passing it to the
property. In this case let us define
testLock :: Property
testLock = flip CM.forAllDL prop_Game $ CM.action $ Lock w1 "hunter2" 0
testLock
is itself a Property
, so we can test it using
quickCheck
. Testing it before adding the delays in the last section
generates the same output as before. Testing it after the delays
are added results in
> quickCheck testLock
+++ OK, passed 100 tests.
Actions (100 in total):
100% Lock
The test passes, and the problem is fixed.
Note
Since there is no random generation in this test, there is no real need to test it 100 times. This can be avoided by adding
withMaxSuccess
to the definition:testLock :: Property testLock = withMaxSuccess 1 . flip CM.forAllDL prop_Game $ CM.action $ Lock w1 "hunter2" 0Note
We save the failing test case, not the random seed used to generate it. This is the only way to be sure that we repeat the same test that just failed. Usually, a failed test that QuickCheck reports is the result of both random generation and shrinking, not random generation alone. Reusing the same random seed would usually regenerate a much larger test, which might well fail for a different reason, leading QuickCheck to report a different shrunk failing test. It is then impossible to know for sure whether or not the change just made to the code fixed the problem it was intended to fix–it might just have changed the way failed tests shrink. By rerunning exactly the same test case we can be sure that our change did fix that problem, at least.
Controlling the log-level¶
When we rerun random tests, they fail for a different reason:
> quickCheck prop_Game
*** Failed! Assertion failed (after 5 tests and 7 shrinks):
Actions
[Lock (Wallet 1) "hunter2" 0,
Lock (Wallet 1) "hello" 0]
Outcome of Contract instance for wallet 1:
False
Failed 'Contract instance stopped with error'
Test failed.
Emulator log:
... 73 lines of emulator log messages ...
Looking at the failing test case,
Actions
[Lock (Wallet 1) "hunter2" 0,
Lock (Wallet 1) "hello" 0]
we can see that it does something unexpected: wallet 1 tries to lock twice. Our model allows this, but the error message tells us that the contract instance crashed.
The emulator log output can be rather overwhelming, but we can eliminate the
INFO
messages by running the test sequence with appropriate options. If we
define
import Control.Monad.Freer.Extras.Log (LogLevel)
propGame' :: LogLevel -> CM.Actions GameModel -> Property
propGame' l s = CM.propRunActionsWithOptions
(set minLogLevel l CM.defaultCheckOptionsContractModel)
CM.defaultCoverageOptions
(\ _ -> pure True)
s
then we can re-run the test and see more succinct output:
> quickCheck $ propGame' Warning
*** Failed! Assertion failed (after 7 tests and 4 shrinks):
Actions
[Lock (Wallet 1) "hello" 0,
Lock (Wallet 1) "*******" 0]
Outcome of Contract instance for wallet 1:
False
Failed 'Contract instance stopped with error'
Test failed.
Emulator log:
[WARNING] Slot 4: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
Contract instance stopped with error: GameSMError (ChooserError "Found 2 outputs, expected 1")
Now we see the problem: an error in the game implementation that
stopped the second contract call, because two unspent transaction
outputs had been created. These two outputs are the Ada amounts
addressed to the contract script that are created by the first
transaction of each call to the Lock
endpoint. The off-chain
contract is not designed to cope with more than one such UTXO; it is
now in a broken state. In fact, the Ada now locked in these UTXOs
cannot be recovered by the present off-chain code–the only way to
recover the money is to revise the contract so that it can accept
multiple UTXOs. Arguably, this is a bug in the contract: if any wallet
tries to start the game for a second time, the Ada will be lost (until
the bug is fixed).
Refining preconditions¶
We just learned that a second Lock
call puts the contract into a
broken state. But this is not how the game was intended to be used, so
the developer might reasonably respond “you shouldn’t do that”. There
could also be other problems in the code that we cannot presently
find, because they are masked by the double-lock bug. Since a test
case with two Lock
calls is easy to generate, then QuickCheck is
likely to report this particular problem in almost every subsequent
run–unless we explicitly prevent it from doing so.
We can easily avoid this by strengthening the precondition of
Lock
, so that it can only be performed once per test case. We
do so by checking whether any wallet holds the game token:
precondition s cmd = case cmd of
Lock {} -> isNothing tok
Guess {} -> True
GiveToken _ -> isJust tok
where
tok = s ^. CM.contractState . hasToken
Now the double-lock test case can no longer be generated. If we save the test case
testDoubleLock :: Property
testDoubleLock = flip CM.forAllDL prop_Game $ do
CM.action $ Lock w1 "*******" 0
CM.action $ Lock w1 "secret" 0
and try to rerun it, then QuickCheck will not do so:
> quickCheck testDoubleLock
*** Gave up! Passed only 0 tests; 1000 discarded tests.
When a precondition cannot be satisfied, then QuickCheck ‘gives up’ as we see here–the faulty test case was discarded (1000 times).
Rerunning random tests finds another ‘bug’:
> quickCheck $ propGame' Warning
*** Failed! Assertion failed (after 10 tests and 6 shrinks):
Actions
[Lock (Wallet 2) "hello" 0,
Guess (Wallet 1) "hello" "secret" 0]
Outcome of Contract instance for wallet 1:
False
Failed 'Contract instance stopped with error'
Test failed.
Emulator log:
[WARNING] Slot 3: W1: handleTx failed: InsufficientFunds "Total: Value (Map [(,Map [(,100000000)])]) expected: Value (Map [(f687...,Map [(guess,1)])])"
[WARNING] Slot 3: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}:
Contract instance stopped with error: GameSMError (SMCContractError (WalletError (InsufficientFunds "Total: Value (Map [(,Map [(,100000000)])]) expected: Value (Map [(f687...,Map [(guess,1)])])")))
In this case, the contract instance in wallet 1 crashes, because the wallet contains ‘insufficient funds’. Reading the last line closely, we see that although the wallet contained 100 million Ada, it lacked the game token, and so making a guess was not allowed.
Arguably, the off-chain code should not have tried to submit the guess transaction without holding the game token, and the contract instance should not have crashed. Or we might take the view that no harm is done, since the transaction is rejected anyway. But the crashing contract does cause tests to fail, which–as before–is likely to prevent us discovering other problems.
We can strengthen the precondition of Guess
to prevent this from
happening.
precondition s cmd = case cmd of
Lock {} -> isNothing tok
Guess w _ _ _ -> tok == Just w
GiveToken _ -> isJust tok
where
tok = s ^. CM.contractState . hasToken
With this change, the tests still fail, and we must study the entire log output to understand why:
> quickCheck $ prop_Game
*** Failed! Assertion failed (after 36 tests and 35 shrinks):
Actions
[Lock (Wallet 1) "*******" 1,
GiveToken (Wallet 2),
Guess (Wallet 2) "*******" "hello" 2,
Guess (Wallet 2) "*******" "hunter2" 1]
Expected funds of W2 to change by
Value (Map [(,Map [(,1)]),(f687...,Map [(guess,1)])])
but they changed by
Value (Map [(f687...,Map [(guess,1)])])
Test failed.
Emulator log:
... 52 lines of log output ...
[INFO] Slot 4: 00000000-0000-4000-8000-000000000001 {Contract instance for wallet 2}:
Receive endpoint call: Object (fromList [("tag",String "guess"),...Number 2.0...
... 25 lines of log output ...
[INFO] Slot 5: TxnValidationFail ab0d...: NegativeValue ...
[INFO] Slot 5: 00000000-0000-4000-8000-000000000001 {Contract instance for wallet 2}:
Receive endpoint call: Object (fromList [("tag",String "guess"),...Number 1.0...
In this case, we lock one Ada, and then wallet 2 makes two guesses, both with the correct password. The first guess tries to withdraw more Ada than are available, which our model predicts should be a no-op. Recall we defined:
nextState (Guess w old new val) = do
correctGuess <- (old ==) <$> CM.viewContractState currentSecret
holdsToken <- (Just w ==) <$> CM.viewContractState hasToken
enoughAda <- (val <=) <$> CM.viewContractState gameValue
when (correctGuess && holdsToken && enoughAda) $ do
currentSecret .= new
gameValue %= subtract val
CM.deposit w $ Ada.lovelaceValueOf val
CM.wait 1
Our model predicts that the second guess, with the correct password
and a withdrawal of only one Ada, ought to succeed. That is why we
expected wallet 2 to end up with the game token, and one Ada. However,
wallet 2 did not receive the Ada, only the game token. Reading the
emulator log reveals why: in slot 4 we called the guess
endpoint
to withdraw two Ada, which would leave -1 Ada locked by the contract,
but the transaction submitted to the blockchain was not validated, and
we see the error message NegativeValue
. We made the second
endpoint call, for the second guess, but nothing more happened. This
is because the validation failure did not crash the off-chain
contract instance (which would have provoked a test failure after the
first guess), it just left it waiting for a result from the
blockchain. As a result, the contract instance is hanging, and ignores
the second guess.
We can avoid this problem too, by strengthening the precondition further:
precondition s cmd = case cmd of
Lock _ _ v -> isNothing tok
Guess w _ _ v -> tok == Just w && v <= val
GiveToken w -> isJust tok
where
tok = s ^. CM.contractState . hasToken
val = s ^. CM.contractState . gameValue
Now the tests pass:
> quickCheck . withMaxSuccess 10000 $ prop_Game
+++ OK, passed 10000 tests.
Actions (241234 in total):
87.1324% GiveToken
9.0854% Guess
3.7822% Lock
It is good practice to run far more than 100 tests, once tests are passing.
In this section we discovered ways to crash the off-line contract instances, or leave them hanging. We debugged the problems by strengthening preconditions–but of course, the problems are still there. We have just avoided provoking them with our tests, which enabled us to continue testing and find more problems. But unless these problems are corrected, enabling our preconditions to be weakened again, then all we know from our tests is that the contract behaves correctly provided callers obey the preconditions.
Measuring and tuning distributions¶
Running successful tests displays statistics over the test cases
generated. By default, testing a Plutus.Contract.Test.ContractModel.Interface.ContractModel
just displays the
distribution of types of action. Looking at the output above, we can
see that the vast majority of actions were GiveToken
actions; only
9% were guesses, and fewer than 4% were Lock
actions.
It is not a surprise that there were relatively few Lock
actions:
our precondition guarantees that there can be at most one Lock
per
test case, and this is intentional, so of course the other actions are
much more common. However, we almost certainly don’t want to test
GiveToken
almost ten times as often as Guess
. What is going
on?
The problem is this: after a Lock
as the first action of a test
case, every attempt to generate a GiveToken
action will succeed;
that is, the precondition of the generated action will be
True
. But for Guess
actions, many randomly generated actions
will not satisfy the precondition we ended up with, either because the
wallet does not contain the game token, or because the amount to be
withdrawn is greater than the amount available.
To achieve a better distribution of tests, we need to redefine the
action generator so that Guess
actions more often satisfy their
precondition. The action generator is itself parameterized on the
contract state, so we could guarantee that generated guesses satisfy
their preconditions by redefining it as follows:
arbitraryAction s = oneof $
[ Lock <$> genWallet <*> genGuess <*> genValue ] ++
[ Guess w <$> genGuess <*> genGuess <*> choose (0, val)
| Just w <- [tok] ] ++
[ GiveToken <$> genWallet ]
where
tok = s ^. CM.contractState . hasToken
val = s ^. CM.contractState . gameValue
With this change, Guess
and GiveToken
actions become equally
frequent:
> quickCheck . withMaxSuccess 1000 $ prop_Game
+++ OK, passed 1000 tests.
Actions (23917 in total):
48.271% GiveToken
47.845% Guess
3.884% Lock
Custom generators vs preconditions¶
It may seem like wasted effort to encode the form of valid Guess
actions twice, once in the precondition, and then again in the
generator. Would it not be sufficient to write the generator to target
successful guesses in the first place, and omit the precondition?
The answer is no: it would not. By writing the generator
carefully, we can ensure that the generated Guess
actions are
valid, but as soon as a test fails, and QuickCheck begins to shrink
it, then the precondition becomes essential. Without it,
QuickCheck might remove a GiveToken
action that makes a subsequent
Guess
valid, and then report that the resulting test (not
surprisingly) failed. It is only preconditions that ensure that
shrunk test cases make sense.
Thus, the action generator cannot ensure that actions in test cases are valid; it can only skew the distribution of actions towards valid ones. This means there is no need for the action generator to guarantee that the actions it generates are valid; they will in any case have to pass the precondition before they are included in a test case. In fact, it is a little dangerous to define a generator so that only actions satisfying the precondition are generated, because we might later choose to weaken the precondition. If we do so, and forget to change the generator too, then we might end up with less thorough testing than we expect. So rather than generate guesses as we did above, it would be better to define
arbitraryAction s = oneof $
[ Lock <$> genWallet <*> genGuess <*> genValue ] ++
[ frequency $
[ (10, Guess w <$> genGuess <*> genGuess <*> choose (0, val))
| Just w <- [tok] ] ++
[ (1, Guess <$> genWallet <*> genGuess <*> genGuess <*> genValue) ] ] ++
[ GiveToken <$> genWallet ]
where
tok = s ^. CM.contractState . hasToken
val = s ^. CM.contractState . gameValue
which generates valid guesses most of the time, with the occasional possibly-invalid one. This approach results in test cases with a reasonable balance between guessing and passing the game token, while ensuring that if the preconditions are later changed, then we can still generate every test case we could before.
Instrumenting contract models to gather statistics¶
It is possible to gather further statistics about the tests we are
generating. For example, we might wonder what proportion of Guess
actions are correct guesses. We can find out by defining the
Plutus.Contract.Test.ContractModel.Interface.monitoring
method in the Plutus.Contract.Test.ContractModel.Interface.ContractModel
class:
monitoring :: (CM.ModelState state, CM.ModelState state) -> CM.Action state -> Property -> Property
This function is called for every Plutus.Contract.Test.ContractModel.Interface.Action
in a test case,
and given the Plutus.Contract.Test.ContractModel.Interface.ModelState
before and after the
Plutus.Contract.Test.ContractModel.Interface.Action
. Its result is a function that is applied to the
property being tested, so it can use any of the QuickCheck functions for analysing test case
distribution or adding output to counterexamples.
To create a table showing the proportion of guesses which were right or wrong, we can define
Plutus.Contract.Test.ContractModel.Interface.monitoring
as
monitoring (s, _) (Guess w guess new v) =
tabulate "Guesses" [if guess == secret then "Right" else "Wrong"]
where secret = s ^. CM.contractState . currentSecret
monitoring _ _ = id
This generates output such as this:
> quickCheck . withMaxSuccess 1000 $ prop_Game
+++ OK, passed 1000 tests.
Actions (23917 in total):
48.271% GiveToken
47.845% Guess
3.884% Lock
Guesses (11443 in total):
75.417% Wrong
24.583% Right
Around 25% of guesses were correct in this test run, which is not
surprising since we chose guesses uniformly from a list of four
possibilities (and the Plutus.Contract.Test.ContractModel.Interface.precondition
for guesses does not depend on
the choice). Since correct guesses are probably at least as
interesting to test as incorrect ones, a sensible next step would be
to modify the guess generator to guess correctly more often–perhaps
half the time. We leave this as an exercise for the reader.
It is always good practice to make measurements of the distribution of test cases like this, and then improve test case generation to that the distribution looks reasonable. Otherwise there is a risk of developing a false sense of security, engendered by running many thousands of trivial tests.
Goal-directed testing with dynamic logic¶
The tests we have developed so far test that ‘nothing bad ever happens’-the funds in a test always end up where the model says that they should. To put it another way, funds are never stolen. But this does not really cover everything we want to test: we also want to know that ‘something good eventually happens’, or at least, ‘something good is always possible’. Concretely, this will often mean testing that the funds in a contract can always be recovered–they cannot end up locked in a contract for ever. And indeed, in the case of the game contract, we would like to check that no matter what has happened previously, the Ada locked by the contract can always be recovered by a player who knows the password.
Here we are really identifying desirable ‘goal states’, namely those in which all the Ada have been recovered from the contract, and aiming to test that a goal state is always reachable. Obviously, random tests are quite unlikely to end in a goal state, so no particular conclusion can be drawn from one that does not. It is also hard to see how QuickCheck might determine automatically whether a goal state is reachable or not. So we test this kind of property by allowing the tester to specify a strategy for reaching a goal state; QuickCheck then tests that this strategy always works.
Introducing the dynamic logic monad¶
We write this kind of test using ‘dynamic logic’ wrapped in a monad, which just means that we write
test case generators that can mix random actions, specified actions, and assertions. These
generators are little programs in the Plutus.Contract.Test.ContractModel.Interface.DL
monad, such as
this one:
unitTest :: CM.DL GameModel ()
unitTest = do
CM.action $ Lock w1 "hello" 10
CM.action $ GiveToken w2
CM.action $ Guess w2 "hello" "new secret" 3
This Plutus.Contract.Test.ContractModel.Interface.DL
fragment simply specifies a unit test in terms
of the underlying Plutus.Contract.Test.ContractModel.Interface.ContractModel
we have already seen,
using Plutus.Contract.Test.ContractModel.Interface.action
to include a specific
Plutus.Contract.Test.ContractModel.Interface.Action
in the test. To run such a test, we must specify
a QuickCheck property such as
propDL :: CM.DL GameModel () -> Property
propDL dl = CM.forAllDL dl prop_Game
which uses Plutus.Contract.Test.ContractModel.Interface.forAllDL
to generate a test sequence from the
dl
provided, and runs it using the same underlying property as before. The execution is checked
against the model, so we do not need to add any further assertions to this unit test. This gives
us a very convenient way to define unit tests for a contract specified by a
Plutus.Contract.Test.ContractModel.Interface.ContractModel
.
We can run this test as follows:
> quickCheck . withMaxSuccess 1 $ propDL unitTest
+++ OK, passed 1 test.
Actions (3 in total):
33% GiveToken
33% Guess
33% Lock
Quantifiers in dynamic logic¶
As well as writing unit tests in the Plutus.Contract.Test.ContractModel.Interface.DL
monad, we can add random
generation. For example, if we wanted to generalize the unit test
above a little to lock a random amount of Ada in the contract, then
we could instead write:
unitTest :: CM.DL GameModel ()
unitTest = do
val <- CM.forAllQ $ CM.chooseQ (1, 20)
CM.action $ Lock w1 "hello" val
CM.action $ GiveToken w2
CM.action $ Guess w2 "hello" "new secret" 3
Here Plutus.Contract.Test.ContractModel.Interface.forAllQ
lets us generate a random value using
Test.QuickCheck.DynamicLogic.Quantify.chooseQ from quickcheck-dynamic:
chooseQ :: (Arbitrary a, Random a, Ord a) => (a, a) -> Quantification a
Plutus.Contract.Test.ContractModel.Interface.forAllQ
takes a
Test.QuickCheck.DynamicLogic.Quantify.Quantification, which resembles a QuickCheck
generator, but with a more limited API to support its use in dynamic
logic.
When this is tested, random values in the range 1-20 are locked… and a test fails:
> quickCheck $ propDL unitTest
*** Failed! Falsified (after 3 tests):
BadPrecondition
[Witness (1 :: Integer),
Do $ Lock (Wallet 1) "hello" 1,
Do $ GiveToken (Wallet 2)]
[Action (Guess (Wallet 2) "hello" "new secret" 3)]
(GameModel {_gameValue = 1, _hasToken = Just (Wallet 2), _currentSecret = "hello"})
Dynamic logic test cases are a little more complex than the simple
action sequences we have seen so far, and they give us a little more
information. Every such test contains a list of Plutus.Contract.Test.ContractModel.Interface.Action
, tagged
Do
, and witnesses, tagged Witness
. The witnesses record the
results of random choices made by Plutus.Contract.Test.ContractModel.Interface.forAllQ
: in this case, the Ada
value to be locked was chosen to be 1. The test proceeds by
locking the Ada and giving the game token to wallet 2, but the third
action we specified–making the guess–cannot be run, because its
precondition is False
. This is what the
BadPrecondition
tells
us, and the action that could not be performed appears as
[Action (Guess (Wallet 2) "hello" "new secret" 3)]
The last component is the model state at that point: we can see that
the gameValue
is only 1 Ada, so of course we cannot withdraw 3.
Note
We saw earlier that when tests are generated from a
Plutus.Contract.Test.ContractModel.Interface.ContractModel
, then QuickCheck only generates actions whosePlutus.Contract.Test.ContractModel.Interface.precondition
is satisfied. On the other hand, when we use dynamic logic to specify an action explicitly like this, then there is no guarantee that its precondition will hold, and so a ‘bad precondition’ error becomes a possibility. The problem here is really that this generalized unit test is inconsistent with our model.
Repeating a dynamic logic test¶
Once again, we can copy-and-paste the failed testcase into our source code:
badUnitTest :: CM.DL GameModel ()
badUnitTest = do
CM.action $ Lock w1 "hello" 1
CM.action $ GiveToken w2
CM.action $ Guess w2 "hello" "new secret" 3
We can rerun the test by supplying the precise action sequence that was generated:
> quickCheck $ forAllDL badUnitTest prop_Game
*** Failed! Falsified (after 1 test):
If we now correct unitTest
and do freshly generated random tests
we see that the issue is resolved:
> quickCheck $ forAllDL unitTest prop_Game
+++ OK, passed 100 tests.
Actions (300 in total):
33.3% GiveToken
33.3% Guess
33.3% Lock
In this case the saved test ‘passes’ because it no longer matches the
modified Plutus.Contract.Test.ContractModel.Interface.DL
test, so it is not a counterexample to the property we
are testing.
Something good is always possible¶
We saw above how to generate random parameters to actions in dynamic logic tests; what gives them their real power is that we can also include random actions.
Suppose we want to test that no Ada remain locked in the game contract
for ever. We could try to specify this with a Plutus.Contract.Test.ContractModel.Interface.DL
test that requires that no
Ada remain locked in the contract after any sequence of actions. We
can include a random sequence of actions in a Plutus.Contract.Test.ContractModel.Interface.DL
test using
Plutus.Contract.Test.ContractModel.Interface.anyActions_
, and we can make assertions about the
Plutus.Contract.Test.ContractModel.Interface.ModelState
using
Plutus.Contract.Test.ContractModel.Interface.assertModel
. Thus we can define
noLockedFunds :: CM.DL GameModel ()
noLockedFunds = do
CM.anyActions_
CM.assertModel "Locked funds should be zero" $ CM.symIsZero . CM.lockedValue
to assert that, after any sequence of actions, no funds should remain
locked (Plutus.Contract.Test.ContractModel.Interface.lockedValue
extracts the total value locked in contracts from
the Plutus.Contract.Test.ContractModel.Interface.ModelState
).
Of course, this test fails:
> quickCheck $ forAllDL noLockedFunds prop_Game
*** Failed! Falsified (after 1 test and 2 shrinks):
BadPrecondition
[Do $ Lock (Wallet 1) "*******" 1]
[Assert "Locked funds should be zero"]
(GameModel {_gameValue = 1, _hasToken = Just (Wallet 1), _currentSecret = "*******"})
If all we do is lock one Ada, then obviously the locked funds are not
zero. The failed assertion is reported as a
BadPrecondition
(for the assertion).
The property we wrote above is wrong: what we really intended to say was that after a correct guess that requests all the funds, then no locked funds remain. Let us write a property that says that any wallet can recover the funds by making such a guess. To program our strategy, we will need to read the secret password, and the value remaining in the contract, from the contract model:
noLockedFunds :: CM.DL GameModel ()
noLockedFunds = do
CM.anyActions_
w <- CM.forAllQ $ CM.elementsQ wallets
secret <- CM.viewContractState currentSecret
val <- CM.viewContractState gameValue
CM.action $ Guess w "" secret val
CM.assertModel "Locked funds should be zero" $ CM.symIsZero . CM.lockedValue
After a random sequence of actions, we choose a random wallet and construct a correct guess that recovers all the locked Ada to this wallet. But this property also fails!
> quickCheck $ forAllDL noLockedFunds prop_Game
*** Failed! Falsified (after 1 test and 2 shrinks):
BadPrecondition
[Witness (Wallet 1 :: Wallet)]
[Action (Guess (Wallet 1) "" "" 0)]
(GameModel {_gameValue = 0, _hasToken = Nothing, _currentSecret = ""})
Here QuickCheck has chosen the arbitrary sequence of actions to be
empty, so the contract has not even been locked–and of course, in
that case, a Guess
is not possible. To pass the test, our strategy
must work in every situation. However, if the contract has not been
locked, then there are no locked funds, so the assertion in this
property would pass without our doing anything at all. Perhaps we
should only make a Guess
if there are actually funds to be
recovered:
noLockedFunds :: CM.DL GameModel ()
noLockedFunds = do
CM.anyActions_
w <- CM.forAllQ $ CM.elementsQ wallets
secret <- CM.viewContractState currentSecret
val <- CM.viewContractState gameValue
when (val > 0) $ do
CM.action $ Guess w "" secret val
CM.assertModel "Locked funds should be zero" $ CM.symIsZero . CM.lockedValue
This is better, but testing the property still fails:
> quickCheck $ forAllDL noLockedFunds prop_Game
*** Failed! Falsified (after 1 test and 1 shrink):
BadPrecondition
[Do $ Lock (Wallet 1) "*******" 1,
Witness (Wallet 2 :: Wallet)]
[Action (Guess (Wallet 2) "" "*******" 1)]
(GameModel {_gameValue = 1, _hasToken = Just (Wallet 1), _currentSecret = "*******"})
In this case we locked 1 Ada in the contract, chose wallet 2 to
recover the funds, and then tried to make a correct guess–but the
precondition for Guess
still failed. And this is no surprise: the
wallet does not hold the game token. This test case shows that, as
part of our strategy for recovering the funds, we also need to give
the game token to the wallet that will make the guess.
noLockedFunds :: CM.DL GameModel ()
noLockedFunds = do
CM.anyActions_
w <- CM.forAllQ $ CM.elementsQ wallets
secret <- CM.viewContractState currentSecret
val <- CM.viewContractState gameValue
when (val > 0) $ do
CM.action $ GiveToken w
CM.action $ Guess w "" secret val
CM.assertModel "Locked funds should be zero" $ CM.symIsZero . CM.lockedValue
Now we expect the tests to pass:
> quickCheck $ forAllDL noLockedFunds prop_Game
*** Failed! Falsified (after 1 test):
BadPrecondition
[Do $ Lock (Wallet 1) "hello" 5,
Witness (Wallet 3 :: Wallet),
Do $ GiveToken (Wallet 3),
Do $ Guess (Wallet 3) "" "hello" 5]
[Assert "Locked funds should be zero"]
(GameModel {_gameValue = 5, _hasToken = Just (Wallet 3), _currentSecret = "hello"})
They do not! We can see from the last line that, in the final state,
our model indeed says that there are still 5 Ada locked in the
contract. This is the effect of the Plutus.Contract.Test.ContractModel.Interface.nextState
function in our model,
so let us inspect the relevant part of its code:
nextState (Guess w old new val) = do
correctGuess <- (old ==) <$> CM.viewContractState currentSecret
-- ...
Comparing carefully with the failed test, we see that our strategy is
supplying the empty string as the old password (the guess), and the
correct password as the new one–so the guess is wrong, and the Ada
was not recovered. Swapping the two password arguments to Guess
does, at last, make the tests pass.
For this simple contract, recovering the locked funds is easy–but as we have seen, writing a property that says that it is always possible forces us to be precise about our strategy, and reveals anything we might have overlooked.
Monitoring and tuning dynamic logic tests¶
The dynamic logic test we have developed only uses our recovery
strategy if there are locked funds remaining after a random sequence of
actions. How often does that happen? Given that tests contain many
more guesses than Lock
actions, there is a risk that the contract is
usually holding no funds before we even consider using our
strategy. To find out, we can Plutus.Contract.Test.ContractModel.Interface.monitor
the contract model during our
tests. As in the Plutus.Contract.Test.ContractModel.Interface.monitoring
method of the
Plutus.Contract.Test.ContractModel.Interface.ContractModel
class, we
can use any of the QuickCheck operations for analyzing test cases, but
instead of applying the Plutus.Contract.Test.ContractModel.Interface.monitoring
at every action in a test case, we
can Plutus.Contract.Test.ContractModel.Interface.monitor
at selected points.
In this case, we choose to label test cases that actually invoke our fund recovery strategy:
noLockedFunds :: CM.DL GameModel ()
noLockedFunds = do
CM.anyActions_
w <- CM.forAllQ $ CM.elementsQ wallets
secret <- CM.viewContractState currentSecret
val <- CM.viewContractState gameValue
when (val > 0) $ do
CM.monitor $ label "Unlocking funds"
CM.action $ GiveToken w
CM.action $ Guess w secret "" val
CM.assertModel "Locked funds should be zero" $ CM.symIsZero . CM.lockedValue
With the addition of the Plutus.Contract.Test.ContractModel.Interface.monitor
line, QuickCheck tells us what
proportion of our tests actually leave funds to recover:
> quickCheck $ forAllDL noLockedFunds prop_Game
+++ OK, passed 100 tests (31% Unlocking funds).
Actions (5112 in total):
49.24% GiveToken
48.81% Guess
1.96% Lock
We can see that around 30% of generated tests leave some Ada in the
contract for our strategy to recover. This is a bit low–it means that
two thirds of our tests do not actually test the strategy. But it is
easy to address: we can simply use the dynamic logic to specify the
initial Lock
action explicitly, and generate larger amounts for
the initial funds locked in the game (lines 3-5 below):
noLockedFunds :: CM.DL GameModel ()
noLockedFunds = do
(w0, funds, pass) <- CM.forAllQ (CM.elementsQ wallets, CM.chooseQ (1, 10000), CM.elementsQ guesses)
CM.action $ Lock w0 pass funds
CM.anyActions_
w <- CM.forAllQ $ CM.elementsQ wallets
secret <- CM.viewContractState currentSecret
val <- CM.viewContractState gameValue
when (val > 0) $ do
CM.monitor $ label "Unlocking funds"
CM.action $ GiveToken w
CM.action $ Guess w secret "" val
CM.assertModel "Locked funds should be zero" $ CM.symIsZero . CM.lockedValue
With this addition, a much higher proportion of tests actually exercise our recovery strategy:
> quickCheck $ forAllDL noLockedFunds prop_Game
+++ OK, passed 100 tests (74% Unlocking funds).
Actions (5198 in total):
49.75% GiveToken
48.33% Guess
1.92% Lock
More dynamic logic¶
Dynamic logic tests are much more expressive than we have seen
hitherto. The Plutus.Contract.Test.ContractModel.Interface.DL
monad is an instance of Alternative
, so we can
write tests with random control flow, weight choices suitably, and so
on. For example, Plutus.Contract.Test.ContractModel.Interface.anyActions
, which generates a random sequence of
actions of expected length n
, is defined by
anyActions :: Int -> CM.DL s ()
anyActions n = CM.stopping
<|> CM.weight (1 / fromIntegral n)
<|> (CM.anyAction >> anyActions n)
This code makes a random choice between three alternatives, expressed
using (<|>)
. The first two alternatives terminate (and return
()
), while the last alternative performs a random action followed
by another random sequence of actions. The second alternative is
weighted by 1/n
, so the third is chosen n
times as often,
resulting in an expected length of n
actions. The first
alternative is guarded by Plutus.Contract.Test.ContractModel.Interface.stopping
, which means it will be chosen
only if the test case is ‘getting too long’; in this case
Plutus.Contract.Test.ContractModel.Interface.anyActions
will generate an empty sequence. We can exercise fine
control over the way test cases are generated, including specifying
strategies for bringing a long test case to a close. See the
documentation for more details.
Limitations¶
The tests we have developed here suffer from three main limitations:
We test only via the off-chain contract endpoints
We test under favourable assumptions on timing
We don’t test for information leaks
What are the implications, and how can we address them?
Testing only via contract endpoints¶
We tested that the guessing game contract behaves as it should, provided transactions are submitted using the off-chain contract code. But what if a malicious actor writes their own off-chain code, submitting transactions that the contract’s own off-chain code cannot generate? Is it possible, for example, to steal the Ada locked by the game, without submitting a guess? Our tests do not cover this.
One way to mitigate this problem is to add additional ‘attack’ endpoints to the contract under test, that carry out a variety of conceivable attacks. Our contract model would then model these attack actions as no-ops, to represent the fact that the attacks fail; our tests would then check that the attacks fail in all circumstances, and with all parameters.
If we think of the guessing game, not as a game, but instead as a contract that protects funds with a password, then we might consider ‘guessing’ with the correct password as a correct withdrawal, and guessing with an incorrect password as an attack. Our model does test that these attacks always fail; this approach can be used in general to test that contracts are robust against anticipated attacks.
Test assumptions on timing¶
Our tests wait for the transactions generated by an
Plutus.Contract.Test.ContractModel.Interface.Action
to complete before performing the next
Plutus.Contract.Test.ContractModel.Interface.Action
(because we inserted calls to delay
into
Plutus.Contract.Test.ContractModel.Interface.perform
, and
Plutus.Contract.Test.ContractModel.Interface.wait
into
Plutus.Contract.Test.ContractModel.Interface.nextState
). In reality, different wallets may perform
actions simultaneously. This usually results in two or more transactions that try to spend the same
UTXO, leading all but the first to fail.
Endpoint calls that submit several transactions are
even more problematic, because such a call may fail part way through,
leaving the blockchain in an intermediate, and possibly invalid
state. Arguably, contracts should be written to undo the effects of
earlier transactions if later ones fail–although, as we saw, the
Lock
endpoint of the game contract (which is implemented
as two transactions) does not do this: a second call to Lock
fails
after the first transaction, and leaves the blockchain in a state that
the contract cannot handle.
It is possible to test this kind of behaviour in our framework, by not delaying before the next
action, so that several actions can be started in the same slot. Delays must then be included as an
explicit Plutus.Contract.Test.ContractModel.Interface.Action
in test cases instead. However,
modelling becomes considerably harder, because the model must predict which transaction of several
simultaneous ones succeeds, and must take into account how many transactions each end-point call
results in, and which slot each is expected to be commited in. It isn’t clear that the extra
modelling effort is really worthwhile.
Moreover, in reality transactions might be delayed, which means that endpoint calls that generate several transactions might end up being interleaved in unexpected ways. The emulator doesn’t currently simulate this, and so these kinds of tests cannot yet be run.
Testing for information leaks¶
We have tested that only a guess containing the correct secret can withdraw Ada from the game contract. So to steal the money, an adversary must discover the secret. But recall that an adversary can access everything on the blockchain, and also the contents of transactions.
There are least two serious bugs that the game contract could contain, that would permit an adversary to steal all the money:
The secret could be stored in plain text in the contract state, instead of encrypted. The contract state is stored on the blockchain. The adversary could simply read the secret from the blockchain, and then make a correct guess to steal the money.
The secret might be stored in encrypted form on the blockchain, but
Guess
transactions might contain an encrypted guess, rather than a plain text guess (as they do in this implementation). Then the adversary could simply read the encrypted secret from the blockchain, and submit it in a guess transaction to steal the money.
Neither of these bugs would be detected by our tests, nor is it clear how they could be.
This test framework is a powerful tool for testing that contracts behave correctly, when used as intended–but users should be aware of the limitations in this section, and be careful to avoid the pitfalls they expose.