How To Write Tests

The Anatomy of a Test Function

Tests are Cairo functions that verify that the non-test code is functioning in the expected manner. The bodies of test functions typically perform these three actions:

  • Set up any needed data or state.
  • Run the code you want to test.
  • Assert the results are what you expect.

Let’s look at the features Cairo provides specifically for writing tests that take these actions, which include the test attribute, the assert function, and the should_panic attribute.

The Anatomy of a Test Function

At its simplest, a test in Cairo is a function that’s annotated with the test attribute. Attributes are metadata about pieces of Cairo code; one example is the derive attribute we used with structs in Chapter 5. To change a function into a test function, add #[test] on the line before fn. When you run your tests with the scarb cairo-test command, Scarb runs Cairo's test runner binary that runs the annotated functions and reports on whether each test function passes or fails.

Let's create a new project called adder that will add two numbers using Scarb with the command scarb new adder:

adder
├── Scarb.toml
└── src
    └── lib.cairo

In lib.cairo, let's remove the existing content and add a first test, as shown in Listing 9-1.

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
    #[test]
    fn it_works() {
        let result = 2 + 2;
        assert(result == 4, 'result is not 4');
    }
}

Listing 9-1: A test module and function

For now, let’s ignore the top two lines and focus on the function. Note the #[test] annotation: this attribute indicates this is a test function, so the test runner knows to treat this function as a test. We might also have non-test functions in the tests module to help set up common scenarios or perform common operations, so we always need to indicate which functions are tests.

The example function body uses the assert function, which contains the result of adding 2 and 2, equals 4. This assertion serves as an example of the format for a typical test. Let’s run it to see that this test passes.

The scarb cairo-test command runs all tests founds in our project, as shown in Listing 9-2.

$ scarb cairo-test
testing adder...
running 1 tests
test adder::lib::tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 filtered out;

Listing 9-2: The output from running a test

scarb cairo-test compiled and ran the test. We see the line running 1 tests. The next line shows the name of the test function, called it_works, and that the result of running that test is ok. The overall summary test result: ok. means that all the tests passed, and the portion that reads 1 passed; 0 failed totals the number of tests that passed or failed.

It’s possible to mark a test as ignored so it doesn’t run in a particular instance; we’ll cover that in the Ignoring Some Tests Unless Specifically Requested section later in this chapter. Because we haven’t done that here, the summary shows 0 ignored. We can also pass an argument to the scarb cairo-test command to run only a test whose name matches a string; this is called filtering and we’ll cover that in the Running Single Tests section. We also haven’t filtered the tests being run, so the end of the summary shows 0 filtered out.

Let’s start to customize the test to our own needs. First change the name of the it_works function to a different name, such as exploration, like so:

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
    #[test]
    fn exploration() {
        let result = 2 + 2;
        assert(result == 4, 'result is not 4');
    }
}

Then run scarb cairo-test again. The output now shows exploration instead of it_works:

$ scarb cairo-test
running 1 tests
test adder::lib::tests::exploration ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 filtered out;

Now we’ll add another test, but this time we’ll make a test that fails! Tests fail when something in the test function panics. Each test is run in a new thread, and when the main thread sees that a test thread has died, the test is marked as failed. Enter the new test as a function named another, so your src/lib.cairo file looks like Listing 9-3.

#![allow(unused)]
fn main() {
    #[test]
    fn another() {
        let result = 2 + 2;
        assert(result == 6, 'Make this test fail');
    }

}

Listing 9-3: Adding a second test that will fail

$ scarb cairo-test
running 2 tests
test adder::lib::tests::exploration ... ok
test adder::lib::tests::another ... fail
failures:
    adder::lib::tests::another - panicked with [1725643816656041371866211894343434536761780588 ('Make this test fail'), ].
Error: test result: FAILED. 1 passed; 1 failed; 0 ignored

Listing 9-4: Test results when one test passes and one test fails

Instead of ok, the line adder::lib::tests::another shows fail. A new section appears between the individual results and the summary. It displays the detailed reason for each test failure. In this case, we get the details that another failed because it panicked with [1725643816656041371866211894343434536761780588 ('Make this test fail'), ] in the src/lib.cairo file.

The summary line displays at the end: overall, our test result is FAILED. We had one test pass and one test fail.

Now that you’ve seen what the test results look like in different scenarios, let’s look at some functions that are useful in tests.

Checking Results with the assert function

The assert function, provided by Cairo, is useful when you want to ensure that some condition in a test evaluates to true. We give the assert function a first argument that evaluates to a Boolean. If the value is true, nothing happens and the test passes. If the value is false, the assert function calls panic() to cause the test to fail with a message we defined as the second argument of the assert function. Using the assert function helps us check that our code is functioning in the way we intend.

In Chapter 5, Listing 5-15, we used a Rectangle struct and a can_hold method, which are repeated here in Listing 9-5. Let’s put this code in the src/lib.cairo file, then write some tests for it using the assert function.

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
trait RectangleTrait {
    fn area(self: @Rectangle) -> u64;
    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool;
}

impl RectangleImpl of RectangleTrait {
    fn area(self: @Rectangle) -> u64 {
        *self.width * *self.height
    }

    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool {
        *self.width > *other.width && *self.height > *other.height
    }
}
}

Listing 9-5: Using the Rectangle struct and its can_hold method from Chapter 5

The can_hold method returns a bool, which means it’s a perfect use case for the assert function. In Listing 9-6, we write a test that exercises the can_hold method by creating a Rectangle instance that has a width of 8 and a height of 7 and asserting that it can hold another Rectangle instance that has a width of 5 and a height of 1.

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
use debug::PrintTrait;
#[derive(Copy, Drop)]
struct Rectangle {
    width: u64,
    height: u64,
}

trait RectangleTrait {
    fn area(self: @Rectangle) -> u64;
    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool;
}

impl RectangleImpl of RectangleTrait {
    fn area(self: @Rectangle) -> u64 {
        *self.width * *self.height
    }

    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool {
        *self.width > *other.width && *self.height > *other.height
    }
}

#[cfg(test)]
mod tests {
    use super::Rectangle;
    use super::RectangleTrait;


    #[test]
    fn larger_can_hold_smaller() {
        let larger = Rectangle { height: 7, width: 8, };
        let smaller = Rectangle { height: 1, width: 5, };

        assert(larger.can_hold(@smaller), 'rectangle cannot hold');
    }

    #[test]
    fn smaller_cannot_hold_larger() {
        let larger = Rectangle { height: 7, width: 8, };
        let smaller = Rectangle { height: 1, width: 5, };

        assert(!smaller.can_hold(@larger), 'rectangle cannot hold');
    }
}


}

Listing 9-6: A test for can_hold that checks whether a larger rectangle can indeed hold a smaller rectangle

Note that we’ve added two new lines inside the tests module: use super::Rectangle; and use super::RectangleTrait;. The tests module is a regular module that follows the usual visibility rules. Because the tests module is an inner module, we need to bring the code under test in the outer module into the scope of the inner module.

We’ve named our test larger_can_hold_smaller, and we’ve created the two Rectangle instances that we need. Then we called the assert function and passed it the result of calling larger.can_hold(@smaller). This expression is supposed to return true, so our test should pass. Let’s find out!

$ scarb cairo-test
running 1 tests
test adder::lib::tests::larger_can_hold_smaller ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 filtered out;

It does pass! Let’s add another test, this time asserting that a smaller rectangle cannot hold a larger rectangle:

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
use debug::PrintTrait;
#[derive(Copy, Drop)]
struct Rectangle {
    width: u64,
    height: u64,
}

trait RectangleTrait {
    fn area(self: @Rectangle) -> u64;
    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool;
}

impl RectangleImpl of RectangleTrait {
    fn area(self: @Rectangle) -> u64 {
        *self.width * *self.height
    }

    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool {
        *self.width > *other.width && *self.height > *other.height
    }
}

#[cfg(test)]
mod tests {
    use super::Rectangle;
    use super::RectangleTrait;


    #[test]
    fn larger_can_hold_smaller() {
        let larger = Rectangle { height: 7, width: 8, };
        let smaller = Rectangle { height: 1, width: 5, };

        assert(larger.can_hold(@smaller), 'rectangle cannot hold');
    }

    #[test]
    fn smaller_cannot_hold_larger() {
        let larger = Rectangle { height: 7, width: 8, };
        let smaller = Rectangle { height: 1, width: 5, };

        assert(!smaller.can_hold(@larger), 'rectangle cannot hold');
    }
}


}

Because the correct result of the can_hold function in this case is false, we need to negate that result before we pass it to the assert function. As a result, our test will pass if can_hold returns false:

$ scarb cairo-test
    running 2 tests
    test adder::lib::tests::smaller_cannot_hold_larger ... ok
    test adder::lib::tests::larger_can_hold_smaller ... ok
    test result: ok. 2 passed; 0 failed; 0 ignored; 0 filtered out;

Two tests that pass! Now let’s see what happens to our test results when we introduce a bug in our code. We’ll change the implementation of the can_hold method by replacing the greater-than sign with a less-than sign when it compares the widths:

#![allow(unused)]
fn main() {
impl RectangleImpl of RectangleTrait {
    fn area(self: @Rectangle) -> u64 {
        *self.width * *self.height
    }

    fn can_hold(self: @Rectangle, other: @Rectangle) -> bool {
        *self.width < *other.width && *self.height > *other.height
    }
}
}

Running the tests now produces the following:

$ scarb cairo-test
running 2 tests
test adder::lib::tests::smaller_cannot_hold_larger ... ok
test adder::lib::tests::larger_can_hold_smaller ... fail
failures:
   adder::lib::tests::larger_can_hold_smaller - panicked with [167190012635530104759003347567405866263038433127524 ('rectangle cannot hold'), ].

Error: test result: FAILED. 1 passed; 1 failed; 0 ignored

Our tests caught the bug! Because larger.width is 8 and smaller.width is 5, the comparison of the widths in can_hold now returns false: 8 is not less than 5.

Checking for panics with should_panic

In addition to checking return values, it’s important to check that our code handles error conditions as we expect. For example, consider the Guess type in Listing 9-8. Other code that uses Guess depends on the guarantee that Guess instances will contain only values between 1 and 100. We can write a test that ensures that attempting to create a Guess instance with a value outside that range panics.

We do this by adding the attribute should_panic to our test function. The test passes if the code inside the function panics; the test fails if the code inside the function doesn’t panic.

Listing 9-8 shows a test that checks that the error conditions of GuessTrait::new happen when we expect them to.

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
#[derive(Copy, Drop)]
struct Guess {
    value: u64,
}

trait GuessTrait {
    fn new(value: u64) -> Guess;
}

impl GuessImpl of GuessTrait {
    fn new(value: u64) -> Guess {
        if value < 1 || value > 100 {
            let mut data = ArrayTrait::new();
            data.append('Guess must be >= 1 and <= 100');
            panic(data);
        }
        Guess { value }
    }
}

#[cfg(test)]
mod tests {
    use super::Guess;
    use super::GuessTrait;

    #[test]
    #[should_panic]
    fn greater_than_100() {
        GuessTrait::new(200);
    }
}
}

Listing 9-8: Testing that a condition will cause a panic

We place the #[should_panic] attribute after the #[test] attribute and before the test function it applies to. Let’s look at the result when this test passes:

$ scarb cairo-test
running 1 tests
test adder::lib::tests::greater_than_100 ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 filtered out;

Looks good! Now let’s introduce a bug in our code by removing the condition that the new function will panic if the value is greater than 100:

#![allow(unused)]
fn main() {
#[derive(Copy, Drop)]
struct Guess {
    value: u64,
}

trait GuessTrait {
    fn new(value: u64) -> Guess;
}

impl GuessImpl of GuessTrait {
    fn new(value: u64) -> Guess {
        if value < 1 {
            let mut data = ArrayTrait::new();
            data.append('Guess must be >= 1 and <= 100');
            panic(data);
        }

        Guess { value, }
    }
}


}

When we run the test in Listing 9-8, it will fail:

$ scarb cairo-test
running 1 tests
test adder::lib::tests::greater_than_100 ... fail
failures:
   adder::lib::tests::greater_than_100 - expected panic but finished successfully.
Error: test result: FAILED. 0 passed; 1 failed; 0 ignored

We don’t get a very helpful message in this case, but when we look at the test function, we see that it’s annotated with #[should_panic]. The failure we got means that the code in the test function did not cause a panic.

Tests that use should_panic can be imprecise. A should_panic test would pass even if the test panics for a different reason from the one we were expecting. To make should_panic tests more precise, we can add an optional expected parameter to the should_panic attribute. The test harness will make sure that the failure message contains the provided text. For example, consider the modified code for Guess in Listing 9-9 where the new function panics with different messages depending on whether the value is too small or too large.

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
#[derive(Copy, Drop)]
struct Guess {
    value: u64,
}

trait GuessTrait {
    fn new(value: u64) -> Guess;
}

impl GuessImpl of GuessTrait {
    fn new(value: u64) -> Guess {
        if value < 1 {
            panic_with_felt252('Guess must be >= 1');
        } else if value > 100 {
            panic_with_felt252('Guess must be <= 100');
        }

        Guess { value, }
    }
}

#[cfg(test)]
mod tests {
    use super::Guess;
    use super::GuessTrait;

    #[test]
    #[should_panic(expected: ('Guess must be <= 100',))]
    fn greater_than_100() {
        GuessTrait::new(200);
    }
}


}

Listing 9-9: Testing for a panic with a panic message containing the error message string

This test will pass because the value we put in the should_panic attribute’s expected parameter is the array of string of the message that the Guess::new function panics with. We need to specify the entire panic message that we expect.

To see what happens when a should_panic test with an expected message fails, let’s again introduce a bug into our code by swapping the bodies of the if value < 1 and the else if value > 100 blocks:

#![allow(unused)]
fn main() {
impl GuessImpl of GuessTrait {
    fn new(value: u64) -> Guess {
        if value < 1 {
            let mut data = ArrayTrait::new();
            data.append('Guess must be >= 1');
            panic(data);
        } else if value > 100 {
            let mut data = ArrayTrait::new();
            data.append('Guess must be <= 100');
            panic(data);
        }

        Guess { value, }
    }
}

#[cfg(test)]
mod tests {
    use super::Guess;
    use super::GuessTrait;

    #[test]
    #[should_panic(expected: ('Guess must be <= 100',))]
    fn greater_than_100() {
        GuessTrait::new(200);
    }
}
}

This time when we run the should_panic test, it will fail:

$ scarb cairo-test
running 1 tests
test adder::lib::tests::greater_than_100 ... fail
failures:
   adder::lib::tests::greater_than_100 - panicked with [6224920189561486601619856539731839409791025 ('Guess must be >= 1'), ].

Error: test result: FAILED. 0 passed; 1 failed; 0 ignored

The failure message indicates that this test did indeed panic as we expected, but the panic message did not include the expected string. The panic message that we did get in this case was Guess must be >= 1. Now we can start figuring out where our bug is!

Running Single Tests

Sometimes, running a full test suite can take a long time. If you’re working on code in a particular area, you might want to run only the tests pertaining to that code. You can choose which tests to run by passing scarb cairo-test an option -f (for "filter"), followed by the name of the test you want to run as an argument.

To demonstrate how to run a single test, we’ll first create two tests functions, as shown in Listing 9-10, and choose which ones to run.

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn add_two_and_two() {
        let result = 2 + 2;
        assert(result == 4, 'result is not 4');
    }

    #[test]
    fn add_three_and_two() {
        let result = 3 + 2;
        assert(result == 5, 'result is not 5');
    }
}
}

Listing 9-10: Two tests with two different names

We can pass the name of any test function to cairo-test to run only that test using the -f flag:

$ scarb cairo-test -f add_two_and_two
running 1 tests
test adder::lib::tests::add_two_and_two ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 1 filtered out;

Only the test with the name add_two_and_two ran; the other test didn’t match that name. The test output lets us know we had one more test that didn’t run by displaying 1 filtered out at the end.

We can also specify part of a test name, and any test whose name contains that value will be run.

Ignoring Some Tests Unless Specifically Requested

Sometimes a few specific tests can be very time-consuming to execute, so you might want to exclude them during most runs of scarb cairo-test. Rather than listing as arguments all tests you do want to run, you can instead annotate the time-consuming tests using the ignore attribute to exclude them, as shown here:

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        let result = 2 + 2;
        assert(result == 4, 'result is not 4');
    }

    #[test]
    #[ignore]
    fn expensive_test() { // code that takes an hour to run
    }
}
}

After #[test] we add the #[ignore] line to the test we want to exclude. Now when we run our tests, it_works runs, but expensive_test doesn’t:

$ scarb cairo-test
running 2 tests
test adder::lib::tests::expensive_test ... ignored
test adder::lib::tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 1 ignored; 0 filtered out;

The expensive_test function is listed as ignored.

When you’re at a point where it makes sense to check the results of the ignored tests and you have time to wait for the results, you can run scarb cairo-test --include-ignored to run all tests whether they’re ignored or not.

Testing recursive functions or loops

When testing recursive functions or loops, you must provide the test with a maximum amount of gas that it can consume. This prevents running infinite loops or consuming too much gas, and can help you benchmark the efficiency of your implementations. To do so, you must add the #[available_gas(<Number>)] attribute on the test function. The following example shows how to use it:

Filename: src/lib.cairo

#![allow(unused)]
fn main() {
fn sum_n(n: usize) -> usize {
    let mut i = 0;
    let mut sum = 0;
    loop {
        if i == n {
            sum += i;
            break;
        };
        sum += i;
        i += 1;
    };
    sum
}

#[cfg(test)]
mod test {
    use super::sum_n;
    #[test]
    #[available_gas(2000000)]
    fn test_sum_n() {
        let result = sum_n(10);
        assert(result == 55, 'result is not 55');
    }
}
}

Benchmarking the gas usage of a specific operation

When you want to benchmark the gas usage of a specific operation, you can use the following pattern in your test function.

#![allow(unused)]
fn main() {
let initial = testing::get_available_gas();
gas::withdraw_gas().unwrap();
    /// code we want to bench.
(testing::get_available_gas() - x).print();
}

The following example shows how to use it to test the gas function of the sum_n function above.

#![allow(unused)]
fn main() {
fn sum_n(n: usize) -> usize {
    let mut i = 0;
    let mut sum = 0;
    loop {
        if i == n {
            sum += i;
            break;
        };
        sum += i;
        i += 1;
    };
    sum
}

#[cfg(test)]
mod test {
    use super::sum_n;
    use debug::PrintTrait;
    #[test]
    #[available_gas(2000000)]
    fn benchmark_sum_n_gas() {
        let initial = testing::get_available_gas();
        gas::withdraw_gas().unwrap();
        /// code we want to bench.
        let result = sum_n(10);
        (initial - testing::get_available_gas()).print();
    }
}
}

The value printed when running scarb cairo-test is the amount of gas that was consumed by the operation benchmarked.

$ scarb cairo-test
testing no_listing_09_benchmark_gas ...
running 1 tests
[DEBUG]	                               	(raw: 0x179f8

test no_listing_09_benchmark_gas::benchmark_sum_n_gas ... ok (gas usage est.: 98030)
test result: ok. 1 passed; 0 failed; 0 ignored; 0 filtered out;

Here, the gas usage of the sum_n function is 96760 (decimal representation of the hex number). The total amount consumed by the test is slightly higher at 98030, due to some extra steps required to run the entire test function.

Last change: 2023-09-22, commit: 724b90c