Introduction

Stwo is a state-of-the-art framework for creating ZK proofs that boasts many compelling features to the ZK ecosystem, some of which include:

  • Provides both the frontend and backend of creating ZK proofs
  • Frontend is designed to be flexible to allow you to express your own proof system
  • Backend leverages Circle STARKs over the Mersenne31 prime field for fast prover performance
  • Seamlessly integrated with Cairo

This book will guide you through why you should create a proof system using Stwo, guide you through the process of creating your own proof system, and also provide in-depth explanations of the inner workings of Stwo.

Why Use a Proof System?

At its core, a proof system can prove the validity of a statement \(C(x)=y\) where \(C\) is a representation of some logic, while \(x\) is an input and \(y\) the output of said logic. (Assuming that we are only dealing with logic that can be expressed as a computation, we will henceforth refer to this logic as a computation). This means that we can verify the validity of a statement by either directly running the computation, or by verifying the validity of the proof produced with the proof system. The verifier can benefit from the second option in terms of time and space, if the time to verify the proof is faster than the time to run the computation, or the size of the proof is smaller than the input to the statement.

This property of a proof system is often referred to as succinctness, and it is exactly why proof systems have seen wide adoption in the blockchain space, where computation on-chain is much more expensive compared to off-chain computation. Using a proof system, it becomes possible to replace a large collection of computation to be executed on-chain with a proof of execution of the same collection of computation and verifying it on-chain. This way, proof generation can be handled off-chain using large machines and only the proof verification needs to be done on-chain.

But there are applications of proof systems beyond just blockchains. Since a proof is verifiable as well as succinct, it can also be used as auxiliary data to verify that the computation of an untrusted party was done correctly. For example, when we delegate computation to an untrusted server, we can ask it to provide a proof along with the computation result that the result indeed came from running a specific computation. Another example could be to ask a server running an ML model to provide proof that it ran inference on the correct model. The size of the accompanying proof and the time to verify it will be negligible compared to the cost of running the computation, but we gain the guarantee that the computation was done correctly.

Another optional feature of proof systems is zero-knowledge, which means that the proof reveals nothing about the computation other than its validity. In general, the output \(y\) of the computation \(C(x)=y\) will be public (i.e. revealed to the verifier), but the input \(x\) may be split into public and private parts, so that the verifier does not learn anything about the private part. With this feature, the intermediate values computed by the prover while computing \(C(x)\) will also be hidden from the verifier.

Why Stwo?

Before we dive into why we should choose Stwo, let's define some terminology. When we talked about proof systems in the previous section, we were only referring to the part that takes a statement and outputs a proof. In reality, however, we first need to structure the statement in a way that it can be proven by the proof system. This structuring part is often referred to as the frontend, and the proof system is commonly called the backend.

With that out of the way, let's dive into some of the advantages of using Stwo.

First, Stwo is a standalone framework that provides both the frontend and backend and therefore handles the entire proving process. There are other frameworks that only provide the frontend or the backend, which has its advantages as its modular structure makes it possible to pick and choose a backend or frontend of one's liking. However, having a single integrated frontend and backend reduces the complexity of the system and is also easier to maintain.

In addition, Stwo's frontend structures statements in the Algebraic Intermediate Representation (AIR), which is a representation that is especially useful for proving statements that are repetitive (e.g. the CPU in a VM, which essentially repeats the same fetch-decode-execute over and over again).

Stwo's backend is also optimized for prover performance. This is due to largely three factors.

  1. It implements STARKs, or hash-based SNARKs, which boasts a faster prover compared to elliptic curve-based SNARKs like Groth16 or PLONK. This improvement comes mainly from running the majority of the computation in a small prime field (32 bits); Elliptic curve-based SNARKs, on the other hand, need to use big prime fields (e.g. 254-bit prime fields), which incur a lot of overhead as most computation does not require that many bits.

  2. Even amongst multiple STARK backends, however, Stwo provides state-of-the-art prover performance by running the Mersenne-31 prime field (modulo \(2^{31} - 1\)), which is faster than another popular 32-bit prime field like BabyBear (modulo \(2^{31} - 2^{27} + 1\)). We suggest going through this post for a breakdown of why this is the case.

  3. Finally, Stwo offers various CPU and GPU optimizations that improves prover performance as shown in Figure 1 below. It can also be compiled to WASM, allowing for fast proving in web environments.

Figure 1: Prover performance optimizations in Stwo

One of the drawbacks of STARKs is that they have a larger proof size compared to elliptic curve-based SNARKs. One way to mitigate this drawback is by batching multiple proofs together to form a single proof.

Note

On zero-knowledge:

As of the time of this writing, Stwo does not provide the "zero-knowledge" feature. "Zero-knowledge" here refers to the fact that the proof should not reveal any additional information other than the validity of the statement, which is not true for Stwo as it reveals to the verifier commitments to its witness values without hiding them by e.g. adding randomness. This reveals some information about the witness values, which may be used in conjunction with other information to infer the witness values.

AIR Development

This section is intended for developers who want to create custom proofs using Stwo (proofs of custom VMs, ML inference, etc.). It assumes that the reader is familiar with Rust and has some background knowledge of cryptography (e.g. finite fields). It also assumes that the reader is familiar with the concept of zero-knowledge proofs and knows what they want to create a zero-knowledge proof for, but it does not assume any firsthand experience with zero-knowledge proof systems.

Note

All the code that appears throughout this section is available here.

Writing a Simple AIR

Figure 1: Proving lifecycle in Stwo

Welcome to the guide for writing AIRs in Stwo!

In this "Writing a Simple AIR" section, we will go through the process of writing a simple AIR from scratch. This requires some understanding of the proving lifecycle in Stwo, so we added a diagram showing a high-level overview of the whole process. As we go through each step, please note that the diagram may contain more steps than the code. This is because there are steps that are abstracted away by the Stwo implementation, but is necessary for understanding the code that we need to write when creating an AIR.

Hello (ZK) World

Let's first set up a Rust project with Stwo.

$ cargo new stwo-example

We need to specify the nightly Rust compiler to use Stwo.

$ echo -e "[toolchain]\nchannel = \"nightly-2025-01-02\"" > rust-toolchain.toml

Now let's edit the Cargo.toml file as follows:

[package]
name = "stwo-examples"
version = "0.1.0"
edition = "2021"
license = "MIT"

[dependencies]
stwo-prover = { git = "https://github.com/starkware-libs/stwo.git", rev = "92984c060b49d0db05e021883755fac0a71a2fa7" }
num-traits = "0.2.17"
itertools = "0.12.0"
rand = "0.8.5"

We are all set!

Writing a Spreadsheet

Figure 1: Prover workflow: Create a table

In order to write a proof, we first need to create a table of rows and columns. This is no different than writing integers to an Excel spreadsheet as we can see in Figure 1.

Figure 2: From spreadsheet to table

But there is a slight caveat to consider when creating the table. Stwo implements SIMD operations to speed up the prover in the CPU, but this requires providing the table cells in chunks of 16 rows. Simply put, this is because Stwo supports 16 lanes of 32-bit integers, which means that the same instruction can be run simultaneously for 16 different data.

Alas, for our table, we will need to create 14 dummy rows to make the total number of rows equal to 16, as shown in Figure 3. For the sake of simplicity, however, we will omit the dummy rows in the diagrams of the following sections.

Figure 3: Creating a table with 16 rows

Given all that, let's create this table using Stwo.

use stwo_prover::core::{
    backend::{
        simd::{column::BaseColumn, m31::N_LANES},
        Column,
    },
    fields::m31::M31,
};

fn main() {
    let num_rows = N_LANES;

    let mut col_1 = BaseColumn::zeros(num_rows as usize);
    col_1.set(0, M31::from(1));
    col_1.set(1, M31::from(7));

    let mut col_2 = BaseColumn::zeros(num_rows as usize);
    col_2.set(0, M31::from(5));
    col_2.set(1, M31::from(11));
}

As mentioned above, we instantiate the num_rows of our table as N_LANES=16 to accommodate SIMD operations. Then we create a BaseColumn of N_LANES=16 rows for each column and populate the first two rows with our values and the rest with dummy values.

Note that the values in the BaseColumn need to be of type M31, which refers to the Mersenne-31 prime field that Stwo uses. This means that the integers in the table must be in the range \([0, 2^{31}-1]\).

Now that we have our table, let's move on!

From Spreadsheet to Trace Polynomials

Figure 1: Prover workflow: Trace polynomials

In the previous section, we created a table (aka spreadsheet). In this section, we will convert the table into something called trace polynomials.

Figure 2: From spreadsheet to trace polynomials

As we can see in Figure 2, the cells in each column of the table can be seen as evaluations of a Lagrange polynomial. A characteristic of a Lagrange polynomial is that interpolating \(n\) distinct points will result in a unique polynomial of at most \(n-1\) degrees. So if we consider each row value \(f(x_i)\) as the evaluation of a distinct point \(x_i\), we can get a unique polynomial of degree \(num\_rows-1\) for each column, which is also known as a trace polynomial.

We will explain why using a polynomial representation is useful in the next section, but for now, let's see how we can create trace polynomials for our code. Note that we are building upon the code from the previous section, so there's not much new code here.

fn main() {
    // --snip--

    // Convert table to trace polynomials
    let domain = CanonicCoset::new(log_num_rows).circle_domain();
    let _trace: ColumnVec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>> =
        vec![col_1, col_2]
            .into_iter()
            .map(|col| CircleEvaluation::new(domain, col))
            .collect();
}

Here, domain refers to the \(x_i\) values used to interpolate the trace polynomials. For example, \(x_1, x_2\) values in Figure 2 are the domain values for our example (in reality, we need the size of the domain needs to be 16 as we have 16 rows). We can ignore terms like CanonicCoset and .circle_domain() for now, but should note that the log_num_rows in CanonicCoset::new(log_num_rows).circle_domain() needs to be equal to the log of the actual number of rows that are used in the table.

Now that we have created 2 trace polynomials for our 2 columns, let's move on to the next section where we commit to those polynomials!

Committing to the Trace Polynomials

Figure 1: Prover workflow: Commitment

Now that we have created the trace polynomials, we need to commit to them.

As we can see in Figure 1, Stwo commits to the trace polynomials by first expanding the trace polynomials (i.e. adding more evaluations) and then committing to the expanded evaluations using a merkle tree.

The rate of expansion (commonly referred to as the blowup factor) is a parameter of FRI and readers who are interested in learning more about how to set this parameter can refer to the Circle-STARKs paper (We are also working on adding an explanation in the book).

For the puposes of this tutorial, we will use the default values for the blowup factor.

const CONSTRAINT_EVAL_BLOWUP_FACTOR: u32 = 1;

fn main() {
    // --snip--

    // Config for FRI and PoW
    let config = PcsConfig::default();

    // Precompute twiddles for evaluating and interpolating the trace
    let twiddles = SimdBackend::precompute_twiddles(
        CanonicCoset::new(
            log_num_rows + CONSTRAINT_EVAL_BLOWUP_FACTOR + config.fri_config.log_blowup_factor,
        )
        .circle_domain()
        .half_coset,
    );

    // Create the channel and commitment scheme
    let channel = &mut Blake2sChannel::default();
    let mut commitment_scheme =
        CommitmentSchemeProver::<SimdBackend, Blake2sMerkleChannel>::new(config, &twiddles);

    // Commit to the preprocessed trace
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals(vec![]);
    tree_builder.commit(channel);

    // Commit to the size of the trace
    channel.mix_u64(log_num_rows as u64);

    // Commit to the original trace
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals(trace);
    tree_builder.commit(channel);
}

We begin with some setup. First, we create a default PcsConfig instance, which sets the values for the FRI and PoW operations. Setting non-default values is related to the security of the proof, which is out of the scope for this tutorial.

Next, we precompute twiddles, also known as the domain over which the rows in the table are evaluated on. The log size of this domain is set to log_num_rows + CONSTRAINT_EVAL_BLOWUP_FACTOR + config.fri_config.log_blowup_factor, which is the max log size of the domain that is needed throughout the proving process. Why we need to add CONSTRAINT_EVAL_BLOWUP_FACTOR will be explained in the next section.

The final setup is creating a commitment scheme and a channel. The commitment scheme will be used to commit to the trace polynomials as merkle trees, while the channel will be used to keep a running hash of all data in the proving process (i.e. transcript of the proof). This is part of the Fiat-Shamir transformation where randomness can be generated safely even in a non-interactive setting. Here, we use the Blake2sChannel and Blake2sMerkleChannel for the channel and commitment scheme, respectively, but we can also use the Poseidon252Channel and Poseidon252MerkleChannel pair.

Now that we have our setup, we can commit to the trace polynomials. But before we do so, we need to first commit to an empty vector called a preprocessed trace, which doesn't do anything but is required by Stwo. Then, we need to commit to the size of the trace, which is a vital part of the proof system that the prover should not be able to cheat on. Once we have done these, we can finally commit to the original trace polynomials.

Now that we have committed to the trace polynomials, we can move on to how we can create constraints over the trace polynomials!

Evaluating Constraints Over Trace Polynomials

Figure 1: Prover workflow: Constraints

Proving Spreadsheet Functions

When we want to perform computations over the cells in a spreadsheet, we don't want to manually fill in the computed values. Instead, we leverage spreadsheet functions to autofill cells based on a given computation.

We can do the same thing with our table, except in addition to autofilling cells, we can also create a constraint that the result was computed correctly. Remember that the purpose of using a proof system is that the verifier can verify a computation was executed correctly without having to execute it themselves? Well, that's exactly why we need to create a constraint.

Now let's say we want to add a new column C to our spreadsheet that computes the product of the previous columns plus the first column. We can set C1 as A1 * B1 + A1 as in Figure 2.

In the same vein, we can create a new column in our table that computes the sum of the two previous columns. And we can constrain the value of the third column by creating an equation that must equal 0: col1_row1 * col2_row1 + col1_row1 - col3_row1 = 0.

Figure 2: Proving spreadsheet functions as constraints

Identical Constraints Every Row

Obviously, as can be seen in Figure 2, our new constraint is satisfied for every row in the table. This means that we can substitute creating a constraint for each row with a single constraint over the columns, i.e. the trace polynomials.

Thus, col1_row1 * col2_row1 + col1_row1 - col3_row1 = 0 becomes \(f_1(x) \cdot f_2(x) + f_1(x) - f_3(x) = 0\).

Note

The idea that all rows must have the same constraint may seem restrictive, compared to say a spreadsheet where we can define different functions for different rows. However, we will show in later sections how to handle such use-cases.

(Spoiler alert: it involves selectors and components)

Composition Polynomial

We will now give a name to the polynomial that expresses the constraint: a composition polynomial.

\(C(x) = f_1(x) \cdot f_2(x) + f_1(x) - f_3(x)\)

Basically, in order to prove that the constraints are satisfied, we need to show that the composition polynomial evaluates to 0 over the original domain (i.e. the domain of size the number of rows in the table).

But first, as can be seen in Figure 1, we need to expand the evaluations of the trace polynomials by a factor of 2. This is because the composition polynomial has degree 2, while the trace polynomials have degree 1, and thus we need more evaluations to uniquely determine the Lagrange polynomial.

Once we have the expanded evaluations, we can evaluate the composition polynomial. Checking that the composition polynomial evaluates to 0 over the original domain is done in FRI, so once again we need to expand the composition polynomial evaluations by a factor of 2 and commit to them.

We'll see in the code below how this is implemented.

Code

struct TestEval {
    log_size: u32,
}

impl FrameworkEval for TestEval {
    fn log_size(&self) -> u32 {
        self.log_size
    }

    fn max_constraint_log_degree_bound(&self) -> u32 {
        self.log_size + CONSTRAINT_EVAL_BLOWUP_FACTOR
    }

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let col_1 = eval.next_trace_mask();
        let col_2 = eval.next_trace_mask();
        let col_3 = eval.next_trace_mask();
        eval.add_constraint(col_1.clone() * col_2.clone() + col_1.clone() - col_3.clone());
        eval
    }
}

fn main() {
    // --snip--

    let mut col_3 = BaseColumn::zeros(num_rows);
    col_3.set(0, col_1.at(0) * col_2.at(0) + col_1.at(0));
    col_3.set(1, col_1.at(1) * col_2.at(1) + col_1.at(1));

    // Convert table to trace polynomials
    let domain = CanonicCoset::new(log_num_rows).circle_domain();
    let trace: ColumnVec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>> =
        vec![col_1, col_2, col_3]
            .into_iter()
            .map(|col| CircleEvaluation::new(domain, col))
            .collect();

    // --snip--

    // Create a component
    let _component = FrameworkComponent::<TestEval>::new(
        &mut TraceLocationAllocator::default(),
        TestEval {
            log_size: log_num_rows,
        },
        QM31::zero(),
    );
}

First, we add a new column col_3 that contains the result of the computation: col_1 * col_2 + col_1.

Then, to create a constraint over the trace polynomials, we first create a TestEval struct that implements the FrameworkEval trait. Then, we add our constraint logic in the FrameworkEval::evaluate function. Note that this function is called for every row in the table, so we only need to define the constraint once.

Inside FrameworkEval::evaluate, we call eval.next_trace_mask() consecutively three times, retrieving the cell values of all three columns (see Figure 3 below for a visual representation). Once we retrieve all three column values, we add a constraint of the form col_1 * col_2 + col_1 - col_3, which should equal 0.

Figure 3: Evaluate function

We also need to implement FrameworkEval::max_constraint_log_degree_bound(&self) for FrameworkEval. As mentioned in the Composition Polynomial section, we need to expand the trace polynomial evaluations because the degree of our composition polynomial is higher than the trace polynomial. Expanding it by the lowest value CONSTRAINT_EVAL_BLOWUP_FACTOR=1 is sufficient for our example as the degree of our composition polynomial is not very high, so we can return self.log_size + CONSTRAINT_EVAL_BLOWUP_FACTOR. For those who are interested in how to set this value in general, we leave a detailed note below.

Note

What value to set for max_constraint_log_degree_bound(&self)?

self.log_size + max(1, ceil(log2(max_degree - 1))), where max_degree is the maximum degree of all defined constraint polynomials.

e.g.

  • degree 1 - 3: self.log_size + 1
  • degree 4 - 5: self.log_size + 2
  • degree 6 - 9: self.log_size + 3
  • degree 10 - 17: self.log_size + 4
  • ...

Note

Now that we know the degree of the composition polynomial, we can also explain the following code:

    // Precompute twiddles for evaluating and interpolating the trace
    let twiddles = SimdBackend::precompute_twiddles(
        CanonicCoset::new(
            log_num_rows + CONSTRAINT_EVAL_BLOWUP_FACTOR + config.fri_config.log_blowup_factor,
        )
        .circle_domain()
        .half_coset,
    );

Why is the log_size of the domain set to log_num_rows + CONSTRAINT_EVAL_BLOWUP_FACTOR + config.fri_config.log_blowup_factor here? As we can see in Figure 1, once we have the composition polynomial, we need to expand it again for before committing to it for the FRI step. Thus, the maximum size of the domain that we need in the entire proving process is the FRI blow-up factor times the degree of the composition polynomial.

Using the new TestEval struct, we can create a new FrameworkComponent::<TestEval> component, which the prover will use to evaluate the constraint. For now, we can ignore the other parameters of the FrameworkComponent::<TestEval> constructor.

We now move on to the final section where we finally create and verify a proof.

Definition

Finally, we can break down what an Algebraic Intermediate Representation (AIR) means.

Algebraic means that we are using polynomials to represent the constraints.

Intermediate Representation means that this is a modified representation of our statement so that it can be proven by a proof system.

So AIR is just another way of saying that we are representing statements to be proven as constraints over polynomials.

Proving and Verifying an AIR

Figure 1: Prover workflow: perform FRI and PoW

We're finally ready to take the final step--prove and verify an AIR!

Since the code is relatively short, let us present it first and then go over the details.

fn main() {
    // --snip--

    // Prove
    let proof = prove(&[&component], channel, commitment_scheme).unwrap();

    // Verify
    let channel = &mut Blake2sChannel::default();
    let commitment_scheme = &mut CommitmentSchemeVerifier::<Blake2sMerkleChannel>::new(config);
    let sizes = component.trace_log_degree_bounds();

    commitment_scheme.commit(proof.commitments[0], &sizes[0], channel);
    channel.mix_u64(log_num_rows as u64);
    commitment_scheme.commit(proof.commitments[1], &sizes[1], channel);

    verify(&[&component], channel, commitment_scheme, proof).unwrap();
}

Prove

As you can see, there is only a single line of code added to create the proof. The prove function performs the FRI and PoW operations under the hood, although technically, the constraint related steps in Figure 1 were not performed in the previous section and are only performed once prove is called.

Verify

In order to verify our proof, we need to check that the constraints are satisfied using the commitments from the proof. In order to do that, we need to set up a Blake2sChannel and CommitmentSchemeVerifier<Blake2sMerkleChannel>, along with the same PcsConfig that we used when creating the proof. Then, we need to recreate the running hash channel by passing the merkle tree commitments and the log_num_rows to the CommitmentSchemeVerifier instance by calling commit (remember, the order is important!). Then, we can verify the proof using the verify function.

Exercise

Try setting the dummy values in the table to 1 instead of 0. Does it fail? If so, can you see why?

Congratulations! We have come full circle. We now know how to create a table, convert it to trace polynomials, commit to them, create constraints over the trace polynomials, and prove and verify the constraints (i.e. an AIR). In the following sections, we will go over some more complicated AIRs to explain Stwo's other features.

Preprocessed Trace

This section and the following sections are intended for developers who have completed the Writing a Simple AIR section or are already familiar with the workflow of creating an AIR. If you have not gone through the Writing a Simple AIR section, we recommend you to do so first as the following sections gloss over a lot of boilerplate code.

For those of you who have completed the Writing a Simple AIR tutorial, you should now be familiar with the concept of a trace as a table of integers that are filled in by the prover (we will now refer to this as the original trace).

In addition to the original trace, Stwo also has a concept of a preprocessed trace, which is a table whose values are fixed and therefore cannot be arbitrarily chosen by the prover. In other words, these are columns whose values are known in advance of creating a proof and essentially agreed upon by both the prover and the verifier.

One of the use cases of the preprocessed trace is as a selector for different constraints. Remember that in an AIR, the same constraints are applied to every row of the trace? If we go back to the spreadsheet analogy, this means that we can't create a spreadsheet that runs different computations for different rows. We can get around this issue by composing multiple constraints using a selector column as part of the preprocessed trace. For example, let's say we want to create a constraint that runs different computations for the first 2 rows and the next 2 rows. We can do this by using a preprocessed trace that has value 1 for the first 2 rows and 0 for the next 2 rows, essentially as a selector for the first 2 rows. The resulting single constraint composes the two different constraints by adding them together: \((1 - \text{preprocessed_trace}) * \text{constraint_1} + \text{preprocessed_trace} * \text{constraint_2}=0\)

Figure 1: Preprocessed trace as a selector

Another use case is to use the preprocessed trace for expressing constant values used in the constraints. For example, when creating a hash function in an AIR, we often need to use round constants, which the verifier needs to be able to verify or the resulting hash may be invalid. We can also "look up" the constant values as an optimization technique, which we will discuss in more detail in the next section.

In this section, we will explore how to implement a preprocessed trace as a selector, and we will implement the simplest form: a single isFirst column, where the value is 1 for the first row and 0 for all other rows.

Note

Boilerplate code is omitted for brevity. Please refer to the full example code for the full implementation.

struct IsFirstColumn {
    pub log_size: u32,
}

#[allow(dead_code)]
impl IsFirstColumn {
    pub fn new(log_size: u32) -> Self {
        Self { log_size }
    }

    pub fn gen_column(&self) -> CircleEvaluation<SimdBackend, M31, BitReversedOrder> {
        let mut col = BaseColumn::zeros(1 << self.log_size);
        col.set(0, M31::from(1));
        CircleEvaluation::new(CanonicCoset::new(self.log_size).circle_domain(), col)
    }

    pub fn id(&self) -> PreProcessedColumnId {
        PreProcessedColumnId {
            id: format!("is_first_{}", self.log_size),
        }
    }
}

First, we need to define a IsFirstColumn struct that will be used as a preprocessed trace. We will use the gen_column() function to generate a CircleEvaluation struct that is 1 for the first row and 0 for all other rows. The id() function is needed to identify this column when evaluating the constraints.

fn main() {
    ...
    // Create and commit to the preprocessed trace
    let is_first_column = IsFirstColumn::new(log_size);
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals(vec![is_first_column.gen_column()]);
    tree_builder.commit(channel);

    // Commit to the size of the trace
    channel.mix_u64(log_size as u64);

    // Create and commit to the original trace
    let trace = gen_trace(log_size);
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals(trace);
    tree_builder.commit(channel);
    ...
}

Then, in our main function, we will create and commit to the preprocessed and original traces. For those of you who are curious about why we need to commit to the trace, please refer to the Committing to the Trace Polynomials section.

struct TestEval {
    is_first_id: PreProcessedColumnId,
    log_size: u32,
}

impl FrameworkEval for TestEval {
    fn log_size(&self) -> u32 {
        self.log_size
    }

    fn max_constraint_log_degree_bound(&self) -> u32 {
        self.log_size + CONSTRAINT_EVAL_BLOWUP_FACTOR
    }

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let is_first = eval.get_preprocessed_column(self.is_first_id.clone());

        let col_1 = eval.next_trace_mask();
        let col_2 = eval.next_trace_mask();
        let col_3 = eval.next_trace_mask();

        // If is_first is 1, then the constraint is col_1 * col_2 - col_3 = 0
        // If is_first is 0, then the constraint is col_1 * col_2 + col_1 - col_3 = 0
        eval.add_constraint(
            (col_1.clone() * col_2.clone() - col_3.clone()) * is_first.clone()
                + (col_1.clone() * col_2.clone() + col_1.clone() - col_3.clone())
                    * (E::F::from(M31::from(1)) - is_first.clone()),
        );

        eval
    }
}

Now that we have the traces, we need to create a struct that contains the logic for evaluating the constraints. As mentioned before, we need to use the is_first_id field to retrieve the row value of the IsFirstColumn struct. Then, we compose two constraints using the IsFirstColumn row value as a selector and adding them together.

If you're unfamiliar with how max_constraint_log_degree_bound(&self) should be implemented, please refer to this note.

fn main() {
    ...
    // Create a component
    let component = FrameworkComponent::<TestEval>::new(
        &mut TraceLocationAllocator::default(),
        TestEval {
            is_first_id: is_first_column.id(),
            log_size,
        },
        QM31::zero(),
    );

    // Prove
    let proof = prove(&[&component], channel, commitment_scheme).unwrap();

    // Verify
    let channel = &mut Blake2sChannel::default();
    let commitment_scheme = &mut CommitmentSchemeVerifier::<Blake2sMerkleChannel>::new(config);
    let sizes = component.trace_log_degree_bounds();

    commitment_scheme.commit(proof.commitments[0], &sizes[0], channel);
    channel.mix_u64(log_size as u64);
    commitment_scheme.commit(proof.commitments[1], &sizes[1], channel);

    verify(&[&component], channel, commitment_scheme, proof).unwrap();
}

Finally, we can create a FrameworkComponent using the TestEval struct and then prove and verify the component.

Static Lookups

In the previous section, we showed how to create a preprocessed trace. In this section, we will introduce the concept of static lookups, where we will create columns that look up values from a preprocessed trace.

Note

Readers who are unfamiliar with the concept of lookups can refer to the Lookups section for a quick introduction.

Specifically, we will implement a range-check. A range-check is a technique used to check that a certain value is within a given range. This proves useful especially in proof systems like Stwo that uses finite fields because it allows checking for underflow and overflow.

A range-check checks that all values in a column are within a certain range. For example, as in Figure 1, we can check that all values in the lookup columns are between 0 and 3. We do this by creating a multiplicity column that counts the number of times each value in the preprocessed trace appears in the lookup columns.

Then, we create two LogUp columns. The first column contains in each row a fraction where the nominator is the multiplicity and the denominator is the random linear combination of the value in the range-check column. For example, for row 1, the fraction should be \(\dfrac{2}{X-0}\). Note that in Figure 1, the nominator is actually \(-2\), i.e. we apply a negation to the multiplicity, because we want the sum of the first column to be equal to the sum of the second column.

The second column contains fractions where the nominator is always 1 and the denominator is the random linear combination of the value in the lookup column. Note that a single row batches two lookup tables by adding two fractions together.

Figure 1: Range-check lookup

If you stare at the LogUp columns hard enough, you'll notice that if we add all the fractions in the two columns together, we get 0. This is no coincidence! The prover will provide the sum of the LogUp columns and the verifier check in the open that this value is indeed 0.

Now let's move on to the implementation.

struct RangeCheckColumn {
    pub log_size: u32,
}

#[allow(dead_code)]
impl RangeCheckColumn {
    pub fn new(log_size: u32) -> Self {
        Self { log_size }
    }

    pub fn gen_column(&self) -> CircleEvaluation<SimdBackend, M31, BitReversedOrder> {
        let col = BaseColumn::from_iter((0..(1 << self.log_size)).map(|i| M31::from(i)));
        CircleEvaluation::new(CanonicCoset::new(self.log_size).circle_domain(), col)
    }

    pub fn id(&self) -> PreProcessedColumnId {
        PreProcessedColumnId {
            id: format!("range_check_{}_bits", self.log_size),
        }
    }
}

First, we need to create the range-check column as a preprocessed column. This should look familiar to the code from the previous section.

fn gen_trace(log_size: u32) -> Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>> {
    // Create a table with random values
    let mut rng = rand::thread_rng();
    let mut lookup_col_1 =
        BaseColumn::from_iter((0..(1 << log_size)).map(|_| M31::from(rng.gen_range(0..16))));
    let mut lookup_col_2 =
        BaseColumn::from_iter((0..(1 << log_size)).map(|_| M31::from(rng.gen_range(0..16))));

    let mut multiplicity_col = BaseColumn::zeros(1 << log_size);
    lookup_col_1
        .as_mut_slice()
        .iter()
        .chain(lookup_col_2.as_mut_slice().iter())
        .for_each(|value| {
            let index = value.0 as usize;
            multiplicity_col.set(index, multiplicity_col.at(index) + M31::from(1));
        });

    // Convert table to trace polynomials
    let domain = CanonicCoset::new(log_size).circle_domain();
    vec![
        lookup_col_1.clone(),
        lookup_col_2.clone(),
        multiplicity_col.clone(),
    ]
    .into_iter()
    .map(|col| CircleEvaluation::new(domain, col))
    .collect()
}

Next, we create the trace columns. The first two columns are random values in the range \([0, 15]\), and the third column contains the counts of the values in the range-check column.

relation!(LookupElements, 1);

fn gen_logup_trace(
    log_size: u32,
    range_check_col: &BaseColumn,
    lookup_col_1: &BaseColumn,
    lookup_col_2: &BaseColumn,
    multiplicity_col: &BaseColumn,
    lookup_elements: &LookupElements,
) -> (
    Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>>,
    SecureField,
) {
    let mut logup_gen = LogupTraceGenerator::new(log_size);

    let mut col_gen = logup_gen.new_col();
    for simd_row in 0..(1 << (log_size - LOG_N_LANES)) {
        let numerator: PackedSecureField = PackedSecureField::from(multiplicity_col.data[simd_row]);
        let denom: PackedSecureField = lookup_elements.combine(&[range_check_col.data[simd_row]]);
        col_gen.write_frac(simd_row, -numerator, denom);
    }
    col_gen.finalize_col();

    let mut col_gen = logup_gen.new_col();
    for simd_row in 0..(1 << (log_size - LOG_N_LANES)) {
        let lookup_col_1_val: PackedSecureField =
            lookup_elements.combine(&[lookup_col_1.data[simd_row]]);
        let lookup_col_2_val: PackedSecureField =
            lookup_elements.combine(&[lookup_col_2.data[simd_row]]);
        // 1 / denom1 + 1 / denom2 = (denom1 + denom2) / (denom1 * denom2)
        let numerator = lookup_col_1_val + lookup_col_2_val;
        let denom = lookup_col_1_val * lookup_col_2_val;
        col_gen.write_frac(simd_row, numerator, denom);
    }
    col_gen.finalize_col();

    logup_gen.finalize_last()
}

fn main() {
    ...
    // Draw random elements to use when creating the random linear combination of lookup values in the LogUp columns
    let lookup_elements = LookupElements::draw(channel);

    // Create and commit to the LogUp columns
    let (logup_cols, claimed_sum) = gen_logup_trace(
        log_size,
        &range_check_col,
        &trace[0],
        &trace[1],
        &trace[2],
        &lookup_elements,
    );
    ...
}

Now we need to create the LogUp columns.

First, note that we are creating a LookupElements instance using the macro relation!. This macro creates an API for performing random linear combinations. Under the hood, it creates two random values \(z, \alpha\) that can create a random linear combination of an arbitrary number of elements. In our case, we only need to combine one value (value in \([0,15]\)), which is why we pass in 1 to the macro.

Inside gen_logup_trace, we create a LogupTraceGenerator instance. This is a helper class that allows us to create LogUp columns. Every time we create a new column, we need to call new_col() on the LogupTraceGenerator instance.

You may notice that we are iterating over BaseColumn in chunks of 16, or 1 << LOG_N_LANES values. This is because we are using the SimdBackend, which runs 16 lanes simultaneously, so we need to preserve this structure. The Packed in PackedSecureField means that it packs 16 values into a single value.

You may also notice that we are using a SecureField instead of just the Field. This is because the random value we created in LookupElements will be in the degree-4 extension field \(\mathbb{F}_{p^4}\). Interested readers can refer to the Mersenne Primes section for more details.

Once we set the fractions for each simd_row, we need to call finalize_col() to finalize the column. This process modifies the LogUp columns from individual fractions to cumulative sums of the fractions as shown in Figure 2.

Figure 2: Finalizing each LogUp column

Finally, we need to call finalize_last() on the LogupTraceGenerator instance to finalize the LogUp columns, which will return the LogUp columns as well as the sum of the fractions in the LogUp columns.

struct TestEval {
    range_check_id: PreProcessedColumnId,
    log_size: u32,
    lookup_elements: LookupElements,
}

impl FrameworkEval for TestEval {
    fn log_size(&self) -> u32 {
        self.log_size
    }

    fn max_constraint_log_degree_bound(&self) -> u32 {
        self.log_size + CONSTRAINT_EVAL_BLOWUP_FACTOR
    }

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let range_check_col = eval.get_preprocessed_column(self.range_check_id.clone());

        let lookup_col_1 = eval.next_trace_mask();
        let lookup_col_2 = eval.next_trace_mask();
        let multiplicity_col = eval.next_trace_mask();

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            -E::EF::from(multiplicity_col),
            &[range_check_col],
        ));

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            E::EF::one(),
            &[lookup_col_1],
        ));

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            E::EF::one(),
            &[lookup_col_2],
        ));

        eval.finalize_logup_batched(&vec![0, 1, 1]);

        eval
    }
}

The last piece of the puzzle is to create the constraints. We use the same TestEval struct as in the previous sections, but the evaluate function will look slightly different. Instead of calling add_constraint on the EvalAtRow instance, we will call add_to_relation, which recreates the fractions that we added in the LogUp columns using values in the range-check, lookup, and multiplicity columns.

Once we add the fractions as constraints, we call the finalize_logup_batched function, which indicates how we want to batch the fractions. In our case, we added 3 fractions but want to create batches where the last two fractions are batched together, so we pass in &vec![0, 1, 1].

    // Verify
    assert_eq!(claimed_sum, SecureField::zero());

    let channel = &mut Blake2sChannel::default();
    let commitment_scheme = &mut CommitmentSchemeVerifier::<Blake2sMerkleChannel>::new(config);
    let sizes = component.trace_log_degree_bounds();

    commitment_scheme.commit(proof.commitments[0], &sizes[0], channel);
    channel.mix_u64(log_size as u64);
    commitment_scheme.commit(proof.commitments[1], &sizes[1], channel);
    commitment_scheme.commit(proof.commitments[2], &sizes[2], channel);

    verify(&[&component], channel, commitment_scheme, proof).unwrap();

When we verify the proof, as promised, we check that the claimed_sum, which is the sum of the fractions in the LogUp columns, is 0.

And that's it! We have successfully created a static lookup for a range-check.

Dynamic Lookups

In the last section, we implemented a static lookup. A dynamic lookup is the same as a static lookup except that the values that are being looked up are not known before the proving process (i.e. they are not preprocessed columns but trace columns).

In this section, we will implement one of the simplest dynamic lookups: a permutation check.

A permutation check simply checks that two sets of values have the same elements, but not necessarily in the same order. For example, the values [1, 2, 3] and [3, 1, 2] are a permutation of each other, but [1, 2, 3] and [1, 2] are not.

If you went through the previous section, you should have a good intuition for how to implement this. First create two LogUp columns where the first column contains the values in the original set of values with multiplicity \(1\) and the second column contains the values in the second set of values with multiplicity \(-1\). Then, check that the claimed_sum, or the sum of the fractions in the two LogUp columns, is \(0\).

We can optimize further by batching the two columns into a single LogUp column so that a LogUp column row looks something like \(\frac{1}{col_1} - \frac{1}{col_2}\).

fn gen_trace(log_size: u32) -> Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>> {
    // Create a table with values [0, 1 << log_size)
    let mut rng = rand::thread_rng();
    let values = (0..(1 << log_size)).map(|i| i).collect::<Vec<_>>();
    let original_col = BaseColumn::from_iter(values.iter().map(|v| M31::from(*v)));

    // Create a random permutation of the values
    let mut permutation = values.clone();
    permutation.shuffle(&mut rng);
    let permuted_col = BaseColumn::from_iter(permutation.iter().map(|v| M31::from(*v)));

    // Convert table to trace polynomials
    let domain = CanonicCoset::new(log_size).circle_domain();
    vec![original_col, permuted_col]
        .into_iter()
        .map(|col| CircleEvaluation::new(domain, col))
        .collect()
}

fn gen_logup_trace(
    log_size: u32,
    original_col: &BaseColumn,
    permuted_col: &BaseColumn,
    lookup_elements: &LookupElements,
) -> (
    Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>>,
    SecureField,
) {
    let mut logup_gen = LogupTraceGenerator::new(log_size);

    let mut col_gen = logup_gen.new_col();
    for row in 0..(1 << (log_size - LOG_N_LANES)) {
        // 1 / original - 1 / permuted = (permuted - original) / (original * permuted)
        let original_val: PackedSecureField = lookup_elements.combine(&[original_col.data[row]]);
        let permuted_val: PackedSecureField = lookup_elements.combine(&[permuted_col.data[row]]);
        col_gen.write_frac(
            row,
            permuted_val - original_val,
            original_val * permuted_val,
        );
    }
    col_gen.finalize_col();

    logup_gen.finalize_last()
}

fn main() {
    ...
    // Create and commit to the trace columns
    let trace = gen_trace(log_size);
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals(trace.clone());
    tree_builder.commit(channel);

    // Draw random elements to use when creating the random linear combination of lookup values in the LogUp columns
    let lookup_elements = LookupElements::draw(channel);

    // Create and commit to the LogUp columns
    let (logup_cols, claimed_sum) =
        gen_logup_trace(log_size, &trace[0], &trace[1], &lookup_elements);
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals(logup_cols);
    tree_builder.commit(channel);
    ...
}

Looking at the code above, we can see that it looks very similar to the implementation in the previous section. Instead of creating a preprocessed column, we create a trace column that contains the values [0, 1 << log_size) in order. Then, we create a random permutation of the trace column values and set it as the second trace column. Note that this is equivalent to "looking up" all values in the first trace column once. And since all the values are looked up only once, we do not need a separate multiplicity column.

Then, we create a LogUp column that contains the values \(\frac{1}{original} - \frac{1}{permuted}\).

struct TestEval {
    log_size: u32,
    lookup_elements: LookupElements,
}

impl FrameworkEval for TestEval {
    fn log_size(&self) -> u32 {
        self.log_size
    }

    fn max_constraint_log_degree_bound(&self) -> u32 {
        self.log_size + CONSTRAINT_EVAL_BLOWUP_FACTOR
    }

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let original_col = eval.next_trace_mask();
        let permuted_col = eval.next_trace_mask();

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            E::EF::one(),
            &[original_col],
        ));

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            -E::EF::one(),
            &[permuted_col],
        ));

        eval.finalize_logup_in_pairs();

        eval
    }
}

The TestEval struct is also very similar to the one in the previous section. The only difference is that we call add_to_relation twice and add them together by calling finalize_logup_in_pairs() on the TestEval instance. This is equivalent to calling the finalize_logup_batched function with &vec![0, 0].

Local Row Constraints

Info

  1. Change the order of elements in BaseColumn in-place via bit_reverse_coset_to_circle_domain_order before creating a CircleEvaluation instance.
  2. The previous row in the first row will point to the last row, so you may need to disable the constraint for the first row.

Until now, we have only considered constraints that apply over values in a single row. But what if we want to express constraints over multiple rows? For example, we may want to ensure that the difference between the values in two adjacent rows is always the same.

Turns out we can implement this as an AIR constraint, as long as the same constraints are applied to all rows. However, actually implementing this requires some background knowledge about

In this section, we will see how to implement this.

We will build upon the example in the previous section, where we created a two columns and proved that they are permutations of each other by asserting that the second column looks up all values in the first column exactly once.

In this section, we will create two columns and prove that not only are they permutations of each other, but also that the second row is a sorted version of the first row.

More specifically, the sorted column will contain in order the values \([0,num\_rows)\), which means that the difference between every current row and the previous row should be \(1\).

We will go through three iterations, fixing an issue in each iteration.

First Try

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let unsorted_col = eval.next_trace_mask();
        let [sorted_col_prev_row, sorted_col_curr_row] =
            eval.next_interaction_mask(ORIGINAL_TRACE_IDX, [-1, 0]);

        // New constraint
        eval.add_constraint(
            E::F::one() - (sorted_col_curr_row.clone() - sorted_col_prev_row.clone()),
        );

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            E::EF::one(),
            &[unsorted_col],
        ));

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            -E::EF::one(),
            &[sorted_col_curr_row],
        ));

        eval.finalize_logup_in_pairs();

        eval
    }

Basically, the same logic for creating the trace and LogUp columns are equal to the previous section, so we omit them for brevity.

What does change is the evaluate function, where we call next_interaction_mask on the EvalAtRow instance. This function can retrieve values from arbitrary row offsets, which means we can access the previous row value using -1. Since we call this function with offsets [-1, 0], we will retrieve the values for the previous and current rows.

Once we have these values, we can now assert that the difference between the current and previous row is always 1 with the constraint: E::F::one() - (sorted_col_curr_row.clone() - sorted_col_prev_row.clone()).

But this will fail with a ConstraintsNotSatisfied error, can you see why? (You can try running it yourself here)

Second Try

The issue was that when calling evaluate on the first row of our trace, the previous row value wraps around to the last row because there are no negative indices.

This means that in our example, we are expecting the 0 - 15 = 1 constraint to hold, which is clearly not true.

To fix this, we can use the IsFirstColumn preprocessed column that we created in the Preprocessed Trace section. So we will copy over the same code for creating the preprocessed column and modify our new constraint as follows:

        let is_first_col = eval.get_preprocessed_column(self.is_first_id.clone());

        eval.add_constraint(
            (E::F::one() - is_first_col.clone())
                * (E::F::one() - (sorted_col_curr_row.clone() - sorted_col_prev_row.clone())),
        );

Now, we have a constraint that is disabled for the first row, which is exactly what we want.

Still, however, this will fail with the same ConstraintsNotSatisfied error. (You can run it here)

Third Try

So when we were creating CircleEvaluation instances from our BaseColumn instances, the order of the elements that we were creating it with was actually not the order that Stwo understands it to be. Instead, it assumes that the values are in the bit-reversed, circle domain order. It's not important to understand what this order is, specifically, but this does mean that when Stwo tries to find the -1 offset when calling evaluate, it will find the previous value assuming that it's in a different order. This means that when we create a CircleEvaluation instance, we need to make sure that the values are in the order that Stwo expects.

Thus, every time we create a CircleEvaluation instance, we need to convert the order of the values in the BaseColumn beforehand.

impl IsFirstColumn {
    ...
    pub fn gen_column(&self) -> CircleEvaluation<SimdBackend, M31, BitReversedOrder> {
        let mut col = BaseColumn::zeros(1 << self.log_size);
        col.set(0, M31::from(1));

        //////////////////////////////////////////////////////////////
        // Convert the columns to bit-reversed circle domain order
        bit_reverse_coset_to_circle_domain_order(col.as_mut_slice());
        //////////////////////////////////////////////////////////////

        CircleEvaluation::new(CanonicCoset::new(self.log_size).circle_domain(), col)
    }
    ...
}

fn gen_trace(log_size: u32) -> Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>> {
    // Create a table with random values
    let mut rng = rand::thread_rng();
    let sorted_values = (0..(1 << log_size)).map(|i| i).collect::<Vec<_>>();
    let mut unsorted_values = sorted_values.clone();
    unsorted_values.shuffle(&mut rng);

    let mut unsorted_col = BaseColumn::from_iter(unsorted_values.iter().map(|v| M31::from(*v)));
    let mut sorted_col = BaseColumn::from_iter(sorted_values.iter().map(|v| M31::from(*v)));

    // Convert table to trace polynomials
    let domain = CanonicCoset::new(log_size).circle_domain();

    ////////////////////////////////////////////////////////////////////
    // Convert the columns to bit-reversed circle domain order
    bit_reverse_coset_to_circle_domain_order(unsorted_col.as_mut_slice());
    bit_reverse_coset_to_circle_domain_order(sorted_col.as_mut_slice());
    ////////////////////////////////////////////////////////////////////

    vec![unsorted_col, sorted_col]
        .into_iter()
        .map(|col| CircleEvaluation::new(domain, col))
        .collect()
}

And voilà, we have successfully implemented the constraint. You can run it here.

Components

So now that we know how to create a self-contained AIR, the inevitable question arises: How do we make this modular?

Fortunately, Stwo provides an abstraction called a component that allows us to create independent AIRs and compose them together. In other proving frontends, this is also commonly referred to as a chip, but the idea is the same.

One of the most common use cases of components is to separate frequently used functions (e.g. a hash function) from the main component into a separate component and reuse it, avoiding trace column bloat. Even if the function is not frequently used, it could be useful to separate it into a component to avoid the degree of the constraints becoming too high. This second point is possible because when we create a new component and connect it to the old component, we do it by using lookups, which means that the constraints of the new component are not added to the degree of the old component.

Example

To illustrate how to use components, we will create two components where the main component calls a hash function component. For simplicity, instead of an actual hash function, the second component will compute \(x^5 + 1\) from an input \(x\). This component will have in total three columns: [input, intermediate, output], which will correspond to the values \([x, x^3, x^5 + 1]\). Our main component, on the other hand, will have two columns, [input, output], which corresponds to the values \([x, x^5 + 1]\).

We'll now refer to the main component as the scheduling component and the hash function component the computing component, as the main component is essentially scheduling the hash function component to run its function with a given input and the hash function component computes on the provided input. As can be seen in Figure 1, the inputs and outputs of each component are connected by lookups.

Figure 1: Scheduling and Computing components

Design

Figure 2: Traces of each component

When we implement this in Stwo, the traces of each component will look like Figure 2 above. Each component has its own original and LogUp traces, and the inputs and outputs of each component are connected by lookups. Since the scheduling component adds the inputs as positive values and the outputs as negative values, while the computing component adds the inputs as negative values and the outputs as positive values, the verifier can simply check that the sum of the two LogUp columns is zero.

Code

fn main() {
    // --snip--

    // Create trace columns
    let scheduling_trace = gen_scheduling_trace(log_size);
    let computing_trace = gen_computing_trace(log_size, &scheduling_trace[0], &scheduling_trace[1]);

    // Statement 0
    let statement0 = ComponentsStatement0 { log_size };
    statement0.mix_into(channel);

    // Commit to the trace columns
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals([scheduling_trace.clone(), computing_trace.clone()].concat());
    tree_builder.commit(channel);

    // Draw random elements to use when creating the random linear combination of lookup values in the LogUp columns
    let lookup_elements = ComputationLookupElements::draw(channel);

    // Create LogUp columns
    let (scheduling_logup_cols, scheduling_claimed_sum) = gen_scheduling_logup_trace(
        log_size,
        &scheduling_trace[0],
        &scheduling_trace[1],
        &lookup_elements,
    );
    let (computing_logup_cols, computing_claimed_sum) = gen_computing_logup_trace(
        log_size,
        &computing_trace[0],
        &computing_trace[2],
        &lookup_elements,
    );

    // Statement 1
    let statement1 = ComponentsStatement1 {
        scheduling_claimed_sum,
        computing_claimed_sum,
    };
    statement1.mix_into(channel);

    // Commit to the LogUp columns
    let mut tree_builder = commitment_scheme.tree_builder();
    tree_builder.extend_evals([scheduling_logup_cols, computing_logup_cols].concat());
    tree_builder.commit(channel);

    let components = Components::new(&statement0, &lookup_elements, &statement1);

    let stark_proof = prove(&components.component_provers(), channel, commitment_scheme).unwrap();

    let proof = ComponentsProof {
        statement0,
        statement1,
        stark_proof,
    };

    // --snip--
}

The code above for proving the components should look pretty familiar by now. Since we need to do everything twice the amount of times, we create structs like ComponentsStatement0, ComponentsStatement1, Components and ComponentsProof, but the main logic is the same.

Let's take a closer look at how the LogUp columns are generated.

fn gen_scheduling_logup_trace(
    log_size: u32,
    scheduling_col_1: &CircleEvaluation<SimdBackend, M31, BitReversedOrder>,
    scheduling_col_2: &CircleEvaluation<SimdBackend, M31, BitReversedOrder>,
    lookup_elements: &ComputationLookupElements,
) -> (
    Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>>,
    SecureField,
) {
        // --snip--

        let scheduling_input: PackedSecureField =
            lookup_elements.combine(&[scheduling_col_1.data[row]]);
        let scheduling_output: PackedSecureField =
            lookup_elements.combine(&[scheduling_col_2.data[row]]);
        col_gen.write_frac(
            row,
            scheduling_output - scheduling_input,
            scheduling_input * scheduling_output,
        );

        // --snip--


fn gen_computing_logup_trace(
    log_size: u32,
    computing_col_1: &CircleEvaluation<SimdBackend, M31, BitReversedOrder>,
    computing_col_3: &CircleEvaluation<SimdBackend, M31, BitReversedOrder>,
    lookup_elements: &ComputationLookupElements,
) -> (
    Vec<CircleEvaluation<SimdBackend, M31, BitReversedOrder>>,
    SecureField,
) {
        // --snip--

        let computing_input: PackedSecureField =
            lookup_elements.combine(&[computing_col_1.data[row]]);
        let computing_output: PackedSecureField =
            lookup_elements.combine(&[computing_col_3.data[row]]);
        col_gen.write_frac(
            row,
            computing_input - computing_output,
            computing_input * computing_output,
        );

        // --snip--
}

As you can see, the LogUp values of the input and output columns of both the scheduling and computing components are batched together, but in the scheduling component, the output LogUp value is subtracted from the input LogUp value, while in the computing component, the input LogUp value is subtracted from the output LogUp value. This means that when the LogUp sums from both components are added together, they should cancel out and equal zero.

Next, let's check how the constraints are created.

impl FrameworkEval for SchedulingEval {
    // --snip--

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let input_col = eval.next_trace_mask();
        let output_col = eval.next_trace_mask();

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            E::EF::one(),
            &[input_col],
        ));

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            -E::EF::one(),
            &[output_col],
        ));

        eval.finalize_logup_in_pairs();

        eval
    }
}

impl FrameworkEval for ComputingEval {
    // --snip--

    fn evaluate<E: EvalAtRow>(&self, mut eval: E) -> E {
        let input_col = eval.next_trace_mask();
        let intermediate_col = eval.next_trace_mask();
        let output_col = eval.next_trace_mask();

        eval.add_constraint(
            intermediate_col.clone() - input_col.clone() * input_col.clone() * input_col.clone(),
        );
        eval.add_constraint(
            output_col.clone()
                - intermediate_col.clone() * input_col.clone() * input_col.clone()
                - E::F::one(),
        );

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            -E::EF::one(),
            &[input_col],
        ));

        eval.add_to_relation(RelationEntry::new(
            &self.lookup_elements,
            E::EF::one(),
            &[output_col],
        ));

        eval.finalize_logup_in_pairs();

        eval
    }
}

As you can see, we define the LogUp constraints for each component, and we also add two constraints that make sure the computations \(x^3\) and \(x^5 + 1\) are correct.

fn main() {
    // --snip--

    // Verify claimed sums
    assert_eq!(
        scheduling_claimed_sum + computing_claimed_sum,
        SecureField::zero()
    );

    // Unpack proof
    let statement0 = proof.statement0;
    let statement1 = proof.statement1;
    let stark_proof = proof.stark_proof;

    // Create channel and commitment scheme
    let channel = &mut Blake2sChannel::default();
    let commitment_scheme = &mut CommitmentSchemeVerifier::<Blake2sMerkleChannel>::new(config);
    let log_sizes = statement0.log_sizes();

    // Preprocessed columns.
    commitment_scheme.commit(stark_proof.commitments[0], &log_sizes[0], channel);

    // Commit to statement 0
    statement0.mix_into(channel);

    // Trace columns.
    commitment_scheme.commit(stark_proof.commitments[1], &log_sizes[1], channel);

    // Draw lookup element.
    let lookup_elements = ComputationLookupElements::draw(channel);

    // Commit to statement 1
    statement1.mix_into(channel);

    // Interaction columns.
    commitment_scheme.commit(stark_proof.commitments[2], &log_sizes[2], channel);

    // Create components
    let components = Components::new(&statement0, &lookup_elements, &statement1);

    verify(
        &components.components(),
        channel,
        commitment_scheme,
        stark_proof,
    )
    .unwrap();

Finally, we verify the components!

Additional Examples

ExamplePreprocessed ColumnsTrace ColumnsLogUp Columns
Permutation argument check- unordered list
- ordered list
1 / unordered list - 1 / ordered list
Range Check (0 <= a < 2^bits)[0,2^bits) rows- lookup columns
- multiplicities column
- 1 / lookup
- multiplicity / preprocessed
Comparator check (a > b)[0,2^bits) rows- a
- b
- multiplicities column
- 1 / (a - b)
- multiplicity / preprocessed
IsZero check (a == 0)- a
- a_inv
XOR operationsvalid XOR operations- lookup columns
- multiplicities column
Selectorscolumns of 0s and 1s
Checking whether current row is first row or notsingle column (first row = 1, other rows = 0)
Connecting multiple components (output of Component A is input of Component B)- 1 / output
- 1 / input * (-1)
Public Input/Output1 / input + 1 / output

Above is a list of additional examples that you can implement as an AIR using Stwo, some of which we have already implemented in the previous sections.

Cairo as a Stwo AIR

🚧

How Does It Work?

This section is for those who want an in-depth explanation of various components of Stwo.

Mersenne Primes

Proof systems typically rely on finite field operations, where efficient field arithmetic is crucial for optimizing proof generation. In STARK protocols, there is no direct dependency between the security level of the proof system and the field size. This allows the use of small fields with highly efficient arithmetic, such as Mersenne prime fields.

A Mersenne prime is defined as a prime number that is one less than a power of two, expressed as \( p = 2^k -1 \).

Consider the Mersenne prime field \( \mathbb{F}_p \) where \( p = 2^{31} - 1 \). Our objective is to perform field multiplication \( a \cdot b \), where \( a, b \in \mathbb{F}_p \). This operation involves a 31-bit integer multiplication, producing a 62-bit intermediate result, which is then reduced modulo \( p \).

Let \( x = a \cdot b \), where \( a, b \) are 31-bit values, resulting in a 62-bit product \( x \). We can decompose \( x \) into two 31-bit values \( b \) and \( s \), such that \( x = 2^{31} \cdot b + s \), as shown in the following figure.

Mersenne Prime Multiplication

To perform modular reduction, we start with: \[ x \equiv (2^{31} \cdot b + s) \quad mod \quad (2^{31} - 1) \] Substituting \( 2^{31} \equiv 1 \mod (2^{31} - 1) \) gives: \[ x \equiv (b + s) \quad mod \quad (2^{31} - 1) \]

Since \( b \) and \( s \) are both 31-bit values, they can be directly represented as field elements. Consequently, modular reduction is performed with a single field addition. This makes arithmetic over Mersenne primes exceptionally fast, making them an ideal choice for our STARK protocol.

However, we instantiate STARK protocols over an FFT-friendly field, meaning a field that contains a multiplicative subgroup of order that is a large power of two (commonly referred to as a smooth subgroup).

\[ |\mathbb{F}_p^*| = p-1 = 2^k-2\]

As shown above, Mersenne prime fields lack a smooth subgroup of size that is a large power of two because there is no large power of two that divides \( |\mathbb{F}_{p}^*| \). In other words, there does not exist a sufficiently large \( n \) such that \( 2^n \, | \, p - 1 \).

Extensions of Mersenne Prime Field

To make Mersenne prime fields compatible with STARKs, we use a degree-2 extension of \( \mathbb{F}_p \), defined as follows:

\[ \mathbb{F}_{p^2} = \mathbb{F}_p[X]/(X^2 + 1) \]

This extension forms a field of size \( p^2 \), where elements can be represented as \( (a, b) \) or \[ a + i \cdot b \] where \( a, b \in \mathbb{F}_p \) and \( i \) is the root of the polynomial \( X^2 + 1 \) i.e. \( i^2 + 1 = 0\).

The order of the multiplicative group of this extended field is calculated as follows:

\[ |\mathbb{F}_{p^2}^*| = p^2 - 1 = (p-1) \cdot (p+1)\]

For Mersenne primes of the form \( p = 2^k - 1 \), this becomes:

\[ |\mathbb{F}_{p^2}^*| = (2^k-2) \cdot (2^k)\]

As shown above, \( 2^k \, | \, |\mathbb{F}_{p^2}^*| \) i.e. \( \mathbb{F}_{p^2}^* \) contains a subgroup of size that is a large power of two. This makes it suitable for instantiating STARKs. This subgroup is what we refer to as the Circle group (explored further in the next section).

Secure Field

For the soundness of the protocol, it is crucial that the verifier samples random challenges from a sufficiently large field to ensure that an adversary cannot guess or brute-force the challenges and generate a proof that passes verification without knowledge of the witness.

If we use \( p = 2^{31} -1 \), then 31-bit random challenges are not sufficient to maintain the security of the protocol. To address this, the verifier draws random challenges from a degree-4 extension of \( \mathbb{F}_{p} \), which is equivalent to degree-2 extension of \( \mathbb{F}_{p^2} \), denoted as \[ \mathbb{F}_{p^4} = \mathbb{F}_{p^2}[X]/(X^2 - 2 - i) \]

The elements of \( \mathbb{F}_{p^4} \) can be represented as \( (r, s) \) or \[ r + u \cdot s \] where \( r, s \in \mathbb{F}_{p^2} \) and \( u \) is the root of the polynomial \( X^2 - 2 - i \) i.e. \( u^2 - 2 - i = 0\).

Alternatively, the elements of \( \mathbb{F}_{p^4} \) can also be represented as four elements of \( \mathbb{F}_{p} \) i.e. \( ((a, b), (c, d)) \) or \[ (a + i \cdot b) + (c + i \cdot d) \cdot u \]

where \( a, b, c, d \in \mathbb{F}_p \). With four elements from \( \mathbb{F}_{p} \), the challenge space consists of 124-bit values, offering a sufficiently large \( 2^{124} \) possibilities to sample a random challenge.

Lookups

Lookups are simply a way to connect one part of the table to another. When we "look up" a value, we are doing nothing more than creating a constraint that allows us to use that value in another part of the table without breaking soundness.

Design

We will walk through four steps to incrementally build up the design of lookups.

Step 1: Suppose we want to have two columns with the same values.

We can do this by creating two columns with the exact same values and adding a constraint over them: col_1 - col_2 = 0.

Step 2: We want to check that the two columns have the same values but in a different order.

We can use the idea that two sets of values will have the same cumulative product if they are indeed permutations of each other. So we add new columns, col_1_cumprod for col_1 and col_2_cumprod for col_2, which contain the running cumulative product of col_1 and col_2, respectively. The new constraints will check that each of these new columns do indeed contain the cumulative product values and that their last values are the same. We can optimize this by creating just one new column that keeps a running cumulative product of the fraction col_1 / col_2.

Step 3: We want to check that all values in col_2 are in col_1, but each value appears an arbitrary number of times.

(Note that this is a generalization of the second step in that for the second step,all values in col_2 appear exactly once in col_1)

Supporting this third step is actually pretty simple: when creating the running cumulative product, we need to raise each value in col_1 to its multiplicity, or the number of times it appears in col_2. The rest of the constraints do not need to be changed.

Step 4: We want to check that all values in [col_2, col_3, ...] are in col_1 with arbitrary multiplicities

Finally, we want to create many more columns that contain values from col_1. Fortunately,

To support this, we can use the same idea as the third step: when creating the running cumulative product, we need to raise each value in col_1 to the power of the number of times it appears in [col_2, col_3, ...].

Note

In summary, lookups support the following use-cases:

  1. Prove equality: we want to prove that the values of the first column are equal to the values of the second column.
  2. Prove permutation: we want to prove that the values of the first column are a permutation of the values of the second column.
  3. Prove permutation with multiplicities: we want to prove that each value of the first column appears a certain number of times over multiple columns.

Technique: LogUp

LogUp is a technique used to constrain lookups. It's a successor to Plookup, and is especially useful for proving permutation with multiplicities. Here, we'll briefly explain why this is the case.

Plookup and its variants use a technique called the Grand Product Check to prove permutation.

$$ \prod_{i=0}^{n-1} (X - a_i) = \prod_{i=0}^{n-1} (X - b_i) $$

In the equation above, we can check that the set ${a_0,...,a_{n-1}}$ is a permutation of the set ${b_0,...,b_{n-1}}$ by setting $X$ to a random value provided by the verifier.

However, this becomes inefficient when we have multiplicities since we need to encode the multiplicities as powers of each lookup polynomial, and thus the degree of the polynomial increases linearly with the number of multiplicities.

$$ \prod_{i=0}^{n-1} (X - a_i) = \prod_{i=0}^{n-1} (X - b_i)^{m_i} $$

On the other hand, LogUp uses the derivative of the Grand Product Check:

$$ \sum_{i=0}^{n-1} \frac{1}{X - a_i} = \sum_{i=0}^{n-1} \frac{m_i}{X - b_i} $$

In this approach, each lookup polynomial is represented as a rational function with the multiplicity as the numerator. This transformation is significant because the degree of the polynomial remains constant regardless of the number of multiplicities, making LogUp more efficient for handling multiple lookups of the same value.

Implementation

The following figures show the implementation of lookups in Stwo that looks up values from a preprocessed trace and constraining them using the LogUp technique.

Figure 1: Create trace columns that look up values from a preprocessed trace
Figure 2: Add a multiplicity column
Figure 3: Create LogUp columns

Users of Stwo

This is a non-exhaustive list of projects and organizations that are currently using or planning to use Stwo. If you are using Stwo and would like to be added to this list, please open an issue or submit a PR.

Projects