Merkle Prover
This section explains the prover of the Merkle commitment scheme, focusing on how columns are committed to compute a Merkle root and how the Merkle tree layers are constructed, as well as how to generate Merkle proofs (decommitments).
Stwo represents the computation trace using multiple columns of different sizes. Rather than committing to each column in separate Merkle trees, Stwo uses a modified Merkle tree structure where all columns of different sizes are committed to using a single Merkle tree. We will describe the structure of this Merkle tree in this section.
MerkleProver Structure
The MerkleProver
struct represents a prover for a Merkle commitment scheme. It stores all layers of the Merkle tree, where each layer contains the hash values at that level.
pub struct MerkleProver<B: MerkleOps<H>, H: MerkleHasher> {
/// Layers of the Merkle tree.
/// The first layer is the root layer.
/// The last layer is the largest layer.
/// See [MerkleOps::commit_on_layer] for more details.
pub layers: Vec<Col<B, H::Hash>>,
}
The Commit Process
The core of the Merkle prover is the commit
function, which takes as input a set of columns and outputs a MerkleProver
containing all layers of the Merkle tree. The columns must be of length that is a power of 2.
Below is the complete implementation of the commit
function:
pub fn commit(columns: Vec<&Col<B, BaseField>>) -> Self {
let _span = span!(Level::TRACE, "Merkle", class = "MerkleCommitment").entered();
if columns.is_empty() {
return Self {
layers: vec![B::commit_on_layer(0, None, &[])],
};
}
let columns = &mut columns
.into_iter()
.sorted_by_key(|c| Reverse(c.len()))
.peekable();
let mut layers: Vec<Col<B, H::Hash>> = Vec::new();
let max_log_size = columns.peek().unwrap().len().ilog2();
for log_size in (0..=max_log_size).rev() {
// Take columns of the current log_size.
let layer_columns = columns
.peek_take_while(|column| column.len().ilog2() == log_size)
.collect_vec();
layers.push(B::commit_on_layer(log_size, layers.last(), &layer_columns));
}
layers.reverse();
Self { layers }
}
Let's walk through the function step by step:
-
Sort Columns by Length: Columns are sorted in descending order of length (columns with the most elements appear first). This ensures that the largest columns are committed first.
-
Layer-by-Layer Commitment:
- For each layer (from largest to smallest), the function collects all columns of the current size and computes the hashes for that layer using the
commit_on_layer
function. - For the largest layer, the previous layer's hashes are empty, so the hash is computed directly from the column values.
- For subsequent layers, the hash is computed from the previous layer's hashes and the current layer's column values.
- For each layer (from largest to smallest), the function collects all columns of the current size and computes the hashes for that layer using the
-
Reverse Layers: After all layers are computed, the list of layers is reversed so that the root layer is at the beginning.
Example: Commit Process
Suppose the input column data is as shown below:
For this example, the columns are already in the sorted order: , , (from longest to shortest). We will now compute the hashes stored at each layer.
-
First Layer (Leaves): The hashes are computed directly from the column values:
-
Second Layer: The next layer uses the previous hashes and the values from :
-
Root: The root is computed as .
The resulting Merkle tree is illustrated below:
The Decommit Process
The decommitment process enables the prover to generate a Merkle proof for a set of queried indices, allowing the verifier to check that specific elements are included in the committed Merkle tree.
The output is a MerkleDecommitment
struct, which contains the hash and column values required for the verifier to reconstruct the Merkle root at the queried positions. Its implementation is shown below:
pub struct MerkleDecommitment<H: MerkleHasher> {
/// Hash values that the verifier needs but cannot deduce from previous computations, in the
/// order they are needed.
pub hash_witness: Vec<H::Hash>,
/// Column values that the verifier needs but cannot deduce from previous computations, in the
/// order they are needed.
/// This complements the column values that were queried. These must be supplied directly to
/// the verifier.
pub column_witness: Vec<BaseField>,
}
The decommit
function implemented for the MerkleProver
takes as input:
queries_per_log_size
: A map from log size to a vector of query indices for columns of that size.columns
: The column data that was committed to in the Merkle tree.
It returns:
- A vector of queried values, ordered as they are opened (from largest to smallest layer).
- A
MerkleDecommitment
containing the hash and column witnesses needed for verification.
Below is the complete implementation of the decommit
function:
pub fn decommit(
&self,
queries_per_log_size: &BTreeMap<u32, Vec<usize>>,
columns: Vec<&Col<B, BaseField>>,
) -> (Vec<BaseField>, MerkleDecommitment<H>) {
// Prepare output buffers.
let mut queried_values = vec![];
let mut decommitment = MerkleDecommitment::empty();
// Sort columns by layer.
let mut columns_by_layer = columns
.iter()
.sorted_by_key(|c| Reverse(c.len()))
.peekable();
let mut last_layer_queries = vec![];
for layer_log_size in (0..self.layers.len() as u32).rev() {
// Prepare write buffer for queries to the current layer. This will propagate to the
// next layer.
let mut layer_total_queries = vec![];
// Each layer node is a hash of column values as previous layer hashes.
// Prepare the relevant columns and previous layer hashes to read from.
let layer_columns = columns_by_layer
.peek_take_while(|column| column.len().ilog2() == layer_log_size)
.collect_vec();
let previous_layer_hashes = self.layers.get(layer_log_size as usize + 1);
// Queries to this layer come from queried node in the previous layer and queried
// columns in this one.
let mut prev_layer_queries = last_layer_queries.into_iter().peekable();
let mut layer_column_queries =
option_flatten_peekable(queries_per_log_size.get(&layer_log_size));
// Merge previous layer queries and column queries.
while let Some(node_index) =
next_decommitment_node(&mut prev_layer_queries, &mut layer_column_queries)
{
if let Some(previous_layer_hashes) = previous_layer_hashes {
// If the left child was not computed, add it to the witness.
if prev_layer_queries.next_if_eq(&(2 * node_index)).is_none() {
decommitment
.hash_witness
.push(previous_layer_hashes.at(2 * node_index));
}
// If the right child was not computed, add it to the witness.
if prev_layer_queries
.next_if_eq(&(2 * node_index + 1))
.is_none()
{
decommitment
.hash_witness
.push(previous_layer_hashes.at(2 * node_index + 1));
}
}
// If the column values were queried, return them.
let node_values = layer_columns.iter().map(|c| c.at(node_index));
if layer_column_queries.next_if_eq(&node_index).is_some() {
queried_values.extend(node_values);
} else {
// Otherwise, add them to the witness.
decommitment.column_witness.extend(node_values);
}
layer_total_queries.push(node_index);
}
// Propagate queries to the next layer.
last_layer_queries = layer_total_queries;
}
(queried_values, decommitment)
}
Let's break down the function step by step:
-
Sort Columns by Length:
- As in the
commit
function, columns are sorted in descending order of length.
- As in the
-
Layer-by-Layer Decommitment:
- For each layer (from largest to smallest):
- Collect all columns of the current size (
layer_columns
) and the previous layer's hashes (previous_layer_hashes
). - Retrieve the queries for the current layer (
layer_column_queries
) and the previous layer's queries (prev_layer_queries
). - For each node index to be decommitted in this layer:
- Check if the child node hashes of the current node can be computed by the verifier, and if not, add the missing child node hashes to the
hash_witness
in the decommitment. - If the node index is queried, fetch the corresponding column values and append them to
queried_values
. - If not queried, add the column values to the
column_witness
in the decommitment.
- Check if the child node hashes of the current node can be computed by the verifier, and if not, add the missing child node hashes to the
- The set of node indices decommitted in this layer is propagated as queries to the next layer.
- Collect all columns of the current size (
- For each layer (from largest to smallest):
Example: Decommit Process
For the column data in Figure 1, consider the query indices , where the query indices are maps from log size to a vector of query indices for columns of that size. This corresponds to querying the following elements:
- The 0th element of columns of size : from and from .
- The 1st element of the column of size : from .
Because columns of equal length are committed together, the same indices are opened together in the decommitment. For example, for query , both and are opened together.
Below is a walkthrough of the main loop in the decommit
function, showing the state of key variables for each layer_log_size
:
-
layer_log_size = 2 (columns of size 4):
layer_columns
:previous_layer_hashes
:None
(first layer)prev_layer_queries
: emptylayer_column_queries
:- For
node_index = 0
:node_values
:- Queried, so append to
queried_values
: - Add
node_index = 0
tolayer_total_queries
(to propagate to next layer)
- At this stage:
queried_values
= ,decommitment
is empty.
-
layer_log_size = 1 (columns of size 2):
layer_columns
:previous_layer_hashes
:prev_layer_queries
:layer_column_queries
:- For
node_index = 0
:- Add to
hash_witness
in decommitment. node_values
:- Not queried, so append to
column_witness
in decommitment. - Add
node_index = 0
tolayer_total_queries
.
- Add to
- At this stage:
queried_values
= ,hash_witness
= ,column_witness
= ,layer_total_queries
= . - For
node_index = 1
:- Add to
hash_witness
in decommitment. node_values
:- Queried, so append to
queried_values
. - Add
node_index = 1
tolayer_total_queries
.
- Add to
- At this stage:
queried_values
= ,hash_witness
= ,column_witness
= ,layer_total_queries
= (to propagate to next layer).
-
layer_log_size = 0 (root):
layer_columns
: emptyprevious_layer_hashes
:prev_layer_queries
:layer_column_queries
: empty- No values are added to
queried_values
,hash_witness
, orcolumn_witness
in this layer.
Final output:
queried_values
:decommitment
:hash_witness
:column_witness
:
In the next section, we describe the verification process to verify the inclusion of the queried values using the Merkle decommitment.