Twiddles
This section provides a detailed look at how twiddles are computed and stored in Stwo.
As discussed earlier, Stwo has two main backend implementations: CpuBackend
and SimdBackend
. Each backend implements the PolyOps
trait, which provides core polynomial operations such as interpolation, evaluation and twiddle precomputation. Here and in the following sections on Circle FFT, we will explore how these functions are implemented for the CpuBackend
.
Twiddle Tree
The twiddles are stored using the TwiddleTree
struct implemented as follows:
pub struct TwiddleTree<B: PolyOps> {
pub root_coset: Coset,
// TODO(shahars): Represent a slice, and grabbing, in a generic way
pub twiddles: B::Twiddles,
pub itwiddles: B::Twiddles,
}
For CpuBackend
, B::Twiddles
is a vector of BaseField
elements. Here, root_coset
is the half coset of our circle domain . As we have seen in the earlier section, for interpolation we divide by twiddles or multiply by inverse twiddles. In the evaluation algorithm, we multiply by twiddles. Thus we store both twiddles
and their inverses itwiddles
.
Now we will understand how the twiddle tree is computed using an example. Consider the following half coset of a circle domain .
The TwiddleTree
is constructed by the precompute_twiddles
function, which takes the half coset as input and produces the twiddles needed to perform FFT over the corresponding circle domain. It is implemented as follows:
fn precompute_twiddles(coset: Coset) -> TwiddleTree<Self> {
const CHUNK_LOG_SIZE: usize = 12;
const CHUNK_SIZE: usize = 1 << CHUNK_LOG_SIZE;
let root_coset = coset;
let twiddles = slow_precompute_twiddles(coset);
// Inverse twiddles.
// Fallback to the non-chunked version if the domain is not big enough.
if CHUNK_SIZE > root_coset.size() {
let itwiddles = twiddles.iter().map(|&t| t.inverse()).collect();
return TwiddleTree {
root_coset,
twiddles,
itwiddles,
};
}
let mut itwiddles = vec![BaseField::zero(); twiddles.len()];
twiddles
.array_chunks::<CHUNK_SIZE>()
.zip(itwiddles.array_chunks_mut::<CHUNK_SIZE>())
.for_each(|(src, dst)| {
batch_inverse_in_place(src, dst);
});
TwiddleTree {
root_coset,
twiddles,
itwiddles,
}
}
As shown above, it first computes twiddles
using the function slow_precompute_twiddles
then computes their inverses itwiddles
using batch inversion and finally stores them in the TwiddleTree
along with the root_coset
which is the input half coset.
Now let us look into the function slow_precompute_twiddles
.
pub fn slow_precompute_twiddles(mut coset: Coset) -> Vec<BaseField> {
let mut twiddles = Vec::with_capacity(coset.size());
for _ in 0..coset.log_size() {
let i0 = twiddles.len();
twiddles.extend(
coset
.iter()
.take(coset.size() / 2)
.map(|p| p.x)
.collect::<Vec<_>>(),
);
bit_reverse(&mut twiddles[i0..]);
coset = coset.double();
}
// Pad with an arbitrary value to make the length a power of 2.
twiddles.push(1.into());
twiddles
}
The above function computes the twiddles required to compute FFT over the line (i.e. the recursive layer FFT, after projecting to the -axis). For the example in the figure with the half coset points , this function will output .
Thus, for the half coset in the figure, the precompute_twiddles
will output TwiddleTree
with twiddles
as and itwiddles
as . The twiddles
will be used for the evaluation algorithm and itwiddles
will be used for the interpolation algorithm.
In the next section, we will bring together everything we've covered so far on Circle FFT to examine the implementation of the interpolation algorithm.