NoName

This is the book on NoName, a high-level language to write circuits using the kimchi zero-knowledge proof system.

Basics

Noname is a language that closely resembles Rust.

For example, in the following program you can see a main function:

fn main(pub public_input: Field, private_input: Field) {
    let x = private_input + public_input;
    assert_eq(x, 2);
}

The only differences with Rust are:

  • The pub keyword is used to mark public inputs. By default all arguments are private.
  • assert_eq is not a macro, there are no macros in noname.
  • a Field type is used as main types everywhere. It is defined in field.rs to be the pasta Fp field (the base field of the Pallas curve). If these words mean nothing to you, just see Field as a large number. Ideally programs should be written without this type, but for now custom types do not exist.

To run such a file, and assuming you have Rust installed, you can type in the terminal:

$ cargo run -- --path path/to/file.no --private-inputs '{"private_input": ["1"]}' --public-inputs '{"public_input": ["1"]}'

As you can see, inputs are passed with a JSON format, and the values are expected to be encoded in decimal numbers.

Builtins and use statements

Some builtin functions are available by default:

  • assert_eq to check that two field elements are equal
  • assert to check that a condition is true.

Like in Rust, you can also import other libraries via the use keyword. If you do this, you must know that you can only import a library, but not its functions (and types, and constants) directly.

For example, to use the poseidon function from the crypto library (or module), you must import std::crypto and then qualify your use of crypto::poseidon:

use std::crypto;

fn main(pub public_input: Field, private_input: [Field; 2]) {
    let digest = crypto::poseidon(private_input);
    assert_eq(digest[0], public_input);
}

Note that currently, only built-in libraries (written in Rust) are working. In the future we’d like for other libraries to be written in the noname language.

Field

The Field type is the primitive type upon which all other types are built. It is good to know about it as it is used in many places, and is error prone: it does not match the size of commonly-found types like u32 and u64 and can have unexpected behaviors as it can overflow or underflow without emitting an error.

Ideally, you should never use the Field type, but currently the library is quite limited and the ideal world is far away.

Note that you can define Field elements directly in the code by writing a decimal number directly. For example:

#![allow(unused)]
fn main() {
let x = 2;
assert(y, 4);
}

Arrays

While there are no dynamic arrays (or vectors), you can use fixed-size arrays like in Rust.

For the moment, I believe that arrays can only be declared in a function argument as the following declaration hasn’t been implemented yet:

#![allow(unused)]
fn main() {
let x = [1, 2, y];
}

Boolean

Booleans are similar to Rust’s boolean. They are currently the only built-in type besides Field and arrays.

#![allow(unused)]
fn main() {
let x = true;
let y = false;
assert(!(x & y));
}

Mutability

Variables are by default not mutable. To make a variable mutable, you must use the mut keyword:

#![allow(unused)]
fn main() {
let mut x = 1;
x = 2; // GOOD

let y = 1;
y = x + y; // BAD
}

For loops

fn main(pub public_input: Field, private_input: [Field; 3]) {
    let mut sum = 0;

    for i in 0..3 {
        sum = sum + private_input[i];
    }

    assert_eq(sum, public_input);
}

Constants

Like variables and function names, constants must be lowercase.

At the moment they can only represent field elements. Perhaps it would be nice to be able to represent different types in the future.

const player_one = 1;
const player_two = 2;

fn main(pub player: Field) -> Field {
    assert_eq(player_one, player);
    let next_player = player + 1;
    assert_eq(player_two, next_player);
    return next_player;
}

If Else statements

Currently, if/else statements are not supported. Only the ternary operator is:

fn main(pub xx: Field) {
    let plus = xx + 1;
    let cond = xx == 1;
    let yy = cond ? plus : xx ;
    assert_eq(yy, 2);
}

The two branches of the ternary operator must be variables (as in the xample), or array accesses (e.g. thing[0]), or field accesses (e.g. thing.field).

Functions

fn add(x: Field, y: Field) -> Field {
    return x + y;
}

fn double(x: Field) -> Field {
    return x + x;
}

fn main(pub one: Field) {
    let four = add(one, 3);
    assert_eq(four, 4);

    let eight = double(4);
    assert_eq(eight, double(four));
}

Custom types

struct Thing {
    x: Field,
    y: Field,
}

fn main(pub x: Field, pub y: Field) {
    let thing = Thing {
        x: 1,
        y: 2,
    };
    
    assert_eq(thing.x, x);
    assert_eq(thing.y, y);
}

Methods on custom types

struct Thing {
    x: Field,
    y: Field,
}

fn Thing.new(x: Field, y: Field) -> Thing {
    return Thing {
        x: x,
        y: y,
    };
}

fn Thing.verify(self, v: Field) {
    assert_eq(self.x, v);
    assert_eq(self.y, v + 1);
}

fn Thing.update_and_verify(self) {
    let new_thing = Thing {
        x: self.x + 1,
        y: self.y + 1,
    };

    new_thing.verify(2);
}

fn main(pub x: Field) {
    let thing = Thing.new(x, x + x);
    thing.update_and_verify();
}

technically you can even call static methods from a variable of that type:

#![allow(unused)]
fn main() {
let y = thing.new(3, 4);
}

Note

It’s not necessarily pleasant to read, and we could prevent it by storing some meta information (static_method: bool) in the type checker, but it’s not a big deal.

Early returns

TODO

Hints

TODO

Shadowing

We forbid variable shadowing as much as we can.

For example, this should not work:

#![allow(unused)]
fn main() {
let x = 2;
let x = 3; // this won't compile

let y = 4;
for i in 0..4 {
    let y = i; // this won't compile either
}
}

Compilation

The compilation of noname programs goes through the following flow:

  1. Lexer. A lexer (lexer.rs) is used to parse the source code into a list of tokens. This is pretty primitive, but will detect some minor syntax issues.
  2. Parser. A parser (parser.rs) is used to parse meaning from the code. It will convert the tokens output by the lexer into an abstract syntax tree (AST) using strong types like Statement and Expression (TODO: link to rust doc). It will also error if some code does not make sense according to the grammar (see the Grammar chapter).
  3. Type checking. A type checker (type_checker.rs) takes the AST produced by the parser and does import resolution and type checking:
    • Built-in functions. Functions like assert_eq are injected into the environment.
    • Custom imports. Modules imported via the use keyword are resolved and added to the environment. For now, these can only be built-in functions, and noname functions or libraries are not supported (of course it is essential to support them in the future).
    • Type checking. The type checker verifies that the types of each variables and expressions in the AST make sense. It is a very simple type checker that can do some simple type inference. Temporary type information (type of an expression) is not stored, and is thrown away as soon as the type checker can afford it. a TAST for typed AST is returned, but it mostly contains resolved imports and most type information has been thrown away.
  4. Gate construction. The TAST produced by the type checker is passed to the circuit writer (circuit_writer.rs), also called the constraint writer, which goes through it one more time and converts it into:
    • compiled circuit: a series of gates and wires
    • prover instructions: instructions on how to run the function for the witness generation (used by the prover)

A simple ASM language is also used, and the circuit can be encoded in this language. See the ASM chapter.

Terminology

A note on topology:

  • functions: noname functions each contain their scope and can be interacted with their interface (arguments and return value)
  • module/program: a noname module is a single file (this is a nice current limitation of noname) containing functions, constants, and structures.
  • library: a noname library is a module/program without a main() function, as well as dependencies (other libraries)
  • executable: a noname executable is like a library, except that its module/program has a main() function.

Grammar

The syntax of the noname language is described through its grammar.

We use a notation similar to the Backus-Naur Form (BNF) to describe the grammar:

land := city "|"
 ^        ^   ^
 |        |  terminal: a token
 |        |
 |      another non-terminal
 |
 non-terminal: definition of a piece of code
city := [ sign ] "," { house }
        ^            ^
        optional     |
                    0r or more houses
sign := /a-zA-Z_/
        ^
        regex-style definition

There are some comments in the parser code (parser.rs) that attempt to define this grammar.

Essentially, it is made to look like Rust, but with some differences of philosophies:

  • expressions cannot be statements, unless they return no value (act using side effects).

Spans

To be able to efficiently track errors, we have a span type:

#![allow(unused)]
fn main() {
pub struct Span(pub usize, pub usize);
}

which represents a location in the original source code:

  • the first number is the offset in the source code file
  • the second numebr if the length of the span (e.g. 1 character)

We start tracking spans in the lexer, and then pass them around to the parser, and then to the compiler. Even gates and wirings have spans associated with them so that we can easily debug those.

Filename

The filename is currently missing from the Span, it is annoying to add it as a String because then we can’t easily copy the span around (String is not Copy but Clone).

One way to solve this, is to add the filenames in a Hashmap<usize, String>, and have the usize be in the Span.

Paths

Paths are structures identifying snippets of code that look like this:

#![allow(unused)]
fn main() {
some_module::some_ident.stuff.z
}

The identifier some_module, appearing before the ::, is an optional module, pointing to code that exists in another library. It is always lowercase.

The identifier some_ident is mandatory. It can represent a type (if it starts with a capital letter), a function name, a variable name, a constant name, etc.

More identifiers can be concatenated together to form a longer path (using .). A path is represent like this internally:

#![allow(unused)]
fn main() {
/// A path represents a path to a type, a function, a method, or a constant.
/// It follows the syntax: `module::X` where `X` can be a type, a function, a method, or a constant.
/// In the case it is a method `a` on some type `A` then it would read:
/// `module::A.a`.
#[derive(Debug, Clone)]
pub struct Path {
    /// A module, if this is an foreign import.
    pub module: Option<Ident>,

    /// The name of the type, function, method, or constant.
    /// It's a vector because it can also be a struct access.
    pub name: Vec<Ident>,

    /// Its span.
    pub span: Span,
}
}

Expressions using Path

A path does not represent an expression by itself. The following expressions make use of path:

#![allow(unused)]
fn main() {
pub enum ExprKind {
    /// `module::some_fn()`
    FnCall {
        name: Path,
        args: Vec<Expr>,
    },

    /// `module::SomeType.some_method()`
    MethodCall {
        self_name: Path,
        method_name: Ident,
        args: Vec<Expr>,
    },

    /// `module::some_var` or
    /// `module::SomeType.some_field.some_other_field`
    Variable(Path),

    /// `module::SomeType.some_field[some_expr]` or
    /// `module::some_const[some_expr]`
    ArrayAccess {
        name: Path,
        idx: Box<Expr>,
    },

    /// `module::SomeType { field1: expr1; field2: expr2 }`
    CustomTypeDeclaration(Path, Vec<(Ident, Expr)>),
}
}

Type Checker

Noname uses a simple type checker to ensure that all types are consistent in the program.

For example, in code like:

#![allow(unused)]
fn main() {
let x = y + z;
}

the type checker will ensure that y and z are both field elements (because the operation + is used).

And in code like:

#![allow(unused)]
fn main() {
assert_eq(a, b);
}

the type checker will ensure that a and b are of the same types, since they are being compared.

Type inference

The type checker can do some simple type inference. For example, in the following code:

#![allow(unused)]
fn main() {
let x = y + z;
}

the type of x is inferred to be the same as the type of y and z.

Inference is willingly kept naive, as more type inference would lead to less readable code.

Scope

The type checker must be aware of scopes, as it keeps track of the type of variables and functions that are local to each scope.

For example:

#![allow(unused)]
fn main() {
let x = 2;
for i in 0..2 {
    let y = x + 1; // x exists inside of this for loop
}
let z = 1 + y; // BAD: y doesn't exist outside of the for loop
}

To do this, each function is passed an Environment which contains a list of all variables along with their type information.

#![allow(unused)]
fn main() {
pub struct Environment {
    /// created by the type checker, gives a type to every external variable
    pub var_types: HashMap<String, TypeInfo>,

    // ...

    /// the functions present in the scope
    /// contains at least the set of builtin functions (like assert_eq)
    pub functions: HashMap<String, FuncInScope>,

    /// stores the imported modules
    pub modules: HashMap<String, ImportedModule>,
}
}

So that shadowing is disallowed, even in different scopes, there is only one variable that is stored, but the scope is stored in TypeInfo and matched against the current scope to see if the current scope is the same or a direct child.

An environment is unique to a function, as it is important that different functions can use the same variable names.

Some notes:

  • Currently the notion of module is quite shaky. It is used mostly for crypto::poseidon at the moment.
  • functions is mostly used for builtins like assert_eq
  • modules is mostly used for the functions in std::crypto, which only contains crypto::poseidon atm.

more:

  • I think Environment mixes things
  • we should be able to create a new Environment whenever we parse a new function, so the functions/modules should be part of another (AvailableToAllScopes)
  • variables is something that has nothing to do with the “Type” Environment and should be moved elsewhere no? GateCreationEnv?
  • there needs to be a namespace in the typeinfo

noname ASM

The circuit that noname compiles to can be serialized as a simple ASM language. For example the following noname program:

fn main(pub public_input: Field, private_input: Field) {
    let x = private_input + public_input;
    assert_eq(x, 2);
}

will be compiled to the following noname asm:

@ noname.0.5.0

DoubleGeneric<1>
DoubleGeneric<1,1,-1>
DoubleGeneric<1,0,0,0,-2>
DoubleGeneric<1,-1>
(0,0) -> (1,1)
(1,2) -> (3,1)
(2,0) -> (3,0)

which includes:

  • the version of noname used to compile this circuit. This is important as the prover needs to know what version of noname to use to prove executions of this circuit.
  • a list of gates and how they are tweaked (the values in the brackets).
  • a list of wires which is canonically ordered so that every compilation gives the same resulting noname asm.

Structs

User can define custom structs like so:

#![allow(unused)]
fn main() {
struct Thing {
    x: Field,
    y: Field,
}
}

and can declare and access such structs like so:

#![allow(unused)]
fn main() {
let thing = Thing { x: 1, y: 2 };
let z = thing.x + thing.y;
}

Internally, a struct is represented within the Var type.

Methods

A method on what?

There’s one problem when handling methods in the circuit-writer: how do you know where the code of that method is? For example:

#![allow(unused)]
fn main() {
let thing = Thing { x: 1, y: 2 };
let z = thing.valid(3);
}

at this point the circuit-writer knows that Thing has a method called valid, but will still wonder what the type of thing is.

due to this, the circuit-writer needs to store the type of local variables in scope. And this is why FnEnv also keeps track of the type of local variables:

#![allow(unused)]
fn main() {
/// Is used to store functions' scoped variables.
/// This include inputs and output of the function,  as well as local variables.
/// You can think of it as a function call stack.
pub struct FnEnv {
    /// The current nesting level.
    /// Starting at 0 (top level), and increasing as we go into a block.
    current_scope: usize,

    /// Used by the private and public inputs,
    /// and any other external variables created in the circuit
    /// This needs to be garbage collected when we exit a scope.
    /// Note: The `usize` is the scope in which the variable was created.
    vars: HashMap<String, (usize, VarInfo)>,
}

/// Information about a variable.
#[derive(Debug, Clone)]
pub struct VarInfo {
    /// The variable.
    pub var: Var,

    /// We keep track of the type of variables, eventhough we're not in the typechecker anymore,
    /// because we need to know the type for method calls.
    pub typ: TyKind,
}
}

This still doesn’t fix our problem. In the line:

#![allow(unused)]
fn main() {
let thing = Thing { x: 1, y: 2 };
}

the local variable thing is stored, but the right hand side is computed via the compute_expr() function which will go through the AST and potentially create different anonymous variables until it can compute a value.

There’s three ways to solve this:

  1. Either the type checker stores type information about each expression it parses. This is what the Rust compiler does I believe: each Expr AST node has a unique node identifier that can be used to search type information in a map.
  2. Or, more simply, the circuit-writer’s compute_expr() function that returns an Option<Var> could be modified to return Option<VarInfo>. This is a bit annoying as we’re recomputing things we’ve done in the type checker.
  3. A variant of the previous option is to change Var so that it also contain a type (might as well).

So we implement option 1: the type checker now stores the type information of each Expr node in the AST under a hashmap that is later passed to the circuit-writer:

#![allow(unused)]
fn main() {
/// The environment we use to type check a noname program.
pub struct TypedGlobalEnv {
    // ...

    /// Mapping from node id to TyKind.
    /// This can be used by the circuit-writer when it needs type information.
    node_types: HashMap<usize, TyKind>,
}
}

Literals and the const keyword

We want to be able to support a few things:

  • writing numbers that will act as field elements directly in the code
  • writing numbers that will act as relatively small numbers (for indexing into arrays for example) in the code
  • writing functions that will accept constant arguments. For example to index into an array.

The first two points allow us to write things like that:

#![allow(unused)]
fn main() {
let x = 3;
assert_eq(y[5], 4);
}

The third point allows us to write things like that:

#![allow(unused)]
fn main() {
fn House.get(self, room: Field) -> Field {
    return House.rooms[room];
}
}

Since House takes a Field here, it can in theory be used with anything during type checking.

This is bad. There are two solutions:

  1. During type checking, when we realize that room is used to index into an array, we enforce that it must be a constant value during compilation.
  2. We create a distinct type for constants and literals.

Approach 1. is not elegant, because it means that it is not clear from the signature of the function alone that the room argument must be a constant. The user of the function will only get warned when trying to compile the program.

Approach 2. is interesting, because we already have such a type internally to track literals: a BigInt. The name is a bit misleading in the case of array accesses, because we additionaly enforce that it is NOT a big integer, but rather a 32-bit integer (u32 in Rust).

Implementing a literal type

Approach 2 can be implemented in two ways:

a. Use a brand new type, like BigInt for literals. b. Use a const attribute to indicate that it is a constant.

Approach a. is a bit clumsy in my opinion because the developer need to remember about a new type name, and understand the distinction with that and Field.

On the other hand, approach b. uses the const keyword which is already well-known in many compiled programming languages.

What about this:

#![allow(unused)]
fn main() {
fn House.get(self, const room: Field) -> Field {
// versus
fn House.get(self, room: const Field) -> Field {
}

To contrast, the two other existing attributes (pub and mut) are placed in front of the variable names, not the type names.

One could argue that the type is the same, but the variable being passed is a constant, and so it makes more sense to implement the first version. This is what we do in noname.

At the time of this writing, the const keyword only seem to make sense in a function argument, and so is implemented in the same way as the pub attribute:

#![allow(unused)]
fn main() {
pub enum AttributeKind {
    Pub,
    Const,
}

pub struct Attribute {
    pub kind: AttributeKind,
    pub span: Span,
}

pub struct FnArg {
    pub name: Ident,
    pub typ: Ty,
    pub attribute: Option<Attribute>,
    pub span: Span,
}
}

When a function is parsed by the type checker, a const Field is transformed into a BigInt. And as such, the type checker will be happy with that variable being used to index into an array, or being used by other functions expecting a constant.

What about other types being const? I don’t think it makes sense for now, as I can only think of array access requiring this. So we don’t implement it.

Note

If we do want to support that one day, we will have to track more than TyKind in the typechecker… This can be achieved by adding a const field in the TypeInfo structure that tracks type-related data on noname variables present in the scope.

Expressions

Field accesses

A field access is an access to a field of a structure, by writing struct.field. It is represented as an expression in noname:

#![allow(unused)]
fn main() {
pub enum ExprKind {
    // ...
    FieldAccess { lhs: Box<Expr>, rhs: Ident }
}

The reason why the left-hand side is also an expression, as opposed to just a variable pointing to a struct, is that we need to support code like this:

#![allow(unused)]
fn main() {
let a = b.c.d;
let e = f[3].g;
}

Note that there are other usecases that are not allowed at the moment for readability reasons. For example we could have allowed the following to be valid noname:

#![allow(unused)]
fn main() {
let res = some_function().some_field;
}

but instead we require developers to write their logic in the following way:

#![allow(unused)]
fn main() {
let temp = some_function();
let res = temp.some_field;
}

In the example

#![allow(unused)]
fn main() {
let a = b.c.d;
}

the expression node representing the right hand side could be seen as:

#![allow(unused)]
fn main() {
ExprKind::FieldAccess {
    lhs: Expr { kind: ExprKind::FieldAccess { // x.y
        lhs: Expr { kind: ExprKind::Variable { name: "x" } },
        rhs: Ident { value: "y" },
    },
    rhs: Ident { value: "z" }, ///  [x.y].z
}
}

Assignments

Imagine that we want to mutate a variable. For example:

#![allow(unused)]
fn main() {
x.y.z = 42;
x[4] = 25;
}

At some point the circuit-writer would have to go through an expression node looking like this:

#![allow(unused)]
fn main() {
ExprKind::Assignment {
    lhs: /* the left hand side as an Expr */,
    rhs: Expr { kind: ExprKind::BigInt { value: 42 } },
}
}

At this point, the problem is that to go through each expression node, we use the following API, which only gets us a Var:

#![allow(unused)]
fn main() {
fn compute_expr(
    &mut self,
    global_env: &GlobalEnv,
    fn_env: &mut FnEnv,
    expr: &Expr,
) -> Result<Option<Var>> {
}

So parsing the x.y node would return a variable that either represents x or represents y. The parent call would then use the result to produce x.y.z with a similar outcome. Then, we would either have x or z (depending on the strategy we chose) when we reach the assignment expression node. Not leaving us enough information to modify the variables of x in our local function environment.

What we really need when we reach the assignment node is the following:

  • the name of the variable being modified (in both cases x)
  • if the variable is mutable or not (it was defined with the mut keyword)
  • the range of circuit variables in the Var.cvars of x, that the x.y.z field access, or the x[42] array access, represents.

Circuit writer

To implement this in the circuit writer, we follow a common practice of tracking references:

#![allow(unused)]
fn main() {
/// Represents a variable in the circuit, or a reference to one.
/// Note that mutable variables are always passed as references,
/// as one needs to have access to the variable name to be able to reassign it in the environment.
pub enum VarOrRef {
    /// A [Var].
    Var(Var),

    /// A reference to a noname variable in the environment.
    /// Potentially narrowing it down to a range of cells in that variable.
    /// For example, `x[2]` would be represented with "x" and the range `(2, 1)`,
    /// if `x` is an array of `Field` elements.
    Ref {
        var_name: String,
        start: usize,
        len: usize,
    },
}
}

and we modify the circuit-writer to always return a VarOrRef when processing an expression node in the AST.

Note

While the type checker already checks if the lhs variable is mutable when it encounters an assignment expression, the circuit-writer should do its best to pass references only when a variable is known to be mutable. This way, if there is a bug in the type checker, this will turn unsafe code into a runtime error.

An array access, or a field access in a struct, is processed as a narrowing of the range we’re referencing in the original variable:

#![allow(unused)]
fn main() {
impl VarOrRef {
    fn narrow(&self, start: usize, len: usize) -> Self {
        match self {
            VarOrRef::Var(var) => {
                let cvars = var.range(start, len).to_vec();
                VarOrRef::Var(Var::new(cvars, var.span))
            }

            //      old_start
            //      |
            //      v
            // |----[-----------]-----| <-- var.cvars
            //       <--------->
            //         old_len
            //
            //
            //          start
            //          |
            //          v
            //      |---[-----]-|
            //           <--->
            //            len
            //
            VarOrRef::Ref {
                var_name,
                start: old_start,
                len: old_len,
            } => {
                // ensure that the new range is contained in the older range
                assert!(start < *old_len); // lower bound
                assert!(start + len < *old_len); // upper bound
                assert!(len > 0); // empty range not allowed

                Self::Ref {
                    var_name: var_name.clone(),
                    start: old_start + start,
                    len,
                }
            }
        }
    }
}

Type checker

While the type checker does not care about the range within a variable, it also needs to figure out if a variable is mutable or not.

That information is in two places:

  1. it is stored under the variable’s name in the local environment
  2. it is also known when we look up a variable, and we can thus bubble it up to the parent expression nodes

Implementing solution 1. means bubbling up the variable name, in addition to the type, associated to an expression node.

Implementing solution 2. means bubbling up the mutability instead.

As it is possible that we might want to retrieve additional information in the future, we chose to implement solution 1. and carry the variable name in addition to type information when parsing the AST in the type checker.

Modules

In noname, the concept of a module is basically a file. A project either is a binary (main.no) or a library (lib.no). That’s it.

A binary or a library can use other libraries by importing them. To do that, a binary or library’s manifest file Noname.toml must contain a dependencies key listing all the other libraries as Github handles like user/repo (e.g. mimoo/sudoku). Libraries will then be retrieved from Github.

Note

Currently there is no versioning. Not because it’s not important, but because I haven’t had the time to implement it.

Each library can be imported in code with the following command:

use module::lib;

For example, currently you automatically have access to the std module:

use std::crypto;

fn main(pub digest: [Field; 2]) {
    let expected_digest = crypto::poseidon([1, 2]);
    assert_eq(expected_digest, digest);
}

Each library is seen as a module, and different modules might have the same name:

use a::some_lib;
use b::some_lib;

There is currently no solution to this problem.

Note

This is a problem that does not exist in Rust, as there’s a single namespace that everyone shares, but that exists in Golang. The current proposed solution is to introduce an as keyword, like in Rust, to be able to alias imports (e.g. use a::some_lib as a_some_lib;).

Dependency graph and type checking

During building, a dependency graph of all dependencies is formed (and dependencies are retrieved from Github at the same time). This must be done to detect dependency cyles.

Once this is done, a list of dependencies from leaves to roots is computed, and each dependency is analyzed in this order. Dependencies are not compiled! As the circuit-writer is not ran. Things stop at the type checker. For every new dependency analyzed, all TAST (typed AST) previously computed on previous dependencies are passed as argument. This way, if a dependency A uses a dependency B, it has access to the TAST of B to perform type checking correctly.

As such, it is important that a::some_lib and b::some_lib are seen as two independent modules. For this reason, we store imported modules as their fully qualified path, in the set of TASTs that we pass to the type checker. But in the current module, we store them as their alias, so that we can use them in the code.

TASTs: HashMap<a::some_lib, TAST>
TAST: contains <some_lib -> a::some_lib>

Compilation and circuit generation

Once type checking is done, the circuit writer is given access to all of the dependencies’ TAST (which also contain their AST). This way, it can jump from AST to AST to generate an unrolled circuit.

Another solution

This is a bit annoying. We need a context switcher in both the constraint writer and the type checker, and it’s almost the same code.

Type Checker

Constraint Writer

#![allow(unused)]
fn main() {
pub struct CircuitWriter {
    /// The type checker state for the main module.
    // Important: this field must not be used directly.
    // This is because, depending on the value of [current_module],
    // the type checker state might be this one, or one of the ones in [dependencies].
    typed: TypeChecker,

    /// The type checker state and source for the dependencies.
    // TODO: perhaps merge {source, typed} in this type?
    dependencies: Dependencies,

    /// The current module. If not set, the main module.
    // Note: this can be an alias that came from a 3rd party library.
    // For example, a 3rd party library might have written `use a::b as c;`.
    // For this reason we must store this as a fully-qualified module.
    pub(crate) current_module: Option<UserRepo>,
}

and then access to the TAST is gated so we can switch context on demand, or figure out what’s the current context:

#![allow(unused)]
fn main() {
impl CircuitWriter {
    /// Retrieves the type checker associated to the current module being parsed.
    /// It is possible, when we jump to third-party libraries' code,
    /// that we need access to their type checker state instead of the main module one.
    pub fn current_type_checker(&self) -> &TypeChecker {
        if let Some(current_module) = &self.current_module {
            self.dependencies
                .get_type_checker(current_module)
                .expect(&format!(
                    "bug in the compiler: couldn't find current module: {:?}",
                    current_module
                ))
        } else {
            &self.typed
        }
    }

    pub fn expr_type(&self, expr: &Expr) -> Option<&TyKind> {
        let curr_type_checker = self.current_type_checker();
        curr_type_checker.node_types.get(&expr.node_id)
    }

    pub fn node_type(&self, node_id: usize) -> Option<&TyKind> {
        let curr_type_checker = self.current_type_checker();
        curr_type_checker.node_types.get(&node_id)
    }

    pub fn struct_info(&self, name: &str) -> Option<&StructInfo> {
        let curr_type_checker = self.current_type_checker();
        curr_type_checker.struct_info(name)
    }

    pub fn fn_info(&self, name: &str) -> Option<&FnInfo> {
        let curr_type_checker = self.current_type_checker();
        curr_type_checker.functions.get(name)
    }

    pub fn size_of(&self, typ: &TyKind) -> Result<usize> {
        let curr_type_checker = self.current_type_checker();
        curr_type_checker.size_of(&self.dependencies, typ)
    }

    pub fn resolve_module(&self, module: &Ident) -> Result<&UsePath> {
        let curr_type_checker = self.current_type_checker();

        let res = curr_type_checker.modules.get(&module.value).ok_or_else(|| {
            self.error(
                ErrorKind::UndefinedModule(module.value.clone()),
                module.span,
            )
        });

        res
    }

    pub fn do_in_submodule<T, F>(&mut self, module: &Option<Ident>, mut closure: F) -> Result<T>
    where
        F: FnMut(&mut CircuitWriter) -> Result<T>,
    {
        if let Some(module) = module {
            let prev_current_module = self.current_module.clone();
            let submodule = self.resolve_module(module)?;
            self.current_module = Some(submodule.into());
            let res = closure(self);
            self.current_module = prev_current_module;
            res
        } else {
            closure(self)
        }
    }

    pub fn get_fn(&self, module: &Option<Ident>, fn_name: &Ident) -> Result<FnInfo> {
        if let Some(module) = module {
            // we may be parsing a function from a 3rd-party library
            // which might also come from another 3rd-party library
            let module = self.resolve_module(module)?;
            self.dependencies.get_fn(module, fn_name) // TODO: add source
        } else {
            let curr_type_checker = self.current_type_checker();
            let fn_info = curr_type_checker
                .functions
                .get(&fn_name.value)
                .cloned()
                .ok_or_else(|| {
                    self.error(
                        ErrorKind::UndefinedFunction(fn_name.value.clone()),
                        fn_name.span,
                    )
                })?;
            Ok(fn_info)
        }
    }

    pub fn get_struct(&self, module: &Option<Ident>, struct_name: &Ident) -> Result<StructInfo> {
        if let Some(module) = module {
            // we may be parsing a struct from a 3rd-party library
            // which might also come from another 3rd-party library
            let module = self.resolve_module(module)?;
            self.dependencies.get_struct(module, struct_name) // TODO: add source
        } else {
            let curr_type_checker = self.current_type_checker();
            let struct_info = curr_type_checker
                .struct_info(&struct_name.value)
                .ok_or(self.error(
                    ErrorKind::UndefinedStruct(struct_name.value.clone()),
                    struct_name.span,
                ))?
                .clone();
            Ok(struct_info)
        }
    }

    pub fn get_source(&self, module: &Option<UserRepo>) -> &str {
        if let Some(module) = module {
            &self
                .dependencies
                .get_type_checker(module)
                .expect(&format!(
                    "type checker bug: can't find current module's (`{module:?}`) file"
                ))
                .src
        } else {
            &self.typed.src
        }
    }

    pub fn get_file(&self, module: &Option<UserRepo>) -> &str {
        if let Some(module) = module {
            &self.dependencies.get_file(module).expect(&format!(
                "type checker bug: can't find current module's (`{module:?}`) file"
            ))
        } else {
            &self.typed.filename
        }
    }

    pub fn get_current_source(&self) -> &str {
        self.get_source(&self.current_module)
    }

    pub fn get_current_file(&self) -> &str {
        self.get_file(&self.current_module)
    }

    pub fn add_local_var(&self, fn_env: &mut FnEnv, var_name: String, var_info: VarInfo) {
        // check for consts first
        let type_checker = self.current_type_checker();
        if let Some(_cst_info) = type_checker.constants.get(&var_name) {
            panic!(
                "type checker bug: we already have a constant with the same name (`{var_name}`)!"
            );
        }

        //
        fn_env.add_local_var(var_name, var_info)
    }

    pub fn get_local_var(&self, fn_env: &FnEnv, var_name: &str) -> VarInfo {
        // check for consts first
        let type_checker = self.current_type_checker();
        if let Some(cst_info) = type_checker.constants.get(var_name) {
            let var = Var::new_constant(cst_info.value, cst_info.typ.span);
            return VarInfo::new(var, false, Some(TyKind::Field));
        }

        // then check for local variables
        fn_env.get_local_var(var_name)
    }
}

we basically have to implement the same in the type checker… It always sort of looks the same. A handy function is either called with get_fn or expr_type or node_type etc. or we call a block of code with do_in_submodule.

all of these basically start by figuring out the curr_type_checker:

  • what’s the current module (self.current_module)?
    • if there is none, use the main TAST (self.typed)
    • otherwise find that TAST (in self.dependencies)
    • btw all of this logic is implemented in self.current_type_checker()
    • the returned TAST is called curr_type_checker

then, if we’re handling something that has a module:

  • do name resolution (implemented in resolve_module()):
    • use curr_type_checker to resolve the fully-qualified module name

or if we’re executing a block within a module:

  • save the current module (self.current_module)
  • replace it with the module we’re using (we have used resolve_module() at this point)
  • execute in the closure where self is passed
  • when we return, reset the current module to its previous saved state

note that when we return an error, we always try to figure out which file it came from, which can be resolved via self.current_module.

Name resolution approach

If we have a name resolution phase, we could do this:

  • fully qualify all things that need to be fully qualified: structs, functions, methods (which are defined as function currently, should we not do that?), consts. And that’s it?
  • create a Hashmap<usize, String> to store all the filenames
  • add the usize in all Span

CellVar

A CellVar type is a type that represents an internal variable. Importantly, it is named after the fact that it relates to a specific cell, or even multiple cells if they will have the same value (using some wiring), in the execution trace.

A CellVar looks like this:

#![allow(unused)]
fn main() {
pub struct CellVar {
    index: usize,
    span: Span,
}
}

It is tracked using a usize, which is just a counter that the compiler increments every time a new CellVar is created.

A CellVar is created via the new_internal_var function which does two things: increments the variable counter, and stores some information on how to compute it (which will be useful during witness generation)

#![allow(unused)]
fn main() {
pub fn new_internal_var(&mut self, val: Value, span: Span) -> CellVar {
    // create new var
    let var = CellVar::new(self.next_variable, span);
    self.next_variable += 1;

    // store it in the compiler
    self.witness_vars.insert(var, val);

    var
}
}

a Value tells us how to compute the CellVar during witness generation:

#![allow(unused)]
fn main() {
pub enum Value {
    /// Either it's a hint and can be computed from the outside.
    Hint(Box<dyn Fn(&Compiler, &mut WitnessEnv) -> Result<Field>>),

    /// Or it's a constant (for example, I wrote `2` in the code).
    Constant(Field),

    /// Or it's a linear combination of internal circuit variables (+ a constant).
    LinearCombination(Vec<(Field, CellVar)>, Field),

    /// A public or private input to the function
    /// There's an index associated to a variable name, as the variable could be composed of several field elements.
    External(String, usize),

    /// A public output.
    /// This is tracked separately as public inputs as it needs to be computed later.
    PublicOutput(Option<CellVar>),
}
}

Note: a CellVar is potentially not directly added to the rows of the execution trace. For example, a private input is converted directly to a (number of) CellVar(s), but only added to the rows when it appears in a constraint for the first time.

As the final step of the compilation, we double check that all CellVars have appeared in the rows of the execution trace at some point. If they haven’t, it can mean two things:

  • A private or public input was never used in the circuit. In this case we return an error to the user.
  • There is a bug in the compiler. In this case we panic.

TODO: explain the LinearCombination. I think we only need an Add((Field, Var), (Field, Var), Field)

Vars

We already have CellVars, why have Vars? The distinction is a matter of abstraction:

  • CellVars are low-level: they track actual cells of the execution trace. When a single CellVar is assigned to multiple cells the wiring must make sure the cells are wired (so that they can only have the same value).
  • Vars are a higher-level concept: they track variables that are created in the noname language either directly (e.g. let x = 3) or indirectly (e.g. in x + (y + z) the term y + z is stored under an anonymous Var)

While a CellVar represents a single field element, a Var can potentially represent several field elements (and as such several cells in the execution trace). Here are some examples of Vars:

#![allow(unused)]
fn main() {
// a constant
let x = 5;

// a field element that will be computed at runtime
let y = private_input + 1;

// a builtin type, like an array, or a bool
let z = [y, x, 6];

// or a custom type
let s = Thing { x, y };
}

Internally, a Var is represented as such:

#![allow(unused)]
fn main() {
/// A constant value created in a noname program
pub struct Constant {
    /// The actual value.
    pub value: Field,

    /// The span that created the constant.
    pub span: Span,
}

/// Represents a cell in the execution trace.
pub enum ConstOrCell {
    /// A constant value.
    Const(Constant),

    /// A cell in the execution trace.
    Cell(CellVar),
}

/// A variable in a program can have different shapes.
pub enum VarKind {
    /// We pack [Const] and [CellVar] in the same enum because we often branch on these.
    ConstOrCell(ConstOrCell),

    /// A struct is represented as a mapping between field names and other [VarKind]s.
    Struct(HashMap<String, VarKind>),

    /// An array or a tuple is represetend as a list of other [VarKind]s.
    ArrayOrTuple(Vec<VarKind>),
}

/// Represents a variable in the noname language, or an anonymous variable during computation of expressions.
pub struct Var {
    /// The type of variable.
    pub kind: VarKind,

    /// The span that created the variable.
    pub span: Span,
}
}

Note

Note: see the Constant chapter to see why constants are treated differently.

Anonymous variable

Here’s a short note on anonymous variable.

When circuit writer parses the ast, it will convert each expression into a Var (unless the expression does not compute to an actual value).

In our example:

#![allow(unused)]
fn main() {
let x = t + (z + y);
}

the z + y is parsed as an expression (a binary operation involving z and y) and stored under a var var1. Then t + ... is also parsed as another binary operation expression and stored under another var var2. Finally the let x = ... is parsed as an assignment statement, and x is stored as a local variable associated to the right handside var var2.

Note

See the Scope chapter for more information on local variables.

Constants

Developers write constants in their code all the time. For example, the following code has two constants 2 and 4:

#![allow(unused)]
fn main() {
let x = 2 + y;
assert_eq(x, 4);
}

It is important that constants are tracked differently than CellVars for several reasons:

  • It is sometimes useless to constrain them directly. For example, in let x = 3 + 7; you can see that we should not constrain 3 and 7 separately, but rather the result 10.
  • It is sometimes useless to constrain them at all. For example, boolean constants are never constrained because you never need to.
  • They can be cached to avoid creating several constraints for the same constant.

Currently a constant appears in the circuit only when CircuitWriter::add_constant is called. It uses the generic gate to constrain the value, and is not cached (so calling it several times with the same constant will create multiple constraints):

#![allow(unused)]
fn main() {
pub fn add_constant(&mut self, value: Field, span: Span) -> CellVar {
    let var = self.new_internal_var(Value::Constant(value), span);

    let zero = Field::zero();
    self.add_gate(
        GateKind::DoubleGeneric,
        vec![Some(var)],
        vec![Field::one(), zero, zero, zero, value.neg()],
        span,
    );

    var
}
}

Note that the Value keep track of the constant as well.

Warning: gadgets must all handle constants gracefuly. That is, they must constrain constants themselves (by calling CircuitWriter::add_constant).

Global environment

In this chapter we will talk about functions.

Local functions

Third-party libraries can have function names that collide with your own function names. Due to this, they are tracked in a different data structure that we will see later.

Local functions include:

  • automatically imported built-ins. Think functions like assert and assert_eq. See here for a full list.
  • main, this is the main function that your program runs. Of course if you’re writing a library this function is not present.
  • normal functions, these are functions that you define in your program. They can be recursive.
  • methods, these are functions that are defined on a type. They can be recursive as well.

Built-ins are different from all other functions listed because they are not written in noname, but written in Rust within the compiler.

For this reason we track functions according to this enum:

#![allow(unused)]
fn main() {
pub enum FnKind {
    /// signature of the function
    BuiltIn(FnHandle),

    /// Any function declared in the noname program (including main)
    LocalFn(AST)

    /// path, and signature of the function
    Library(Vec<String>),
}

/// An actual handle to the internal function to call to resolve a built-in function call.
pub type FnHandle = fn(&mut CircuitWriter, &[Var], Span) -> Result<Option<Var>>;

pub struct FnInfo {
    pub name: Ident,
    pub sig: FnSig,
    pub kind: FnKind,
    pub span: Span,
}
}

Note that the signature of a FnHandle is designed to:

  • &mut CircuitWriter: take a mutable reference to the circuit writer, this is because built-ins need to be able to register new variables and add gates to the circuit
  • &[Var]: take an unbounded list of variables, this is because built-ins can take any number of arguments, and different built-ins might take different types of arguments
  • Span: take a span to return user-friendly errors
  • -> Result<Option<Var>>: return a Result with an Option of a Var. This is because built-ins can return a variable, or they can return nothing. If they return nothing, then the Option will be None. If they return a variable, then the Option will be Some(Var).

We track all of these functions in the following structure:

#![allow(unused)]
fn main() {
pub struct GlobalEnv {
    /// the functions present in the scope
    /// contains at least the set of builtin functions (like assert_eq)
    functions: HashMap<String, FnInfo>,

    // ...
}
}

Handling builtins

Builtins are handled in a special way. They are not written in noname, but in Rust.

Handling local functions

The parser:

  • saves the AST of each function it encounters. Specifically, the function’s AST is stored under the GlobalEnv (TODO: where exactly?). This is necessary as the circuit writer will have to switch to a function’s AST when a function is called (and then return to its own AST).

The first step of the type checker resolves imports by doing the following:

  • store all built-ins in the functions map of the GlobalEnv
  • resolve all imports (e.g. use std::crypto)
  • type check each function individually, and save their signature in the GlobalEnv using the FnSig type
  • type check function calls with the signatures they just saved

(TODO: this means that function declaration must be ordered. I think it is a GOOD thing)

When a function is called, we do the following:

  • if the function is qualified (e.g. crypto::poseidon), then lookup imported modules (see next section)
  • otherwise, check if the function exist in the GlobalEnv, if it doesn’t then return an error
  • if the function exist, then create a new FnEnv and register the arguments as local variables there
  • switch to the function’s AST and pass the new FnEnv as argument
  • TODO: how to handle the return value? it should be saved in the FnEnv

Third-party libraries

TODO: write this part

#![allow(unused)]
fn main() {
/// This seems to be used by both the type checker and the AST
// TODO: right now there's only one scope, but if we want to deal with multiple scopes then we'll need to make sure child scopes have access to parent scope, shadowing, etc.
#[derive(Default, Debug)]
pub struct GlobalEnv {
    /// the functions present in the scope
    /// contains at least the set of builtin functions (like assert_eq)
    pub functions: HashMap<String, FuncInScope>,

    /// stores the imported modules
    pub modules: HashMap<String, ImportedModule>,

    /// the arguments expected by main
    pub main_args: (HashMap<String, FuncArg>, Span),
}

pub type FnHandle = fn(&mut CircuitWriter, &[Var], Span) -> Result<Option<Var>>;

pub enum FuncInScope {
    /// signature of the function
    BuiltIn(FnSig, FnHandle),

    /// path, and signature of the function
    Library(Vec<String>, FnSig),
}
}

Note

Not all modules are third-party libraries, some are also built-ins (e.g. std::crypto).

As part of resolving imports, the type checker looks at third-party libraries differently…

TODO: implement this

TODO: how to handle diamond dependency graph or cycles? We must form a dependency graph first, and resolve dependency according to this graph

Scope

Like most languages, noname has a notion of scope within a function. Unlike a lot of languages noname forbids shadowing at all scope level. This means that eventhough different functions can use local variable with colliding names, the local variable of one function must all have different names.

For example, the following code does not compile:

#![allow(unused)]
fn main() {
let x = 2;
let x = 3; // this won't compile

let y = 4;
for i in 0..4 {
    let y = i; // this won't compile either
}
}

Scopes are only used for:

  • for loops
  • in the future: if/else statements

Scope mechanisms

Both the type checker and the circuit writer need to keep track of local variable. For the type checker (type_checker.rs), a TypeEnv structure keeps track of the association between all local variables names and their type information. For the circuit writer (circuit_writer.rs), a FnEnv structure keeps track of the association between all local variable names and their circuit variable.

Both structure also keep track of how nested the current block is (the top level starting at level 0). For this reason, it is important to remember to increase the current scope when entering a new block (for loop, if statement, etc.) and to decrease it when exiting the block. In addition, all variables from a scope must be disabled (but not deleted, in order to detect shadowing) when exiting that scope.

For example, the type checker’s TypeEnv structure implements the following logic:

#![allow(unused)]
fn main() {
impl TypeEnv {
    // ...


    /// Enters a scoped block.
    pub fn nest(&mut self) {
        self.current_scope += 1;
    }

    /// Exits a scoped block.
    pub fn pop(&mut self) {
        self.current_scope.checked_sub(1).expect("scope bug");

        // disable variables as we exit the scope
        for (name, (scope, type_info)) in self.vars.iter_mut() {
            if *scope > self.current_scope {
                type_info.disabled = true;
            }
        }
    }

    /// Returns true if a scope is a prefix of our scope.
    pub fn is_in_scope(&self, prefix_scope: usize) -> bool {
        self.current_scope >= prefix_scope
    }

    /// Stores type information about a local variable.
    /// Note that we forbid shadowing at all scopes.
    pub fn store_type(&mut self, ident: String, type_info: TypeInfo) -> Result<()> {
        match self
            .vars
            .insert(ident.clone(), (self.current_scope, type_info.clone()))
        {
            Some(_) => Err(Error::new(
                 ErrorKind::DuplicateDefinition(ident),
                 type_info.span,
            )),
            None => Ok(()),
        }
    }

    /// Retrieves type information on a variable, given a name.
    /// If the variable is not in scope, return false.
    pub fn get_type_info(&self, ident: &str) -> Option<TypeInfo> {
        if let Some((scope, type_info)) = self.vars.get(ident) {
            if self.is_in_scope(*scope) && !type_info.disabled {
                Some(type_info.clone())
            } else {
                None
            }
        } else {
            None
        }
    }
}
}

Public Outputs

Public outputs are usually part of the public inputs in Plonk.

In noname, public ouputs are treated differently than the public inputs for one reason: unlike (real) public inputs they cannot be computed directly during witness generation (proving).

This is because public inputs are listed first in the circuit. During witness generation, we go through each rows and evaluate the values of the cells to construct the execution trace. When we reach the public output part of the public input, we do not yet have enough information to construct the values. Thus, we ignore them, and fill them later on.

During the compilation, we create CellVars to keep track of the public output:

#![allow(unused)]
fn main() {
pub struct Compiler {
    // ...

    /// If a public output is set, this will be used to store its [CircuitVar] (cvar).
    /// The public output generation works as follows:
    /// 1. This cvar is created and inserted in the circuit (gates) during compilation of the public input
    ///    (as the public output is the end of the public input)
    /// 2. When the `return` statement of the circuit is parsed,
    ///    it will set this `public_output` variable again to the correct vars.
    /// 3. During witness generation, the public output computation
    ///    is delayed until the very end.
    pub public_output: Option<CellVars>,
}
}

During witness generation (see the Witness Generation chapter), we indeed defer computation the first time we go through the public output rows:

#![allow(unused)]
fn main() {
let val = if let Some(var) = var {
    // if it's a public output, defer it's computation
    if matches!(self.witness_vars[&var], Value::PublicOutput(_)) {
        public_outputs_vars.push((row, *var));
        Field::zero()
    } else {
        self.compute_var(&mut env, *var)?
    }
} else {
    Field::zero()
};
witness_row[col] = val;
}

and at the end we go back to them:

#![allow(unused)]
fn main() {
// compute public output at last
let mut public_output = vec![];

for (row, var) in public_outputs_vars {
    let val = self.compute_var(&mut env, var)?;
    witness[row][0] = val;
    public_output.push(val);
}
}

and finally we return the public output to the prover so that they can send it to the verifier, as well as the “full public input” which is the concatenation of the public input and the public output (needed to finalized the proof):

#![allow(unused)]
fn main() {
// return the public output separately as well
Ok((Witness(witness), full_public_inputs, public_output))
}

Witness Generation

Witness generation is the process of creating the execution trace table during proving. The execution trace table is then passed to the kimchi proof system which will create the final proof.

The code creates a series of instructions during compilation for the witness generation to follow. These instructions are stored as two different fields:

#![allow(unused)]
fn main() {
pub struct Compiler {
    // ...

    /// This is how you compute the value of each variable, for witness generation.
    pub witness_vars: HashMap<CellVar, Value>,

    // ...

    /// This is used to compute the witness row by row.
    pub rows_of_vars: Vec<Vec<Option<CellVar>>>,

    // ...
}
}

rows_of_vars can essentially be seen as the execution trace table, containing variables instead of values.

The witness generation goes as follows:

  1. Each rows in rows_of_vars is looked at one by one
  2. For each CellVar in the row:
    1. If it is set, it is evaluated using the Value stored in witness_vars.
    2. If it set to None, it is simply evaluated as 0.
  3. Once the row is created, it is checked for correctness by checking what gate was used in the row. Note that this is only true for the generic gate, as we trust built-in gadgets to produce correct values. For example, assert(x, 2) will be checked because it is using the generic gate, but let y = poseidon(x) won’t be because we trust the poseidon gate to be correct (and if there is a bug there, kimchi will still catch it).