Zero-Knowledge Proof - Arkworks Usage Guide
This blog showcases how to use Arkworks ecosystem to implement a zk-SNARK logic into a Rust application
ByVitali Bestolkau — Explainers
19 min read
Introduction
Arkworks is not the most famous but definitely one of the most intriguing projects in Zero-Knowledge field. This article provides an in-depth explanation of the versatile and robust Arkworks ecosystem. It covers its general usage and provides a guide to ZKP programming with Groth16, supported by code samples to make the implementation aspect easier to understand.
0 Prerequisites
The Zero-Knowledge by itself is quite a difficult subject, but understanding how to program it can be even harder. That is why the reader is advised to meet the following criteria:
- Basic knowledge of crypto primitives such as Collision-Resistant Hash (CRH), Merkle Trees, Elliptic Curves, and Pedersen Commitments.
- Basic understanding of zk-SNARKs work, such as (public) input and witness, R1CS, Proving key, and Verifying key generation.
- Basic Rust coding skills.
0.1 Helpful links:
- Collision resistance
- Merkle Trees
- Elliptic Curves and Pedersen Commitments
- General purpose zk-SNARKs
- Rust Basics
1 What is Arkworks?
Arkworks is the ecosystem for zk-SNARK implementation into applications, which is built in Rust. However, besides the Zero Knowledge Proof (ZKP) implementation, Arkworks offers crates for Algebra, Crypto Primitive, Curves and many other, which makes Arkworks way more versatile comparing to other existing Zero-Knowledge tools. Although, this aspect makes Arkworks very flexible to implement, at the same time it brings another layer of complexity, as the developers also need to know how certain cryptographic primitives and needed algebra work.
But why should we use the difficult to understand Arkworks ecosystem for ZKP implementation if there are already numerous existing tools that do that in a more simplified way, such as Circom and bellman? It is better to ask this question to companies such as Mina Protocol, Polygon, zkSync, ZCash and many others. Because according to the Arkworks' latest presentation (15-04-2022) all these projects utilize Arkworks ecosystem. They don’t use Arkworks directly, but they take some code as a backbone and improve it to meet their project needs. Not to mention, that Arkworks is recognized by IT giants and being funded by Google and Ethereum (source).
So, the point of this article is to explain how to make use of Arkworks, as unfortunately, they don’t have good documentation yet. However, be wary: at the moment of writing this article the Arkworks repositories are not production ready.
2 Initial code
Throughout the whole article there will be small pieces of code samples, which are small parts of our Proof of Concept.
You can check it for even more context, but keep in mind that the code in the article is a cleaner and easier to understand version of the code from the repository, so they are not identical.
If you don't understand anything don't get scared, it is normal. This article will explain step by step all the techniques used in the initial code.
So, lets begin the deep dive into Arkworks.
3 General
Although, Arkworks has a lot of repositories for different purposes, there are some traits/patterns, which are very common. Knowing these patterns you can already understand half of the Arkworks code and the whole system becomes less confusing.
3.1 <type> and <type>Var
Almost every Arkworks repository at some point introduces certain types. Then later in the code this repository introduces the same types, but the only visible difference between two is the addition “Var” to the end of the name of the previously mentioned type. But what is the logic behind it?
The first types are the general type you use while programming in Rust. Let’s call this type <type>
.
The second type is the one that can be accepted by R1CS. So, it is the same value as of the type <type>
, but now it can be operated within R1CS. This type is usually called <type>Var
. So, if the name of the <type>
is Name
, then the name of the <type>Var
will be NameVar
.
💡
Quick reminders: R1CS stands for the “Rank 1 Constraint System”, which is a low-level computation of values utilized by zk-SNARK. That’s why before being used in the SNARK system, all the inputs have to be converted into their R1CS representation first. Learn more from Vitalik Buterin's article.
R1CS-compatability can be achieved using “native” types specified in the r1cs-std repository. Or if you want to use your own struct or type in R1CS, then this struct has to implement the AllocVar
trait. And to be able to generate values for the constraint system (R1CS), e.g., witness, inputs, constants, etc., you need to implement the method “new_variable,” which converts the value of the type <type>
into type Field or simply, R1CS readable type. This is how it usually looks like:
1use ark_bls12_381::Fr;2use ark_r1cs_std::bits::uint64::UInt64;3use ark_r1cs_std::prelude::*;4use std::borrow::Borrow;5use ark_relations::r1cs::{Namespace, SynthesisError};67pub struct Number(pub u64);89pub struct NumberVar(pub UInt64<Fr>);1011impl AllocVar<Number, Fr> for NumberVar {12 #[tracing::instrument(target = "r1cs", skip(cs, f, mode))]13 fn new_variable<T: Borrow<Number>>(14 cs: impl Into<Namespace<Fr>>,15 f: impl FnOnce() -> Result<T, SynthesisError>,16 mode: AllocationMode,17 ) -> Result<Self, SynthesisError> {18 UInt64::new_variable(cs.into(), || f().map(|u| u.borrow().0), mode).map(Self)19 }20}
💡
To better understand new implementations, check the following links:
- SynthesisError → Error handling section
- Fr → a particular Finite Field implementation. More about fields in Arkworks here.
- UInt64 → a R1CS equivalent for uint64.
- Namespace → honestly, we are not sure what this struct does, but you need to use it as shown in the example for allocating a variable in R1CS.
- Tracing annotation → can be seen as Backtrace for R1CS as mentioned here.
3.2 <trait> and <trait>Gadget
Sometimes, you may encounter traits whose names differ only by the word “Gadget” in the end. This is very similar to the example with the types.
The trait <trait>
is used for general computation, and the <trait>Gadget
is used for computation within R1CS. Moreover, in Arkworks, the structs that implement the <trait>
trait make use of the <type>
variables, while <trait>Gadget
structs use <type>Var
variables.
For example, the traits CRHScheme and TwoToOneCRHScheme use variables of types Input
, Output
, and Parameters
. At the same time, the traits CRHSchemeGadget and TwoToOneCRHSchemeGadget use variables of type InputVar
, OutputVar
, and ParametersVar
. Later in the article a more elaborate and clear example will be presented.
3.3 Error Handling
When writing functions, you need to know what type of Error is returned every time to put it in the return type Result<T, E>
. Luckily, right now all Arkworks error handling comes down to a single error type: SynthesisError. This is an enum, which also implements Display trait, which allows displaying more elaborate error explanation in the console, that helps during debugging. However, these messages are not that helpful as they could have been, but the names of all the SynthesisError
options are self-explanatory.
3.4 Algebra
For certain cases it is better to use the Algebra crate, which allows to use Finite Fields, Elliptic Curves and Pairings.
For example, in our example it is used to calculate big numbers. Usually, when multiplying two big numbers, the compiler throws an error. And BigInt crate was not accurate enough for our use-case, but Finite Fields worked perfectly.
Although this crate is very useful and easy to use, it has some limitations. One of them is that you can not divide the variables. Most likely, it was done in order to avoid any inaccuracies, as division most likely would lead to numbers with infinite decimal digits, which would be impossible to store and would lead to wrong results.
To learn more about the Arkworks Algebra crate and how to use it check their README file in this repository.
4 Crypto Primitives
Crypto primitives is an arkworks repository that provides the implementation for different cryptographical concepts: from Merkle Trees to Signatures. All these primitives have their own purpose and can be used according to your needs, but generally in the SNARK implementation they serve as public inputs and witness and they get manipulated in the constraint system to form a circuit logic, but this will be shown later in the article.
As mentioned, Crypto Primitives repository implements many different concept, and it would be very difficult to cover them in one article. So here we will focus only on the Collision-Resistant Hash (CRH) implementation.
4.1 Collision-Resistant Hash (CRH)
CRHs are mostly used in the cases when you want to be sure that a single value will lead to a single unique hash. It also means that any other value different from the initial value will never result in the same hash and that the same value will always result in the same value. The password management tool is a prime use-case for that.
In Arkworks, CRH has several traits that users can implement with their custom logic. However, at the same time there are already Arkworks implementations which utilize CRH logic, such as Pedersen Commitments, SHA256, and others.
CRH has four traits that can be implemented:
We will go through all of them.
CRHScheme
The CRHScheme trait is used to create a common CRH, which is used in all the cases except one: generating parent nodes from their leaves in Merkle Trees. This is the responsibility of the TwoToOneCRHScheme trait.
The meaning behind its values and functions will be explained below. The TwoToOneCRHScheme explanation will follow the same structure.
-
Firstly, CRHScheme has three internal types:
- Input. Represents the input from which the hash is generated.
- Output. Represents the output of the hash generation function.
- Parameters. Parameters required for the hash generation.
-
Also CRHScheme has two methods to implement:
a. Setup
1fn setup<R: Rng>(r: &mut R) -> Result<Self::Parameters, Error>;
This method requires a random variable. Arkworks usually use their implementation in the tests, which is not secure, as they mention themselves. Instead, an Rng
variable from the rand crate can be used like that:
1let mut rng = rand::thread_rng();
As an output, the setup method returns a Result variable. The Result
variable is unwrapped then to get the Parameters
value. It looks like this:
1let mut rng = rand::thread_rng();2let params = MyCRH::setup(&mut rng).unwrap();
💡
Note: further on the Result part will be omitted. Meaning that whenever a variable of the type
Result<T, E>
is returned, it will be said that the variable of the type T is returned instead. It should be assumed that theResult
value was handled appropriately.
b. Evaluate
1fn evaluate<T: Borrow<Self::Input>>(2 parameters: &Self::Parameters,3 input: T,4 ) -> Result<Self::Output, Error>;
There are two parameters for this function: one of the type Parameters
and the other of type Input
. The first value is received from the setup()
method mentioned earlier. Input
is usually an array of bytes. The majority of types usually already implement methods similar to to_bytes()
, so you can use this function to get the needed Input
value.
💡
Note: in Arkworks, as well as in some other Rust crates, there are two different types of bytes: Big-Endian (bytes_be) and Little-Endian (bytes_le). (For detailed information about their difference, check the source.) Usually, Arkworks specifies which kind of bytes they need as a parameter for a function, but sometimes they require bytes. It most likely means that the type of bytes doesn’t matter, but we are not sure. After all, in the cases where the type of the bytes is undefined, we would suggest using Big-Endian bytes.
The Output
value received from the evaluate()
function can be considered an actual CRH.
TwoToOneCRHScheme
As mentioned, the only purpose of this trait is to generate the parent nodes of their leaves in Merkle Trees. The variables and functions of this trait and their purposes are shown below.
-
Same as CRHScheme TwoToOneCRHScheme has three internal types:
- Input. Represents the input from which the hash is generated.
- Output. Represents the output of the hash generation function.
- Parameters. Parameters required for the hash generation.
-
Similar to CRHScheme TwoToOneCRHScheme has
setup()
andevaluate()
functions, but it also has an additionalcompress()
function.
a. Setup
The setup function acts precisely the same as in the CRHScheme trait.
b. Evaluate and Compress
Both functions calculate a parent node from two leaves/child nodes. This is how the code in the interface looks like:
1fn evaluate<T: Borrow<Self::Input>>(2 parameters: &Self::Parameters,3 left_input: T,4 right_input: T,5 ) -> Result<Self::Output, Error>;67fn compress<T: Borrow<Self::Output>>(8 parameters: &Self::Parameters,9 left_input: T,10 right_input: T,11 ) -> Result<Self::Output, Error>;
In both methods, three parameters are needed: a well-known Parameters
variable, a left_input
variable, which represents the left child, and a right_input
variable, which refers to the right child.
The only difference between the functions is the type of input that should be provided. Both functions accept the same type of parameters, but in the evaluate()
function, both left_inpu
t and right_input
are of Input
type. In the compress()
function, they are of the Output
type.
This may mean that the evaluate()
method calculates nodes from the bytes of the leaves. Compress()
, on the other hand, calculates the node hash from the given node hashes. Possibly, that is why Arkworks, in their Merkle Tree membership verification implementation, use the evaluate()
function only for the first two leaves. Then they use only compress()
until the Root is calculated.
CRHSchemeGadget and TwoToOneCRHSchemeGadget
These traits implement the same functions, which work the same as in the CRHScheme and TwoToOneCRHScheme, respectively. The only difference is that Gadget traits use <type>Var
and are used for R1CS.
4.2 CRH Variants
In Arkworks, CRH is just an interface. However, Arkworks already have different implementations of all CRH traits inside Pedersen commitment, SHA256, Poseidon, and others. These CRH variants are ready for use for general purposes and R1CS-compatible usage.
When using a CRH variants, there is a lot to define: a specific curve, window, etc. To make life easier, it is suggested to initialize the CRHs as Arkworks does in their tests. For example, for Pedersen commitments it looks the following way:
1use ark_crypto_primitives::crh::pedersen;2use ark_ed_on_bls12_381::{constraints::EdwardsVar, EdwardsProjective as JubJub};34#[derive(Clone, PartialEq, Eq, Hash)]5pub struct Window;67impl pedersen::Window for Window {8 const WINDOW_SIZE: usize = 128;9 const NUM_WINDOWS: usize = 8;10}1112pub type MyCRH = pedersen::CRH<JubJub, Window>;13pub type MyCRHGadget = pedersen::constraints::CRHGadget<JubJub, EdwardsVar, Window>;
5 ZKP Implementation
After all the “basics” are covered, it’s time to get serious. In this section we will explain all the phases of the full SNARK implementation process: from building a circuit to verifying a proof. Every explanation is also followed by the corresponding code to make the implementation part even more clear.
5.1 ConstraintSynthesizer
Before any zk-SNARK computation, a constraint system is required together with a circuit, where a constraint system defines constraints (e.g. witness, (public) input, constant) and a circuit defines the logic. Also, all the checks are handled in the circuit.
The ConstraintSynthesizer trait is responsible for both. It has a single function, generate_constraints()
, where all the constraints should be initialized and defines the circuit logic. This means that in this function, you should make use of <type>Var::new_variable()
and similar functions, like new_input()
, new_witness()
, and new_constant()
, and that you should describe here what kind of calculations and checks should be executed here for the witness to be verified.
Usually, the ContraintSynthesizer trait is implemented by a struct that stores all the variables that will be used in the constraint system and the circuit. The implementation looks like this:
1pub struct HashDataVar {2 data: HashData,3 params: DataParams,4 public_hash_commitment: Option<Commitment>,5}67impl ConstraintSynthesizer<ScalarField> for HashDataVar {8 #[tracing::instrument(target = "r1cs", skip(self, cs))]9 fn generate_constraints(10 self,11 cs: ConstraintSystemRef<ScalarField>,12 ) -> Result<(), SynthesisError> {1314 let wallet_address_var_scalar = ScalarField::from_be_bytes_mod_order(&self.data.wallet_address.to_be_bytes());15 let first_half_var_scalar = ScalarField::from_be_bytes_mod_order(&self.data.first_pass_half.to_be_bytes());16 let second_half_var_scalar = ScalarField::from_be_bytes_mod_order(&self.data.second_pass_half.to_be_bytes());1718 let final_scalar = wallet_address_var_scalar * first_half_var_scalar - second_half_var_scalar;1920 let final_commitment = HashDataVar::scalar_to_commitment(final_scalar, &self.params);2122 let pub_hash_commitment_var = CommitmentVar::new_input(23 ark_relations::ns!(cs, "The commitment of the public hash"),24 || { Ok(self.public_hash_commitment.unwrap()) },25 )?;2627 let final_commitment_var = CommitmentVar::new_witness(28 ark_relations::ns!(cs, "The commitment of wallet address, first half of the password and the second half of the password"),29 || { Ok(final_commitment) },30 )?;3132 pub_hash_commitment_var.enforce_equal(&final_commitment_var)?;3334 Ok(())35 }36}
💡
Explanation of new structs
To begin with, all the new structs, except ScalarField, are custom structs that are not part of the Arkworks ecosystem. So, the explanation:
- HashData → the struct that holds additional data:
wallet_address
,first_pass_half
andsecond_pass_half
. They are used in the beginning of thegenerate_constraint()
function.- Commitment → actually an
Output
type from the Pedersen Commitment implementation of the CRHScheme trait.- CommitmentVar → the
OutputVar
type from the Pedersen Commitment implementation of the CRHSchemeGadget trait.- DataParams → the
Parameters
type from the Pedersen Commitment implementation of the CRHScheme trait.- ScalarField → actually a
Fr
type mentioned in types section the Arkworks struct used here to compute big numbers.Explanation of new functions
There are two new methods. Although they are self-explanatory, here is a brief description:
HashDataVar::scalar_to_commitment(ScalarField, &DataParams)
→ converts aScalarField
value into a Commitment.DataParams
value is used here to ensure that all the generated Commitments in this constraint system have the same parameters so that if the inputsscalar_field_1 == scalar_field_2
, then the generated commitments are also equal. Because ifscalar_field_1 == scalar_field_2 && params_1 != params_2
thencommitment_1 != commitment_2
.enforce_equal()
→ checks if the values are equal. If not, then return aSynthesisError
.If you want to check the whole code and understand what this code is trying to do, check this repository. And the Arkworks-related code is in this file only.
5.2 Groth16
Finally, the Zero-Knowledge Proof itself. Luckily, the Groth16 implementation is quite easy to use. All the actual SNARK logic is done under the hood. Mainly, we just need to make sure that we provide the functions with proper parameters.
It is worth mentioning that there is also a Marlin implementation, which uses a universal setup. We didn’t use it because the Universal Setup doesn’t change anything if you have only one circuit, but it is worth checking if you want an extensive application with numerous circuits.
Arkworks Groth16 implementation has three phases:
- Key generation phase
- Proof generation phase
- Proof verification phase.
We will go through each of them and show the proper usage techniques.
Key Generation
The key generation phase generates a proving key pk
and a verifying key vk
, which are also connected to the provided circuit:
1let circuit_defining_cs = HashDataVar::new(HashDataVar::default());23let mut rng = rand::thread_rng();4let (pk, vk) =5Groth16::<Bls12_381>::circuit_specific_setup(circuit_defining_cs, &mut rng)?;
The function HashDataVar::default()
provides hardcoded dummy data. This is done because, in this phase, the provided data is irrelevant. The pk
and vk
are only bonded with the structure of the circuit, and the data is not considered during the bonding process. Besides, the constraints (witness, input, etc.) must be of the same type and there should be the same number of the constraints (but not of the same values) as in an actual circuit. The only requirement from the data (values) is to be correct (meaning that it should pass the checks in the circuit). Otherwise, the Error will be thrown from the circuit itself.
Proof Generation
After creating keys, we generate proof of the fact that the witness is known:
1let proof = Groth16::prove(&pk, data, &mut rng)?;
Where data
is a circuit that should be verified. Or to be more specific, data is a variable of the type HashDataVar
which also contains the actual data but not the dummy one as in the Key Generation case.
Also, Arkworks checks if the provided data is correct while generating the proof, which means the data passes all the checks in the circuit. If the data is incorrect, Arkworks will throw a SynthesisError
in this step and will not move to the Proof Verification phase.
Proof Verification
When the proof is generated, all that is left is to verify it. But before that, the public inputs should be defined. Because the verifier doesn’t learn the public inputs and witnesses from the proof.
1let public_input= [2data.public_hash_commitment.unwrap().x,3data.public_hash_commitment.unwrap().y4];56let valid_proof = Groth16::verify(&vk, &public_input, &proof)?;
In our case, the public input is a Pedersen Commitment. Arkworks doesn’t accept the plain Commitment value as public input. But it accepts the coordinates of the Pedersen Commitment as public inputs (a Pedersen Commitment can also be represented as a point on an Elliptic Curve, that’s why we can get coordinates x and y for it).
During the verification, Arkworks also checks if the vk
is paired with this proof
or not. If not, the SynthesisError
is thrown. The Error is also thrown if the provided public inputs are incorrect. If all the checks are passed, the verify
function returns true
.
All Together
This is how everything looks together, with some comments to make certain parts clear:
1pub fn prove_with_zkp(data: HashDataVar) -> Result<bool, SynthesisError> {2if let None = data.public_hash_commitment {3return Err(SynthesisError::AssignmentMissing);4}56// Use a circuit just to generate the circuit7// This circuit is used to tell the SNARK the setup of the circuit that we are going to verify.8// Thus SNARK generates proving key (pk) and verifying key (vk) and "connects" them (meaning9// that only this vk can verify this pk and only this pk can be verifier by this vk)10let circuit_defining_cs = HashDataVar::new(HashDataVar::default());1112let mut rng = rand::thread_rng();13let (pk, vk) =14Groth16::<Bls12_381>::circuit_specific_setup(circuit_defining_cs, &mut rng)?;1516let public_input= [17data.public_hash_commitment.unwrap().x,18data.public_hash_commitment.unwrap().y19];2021let proof = Groth16::prove(&pk, data, &mut rng)?;22let valid_proof = Groth16::verify(&vk, &public_input, &proof)?;2324Ok(valid_proof)25}
The only new thing here is the if let
statement in the beginning. It checks whether the provided data already has the commitment that is public input. If not, the SynthesisError
is returned.
6 Final Tips
Here are some more tips you might find helpful while working with Arkworks and ZKP.
- Check Arkworks’ rollup tutorial for more examples of the ZKP usage with Arkworks. Despite not enough explanation of the used types, traits, their implementation, etc. with the information from this article, it will be easier to understand how everything works. However, be careful while using the code from the tutorial because, apparently, this code needs to be updated to match the latest versions of the dependencies.
- It would be best for SNARK verification to have multiple witnesses, two public inputs (that represent the initial and the final state of the same value), and a complex verification process in the circuit. This is also how the SNARK part works in the mentioned rollup tutorial.
- Multiple witnesses will make it more difficult to brute-force them.
- Two public inputs are the same value but in different states: initial and final. It is a more common usage for SNARKs, as in the Merkle Tree, where the public inputs are the initial Root of the tree and the final Root. Of course, there is always room for creativity. Still, for beginners, we would suggest this workflow with initial and final value states as public inputs, and the witness that alters the initial value to get the final one.
- Complex circuit logic is also advised to decrease the likelihood of brute forcing.
- To better understand how to implement any Arkworks “elements” (traits, types, implementations, etc.), always check the corresponding tests. The tests usually show the basic implementation of the needed code, so when you start using some these elements try to use the code from the test first.
Conclusion
This article provides a detailed insight into both Arkworks’ general usage and the usage of their ZKP implementation. After clarifying the crucial difference between <type> and <type>Var
and between <trait> and <trait>Gadget,
the article explains the CRH structure and how to work with CRH traits. In the end, the blog shows the whole ZKP workflow: from the constraint system and circuit generation with the ConstraintSynthesizer
to the Proof verification. It summarizes everything with helpful tips for Arkworks development.
Resources
Prerequisites
- Collision resistance
- Merkle Trees
- Elliptic Curves and Pedersen Commitments
- General purpose zk-SNARKs
- Rust Basics
Why Arkworks
Arkworks repositories mentioned
- Arkworks
- R1CS-Tutorials
- Algebra
- Groth16
- Crypto Primitives
- R1CS-std
- SNARK
- Ed on BLS12 381
- Relations
- Marlin