
Authored by @grandchildrice.
Special thanks to Fenbushi Capital for supporting this research, ICME Labs — the operator of Novanet — for covering our ZKProof7 travel expenses, and ClankPan and Yuki Aoki for their substantial code contributions.
- Code: Benchmark repository
Abstract
Zero‑Knowledge Virtual Machines (zkVMs) are emerging as fundamental building blocks for Ethereum scaling. However, most existing benchmarks — such as the SP1 launch post, the Vac report, and zkbenchmarks.com — employ incomparable methodologies, thereby obscuring crucial performance trade‑offs.
We introduce a standardized test‑bed and evaluate eight zkVMs — SP1, RISC Zero, OpenVM, Pico, ZKM, Jolt, Nexus, and Novanet — across four computational tasks and three performance metrics (prover time, proof size, and peak RAM utilization). Our findings elucidate which zkVM architectures are optimally suited for specific use‑cases within the Ethereum ecosystem.
1. Introduction
1.1 The Concept of zkVM
A Zero-Knowledge Virtual Machine (zkVM) represents a technological framework designed to cryptographically verify the correctness of program execution without disclosing inputs or intermediate computational states. The operational workflow typically encompasses three principal stages:
- Execute: The program undergoes execution, generating a comprehensive record (trace) that captures instructions, CPU states, and memory values at each computational step.
- Prove: A cryptographic attestation (typically utilizing SNARKs — Succinct Non-interactive Arguments of Knowledge) is constructed based on the execution trace.
- Verify: The generated proof undergoes validation to confirm computational correctness without necessitating re-execution of the original program.
A zkVM leverages SNARKs to validate the VM’s state transitions at each step by synthesizing three fundamental cryptographic assurances:
- Read-Write Memory Consistency Proof: Ensures the integrity and proper manipulation of memory throughout the computation.
- Instruction Encoding Proof: Validates that executed instructions correspond precisely to their defined encodings.
- Instruction Proof: Confirms that each instruction produced the correct result according to the VM’s formal specification.
It is imperative to recognize that zkVMs primarily serve to facilitate the efficient verification of computational integrity rather than privacy preservation.

1.2 Why Ethereum needs zkVMs
Currently, every Ethereum validator re‑executes all transactions, inherently coupling gas costs to raw computational complexity.
With zkVMs, an off‑chain prover executes computationally intensive operations, generates a cryptographic proof, and submits only the resultant state root alongside the proof on‑chain. Validators merely verify these proofs — an O(1) process regardless of code complexity. Consequently, transaction fees can remain relatively constant even for demanding operations such as machine learning inference or other computationally intensive smart contract executions.

1.3 Goal of this work
Numerous zkVM projects assert superior performance metrics, frequently citing benchmarks conducted under heterogeneous conditions, thereby complicating objective comparative analysis. Systems marketed as “fastest” may harbor significant limitations, such as excessive memory requirements or restricted functionality applicable exclusively to specific program categories.
This research introduces a rigorously standardized benchmarking framework designed for comprehensive comparative analysis of multiple zkVM implementations. We evaluate performance across diverse metrics and representative computational tasks. This methodical approach enables the identification of current performance bottlenecks in zkVM technology and facilitates the formulation of best practices for their effective deployment and utilization.
2. Classification and Techniques of zkVM Implementations
2.1 Classification of zkVMs
Several proof system architectures predominate in the zkVM landscape:
- FRI-STARK-based: These systems leverage Fast Reed-Solomon Interactive Oracle Proofs of Proximity (FRI) in conjunction with Scalable Transparent Arguments of Knowledge (STARKs). They typically employ a 32-bit extension field to optimize proof generation efficiency. Projects implementing this approach include SP1, RISC Zero, OpenVM, Pico, ZKM, and Valida.
- Nova-based: These implementations utilize a folding scheme, a technique that efficiently aggregates multiple proofs incrementally. Novanet and Nexus 1.0 exemplify this methodology.
- Lasso Lookup: This technique aims to mitigate computational overhead by retrieving execution data from precomputed lookup tables. Jolt employs this innovative approach.
- GKR: Based on the Goldwasser-Kalai-Rothblum protocol, GKR demonstrates circuit satisfiability layer by layer, typically proceeding from output to input, utilizing a sum-check protocol. Ceno represents a notable implementation of this methodology.
zkVM architectures can be further categorized into two predominant paradigms:
- vRAM Style: This architecture segregates the program execution trace according to instruction types (e.g., CPU instructions, memory operations). Circuit proofs are subsequently generated in a data-parallel manner for each group before final aggregation. SP1, RISC Zero, and ZKM exemplify this architectural approach.
- Modular Style: This methodology defines dedicated circuits or lookup tables for individual opcodes. These are sequentially executed as required, with the resultant proofs aggregated. OpenVM and Jolt adopt this modular architectural design.

2.2 Optimization Techniques
Common optimization methodologies employed across diverse zkVM implementations include:
- Precompiles: Utilizing specially optimized circuits to efficiently prove specific, frequently invoked operations such as large integer arithmetic, floating-point calculations, or cryptographic hash functions (e.g., Keccak).
- Continuation: Segmenting the program execution trace into multiple discrete sections. This facilitates a concurrent generation of proofs for each segment, potentially reducing overall proving time significantly, particularly for computationally extensive operations.
- À la carte Proving: Architecting the system such that proof generation costs are incurred exclusively for instructions actually executed during specific CPU cycles, rather than for all potential instructions.
3. Benchmark Results and Analysis
Benchmarks were conducted on a Linux system equipped with Ubuntu 24.04, 8 virtual CPUs, 192GB of RAM, and an NVIDIA RTX 5090 GPU with 32GB of VRAM.
The four test programs utilized for evaluation comprised:
- Calculation of the 100,000th Fibonacci number.
- SHA2–2048 hash computation.
- ECDSA signature verification using the secp256k1 curve.
- Simulation of 100 Ethereum Transfer transactions (ETHTransfer).
3.1 Execution Time Efficiency (Prover Time)
The four bar charts presented below illustrate the proof generation times for each benchmark (ordered from top to bottom: Fibonacci, SHA2–2048, ECDSA k256, 100 ETHTransfer).
Overall, RISC Zero (GPU), SP1 (GPU), and OpenVM (CPU) demonstrated exceptional time efficiency. Conversely, Pico and Jolt exhibited considerable performance variability contingent upon the specific program being executed.
Comparative analysis with results obtained on an EC2 g5.x16xlarge instance underscores the dependency of SP1’s GPU performance on underlying hardware capabilities.
OpenVM’s remarkable performance in CPU execution merits particular attention, suggesting potential advantages inherent to its modular architectural design.
Fibonacci (100k-th):
Proof generation times were:
- SP1 (GPU): 3.4s
- RISC Zero (GPU): 3.6s
- OpenVM (CPU): 7.5s
- Pico (CPU): 20s
As this program involves straightforward iterative calculations with minimal memory access, the performance advantages conferred by GPU acceleration are particularly pronounced.

SHA2–2048:
Proof generation times were:
- RISC Zero (GPU): 0.54s
- OpenVM (CPU): 0.99s
- Jolt (CPU): 2.7s
- SP1 (GPU): 12s
- ZKM (CPU): 22s
For cryptographic operations such as SHA2, precompile-based acceleration represents a prevalent optimization strategy. This explains the impressive speeds achieved by RISC Zero, OpenVM, SP1, and ZKM, which implement relevant precompiles. Jolt, currently lacking SHA2 precompiles, utilizes lookup tables to ameliorate the computational cost of large bit operations.

ECDSA Verification (secp256k1):
Proof generation times were:
- RISC Zero (GPU): 1.0s
- SP1 (GPU): 12s
- OpenVM (CPU): 14s
- Jolt (GPU): 83s
Analogous to the SHA2 results, RISC Zero, SP1, and OpenVM exhibited superior performance attributable to their implementation of precompiles for requisite cryptographic operations. Jolt’s comparative inefficiency relative to its SHA2 performance stems from its current absence of lookup table support for the finite field arithmetic necessary for ECDSA.

100 ETHTransfer Transactions:
Proof generation times were:
- RISC Zero (GPU): 7.3s
- OpenVM (CPU): 7.6s
- SP1 (GPU): 13s
- Jolt (GPU): 82s
Once again, RISC Zero, OpenVM, and SP1 demonstrated superior performance, benefiting from precompiles for essential EVM operations such as Keccak hashing. The capability to generate proofs for simulated EVM execution in approximately 7 seconds, even on desktop-class hardware, represents a significant technological milestone. For comparative analyses involving continuous EVM block execution, refer to the ETHProofs website.

Scalability with Input Size (Fibonacci):
When increasing the Fibonacci input from the 10th term to the 100,000th term, most implementations exhibited proof generation time increases that were approximately linear (or slightly super-linear) with input size.
- SP1 (GPU) demonstrated a remarkably gradual increase, showing minimal performance degradation.
- RISC Zero and OpenVM exhibited moderate performance scaling.
- Pico, Jolt, and ZKM manifested the most significant increases in proving time.
The sub-linear scaling observed in certain zkVMs might be attributed to overhead associated with proof aggregation, particularly in implementations lacking or possessing less optimized continuation mechanisms.

3.2 Memory Efficiency (Peak Memory Usage)
The graph below summarizes the comparative peak memory utilization (CPU execution only). For each project, four bars represent memory consumption for the Fibonacci, SHA2, ECDSA, and ETHTransfer benchmarks, respectively.
Peak Memory Usage (Selected Examples):
- 100k-Fibonacci:
- RISC Zero (GPU): 0.63GB (GPU Mem)
- SP1 (GPU): 1.3GB (GPU Mem)
- Nexus/Novanet (CPU): ~4GB
- OpenVM (CPU): 8.2GB
- RISC Zero (CPU): 9.2GB
- Pico (CPU): 21GB
- Jolt (CPU): 28GB
- ZKM (CPU): 82GB
- ECDSA Verification (k256):
- RISC Zero (GPU): 0.49GB (GPU Mem)
- SP1 (GPU): 1.3GB (GPU Mem)
- Nexus/Novanet (CPU): ~4GB
- RISC Zero (CPU): 4.5GB
- OpenVM (CPU): 11GB
- Pico (CPU): 20GB
- SP1 (CPU): 21GB
- Jolt (CPU): 58GB
- ZKM (CPU): 84GB

SP1 (GPU), RISC Zero (GPU), Nexus, and Novanet demonstrated relatively constant memory consumption irrespective of the test program. However, other projects exhibited memory utilization varying significantly depending on program characteristics, likely influenced by data structures and the frequency of memory access operations.
GPU implementations generally exhibited reduced host (CPU) memory utilization but consumed substantial GPU memory; the benchmarked GPU-accelerated projects necessitated a minimum of 24GB of VRAM.
Nexus and Novanet maintained consistent memory utilization even as input size increased, though this consistency incurred significantly extended execution times.
Memory efficiency represents an active area of development, with improvements frequently resulting from the adoption of more efficient memory checking arguments (e.g., polynomial IOPs), utilization of smaller cryptographic fields, and implementation of techniques such as continuation.
3.3 Proof Size
The graph below presents the generated proof sizes in kilobytes (kB).
Proof Sizes (Selected Examples — 100k Fibonacci):
- RISC Zero: 222 kB
- Jolt: 232 kB
- ZKM: 415 kB
- OpenVM: 838 kB
- SP1: 1.8 MB
- Novanet, Nexus: Tens of MB

RISC Zero and Jolt consistently produced among the most compact proof sizes across the evaluated benchmarks.
Conversely, SP1 generated notably expansive proofs (exceeding 6 MB for tasks other than Fibonacci). Proof sizes surpassing 1 MB, as observed with SP1 and ZKM in certain cases, may indicate that their proof aggregation algorithms warrant further optimization.
3.4 Performance Summary
The following table summarizes the comprehensive performance results.
Figures representing optimal performance in each category are highlighted in green.
Peak memory utilization compares CPU executions exclusively, as GPU memory allocation is distinct and not directly comparable.
Overall, RISC Zero’s performance demonstrates exceptional consistency, while SP1, OpenVM, Pico, and Jolt each attain superlative performance in several individual categories.

3.5 Bottleneck Analysis
A detailed performance analysis reveals the following primary bottlenecks for each zkVM implementation:
- Jolt: CPU limitations include Lasso’s Sumcheck protocol and Spartan proof generation. Memory constraints stem from the Multivariate Polynomial Extension (MLE) construction for lookup tables.
- RISC Zero, SP1, ZKM, OpenVM: CPU limitations primarily relate to proof recursion and polynomial commitment schemes. Memory constraints frequently arise from Merkle Tree construction.
- Ceno: The GKR Sumcheck protocol constitutes a bottleneck for both CPU and memory resources.
- Nexus, Novanet: The folding scheme employed for proof aggregation represents the principal limitation affecting both CPU and memory performance.

4. Best Practices for Selecting zkVMs
Based on our comprehensive benchmark results, the following guidelines can facilitate the selection and configuration of appropriate zkVM technologies:
4.1 Selection Criteria Based on Performance
- For general-purpose applications requiring a balanced combination of speed, memory efficiency, and proof size, RISC Zero, SP1 (particularly with GPU acceleration), and OpenVM emerge as compelling candidates. RISC Zero, specifically, demonstrated consistently exceptional performance across all evaluated computational tasks.
- For applications with highly specialized or computationally intensive requirements not addressed by standard precompiles, the implementation of custom precompiles within a selected zkVM framework is strongly recommended.
4.2 GPU Specifications
- Utilizing the GPU-accelerated variants of SP1 and RISC Zero necessitates graphics processing units with a minimum of 24GB of VRAM for the benchmarked computational tasks.
4.3 Memory Specifications
- A minimum of 32GB of system RAM is advisable for effective operation of most zkVMs in general-purpose applications.
- For memory-intensive zkVMs such as Jolt or ZKM, particularly when processing complex programs, 128GB of RAM or greater might be necessary to prevent performance degradation or memory exhaustion errors.
5. Discussion
5.1 Trade-offs Between Security and Performance
A legitimate concern regarding the equitability of these benchmark results pertains to varying levels of security assurance across the evaluated zkVM projects. Optimization exclusively for performance can potentially compromise security guarantees.
Currently, numerous zkVM projects remain in development phases and may lack comprehensive security validation. For example:
- The latest version of SP1 has not undergone a complete audit.
- Soundness bugs were previously reported in SP1 (since remediated).
- Jolt remains in an early development stage (v0.1).
Performance comparisons may prove misleading if security levels differ substantially. Future zkVM evaluations should ideally incorporate assessments of security maturity, including factors such as formal verification efforts, completed third-party audits, and the existence of rigorous safety proofs, to provide a more comprehensive comparative analysis.
5.2 Technical Differences in GPU Implementations
Both SP1 and RISC Zero offer GPU acceleration, yet their performance characteristics diverge significantly, suggesting fundamental architectural variations.
A comparative analysis based on their respective GitHub implementations reveals distinct design philosophies:

It should be noted that Jolt’s GPU implementation currently remains partial and subject to substantial architectural modifications.
Comparing the more mature GPU implementations, SP1’s approach, while potentially offering superior maintainability through abstraction of CUDA execution via the Moongate server, may incur additional overhead resulting from the separation of certain processes from the GPU kernel. Conversely, RISC Zero’s direct GPU access through its Hardware Abstraction Layer (HAL), despite potentially entailing higher initial implementation complexity when incorporating new CUDA-accelerated functions, aims to achieve performance closer to theoretical limits by minimizing abstraction layers.
6. Conclusion
This comprehensive study presents a standardized benchmark comparing eight zkVM implementations, evaluating their performance characteristics and potential as scaling solutions specifically relevant to the Ethereum ecosystem.
Our findings indicate that RISC Zero, OpenVM, and SP1 demonstrate particularly robust performance, especially in executing EVM-related computational tasks. This positions them as promising candidates for integration into Ethereum scaling solutions, such as Layer 2 rollups or specialized co-processors, thereby contributing directly to enhancing the network’s throughput and operational efficiency.
RISC Zero, notably, exhibited exceptional efficiency across key metrics relevant to blockchain applications. It completed proof generation for 100 simulated ETHTransfer executions in merely 7.3 seconds (GPU), maintained peak CPU memory utilization below 10GB (with GPU memory consumption under 1GB), and produced remarkably compact proofs (approximately 222kB for Fibonacci computation). These results highlight its significant potential for applications demanding high throughput, minimal resource consumption, and low on-chain verification costs — critical factors for enhancing Ethereum’s scalability and improving overall user experience.
The comparative benchmark results provide invaluable data for developers, researchers, and decision-makers within the Ethereum community. By offering a clearer understanding of current capabilities and performance trade-offs among diverse zkVM technologies, this research aims to inform the selection and development of efficient and robust scaling solutions, ultimately contributing to the continued evolution and expansion of the Ethereum network.
References
Additional sources are cited inline via hyperlinks.