EIP-7885 - Precompile for NTT operations

Created 2025-02-12
Status Draft
Category Core
Type Standards Track
Authors

Abstract

This proposal creates a precompiled contract that performs NTT and Inverse NTT transformations. This provides a way to efficiently perform fast polynomial multiplication for post-quantum and STARK cryptography.

Motivation

With the recent advances in quantum computing, there are increased concerns for the quantum threat against Ethereum. Today ECDSA is the EOA signature algorithms, which is vulnerable to attacks by quantum computers. Efficient replacement algorithms use polynomial multiplication as the core operation. Once NTT and Inverse NTT are available, the remaining of the verification algorithm is trivial. Choosing to integrate NTT and InvNTT instead of a specific algorithm provides agility, as DILITHIUM or FALCON or any equivalent can be implemented with a modest cost from those operators. NTT is also of interest to speed-up STARK verifiers. This single operator would thus benefit to both the Ethereum scaling and post-quantum threat mitigation.

Specification

Constants

Name Value Comment
NTT_FW TBD precompile address
NTT_INV TBD precompile address
NTT_VECMULMOD TBD precompile address
NTT_VECADDMOD TBD precompile address

We introduce four separate precompiles to perform the following operations:

Field parameters

The NTT_FW and NTT_INV are fully defined by the following set of parameters: Let $R$ be a cyclotomic ring of the form $R=\mathbb F_q[X]/(X^n+1)$. In these notations,

Any element $a \in R$ is a polynomial of degree at most $n-1$ with integer coefficients, written as $a=\sum_{i=0}^{n-1} a_iX^i$

NTT_FW

The NTT transformation is described by the following algorithm.

Input: A vector $a = (a[0], a[1], \dots, a[n-1]) \in \mathbb F_q^n$ in standard order, where $q$ is a prime such that $q \equiv 1 \mod 2n$ and $n$ is a power of two, and a precomputed table $\Psi_\text{rev} \in \mathbb{Z}_q^n$ storing powers of $\psi$ in bit-reversed order.

Output: $a \leftarrow \text{NTT_FW}(a)$ in bit-reversed order.

t  n
for m = 1 to n-1 by 2m do
    t  t / 2
    for i = 0 to m-1 do
        j1  2  i  t
        j2  j1 + t - 1
        S  Ψrev[m + i]
        for j = j1 to j2 do
            U  a[j]
            V  a[j + t]  S
            a[j]  (U + V) mod q
            a[j + t]  (U - V) mod q
        end for
    end for
end for
return a

NTT_INV

The Inverse NTT is described by the following algorithm.

Input: A vector $a = (a[0], a[1], \dots, a[n-1]) \in \mathbb F_q^n$ in bit-reversed order, where $q$ is a prime such that $q \equiv 1 \mod 2n$ and $n$ is a power of two, and a precomputed table $\Psi^{-1}_\text{rev} \in \mathbb F_q^n$ storing powers of $\psi^{-1}$ in bit-reversed order.

Output: $a \leftarrow \text{NTT_INV}(a)$ in standard order.

t  1
for m = n to 1 by m/2 do
    j1  0
    h  m / 2
    for i = 0 to h-1 do
        j2  j1 + t - 1
        S  Ψ¹rev[h + i]
        for j = j1 to j2 do
            U  a[j]
            V  a[j + t]
            a[j]  (U + V) mod q
            a[j + t]  (U - V)  S mod q
        end for
        j1  j1 + 2t
    end for
    t  2t
end for
for j = 0 to n-1 do
    a[j]  a[j]  n¹ mod q
end for
return a

NTT_VECMULMOD

The NTT_VECMULMOD is functions similarly to SIMD, but operates with larger input and output sizes.

Input: Two vectors $a = (a[0], a[1], \dots, a[n-1]), b=(b[0], b[1], \dots, b[n-1]) \in \mathbb F_q^n$ where $n$ and $q$ are defined above.

Output: The element-wise product $(a[0]\cdot b[0] \mod q, a[1]\cdot b[1]\mod q, \dots, a[n-1]\cdot b[n-1] \mod q)$.

Gas cost: Denoting $k$ to be the smallest power of $2$ larger than $\log_2(q)$, the gas cost of this operation is $k\log_2(n) / 8$.

NTT_VECADDMOD

The NTT_VECMULMOD is similar to SIMD in the functioning, but operates with larger sizes in input and output.

Input: Two vectors $a = (a[0], a[1], \dots, a[n-1]), b=(b[0], b[1], \dots, b[n-1]) \in \mathbb F_q^n$ where $n$ and $q$ are defined above.

Output: The element-wise addition $(a[0]+ b[0] \mod q, a[1]+ b[1]\mod q, \dots, a[n-1]+ b[n-1] \mod q)$.

Gas cost: Denoting $k$ to be the smallest power of $2$ larger than $\log_2(q)$, the gas cost of this operation is $k\log_2(n) /32$.

Rationale

If $f$ and $g$ are two polynomials of $R$, then $f\times g= \text{NTT_INV}(\text{NTT_VECMULMOD}( \text{NTT_FW}(a), \text{NTT_FW}(b)))$ is equal to the product of $f$ and $g$ in $R$. The algorithm has a complexity of $n \log_2n$ rather than $n^2$ with the classical schoolbook multiplication algorithm.

Gas Cost Analysis

The gas cost model for EIP-7885 precompiles is designed to target approximately 50-55 mgas/s performance, consistent with existing precompiled contracts like ECRECOVER. Two implementations with different gas costs are available:

NTT Precompiles (0x12, 0x13) - Scheme-Specific Cost Model

Gas Costs by Implementation:

Scheme Ring Degree nocgo NTT_FW nocgo NTT_INV cgo NTT_FW cgo NTT_INV
Falcon-512 512 790 gas 790 gas 500 gas 500 gas
Falcon-1024 1024 1,750 gas 1,750 gas 1,080 gas 1,080 gas
ML-DSA (Dilithium) 256 220 gas 270 gas 256 gas 340 gas

Pure Go (nocgo): Zero-dependency implementation.

liboqs (cgo): Offers 36-38% lower gas costs for NTT operations.

Rationale: Gas costs are calibrated per cryptographic scheme based on actual execution performance. NTT computation complexity varies by ring degree and modulus size, with costs optimized to maintain consistent throughput across different post-quantum schemes.

Vector Operations (0x14, 0x15) - Dynamic Cost Model

VECMULMOD Gas Cost: ceil(0.32 × N)

VECADDMOD Gas Cost: ceil(0.3 × N)

Cost Components:

Benchmark Validation: The dynamic cost model achieves target performance:

The gas cost formulas accurately reflect actual execution costs while maintaining ~50-55 mgas/s performance target across all tested cryptographic standards.

Fields of interest

The implementation applies for many fields of interest for cryptography. In particular, the design applies for:

Benchmarks

Pure solidity

To illustrate the interest of the precompile, the assets provide the measured gas const for a single NTT and extrapolates the minimal gas cost taking into account the required number of NTT_FW and NTT_INV. The provided assets use pure Yul optimizations, with memory access hacks. It is unlikely that more than one order of magnitude could be spared on such a minimal code.

Use case Parameters single NTT gas cost Required NTT(FW/INV) Estimated NTT/Full cost
Falcon $q=12289, n=512$ 1.8 M 1 NTTFW+1 NTTINV 3.6 M
Dilithium $q=2^{23}-2^{13}+1, n=256$ 460K 4 NTTFW +1 NTTINV 2.3M

Falcon cost has been measured over a full implementation and is compliant to the estimation. Dilithium cost is evaluated assuming

This demonstrates that using pure solidity enables L2s with low gas fees to experiment with FALCON in the short term, whereas it is too expensive to do so on L1. Adopting this EIP, the signature verification of Falcon can be reduced to 1500 gas, and a similar result is expected for Dilithium. Adopting the hash function as a separate EIP would enable a gas verification cost of 2000 gas. This is in line with the ratio looking at SUPERCOP implementations.

Native Client Implementation

Two implementations have been developed for op-geth with four precompiled contracts at addresses 0x12, 0x13, 0x14, and 0x15:

Pure Go Implementation (nocgo) - Default:

liboqs Implementation (cgo) - High Performance:

Both implementations support:

Benchmark results on Intel(R) Xeon(R) CPU @ 2.20GHz (liboqs implementation):

BenchmarkPrecompiledNTT_FW/NTT_FW-Falcon-512-Gas=500-4
    377011      9411 ns/op      500 gas/op        53.12 mgas/s
BenchmarkPrecompiledNTT_FW/NTT_FW-Falcon-1024-Gas=1080-4
    176936     20419 ns/op     1080 gas/op        52.89 mgas/s
BenchmarkPrecompiledNTT_FW/NTT_FW-ML-DSA-256-Gas=256-4
    740838      4785 ns/op      256 gas/op        53.49 mgas/s

BenchmarkPrecompiledNTT_INV/NTT_INV-Falcon-512-Gas=500-4
    372236      9374 ns/op      500 gas/op        53.33 mgas/s
BenchmarkPrecompiledNTT_INV/NTT_INV-Falcon-1024-Gas=1080-4
    176300     20316 ns/op     1080 gas/op        53.15 mgas/s
BenchmarkPrecompiledNTT_INV/NTT_INV-ML-DSA-256-Gas=340-4
    558224      6396 ns/op      340 gas/op        53.15 mgas/s

BenchmarkPrecompiledNTT_VECMULMOD/VECMULMOD-Falcon-512-Gas=164-4
   1237585      2919 ns/op      164 gas/op        56.17 mgas/s
BenchmarkPrecompiledNTT_VECMULMOD/VECMULMOD-Falcon-1024-Gas=328-4
    592540      5999 ns/op      328 gas/op        54.67 mgas/s
BenchmarkPrecompiledNTT_VECMULMOD/VECMULMOD-ML-DSA-256-Gas=82-4
   1726333      1957 ns/op       82 gas/op        41.89 mgas/s

BenchmarkPrecompiledNTT_VECADDMOD/VECADDMOD-Falcon-512-Gas=154-4
   1273167      2828 ns/op      154 gas/op        54.45 mgas/s
BenchmarkPrecompiledNTT_VECADDMOD/VECADDMOD-Falcon-1024-Gas=308-4
    606345      5829 ns/op      308 gas/op        52.84 mgas/s
BenchmarkPrecompiledNTT_VECADDMOD/VECADDMOD-ML-DSA-256-Gas=77-4
   2178388      1645 ns/op       77 gas/op        46.81 mgas/s

BenchmarkPrecompiledEcrecover/-Gas=3000-4
   65388      57182 ns/op     3000 gas/op        52.46 mgas/s

The benchmarks demonstrate that all NTT precompiles maintain competitive or better throughput compared to existing precompiled contracts like ECRECOVER, processing approximately 52-56 million gas per second. Both implementations (nocgo and cgo) provide efficient support for multiple post-quantum cryptographic schemes including Falcon-512/1024 and ML-DSA/Dilithium, with the liboqs variant offering ~36-38% lower gas costs for NTT operations.

End-to-End Signature Verification with Precompiles

Integration testing demonstrates the practical impact of NTT precompiles on full signature verification. Tests were conducted using modified ZKNOX implementations (ETHFALCON and ETHDILITHIUM) against an op-geth client with NTT precompile support.

Test Environment: OP-Stack testnet with NTT precompiles enabled.

Algorithm Verification Method Pure Solidity With Precompile Gas Saved
ETHFALCON (Falcon-512) Direct 1,800,000 479,341 1,320,659 (73.4%)
ETHDILITHIUM (ML-DSA) PKContract-based 9,746,410 7,618,412 2,127,998
ETHDILITHIUM (ML-DSA) Direct (struct) 7,874,354 5,732,354 2,142,000

Key Findings:

These results validate that NTT precompiles significantly reduce gas costs for post-quantum signature verification, making lattice-based cryptography more practical for Ethereum applications.

Backwards Compatibility

There are no backward compatibility questions.

Test Cases

The reference implementation includes comprehensive test coverage across multiple layers:

NTT Precompile Tests (0x12)

Malformed Input Tests - 8 test cases covering:

Functional Tests:

Crypto Standards Benchmarks:

Vector Operations Tests (0x13, 0x14)

Unified Malformed Input Tests - 7 test cases covering:

Functional Tests:

Crypto Standards Benchmarks - Performance validation with:

Benchmark data are available in the EIP assets:

Pure Go Implementation (nocgo):

liboqs Implementation (cgo):

Reference Implementation

All implementations have been validated over a large base of reference vectors, and implementing both FALCON and DILITHIUM algorithms as demonstration of the usefulness of the precompile.

Security Considerations

Needs discussion.

Copyright

Copyright and related rights waived via CC0.