Vitalik Buterin published a research proposal that avoids the question everyone keeps asking: Can you run AI models on blockchain?
Instead, the study argues that Ethereum is a privacy-preserving payments layer for pay-as-you-go AI and API usage. This post, co-authored with Davide Crapis on Ethereum research, argues that the real opportunity is not to put LLM on-chain.
The real opportunity lies in building an infrastructure that allows agents and users to pay for thousands of API calls without compromising their identities or creating a surveillance trail with billing data.
Timing is critical as agent AI is moving from demonstration to enterprise roadmap. Gartner predicts that 40% of enterprise applications will include task-specific AI agents by the end of 2026, up from less than 5% in 2025.
This shift hints at a world where software autonomously generates large numbers of API calls and billing rails become strategic infrastructure rather than back-office plumbing.
Current measurement systems force you to choose between Web2 identity billing, which relies on API keys and credit cards and exposes profiling data, and on-chain pay-per-call models that link activity through transparent transaction graphs, which are too slow and expensive.
This proposal introduces ZK API Usage Credits, a payment and abuse prevention primitive built on the rate limit override functionality.
RLN is a zero-knowledge gadget designed to prevent spam on anonymous systems, and research repurposes it for metered access to services.
The flow proceeds as follows. Users deposit funds once into a smart contract and their commitment is added to the on-chain Merkle tree.
Each API request includes a zero-knowledge proof that proves that the user is a valid depositor with sufficient trust for the requested index.
If a user attempts to reuse the ticket index and spends twice the allowance, the RLN allows the system to recover the secret and reduce the stake as a financial penalty.
The post contains concrete examples. User deposits 100 USDC and runs 500 hosted LLM queries. Another user deposits 10 USDC for 10,000 Ethereum RPC calls.
The architecture is explicitly designed for “many calls per deposit,” and on-chain activity scales with the number of accounts and settlement frequency, rather than raw inference volume.
Increase flexibility with variable cost support. Users pay upfront the maximum fee per call, the server returns a signed refund ticket for the unused amount, and users accumulate refunds privately to unlock more calls without additional deposits.
the infrastructure already exists
This proposal comes at a time when a large usage credit payment infrastructure already exists.
According to DefiLlama, the stablecoin has a circulating market capitalization of approximately $307.6 billion, indicating that the on-chain dollar layer has sufficient liquidity to support deposit-based billing for high-frequency services.
Ethereum’s scaling stack has matured to the point where Rollup handles much more activity than Layer 1, with L2Beat exhibiting a scaling factor of around 100x, with Rollup processing thousands of operations per second compared to Ethereum Mainnet’s thousands of operations per second.
Recently, Ethereum’s average transaction fee was measured at around $0.21 on February 7th, suggesting that occasional on-chain measurement and payment flows are possible without prohibitive costs.
This design explicitly avoids putting LLM on-chain. Ethereum competes not on TPU cycles or inference speed, but on neutral payments, programmable escrow, and verifiable execution.
This architecture treats inference as an off-chain service and blockchain as a reliable layer for payments, measurement, and dispute resolution. There is no need for users to trust or reveal their identity to individual providers.
Ethereum becomes the enforcement layer for AI commerce when AI service providers rely on Ethereum or layer 2 smart contracts to accept deposits, slash, refund, and adjudicate disputes.
This model is similar to how Ethereum became a stablecoin and DeFi payments layer by providing a neutral foundation on which economic agreements are enforced programmatically, rather than hosting the complete application stack on-chain.
A scenario without the hype
On-chain footprint is limited by settlement cadence, not raw call volume.
Assume that 250,000 power users or agents adopt usage credits in a crypto-native wedge scenario targeting RPC and infrastructure APIs.
If each performs two on-chain actions per month (deposit or replenishment and withdrawal), approximately 500,000 transactions are generated monthly attributable to the railway.
In an AI provider deployment scenario, imagine 1 million users leveraging privacy-preserving credits across hosted LLM services, but only performing 1 to 3 on-chain actions per month.
This means that the 1 million to 3 million transactions per month associated with AI commerce rails may be concentrated in Layer 2, where it is cheaper to execute.
In enterprise agent scenarios, the size of deposits increases, the risk of reliable execution increases, and the importance of the slash mechanism increases.
Metadata issues
The proposal seeks to make payments unlinkable, but the research thread itself highlights potential weaknesses.
Commenters argue that even if nullifiers cannot be cryptographically linked, servers can associate users through inference-based metadata such as timing patterns, token counts, and cache hits.
This critique proposes fixed, bucketed pricing for input and output classes to reduce leakage. The tension between cryptographic privacy and behavioral metadata is central to whether a design actually achieves the goal of anonymity.
Another hurdle exists in actual implementation. Although this proposal uses RLN as a primitive, the Privacy and Scaling Explorations project page notes that RLN is inactive or deprecated.
Productizing ZK API Usage Credits may require you to maintain a fork or implement a new solution rather than relying on existing tools.
The RLNJS benchmark reports around 800ms for proof generation and around 130ms for verification on M2 Macs, providing an early performance sanity check, but leaving open questions about mobile constraints and large production-grade circuits.
The proposal also envisions providers integrating deposit and proof flows, accepting stablecoin payments, and adopting Ethereum or Layer 2 contracts for dispute resolution.
It’s not just a technical problem, it’s a coordination problem. Web2 API providers have existing billing infrastructure and clear regulations regarding identity-linked transactions.
To convince them to adopt a ZK-based alternative, they will need to demonstrate either a compelling cost advantage or a differentiated market segment where privacy protection charges can capture revenue they would not otherwise have.
| model | Billing method | Things that leak or break | Someone who suits you |
|---|---|---|---|
| Web2 ID billing (API key + card) | Account-based billing tied to your ID (API key + payment method). Provider centrally measures requests and invoices | leak: Identity linkage and profiling trail across requests. break: Pseudonym/self-custody norms. risk: Centralized management (suspension/censorship, single provider trust) | Mainstream SaaS/API providers. Companies that prioritize compliance, simplicity, and existing billing rails |
| On-chain pay-per-call | Each request (or batch) pays on-chain per call via a transaction/smart contract | break: High call cost/delay. leak: On-chain linkability (transaction graph ties usage together). friction: UX overhead for repeated TX | Crypto-native services that are called infrequently. When transparency/auditability is more important than privacy/throughput |
| ZK API usage credits (one time deposit, unlimited calls) | Users deposit once. Each request includes ZK proof of membership and remaining credits. Slash for double use. Optional refund ticket for variable expenses | risk: Metadata correlation (timing/token patterns can be relinked). Burden: Provider integration + collaboration. Maturity: ZK tools/operational complexity, circuit maintenance | High-frequency APIs (LLM, RPC, data) where privacy is a selling point. Agent toolchain. Users who require metering without identity-based monitoring |
What this means for Ethereum
If this design gains traction, Ethereum’s value proposition will further shift toward serving as a neutral enforcement layer for digital commerce rather than a general-purpose computing platform.
The proposal treats blockchain not as a place where applications run, but as a payment infrastructure that ensures economic rules are enforced.
The velocity of stablecoins increases as deposits flow into utilized credit contracts, potentially creating a new category of on-chain economic activity distinct from DeFi speculation and NFT trading.
Layer 2 utilization is likely to increase as providers and users resolve disputes, process refunds, and handle thrashing events in a throughput-optimized chain.
The question is whether a parallel ecosystem will emerge where privacy protection charges become a prerequisite for certain user segments.
Businesses concerned about data leakage through billing logs, developers building agent toolchains that require unsupervised and auditable instrumentation, and power users who value anonymous access to high-frequency services are all potential early adopters.
Ethereum’s opportunity is to serve as a layer on which the AI services market can settle without participants having to trust individual platforms or sacrifice the privacy of their billing infrastructure.
The proposal claims that Ethereum can enforce payment agreements, adjudicate disputes, and enable pay-as-you-go access without identity federation in ways that are structurally not possible with traditional systems.
The success of this argument depends on solving the metadata correlation problem, maintaining robust ZK implementations, and convincing providers that the market-unlocked integration costs are justified.
(Tag translation) Featured

