
The crypto world has been obsessed with price graphs and Bitcoin ETFs, while the real crisis is going unnoticed. An infrastructural crisis lurks beneath the surface of Ethereum, and it’s to do with blockchain’s “state”. The “state” refers to all of the account balances, smart contracts, and chunks of data that exist on any given blockchain, and it has quietly ballooned out of control. It comprises approximately 80% of untouchable data, which has not been accessed in well over a year, yet all nodes are forced to maintain it indefinitely. This ultimately means that a system once touted as extremely decentralized is quickly becoming a very elite group of infrastructure owners.
This problem was addressed on December 16th, when the Stateless Consensus team of the Ethereum Foundation made a grim announcement: If left unchecked, state growth will eventually undermine the censorship-resistant nature of Ethereum and cause a transition to a more centralized system.
Ethereum’s internal scaling solutions are also contributing to the problem. Increased gas limits, recent scaling improvements, and increasing network use may be allowing for more transactions, but each one permanently increases the state, which never expires. Researchers at the Ethereum Foundation have now proposed three different ways ahead, including state expiry solutions that temporarily delete idle information, state archive solutions that decouple hot storage from cold storage, and partial statelessness, which allows the responsibility to be spread among specialized nodes, including even user wallets.
There may be unknowns in the Ethereum roadmap, but the implication is clear, and the Ethereum network must solve its hidden storage problem or face becoming the centralized system it set out to overcome.
Each new account, each storage write, and each bytecode deployment contributes to the data that will be maintained on the network for good on the state. Gas limits have boosted the growth of the state because they enable more write operations per block. The intended democratization of Ethereum access is now impeding the feasibility of normal users maintaining full nodes, giving superpowers to major infrastructure projects.
It not only impacts the cost of data storage, but validators and full nodes will have to store more data, which adds extra work to the database that will be less efficient with an ever-growing state. RPC (Remote Procedure Call) servers will have to store the entire state to be able to query any account or store at any given time, which will make the synchronization process slower and error-prone to the Ethereum Foundation with an ever-growing Ethereum blockchain.
The long-term development plan for Ethereum has a feature called "statelessness," whereby validators would allow other validators to verify blocks without holding the full state. On paper, this may sound like the perfect solution, as validators could seemingly operate with far less hardware, theoretically increasing decentralization. However, the Foundation's researchers warn of an unintended consequence.
In a stateless chain, the role of state storage will be separated from the validators and will be much more specialized. It is to be noted that most of the state data will be retained only in the block builders, RPC providers, and other specialized nodes like MEV searchers (which are bots that profit from recording transactions) and block explorers (services that display Blockchain data). Essentially, the state of Ethereum will be much more centralized at a point when Ethereum requires the most decentralization.
This is what the Ethereum Foundation describes as a “resilience and capture risk.” Due to a limited number of players holding and providing access to the state in its entirety, network failures or outside pressure on them can rapidly isolate a broad swath of the Ethereum network.
This risk extends to Layer 2 rollups (networks built on top of Ethereum that process transactions faster and cheaper), in that they function on the understanding that a user has the ability to, at all times, access the state of Ethereum in order to force a transaction in the case of a state of emergency. If this state of affairs is precarious and in the control of a few players, then the very safeguards that make rollups secure will fall apart.
The Stateless Consensus team from the Ethereum Foundation has proposed three different methodologies that target the swelling of the state on different fronts.
Firstly, State Expiry is the most aggressive approach. The basic design involves temporarily removing inactive states from consideration in the ‘active set’ and providing a mechanism to prove necessity, allowing for revival when needed. Two versions of state expiry are being investigated: designated “mark-expire-revive,” which deals with instances one-by-one, and “multi-era expiry," which groups states into periods of time.
Nevertheless, state expiration introduces serious usability issues. "Resurrection hopping" (which is the problem of repeatedly needing to provide multiple proofs for expired accounts) would require strong implementation in the wallets to deal with the frustrating situation of finding out, mid-transaction, that multiple states have expired and retrieving the proof to renew them.
In fact, a recent research forum post by a member on the Ethereum forum talked about how if the sender wants to send ETH and tokens to an expired account, they first realize the account is expired, and after reviving it, realize the storage slots of the token contract are expired, requiring additional proofs, which obviously poses a storage problem.
“State Archive” has a less disruptive approach. It distinguishes between the “hot state,” which is often accessed by the network, and the “cold state,” which is relevant for the history and verification but accessed infrequently. The “hot state” that needs to be accessed quickly will remain bounded in size despite the overall state size growing continuously. This way, the performance during execution does not decline over time due to the age of the chain.
The Ethereum Foundation has said that Partial Statelessness would allow for an entirely different perspective on who owns which data. Nodes would no longer be required to store all state information; instead, they would store and serve state information related to a certain set of users or applications on the network, and the actual storage and caching of state pieces would be handled by the wallet and light clients themselves.
I feel as if the Foundation recognizes that many questions remain unanswered: How much state can there be before it becomes a barrier to participation, and who will end up holding the state once validators have the ability to validate without it? How and under what incentives will they serve it to their clients?
It is clear that Ethereum is faced with an inherent dilemma in that it needs to proceed with scaling in order to stay competitive, but the same process is leading it down a road that could result in a storage crisis, which compromises its ability to remain decentralized. The outcome of this conundrum could shape not only the technological trajectory but also the very relevance it originally had in being neutral and censorship-resistant.