Solana zK Compression

Notes on zK Compression


Background

  • Trustless -> security assumptions of a full node. Anything a full node can't do alone involves additional trust assumptions

  • Solana state is stored on disks of full nodes in the AccountsDB

  • The unit for storage is an Account, which have addresses of 32 bytes each.

  • The maximum amount of data an account can store is 10MB

  • Storing 10MB costs 70 whgich can be 1x10MB or 1kx10KB-> Full storage rent would cost 70SOL or 1k accounts are 70SOL

  • Solana txs include a 32-byte "recent blockhash" (that must land within the most recent 150 blocks or be considered invalid and need to be re-signed and re-submitted)

Lifecycle of a solana tx

  • Age check (only recent are valid), deduplication, structural verification, fees and signature checks
  • Program bytecode loaded and SVM instantiated
  • All accounts specified are loaded (read) and passed to the SVM
  • Program is executed
  • Modified accounts synced back into storage in their updated state.

Here ZK-Compression motivations are:

  • On-chain is expensive.
  • More accounts means larger snapshots
  • Not all accounts are frequently accessed
  • In Solana, compute is cheaper than storage

Compression

  • Instead of readsing them when needed (step 3 in lifecycle of a tx), a tx could pass the account data as part of the payload, saving on-chain storage costs --> how can we enforce that users do not lie about the state?
  • An account with an off-chain value of 1200SOL can be passed in the payload but how would the chain know that is actually true? --> Merkle proofs
  • Merkle proofs are a way to prove that a certain value is part of a tree of values with a small on-chain storage footprint, thus small cost. This footprint is synced among all the nodes and can be used to verify the state of the chain and account specified on the tx --> Problem? Merkle proofs can be really large. So for really large amount of data (or accounts). ZK-Compression won't work. but there are nuances around this as parts of the account state can also be compressed and comitted, needing only to provide proof for partial data.
  • This data can be stored anywhere, as long as all the elements of the proog are stored somewhere, the proofs can be calculated.

ZK-Compression

  • Basic idea: It is a way to prove that a certain computation was carried out "correctly"

Based on Helius docs

A simple example is that you want to prove you multiply two numbers to get a 3rd number. i.e., 4*3 = 12

The “ZK way” to prove this is to have the following function:

f(x,y) = x*y

If you generate a circuit for the above code, the prover generates a proof of correct computation. The circuit itself is the commitment, so everyone knows what the "computation" you're running is. But the beautiful thing is that they don't need to know the inputs. once you run f(3,4), it returns 12 and the "proof." Now anyone can take 12, "proof," and verify that you multiplied SOME two numbers to get 12. They don't know if you did 4,3 or 6,2, or even 12,1. The fact that you hide those numbers and someone can still verify is where you get the “zero knowledge” part.
  • Once someone has the result and the proof, they can verify whether you did it correctly or not without have to actually run the computation. Being this true for any arbitray computation. \
  • The example above was for a multiplication of two numbers but could also be used for the verification of 10 signatures. -> You don't need to hide anything bc we turn a computation of 1k steps into a proof to know that the computation has been done correctly
  • Caveat: Proof generation takes time
  • ZK compression has a circuit that can take the account data and a proof and verify it is part of the commitment on the chain
  • Proof generation can be done off-chain, but the verification can only be done on-chain
  • From a developer point of view, they treat ZK-compressed accounts as any other account, so inside a program, it can be treated as any other account with the same properties

ZK rollups

  • Zk compression differentiates from zk rollups in the sense taht they do not store the result of the state after having applied 100 transactions, but rather each one of the 100 transactions containing one proof that simply tells you that the account data is correct but the state transitions are executed on chain as part of the SVM itself and once the proof is validated, it is treated as a regular account.

Nits

  • Zk-Compressed accounts cannot be concurrently modified. If two users try to update the same accoutn at the same time, one of them will fail. Also, for the next tx that zk-compressed account would have an invalid state, so it would need to use the new compressed state
  • Compression uses a really high CU which limits the max concurrency per tree

Trust assumptions

  • This approach convers the Byzantine fault tolerance as just one honest node is needed to provide data since the proofs are self-verifieable and there are not safety issue.

Links