ethereum basics

April 2022

As I’m going deeper on web3, I wrote these notes to codify my understanding of Ethereum.

They’re focused on the basics. Naturally, there’s lots of oversimplification and skimming over nuance because I think the broad concepts carry most of the freight. They assume some knowledge of Ethereum already.

I want these notes to be as evergreen as possible, so they’re light on examples.

Hope they’re helpful and DMs are open :)

High level

What we call Ethereum comprises three big pieces: a blockchain, a native currency asset, and blockspace.

  • Ethereum is a blockchain network

  • Ether is Ethereum’s native currency (an asset) for pricing computing power and incentivising users to keep the blockchain secure

  • Blockspace is the product that the blockchain “sells” - aka finite space for transactions

***

Blockchain

A blockchain is a ledger (database) of transactions stored on a decentralised network of computers. In other words, it’s a computer in the cloud running on a distributed network of physical computers. Transactions are stored in discrete blocks that are linked together in a chain.

Blockchains serve different purposes. Bitcoin was designed as a peer-to-peer payments network for moving around bitcoin (the asset). Ethereum is a general-purpose blockchain for executing different transactions (not just payment transfers) in virtually any asset (not just ether). Every blockchain has a set of rules called a protocol that governs it.

Blockchain performance is measured on two vectors: throughout (the number of transactions it can process per second) and latency (the time it takes for transactions to process).

Nodes are the decentralised computers that run an up to date copy of the Ethereum blockchain and agree (reach consensus) on transactions. Anyone can run a node permissionlessly without needing approval.

There are three different types of nodes:

  • Full nodes store the full blockchain data, act as validators, and provide proofs that previous transactions are valid

  • Light nodes only store header chains and need to request everything else

  • Archive nodes store everything that the full nodes have as well as an archive of historical states of the blockchain, so they can look back in time

The more nodes Ethereum has, the more decentralised it is. Decentralisation is defined in two ways:

  1. Low trust - nodes can independently compute the latest state of the blockchain and verify that all the blocks have followed the rules of the blockchain without needing to trust another party

  2. Low cost - there are low barriers to entry to becoming a node. If there are prohibitive barriers like expensive hardware or other resource requirements, users will rely on third parties, and the blockchain will trend towards centralisation. Cost is the product of a few things: bandwidth (downloading and broadcasting blockchain data), compute (running computations in scripts or smart contracts), and storage (storing transaction data and storing “state” to keep processing new blocks)

Clients are the software that nodes need to read the blockchain. There are multiple clients based on different programming languages. Ethereum is stronger when nodes are equally distributed across different clients (called client diversity) so the blockchain isn’t overly exposed to an attack or bug in any one client.

Blocks are batches of transactions that form the blockchain. Each block has a fixed limit of a 30 million gas fee and a target of 15 million because if blocks become too large, some nodes wouldn't be able to handle them. Each block contains a unique identifying cryptographic hash and the parent hash of the previous block to link them together. Changing any block modifies its hash, which can be detected in every subsequent block through the parent hash reference, preventing fraud.

State is the latest status of accounts and their balances on the blockchain. As transactions are processed, this gets constantly updated.

The Ethereum Virtual Machine (EVM) is the computer in the cloud above the decentralised network of physical computers. The EVM holds the state of the blockchain, updating it after every transaction and disseminating it to the nodes.

***

How transactions get done

Accounts are units that complete transactions on Ethereum - receiving, holding, and sending ether (and other assets) and engaging smart contracts. Accounts can be externally owned (controlled by users with private keys) or contract accounts (smart contracts controlled by code). Externally owned accounts can initiate transactions, but contract accounts can only make transactions in response to transactions they receive.

Accounts have four fields:

  • A nonce value that shows the number of transactions sent for externally owned accounts or the number of contracts created for contract accounts

  • The balance of ether owned by the account

  • A codeHash of the EVM code for that account. The EVM code is executed when the account gets a message call. For externally owned accounts, the codeHash is the hash of an empty string

  • The storageRoot hash of the root node of a Merkle Patricia tree (structures for hashing chunks of data together by splitting them into buckets, taking the hash of each bucket and repeating until the total number of hashes becomes one single Merkle root hash) that encodes the storage contents of the account

Accounts are backed by public and private key cryptography. A public key is an outward-facing address so that other people can find you and send you assets. If someone knows your public key, they can identify your activity on the blockchain. A private key is derived from the public key and functions like a safe code to unlock the account.

Accounts don’t hold assets because they’re always on the blockchain. But they can only be accessed with the account’s private key.

Externally owned accounts carry user identity in web3 as they can be used to sign in to applications.

Wallets are wrappers around accounts that provide a user interface. Accounts are independent of wallets, so you can take your account and migrate it to another wallet any time you want. There are three types of wallets: (1) hot (online) hosted wallets managed by an exchange where they own the account keys, (2) hot non-custodial wallets where you own the keys, and (3) cold (offline) hardware wallets which aren’t connected to the internet and are the most secure.

Transactions are actions that update the state of Ethereum by changing the ledger. Transactions must be signed by the sender’s private key to prove their authenticity. There are two types of transactions: (1) regular transactions between wallets resulting in message calls (transfers), and (2) smart contract transactions (computations).

Gas is the fee for all transactions on Ethereum. Completing a transaction requires computing power, priced in ether. Gas is determined by the amount of computing power a transaction requires and the demand for blockspace. It’s a mechanism for allocating blockspace (users who pay it will have their transactions included in the blockchain) and incentivises miners/validators to validate transactions.

Gas is denominated in gwei. Each gwei equates to 0.000000001 ether and 1,000,000,000 wei - this is the smallest unit of ether. Every block has a base fee reserve price, which is the minimum fee for adding a transaction to that block - so wallets can automatically calculate how much to spend on gas, compared to the old auction system where users had to submit gas bids.

Consensus mechanisms enable decentralised nodes to agree trustlessly on the state of the blockchain. Consensus means at least 51% of nodes agree on the next state. Consensus mechanisms are a solution to the Byzantine Generals Problem: establishing an agreement in a distributed system with imperfect information flows and potential bad actors.

Consensus mechanisms include a Sybil resistance mechanism - an economic deterrent against a Sybil attack (where users forge identities to influence a network). In a blockchain Sybil attack, bad actors would pretend to be other users to reach the 51% threshold and control the network. Sybil resistance mechanisms reduce this risk through a forced commitment of resources.

Ethereum (like Bitcoin) currently uses Nakamoto Consensus, which is a combination of a proof of work Sybil resistance mechanism and the longest chain rule. Proof of work is based on committing computing (hashing) power. Bad actors wanting to attack the network would need to control 51% of the blockchain’s computing power which is prohibitively expensive.

Ethereum is transitioning to a proof of stake Sybil resistance mechanism in 2022. Proof of stake is based on committing capital. Ether holders will lock up their ether (minimum 32 individually or as part of a pool) to become validators and earn ether “dividends” in return for verifying transactions.

Bad actors would need to put up 51% of the total staked ether to control the network. Proof of stake is better than proof of work on multiple fronts: less energy-intensive, lower entry barriers because it doesn’t have the same computer hardware requirements, and more decentralising.

The process for executing transactions starts the same in proof of work and proof of stake:

  • A user initiates a transaction and signs it with their private key

  • The user broadcasts the transaction to the Ethereum network from a node

  • All nodes add the transaction request to their local mempool (a queue of transaction requests that haven’t been added to the blockchain yet)

When the nodes need to agree on updating the blockchain, the process diverges.

In proof of work:

  • Miners (computers that add and verify transactions on the blockchain) compete to solve a cryptographic puzzle to find the nonce for the next block

  • The first miner to solve the puzzle gets to verify the next batch of transactions, execute the transaction requests, post them in a block to the blockchain, and update their local copy of the EVM. Then, they produce a “certificate of legitimacy” for the block and broadcast the completed block (including the certificate and a checksum of their proposed new EVM state)

  • The other nodes verify the certificate, execute the transactions in the block, and validate that the checksum of their new EVM state is the same as the checksum the miner reported. Then, they add the block to their version of the blockchain and accept the updated EVM as the new canonical state. Finally, they remove the transactions in the block from their mempool

  • The first miner earns a block reward (freshly minted ether) and any tip portion of the gas fee

Proof of work uses a “longest chain” selection rule for determining which chain is “correct.” This says the longest blockchain is the right one because it’s received the most computational work, therefore other nodes will follow it. Proof of work combined with the longest chain rule is called Nakamoto Consensus.

In December 2020, Ethereum deployed a separate Beacon Chain based on proof of stake. Since then, the Beacon Chain has been running in parallel to the main Ethereum chain (Mainnet). In 2022, the Beacon Chain will merge with Mainnet into a single chain (“The Merge”). Mainnet will do transaction execution, the Beacon Chain will run consensus, and proof of work will be switched off.

In proof of stake:

  • Every epoch has 32 fixed 12 second slots. Each epoch, the whole validator set gets evenly distributed into committees for every slot - so there are 32 equal sized committees, one for each slot. Each slot, one validator proposes a block and the other committee members vote on it. The winning validator is randomly chosen based on how much ether they have staked

  • Once the block is completely attested, a “crosslink” is created to confirm that the block is included in the Beacon Chain

  • The validator who proposed the block and the validators who attested it receive any tip portion of the gas fees

  • If validators propose or attest bad blocks, or go offline and don’t validate, they have their staked ether slashed (taken away)

Proof of stake uses LMD Ghost as a fork choice rule, whereby nodes follow the “heaviest” chain with the most accumulated votes.

Finality is when transactions are part of a block that can't change. On Ethereum, transactions should reach finality after six blocks (or around one minute). Nakamoto consensus blockchains can only achieve probabilistic finality and eventual consensus. As more blocks get built on top of a block, it’s more probable that it’s final, and eventually all nodes agree on that. Proof of stake enables provable finality. The Beacon Chain uses a Casper Friendly Finality Gadget protocol to expedite finality by getting validators to agree on the state of the block at set intervals, and if ⅔ of validators agree, the block is finalised. Validators lose their stake if they try changing a finalised block later.

Maximal extractable value (MEV) is the maximum profit a miner or validator can make (above the standard block reward and tip portion of the gas fee) from including, excluding, or reordering transactions in blocks they create, essentially through arbitraging. Some MEV transactions are harmless, but others are at the expense of Ethereum users. Many “searchers” run algorithms on the blockchain to identify profitable MEV opportunities and have bots automatically submit those transactions to the network.

***

On the blockchain

Tokens are assets on the Ethereum blockchain. There are two kinds of tokens, which map to the categories of real-world assets: fungible and non-fungible.

Fungible tokens (technically, ERC20 tokens) represent interchangeable assets. Mostly, these are financial assets like stocks, bonds, and currencies because these assets are homogenous for efficiency. Ethereum is inseparable from its native ether token because it needs it to incentivise users to act in the interests of the network and punish bad actors.

One of Ethereum’s big innovations was enabling the creation of nonfungible tokens (“NFTs,” or ERC721 tokens) which represent unique assets like collectables, art and music.

Tokens enable digital property rights and make assets fully tradeable. Users can actually “own” assets in their accounts. The downstream effect of this is that assets can have digital scarcity for the first time. A finite number of an asset can be created and tracked on the blockchain.

As tokens sit on the blockchain, scarce assets can now be owned and exchanged on a trustless settlement layer.

Smart contracts are programmes that run automatically on the Ethereum blockchain. They’re often referred to as “protocols.” Since they’re coded, smart contracts are based on logic like “if-then, then that” statements. Nick Szabo conceptualised smart contracts as a digital vending machine: certain inputs produce a predetermined output, which removes the need for intermediaries.

Smart contracts are open-sourced and composable. Users and decentralised applications (dapps) can permissionlessly leverage smart contracts for their use (called “forking” the code) or build on top of them.

The open-source smart contract libraries include reusable behaviours that can be added to contracts and implementations of Ethereum standards (called ERCs).

Anyone can write a smart contract and deploy it to Ethereum. It just needs to be written in a smart contract language (Solidity and Vyper are the most popular) and gas paid to deploy it. After the smart contract is written, it must be compiled to bytecode (low-level machine instructions called opcodes) so that the EVM can interpret and store it. Compiling the smart contract also produces the Application Binary Interface (ABI) which is needed for dapps to understand the contract and call its functions. Once a smart contract is on the blockchain, it can't be taken down.

Oracles enable smart contracts to incorporate real-world data. Smart contracts can't get information about real-world events because inputting external data could lead to nodes receiving different information (due to changes in the data or varying sources) and not reaching consensus.

Oracles plug off-chain, real-world, data (like event results or trading prices) into smart contracts and vice versa (they enable smart contracts to send data off-chain). We can think of them as APIs between Ethereum and the real world.

Oracles execute a transaction to update the data in the smart contract. This ensures the blockchain can keep consensus because any node replaying the transaction could use the same data and get the same output. But smart contracts can't verify the offline data that the oracle imports, so it’s a trusted system.

The “oracle problem” is that importing off-chain data through centralised oracles undermines decentralisation and creates single points of failure. Decentralised oracles can mitigate this by pulling data from multiple different sources to hedge the risk of any single oracle.

Dapps are decentralised applications that run on the Ethereum network. 

All parts of web2 applications (backend database, backend code logic, and frontend) run on centralised servers.

With dapps, the backend is the Ethereum network. The Ethereum blockchain stores the dapp data, smart contracts define the dapp logic, and the EVM implements the smart contract logic and processes state changes. Dapps can run their frontend on centralised web servers (such as AWS) like web2 or decentralised storage systems. Users engage with dapps through web browsers, as in web2.

Dapps are open code and composable. Developers can borrow each other’s code as building blocks for their dapps. Users can freely port their data and content between dapps, unlike web2 applications which are typically walled gardens that prevent transfers between - for example, users can't export their Twitter followers or YouTube videos.

Dapps must connect to an Ethereum node so that they can read blockchain data and broadcast transactions to the network. They can either run the nodes themselves or use client APIs. Client APIs make interacting directly with nodes easier and save engineering work.

***

Infrastructure

Networks are different environments for development, testing, and production on Ethereum. There are public networks and private networks.

Mainnet and Testnet are public networks. Mainnet is the production blockchain for transactions. Testnet is a testing environment for blockchain upgrades and smart contracts before release.

Private networks aren’t connected to Mainnet or Testnet. There are local private development networks for testing individual dapps and consortium networks where consensus is controlled by a pre-agreed group of trusted nodes.

Block explorers provide real-time access to on-chain data (for example, about blocks, transactions, and accounts). They provide an interface for developers to easily filter and summarise blockchain data. Etherscan is a block explorer for transaction data on Ethereum, and The Graph is a decentralised indexing protocol for querying Ethereum data with open APIs.

Decentralised storage systems are peer-to-peer networks of data storage based on breaking data into fragments and distributing it across multiple users’ drives. Interestingly, you don’t need all the fragments to reengineer the data back to its original form. Ethereum is a decentralised storage system but at a certain scale, nodes won’t be able to store all the data and deploying it to Mainnet will be too expensive, so it may need to outsource storage to another system.

Filecoin is an example of a decentralised storage system. Storage providers compete for storage contracts from clients to earn revenue from their spare storage capacity. Storage providers have to prove they’re storing the data securely to earn Filecoin, and the Filecoin network verifies it through cryptographic proofs. When clients want their data back, they search for storage providers who may have it, choose the quickest or cheapest from that group, request to retrieve the data, and then pay the storage provider. If more clients request particular data, more storage providers will store it (supply will follow demand) - so data will get stored closer to users and quicker to retrieve.

***

Governance

Ethereum has an informal, off-chain governance process: decisions and changes to the network are coordinated off the blockchain, rather than being proposed through code updates and voted online.

Ethereum development standards are areas where the Ethereum community can decide to standardise parts of the network, like processes or features. Standardisation helps with interoperability and composability between different parts of the ecosystem. Ethereum Improvement Proposals (EIPs) are the mechanism for this. EIPs set out the technical specification for changes and are referred to as the source of truth by the community. Clients need to incorporate EIPs to stay in consensus with other clients on Mainnet.

EIPs are usually proposed by the core group of Ethereum developers (although in theory, anyone can propose them) and then they’ll mobilise consensus between other stakeholders (nodes, miners/validators, developers, and Ethereum users). There isn’t a specific support threshold to reach, just “consensus” in the community.

If consensus is reached, then the code is updated. If not, the stakeholders can hard fork the protocol, creating a new chain based on Ethereum but without the proposed change. Then those chains exist in parallel. In 2016, Ethereum Classic forked Ethereum after the DAO Hack.

Some EIPs relate to application-level standards and get introduced as Ethereum Requests for Comment (ERCs). For example, there are multiple ERCs for tokens: the ERC-20 standard for fungible tokens, the ERC-721 standard for NFTs, ERC-77 enables people to build extra functionality on top of tokens, and the ERC-155 set a standard for single contracts to represent multiple tokens.

***

Scaling

The blockchain trilemma is that blockchains can have two of decentralisation, security, and scalability, but having all three is very difficult.

  • Decentralisation is the degree of concentration of blockchain nodes and the absence of trusted intermediaries - this is a spectrum, not binary

  • Security is how well the blockchain stands up to external attack, which usually means how difficult it would be for someone to control 51% of the network. Greater decentralisation makes blockchains more secure because there’s a larger surface area attackers need to penetrate

  • Scalability is a blockchain’s ability to handle a high volume of transactions and process them quickly

Since its creation, Ethereum has prioritised decentralisation and security, but this has led to being constrained by scalability and transactions being slow and expensive.

The goal of scaling solutions is to increase transaction speed (quicker finality) and throughput (more transactions per second).

Scaling solutions can be on-chain (layer 1) or off-chain. Ethereum wants to use both. 

On-chain scaling means upgrading the Ethereum blockchain itself. 

Sharding is Ethereum’s leading on-chain scaling solution. Sharding means breaking up the Ethereum blockchain into new chains called shards (64 are planned).

At first, the shards will only store data and not execute code. It’s still being debated whether shards should execute code and process transactions with their own smart contracts and account balances. If it goes that way, each shard would provide transaction proofs to the Ethereum main chain, and shards would share data and transactions using cross-shard communication.

Sharding benefits decentralisation. Validators will only need to run the shard they’re validating, so potential validators won’t be excluded by not having extensive hardware to run the full main chain, which would be the case if Ethereum scaled only by expanding the main chain. The hardware requirements for running validator nodes will come down over time as they’ll eventually be possible to run from personal laptops and phones.

If shards execute transactions, this will take pressure off the main chain. Multiple blocks could be created simultaneously and validators would only have to process transactions on their shard.

Sharding relies on proof of stake so it will come after The Merge.

Off-chain scaling leaves the Ethereum blockchain untouched and bolts on scaling solutions around it.

There are two types of off-chain scaling solutions: layer 2s and new chains.

Layer 2s rely on Ethereum for their security.

Rollups are separate blockchains where transactions are executed, then compressed together with a SNARK cryptographic proof (Succinct Non-Interactive Argument of Knowledge) and submitted to the main Ethereum chain for consensus. The SNARK proof shows the transactions were processed in an Ethereum-compliant way, so Ethereum only has to check the proofs, not each transaction, to confirm their validity. Ethereum also only has to store the compressed batch transaction data.

Two kinds of rollups take different approaches to prove the validity of transactions to the main Ethereum chain:

  • Optimistic rollups assume the batched transactions are valid by default and only run fraud proofs if they’re challenged by a verifier node. After the party submitting the rollup (called the sequencer) posts the batch, there’s a dispute period where verifiers can publish a fraud-proof. When a verifier publishes a fraud-proof, the transaction gets rerun on the Ethereum main chain and compared to see if it gives the same state root as the one the sequencer had. Sequencers put up a bond, and if fraud is proven, it gets slashed and distributed to the verifiers. Once the dispute period expires, the transactions are final. Optimistic rollups run an EVM compatible virtual machine, so they can do anything that Ethereum can

  • Zero-knowledge (ZK) rollups include a zero-knowledge SNARK proof with every batch of transactions which shows the transactions are valid without needing to include all the transaction data, so even less information is added to Ethereum. Some ZK rollups aren’t EVM compatible, so it’s harder to use them for general purpose applications

Rollups are Ethereum’s favourite off-chain scaling solution because they offer the best mix of decentralisation, security, and scalability.

State channels are mechanisms whereby parties record transactions with each other under a multi-signature smart contract off-chain but settle on the main Ethereum chain once they’ve finished. The channel is opened through a deposit, and then transactions are recorded through signed messages, but they’re not settled (so no assets are sent or received) until they’ve finished all their transactions. When the parties are ready, they close the state channel and the total assets in the transactions are settled. Only the opening deposit and closing transaction are public, so while multiple transactions are done in the channel, only two are settled on the Ethereum main chain.

New chains create their security separate from layer 1.

Sidechains are independent, EVM compatible blockchains linked to Ethereum through a two-way bridge, so assets can be moved between them. Transactions are executed on the sidechain and then posted in batches to the Ethereum main chain. Because they’re EVM compatible, sidechains can do everything Ethereum does.

Plasma chains are separate “child” blockchains which are smaller replications of the “parent” Ethereum main chain. They’re built through smart contracts and Merkle trees. The plasma chain provides a Merkle root hash of the transactions it executes to the main Ethereum chain and can submit fraud proofs to the main chain to challenge transactions.

Validium chains are similar to ZK rollups as they use zero-knowledge proofs, except data is kept off the main chain.

***

Ether

Ether is digital money and the native currency of the Ethereum network.

Money has three characteristics: (1) a store of value so people hold it for future use, (2) a medium of exchange in transactions, and (3) a unit of account. Ether embodies these because it has scarcity value from limited supply, it’s exchangeable for other assets (tokens) in the Ethereum ecosystem, and it’s payment for gas.

Unlike traditional fiat money, ether is decentralised as it operates independently of any coordinating institution (like a central bank), anyone can make transactions without an intermediary, and it’s highly fractionalised (divisible up to 18 decimal places).

Ether is the world’s first “triple point” asset. There are three asset superclasses: capital assets that produce a yield through cash flows, store of value assets that hold their value over time (money), and consumable assets that can be converted into another asset or have a one-time use. Ether is a capital asset because holders can earn a yield through staking, a store of value from supply scarcity and use as collateral in defi dapps, and a consumable asset used to pay gas for blockspace.

Monetary policy

Ethereum follows a minimum necessary issuance monetary policy, issuing the minimum amount of new ether needed to keep the network secure. Unlike Bitcoin, Ethereum doesn’t have a fixed supply because this would create a fixed security budget for rewarding miners/validators and threaten the network’s sustainability.

Minimum necessary issuance enables Ethereum to ensure a scarce supply of ether while always having enough to pay for security.

Overall, the net supply is driven by new and burned ether.

  • New issuance comes from block rewards paid to miners/validators for validating transactions. Block rewards have declined over time and are currently at 2 ether per block for successful blocks and 1.75 for uncle blocks (blocks created when two blocks are mined and broadcasted simultaneously, usually due to latency). Proof of stake will dramatically reduce new ether issuance compared to proof of work. As total staked ether increases, ether dividends (from new issuance) will decline, although the volume will remain positive

  • Burned ether was introduced by EIP1559 in August 2021. Gas is split into a base fee driven by supply/demand for blockspace and an optional tip that users can pay to speed up their transaction (usually in periods of high demand). The base fee is burned (removed from circulation), while the tip goes to miners/validators (along with the block reward) as an incentive for processing transactions. Burning ether creates deflationary pressure and a virtuous cycle whereby more economic activity on Ethereum (therefore more burned ether) leads to greater ether scarcity and more value to ether holders

Other downward pressures on ether supply are staked ether (rather than selling their ether, stakers are incentivised to continue staking to earn more), ether locked in defi applications, and people losing access to their ether. If the volume of burned ether exceeds new issuance, ether supply will decrease.

As ether supply decreases over time, it could become “ultra-sound money.” “Sound” means money is stable and not susceptible to sudden appreciation or depreciation. Bitcoin’s fixed supply of 21 million makes it sound. The amount of new bitcoin issued to miners gets cut roughly every four years (through a process called the “halvening”), but more supply is being added until the final bitcoins are created around 2140. Whereas ether’s supply could be declining, not just static, making it “ultra-sound money.”

This could be challenged, however, if transactions continue moving to off-chain scaling solutions (so fewer ether get burned) or stakers end up selling most of their earned ether (rather than reinvesting it in staking).

***

That’s a wrap! Thanks for reading.

For web3 newbies, there’s so much information out there. The challenge is separating the wheat from the chaff. Good curation can expedite the learning process.

In that vein, I want to suggest just a handful of other resources I’ve found super helpful for developing my understanding of Ethereum and that informed these notes.

Wagmi!

(Obviously, none of this is investment advice. DYODD.)