Intro to Web3 Security: Part 1

0xBlue

part 1 of 4 in a series on blockchain security

Intro

In this one, I’m going to cover:

the basics of the (ethereum) blockchain
the “tech stack” for interacting with ethereum stuff

^*Note: a lot of this information isn’t needed for ctfing, so feel free to skim over it.

The Basics

Let’s say you want to buy some BlueCoin, a very safe, very real, very hot new cryptocurrency. Here’s how it would look:

At a really high level:

You create your transaction(which can call code, buy nfts, transfer ether, etc)
That transaction is “propagated”(spread) throughout the ethereum network
Pseudo-randomly chosen “validators” on that network:

a. Bundle transactions together in a “block”

b. Stake eth(which they will lose if the block is malicious)

c. And receive the fees paid for transactions in the block
And finally, the “block” is added to the “chain”

The 5th “step”(confirmation that your transaction went through) isn’t usually done by the blockchain, but by the website/extension/library you used.

The Specific Basics

side note: the Ethereum Foundation has a lot of great information at https://ethereum.org/en/developers/docs/intro-to-ethereum/

Long story short, people use their accounts to send transactions through nodes to interact with smart contracts^*. As a really poor analogy:

Your account is like your Gmail account: it has a “username”(account address) and a “password”(private key)
The transaction is like the email: it’s just another word for the data you want to send
The node is like the receiving Google server(s): it receives the transaction and broadcasts it with any related data to other nodes
The smart contract is like the backend logic on the server: it processes the data in your transaction

^*Not all transactions involve smart contracts - you can send ether(Ethereum’s cryptocurrency) without interacting with a contract.

Accounts

There are two types of accounts in Ethereum:

“Externally Owned Accounts”(EOAs): basically a “normal” user account managed with a keypair
Contract accounts: “smart contracts” that are managed by code, not keys

Both account types have a nonce(the number of transactions sent from an EOA, or the number of separate contracts made by a contract account) and a balance.

^*Note: I’m 99% sure the rest of this section is useless for a ctf, but I just included it because it’s interesting. Feel free to skip

EOA Accounts

The address for an EOA account(which has a keypair), is not its public key, like I originally thought. The private key for an EOA account is 32 bytes of randomly generated(you can generate it anyway you like). From that private key, the public key is generated using the secp256k1 curve of the ECDSA cryptographic algorithm. I have no idea how it works, but it creates a public key from a private key. Then, the public key is hashed using the keccak256 hash. Finally, the actual address of the EOA account is the last 20 bytes of that keccak256 hash.

In other words:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27


const secp = require("ethereum-cryptography/secp256k1");
const { keccak256 } = require("ethereum-cryptography/keccak");

(async () => {
    // PLAN: address = last_20_bytes(keccak256(ecsda(privkey)))
    
    // privkey from https://www.freecodecamp.org/news/how-to-create-an-ethereum-wallet-address-from-a-private-key-ae72b0eee27b/
    // ecsda(privkey)
    const privateKey = "60cf347dbc59d31c1358c8e5cf5e45b822ab85b79cb32a9f3d98184779a9efc2";
    const publicKey = secp.getPublicKey(privateKey);

    // keccak(ecsda(privkey))
    const hash = keccak256(publicKey.slice(1, publicKey.length)); // ignore the first byte
    // first byte says whether the pubkey is uncompressed(0x04) or compressed(0x02 or 0x03)
    // https://www.ietf.org/rfc/rfc5480.html

    // last_20_bytes(keccak(ecdsa(privkey)))
    let last_20_bytes = hash.slice(-20);
  
    // display byte values in hex form  
    let hash_string = "";
    for (let i = 0; i < 20; i++) {
        hash_string += last_20_bytes[i].toString(16);
    }

    console.log("0x" + hash_string)
})();

Contract Accounts

The address generation for smart contracts is a bit easier, since smart contracts don’t have keys. Instead, it uses the address that it was created from and the nonce of that address. Using the address and nonce, a special encoding of the data(using Ethereum’s recurive-length-prefix encoding) is created. Just like EOA addresses, the address for a smart contract is the last 20 bytes of the keccak256 hash of that data.

last_20_bytes(keccak(RLP(address, nonce)))

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


const {hexToBytes, bytesToHex} = require("ethereum-cryptography/utils")
const { getContractAddress } = require("@ethersproject/address");
const { keccak256 } = require("@ethersproject/keccak256");
const { arrayify, stripZeros } = require("@ethersproject/bytes");
const RLP = require("@ethersproject/rlp");
const { BigNumber } = require("@ethersproject/bignumber");

(async () => {
    // PLAN: address = last_20_bytes(keccak256(RLP(address, nonce)))
    
    // example address and nonce from cryptokitties txn - https://etherscan.io/tx/0xc6edb503f920816d81ff0de096ec4ec5f4564d17375bd92435a4ee768ca56dff
    // ethersjs has a prebuilt function to do this, getContractAddress()
    const output = getContractAddress({from: "0x09191d18729da57a83a9afc8ace0c8d7d104e118", nonce: "18747"});
    console.log(output);

    // to do it manually:
    const nonce = stripZeros(arrayify(BigNumber.from(18747).toHexString()));
    const output2 = keccak256(RLP.encode(["0x09191d18729da57a83a9afc8ace0c8d7d104e118", nonce])).slice(-40); // 20 bytes = 40 hex characters("ff" = "f" + "f")
    console.log("0x" + output2);
})();

Transactions

Transactions in Ethereum are like packets in the networking world. Transactions store data about the change made to the blockchain state, and also prove who they came from. In addition, since transactions change the blockchain, validators are paid an amount of ether tied to the computing power needed to make that change(measured in gas).

There’s a lot of writing already on the different parts of a transaction, so I’ll just recommend this one at the official Ethereum docs. The main parts of a transaction though, are:

the to address: for contract creation, this is the address the smart contract will be created at
the nonce: the number of transactions sent by the account before this one
the value: the number of eth sent(not including any gas-related fees)
the gasPrice: the amount the sender will pay per gas, measured in Wei(a denomination of Eth)
the gasLimit: the max gas that can be used for the transaction
the data: a collection of any data you want
- for contract creation, this contains code that will generate the contract code to be deployed(example: if the contract code is x, the data field would be something like return x)
- for calling smart contracts, it will contain the function signature along with arguments to the function
the v, r, and s values: numbers related to ECDSA cryptography that prove the transaction came from a specific address

Eventually transactions are grouped together into blocks and added to the chain. To protect against bad actors, those blocks have to be checked for malicous/incorrect data. The way blocks are checked/made and the names for computers that check/make them have changed since The Merge:

Before The Merge:

Ethereum used proof-of-work: the chance of being selected to add a block to the chain(and receiving the reward for doing it) was tied to your processing power
Computers that check and make these blocks were called “miners”

Now(after The Merge):

Ethereum uses proof-of-stake: the chance of being selected to add a block to the chain is tied to the amount of eth you have deposited
Computers that check and make these blocks are called “validators”(and blocks are now “forged,” not mined)

Nodes

The Ethereum network is made up of nodes(basically computers) that have direct access to the Ethereum network and run Ethereum-specific software(called clients). After The Merge, there are 2 types of clients:

Execution clients, which run transactions and store related data using their own copies of the Ethereum Virtual Machine
- The most popular is Geth(go-ethereum)
Consensus clients, which use proof-of-stake “consensus mechanisms”(I don’t know what these are either) to agree on the state of the blockchain
- The most popular is Lighthouse

Because running your own node is usually time-/resource-intensive, there are node providers like Infura(which is used by MetaMask) and Alchemy. If you ever need to interact with a Ethereum network, use a provider.

Smart Contracts

A smart contract is just code that lives on the blockchain. 99% of the time, they’re written in Solidity(although they can also be written in Vyper). It reads a lot like a combination of C and JavaScript.

When it comes to smart contracts, there’s basically 3 things to do:

Write code in Solidity(or Vyper)
Compile that code to EVM(Ethereum Virtual Machine) bytecode, and deploy it to the blockchain
Interact with that contract using it’s ABI(Application Binary Interface), which is mostly a collection of its functions and the argument types they take

Most chall writeups will probably revolve around steps 1 & 3(old Solidity bugs/vulnerabilities & insecurely written functions), but some challs involve low-level aspects of Ethereum. One really interesting example is the zero-day discovered by samczsun.

For more info on Solidity, check out the Ethereum docs at https://ethereum.org/en/developers/docs/smart-contracts/languages/#important-links.

The Tech Stack

Luckily, the tech stack for working with Ethereum challs is pretty simple. You just need:

A way to compile the Solidity(or Vyper) code you wrote
A network to deploy that compiled code to
A way to interact with the deployed code

I personally use:

solc-static - a static commandline binary
ganache-ui - a local blockchain simulator from trufflesuite
ethers.js - a JS library
- There’s also web3.py and web3.js, I just use ethers because I started with it.

I’m going to cover this in the future, but if you want to start now, check out my barebones setup I used for ctf: https://github.com/0xBlue2/js-setup

^ This was before I learned about solc, so I used files from etherscan

Wrapup

That’s it! This was my first “real” blog post, so lmk about any technical details I missed/fudged, and any writing tips you have. I’m on discord at 0xBlue#8985. GLHF! 🖖