Exploring blocks, gas and transactions

A focus on the recent high gas prices, towards understanding high congestion regimes for EIP 1559.

Barnabé Monnot https://twitter.com/barnabemonnot (Robust Incentives Group, Ethereum Foundation)https://github.com/ethereum/rig
2020-06-03

Table of Contents


While real world gallons of oil went negative, Ethereum gas prices have sustained a long period of high fees since the beginning of May. I wanted to dig in a bit deeper, with a view to understanding the fundamentals of the demand. Some of the charts below retrace steps that are very well-known to a lot of us – these are mere restatements and updates. The data includes all blocks produced between May 4th, 2020, 13:22:16 UTC and May 19th, 2020, 19:57:17 UTC.

Onur Solmaz from Casper Labs wrote a very nice post arguing that since we observe daily cycles, there must be something more than one-off ICOs and Ponzis at play.

We will see these cycles here too, and a few more questions I thought were interesting (or at least, that I kinda knew the answer to but never had derived or played with myself). This is an excuse to play with my new Dappnode full node, using the wonderful ethereum-etl package from Evgeny Medvedev to extract transaction and block details. This data will also be useful to calibrate good simulations for EIP 1559 (more on this soon!)

Block properties

Gas used by a block

Miners have some control over the gas limit of a block, but how much gas do blocks generally use?

There are a few peaks, notably at 0 (the amount of gas used by an empty block) and towards the maximum gas limit set at 10,000,000. Let’s zoom in on blocks that use more than 9,800,000 gas.

We can also look at the proportion of gas used, i.e., the amount of gas used by the block divided by the total gas available in that block. Taking a moving average over the last 500 blocks, we obtain the following plot.

Where does the dip on May 15th come from? Empty blocks?

It doesn’t seem so.

Relationship between block size and gas used

Does the block weight (in gas) roughly correlate with the block size (in bytes)?


    Pearson's product-moment correlation

data:  blocks$gas_used and blocks$size
t = 210.3522, df = 98398, p-value < 0.00000000000000022204
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5526261983 0.5612463179
sample estimates:
         cor 
0.5569512568 

It does! But since most blocks have very high gas_used anyways, it pays to look a bit more closely.

We use a logarithmic scale for the y-axis. There is definitely a big spread around the 10 million gas limit. Does the block size correlate with the number of transactions instead then?

A transaction has a minimum size, if only to include things like the sender and receiver addresses and the other necessary fields. This is why we pretty much only observe values above some diagonal. The largest blocks (in bytes) are not the ones with the most transactions.

Gas prices

Distribution of gas prices

First, some descriptive stats for the distribution of gas prices.

Quartile Value
0.00 0
0.25 14
0.50 20
0.75 31
1.00 60440

75% of transactions post a gas price less than or equal to 31 Gwei! Plotting the distribution of gas prices under 2000 Gwei:

The y-axis is in logarithmic scale. Notice these curious, regular peaks? Turns out people love round numbers (or their wallets do). Let’s dig into this.

Do users like default prices?

How do users set their gas prices? We can make the hypothesis that most rely on some oracle (e.g., the Eth Gas Station or their values appearing as Metamask defaults). We show next the 50 most frequent gas prices (in Gwei) and their frequency among included transactions.

Clearly round numbers dominate here!

Evolution of gas prices

I wanted to see how the gas prices evolve over time. To compute the average gas price in a block, I do a weighted mean using gas_used as weight. I then compute the average gas price over 100 blocks by doing another weighted mean using the total gas used in the blocks.

We see a daily seasonality, with peaks and troughs corresponding to high congestion and low congestion hours of the day. How does this jive with other series we saw before? We now average over 200 blocks and present a comparison with the series of block proportion used.

Blocks massively unused right after a price peak? The mystery deepens.

Timestamp difference between blocks

How much time elapses between two consecutive blocks? Miners are responsible for setting the timestamp, so it’s not a perfectly objective value, but good enough!

We can do a simple difference-in-means test to check whether the difference between the average gas price of late blocks (with timestamp difference greater than 20 seconds) and early blocks (lesser than 20 seconds) is significant.

late_block avg_gas_price
FALSE 23.47139866
TRUE 26.62337871