We introduce here the building blocks of agent-based simulations of EIP1559. This follows an earlier notebook that merely looked at the dynamics of the EIP 1559 mechanism. In the present notebook, agents decide on transactions based on the current basefee and form their transactions based on internal evaluations of their values and costs.
Huberman et al., 2019 introduced such a model and framework for the Bitcoin payment system. We adapt it here to study the dynamics of the basefee.
All the code is available in this repo, with some preliminary documentation here. You can also download the abm1559
package from PyPi and reproduce all the analysis here yourself!
We have several entities. Users come in randomly (following a Poisson process) and create and send transactions. The transactions are received by a transaction pool, from which the $x$ best valid transactions are included in a block created at fixed intervals. $x$ depends on how many valid transactions exist in the pool (e.g., how many post a gasprice exceeding the prevailing basefee in 1559 paradigm) and the block gas limit. Once transactions are included in the block, and the block is included in the chain, transactions are removed from the transaction pool.
How do users set their parameters? Users have their own internal ways of evaluating their costs. Users obtain a certain value from having their transaction included, which we call $v$. $v$ is different for every user. This value is fixed but their overall payoff decreases the longer they wait to be included. Some users have higher time preferences than others, and their payoff decreases faster than others the longer they wait. Put together, we have the following:
$$ \texttt{payoff} = \texttt{value} - \texttt{cost from waiting} - \texttt{transaction fee} $$Users expect to wait for a certain amount of time. In this essay, we set this to a fixed value -- somewhat arbitrarily we choose 5. This can be readily understood in the following way. Users estimate what their payoff will be from getting included 5 blocks from now, assuming basefee remains constant. If this payoff is negative, they decide not to send the transaction to the pool (in queuing terminology, they balk). We'll play with this assumption later.
The scenario is set up this way to study stationarity: assuming some demand comes in from a fixed distribution at regular intervals, we must expect basefee to reach some stationary value and stay there. It is then reasonable for users, at this stationary point, to consider that 5 blocks from now basefee will still be at the same level. In the nonstationary case, when for instance a systemic change in the demand happens (e.g., the rate of Poisson arrivals increases), a user may want to hedge their bets by estimating their future payoffs in a different way, taking into account that basefee might increase instead. This strategy would probably be a good idea during the transition phase, when basefee shifts from one stationary point to a new one.
We make the assumption here that users choose their 1559 parameters based on their value alone. We set the transaction max_fee
parameter to the value of the user and set the gas_premium
parameter to a residual value -- 1 Gwei per unit of gas.
There is no loss of generality in assuming all users send the same transaction in (e.g., a simple transfer) and so all transactions have the same gas_used
value (21,000). In 1559 paradigm, with a 20M gas limit per block, this allows at most 952 transactions to be included, although the mechanism will target half of that, around 475 here. The protocol adjusts the basefee to apply economic pressure, towards a target gas usage of 10M per block.
We import a few classes from our abm1559
package.
%config InlineBackend.figure_format = 'svg'
import os, sys
sys.path.insert(1, os.path.realpath(os.path.pardir))
# You may remove the two lines above if you have installed abm1559 from pypi
from abm1559.utils import constants
from abm1559.txpool import TxPool
from abm1559.users import User1559
from abm1559.userpool import UserPool
from abm1559.chain import (
Chain,
Block1559,
)
from abm1559.simulator import (
spawn_poisson_demand,
update_basefee,
)
import pandas as pd
And define the main function used to simulate the fee market.
def simulate(demand_scenario, UserClass):
# Instantiate a couple of things
txpool = TxPool()
basefee = constants["INITIAL_BASEFEE"]
chain = Chain()
metrics = []
user_pool = UserPool()
for t in range(len(demand_scenario)):
if t % 100 == 0: print(t)
# `env` is the "environment" of the simulation
env = {
"basefee": basefee,
"current_block": t,
}
# We return a demand drawn from a Poisson distribution.
# The parameter is given by `demand_scenario[t]`, and can vary
# over time.
users = spawn_poisson_demand(t, demand_scenario[t], UserClass)
# We query each new user with the current basefee value
# Users either return a transaction or None if they prefer to balk
decided_txs = user_pool.decide_transactions(users, env)
# New transactions are added to the transaction pool
txpool.add_txs(decided_txs)
# The best valid transactions are taken out of the pool for inclusion
selected_txs = txpool.select_transactions(env)
txpool.remove_txs([tx.tx_hash for tx in selected_txs])
# We create a block with these transactions
block = Block1559(txs = selected_txs, parent_hash = chain.current_head, height = t, basefee = basefee)
# The block is added to the chain
chain.add_block(block)
# A couple of metrics we will use to monitor the simulation
row_metrics = {
"block": t,
"basefee": basefee / (10 ** 9),
"users": len(users),
"decided_txs": len(decided_txs),
"included_txs": len(selected_txs),
"blk_avg_gas_price": block.average_gas_price(),
"blk_avg_tip": block.average_tip(),
"pool_length": txpool.pool_length(),
}
metrics.append(row_metrics)
# Finally, basefee is updated and a new round starts
basefee = update_basefee(block, basefee)
return (pd.DataFrame(metrics), user_pool, chain)
As you can see, simulate
takes in a demand_scenario
array. Earlier we mentioned that each round, we draw the number of users wishing to send transactions from a Poisson distribution. This distribution is parameterised by the expected number of arrivals, called lambda $\lambda$. The demand_scenario
array contains a sequence of such lambda's. We also provide in UserClass
the type of user we would like to model (see the docs for more details).
Our users draw their value for the transaction (per unit of gas) from a uniform distribution, picking a random number between 0 and 20 (Gwei). Their cost for waiting one extra unit of time is drawn from a uniform distribution too, this time between 0 and 1 (Gwei). The closer their cost is to 1, the more impatient users are.
Say for instance that I value each unit of gas at 15 Gwei, and my cost per round is 0.5 Gwei. If I wait for 6 blocks to be included at a gas price of 10 Gwei, my payoff is $15 - 6 \times 0.5 - 10 = 2$.
The numbers above sound arbitrary, and in a sense they are! They were chosen to respect the scales we are used to (although gas prices are closer to 100 Gweis these days...). It also turns out that any distribution (uniform, Pareto, whatever floats your boat) leads to stationarity. The important part is that some users have positive value for transacting in the first place, enough to fill a block to its target size at least. The choice of sample the cost from a uniform distribution, as opposed to having all users experience the same cost per round, allows for simulating a scenario where some users are more in a hurry than others.
demand_scenario = [2000 for i in range(200)]
(df, user_pool, chain) = simulate(demand_scenario, User1559)
0 100
To study the stationary case, we create an array repeating $\lambda$ for as many blocks as we wish to simulate the market for. We set $\lambda$ to spawn on average 2000 users between two blocks.
Let's print the head and tail of the data frame holding our metrics. Each row corresponds to one round of our simulation, so one block.
df
block | basefee | users | decided_txs | included_txs | blk_avg_gas_price | blk_avg_tip | pool_length | |
---|---|---|---|---|---|---|---|---|
0 | 0 | 1.000000 | 2031 | 1562 | 952 | 2.000000 | 1.0 | 610 |
1 | 1 | 1.124900 | 1983 | 1525 | 952 | 2.124900 | 1.0 | 1183 |
2 | 2 | 1.265400 | 1912 | 1453 | 952 | 2.265400 | 1.0 | 1684 |
3 | 3 | 1.423448 | 1997 | 1493 | 952 | 2.423448 | 1.0 | 2225 |
4 | 4 | 1.601237 | 2001 | 1459 | 952 | 2.601237 | 1.0 | 2732 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
195 | 195 | 11.795094 | 2001 | 508 | 508 | 12.795094 | 1.0 | 1924 |
196 | 196 | 11.893583 | 2040 | 459 | 459 | 12.893583 | 1.0 | 1924 |
197 | 197 | 11.839914 | 1979 | 462 | 462 | 12.839914 | 1.0 | 1924 |
198 | 198 | 11.795810 | 1983 | 487 | 487 | 12.795810 | 1.0 | 1924 |
199 | 199 | 11.829281 | 1960 | 444 | 444 | 12.829281 | 1.0 | 1924 |
200 rows × 8 columns
At the start of the simulation we clearly see in column users
a demand close to 2000 users per round. Among these 2000 or so, around 1500 decide to send their transaction in (decided_txs
). The 500 who don't might have a low value or high per-round costs, meaning it is unprofitable for them to even send their transaction in. Eventually 952 of them are included (included_txs
), maxing out the block gas limit. The basefee starts at 1 Gwei but steadily increases from there, reaching around 11.8 Gwei by the end.
By the end of the simulation, we note that decided_txs
is always equal to included_txs
. By this point, the basefee has risen enough to make it unprofitable for most users to send their transactions. This is exactly what we want! Users balk at the current prices.
In the next chart we show the evolution of basefee and tips. We define tip as the gas price minus the basefee, which is what miners receive from the transaction.
Note that tip is in general not equal to the gas premium that users set. This is particularly true when basefee plus gas premium exceeds the max fee of the user. In the graph below, the tip hovers around 1 Gwei (the premium), but is sometimes less than 1 too, especially when users see the prevailing basefee approach their posted max fees.
df.plot("block", ["basefee", "blk_avg_tip"])
<AxesSubplot:xlabel='block'>
Notice the increase at the beginning followed by a short drop? At the very beginning, the pool fills up quickly with many users hopeful to get their transactions in with a positive resulting payoff. The basefee increases until users start balking and the pool is exhausted. Once exhausted, basefee starts decreasing again to settle at the stationary point where the pool only includes transactions that are invalid given the stationary basefee.
We can see the pool length becoming stationary in the next plot, showing the length of the pool over time.
df.plot("block", "pool_length")
<AxesSubplot:xlabel='block'>
The remaining transactions are likely from early users who did not balk even though basefee was increasing, and who were quickly outbid by others.
We look at a stationary setting, where the new demand coming in each new round follows a fixed expected rate of arrival. Demand shocks may be of two kinds:
We'll consider the second scenario here, simply running the simulation again and increasing the $\lambda$ parameter of our Poisson arrival process suddenly, from expecting 2000, to expecting 6000 users per round.
demand_scenario = [2000 for i in range(100)] + [6000 for i in range(100)]
(df_jump, user_pool_jump, chain_jump) = simulate(demand_scenario, User1559)
0 100
The next plot shows the number of new users each round. We note at block 100 a sudden jump from around 2000 new users to 6000.
df_jump.plot("block", "users")
<AxesSubplot:xlabel='block'>
df_jump.plot("block", ["basefee", "blk_avg_tip"])
<AxesSubplot:xlabel='block'>
We see a jump around block 100, when the arrival rate of users switches from 2000 to 6000. The basefee increases in response. With a block limit of 20M gas, about 950 transactions fit into each block. Targeting half of this value, the basefee increases until more or less 475 transactions are included in each block.
Since our users' values and costs are always drawn from the same distribution, when 2000 users show up, we expect to let in about 25% of them (~ 475 / 2000), the 25% with greatest expected payoff. When 6000 users come in, we now only expect the "richest" 8% (~ 475 / 6000) to get in, so we "raise the bar" for the basefee, since we need to discriminate more.
df_jump.plot("block", ["pool_length", "users", "decided_txs", "included_txs"])
<AxesSubplot:xlabel='block'>
As we see with the graph above, for a short while after block 100, blocks include more than the usual ~475 transactions. This is the transition between the old and the new stationary points.
Since we have a lot more new users each round, more of them are willing and able to pay for their transactions above the current basefee, and so get included. This keeps happening until the basefee reaches a new stationary level.
Up until now, users decided whether to join the transaction pool or not based on the expectation that they would be included at least 5 blocks after they join. They evaluated their payoff assuming that basefee did not change (due to stationarity) for these 5 blocks. If their value for transacting minus the cost of waiting for 5 blocks minus the cost of transacting is positive, they sent their transactions in!
$$ \texttt{payoff} = \texttt{value} - \texttt{cost from waiting 5 blocks} - \texttt{transaction fee} > 0 $$Under a stationary demand however, users can expect to be included in the next block. So let's have user expect to be included in the next block, right after their appearance, and see what happens. We do this by subclassing our User1559
agent and overriding its expected_time
method.
class OptimisticUser(User1559):
def expected_time(self, env):
return 0
demand_scenario = [2000 for i in range(100)] + [6000 for i in range(100)]
(df_opti, user_pool_opti, chain_opti) = simulate(demand_scenario, OptimisticUser)
0 100
df_opti.plot("block", ["basefee", "blk_avg_tip"])
<AxesSubplot:xlabel='block'>
The plot looks the same as before. But let's look at the average basefee for the last 50 blocks in this scenario and the last.
df_opti[(df.block > 150)][["basefee"]].mean()
basefee 17.433133 dtype: float64
df_jump[(df.block > 150)][["basefee"]].mean()
basefee 15.030248 dtype: float64
When users expect to be included in the next block rather than wait for at least 5, the basefee increases! This makes sense if we come back to our payoff definition:
$$ \texttt{payoff} = \texttt{value} - \texttt{cost from waiting} - \texttt{transaction fee} $$The estimated cost for waiting is lower now since users estimate they'll be included in the next block and not wait 5 blocks to get in. Previously, some users with high values but high time preferences might have been discouraged to join the pool. Now these users don't expect to wait as much, and since their values are high, they don't mind bidding for a higher basefee either. We can check indeed that on average, users included in this last scenario have higher values than users included in the previous one.
To do so, we export to pandas DataFrame
s the user pool (to obtain their values and costs) and the chain (to obtain the addresses of included users in the last 50 blocks).
user_pool_opti_df = user_pool_opti.export().rename(columns={ "pub_key": "sender" })
chain_opti_df = chain_opti.export()
Let's open these up and have a look at the data. user_pool_opti_df
registers all users we spawned in our simulation.
user_pool_opti_df.tail()
user | sender | value | wakeup_block | user_type | cost_per_unit | |
---|---|---|---|---|---|---|
799774 | 1559 affine user with value 11054269619 and co... | b0930fcba0677116 | 11.054270 | 199 | user_1559 | 0.059898 |
799775 | 1559 affine user with value 4307381915 and cos... | cb5c2e2729133b44 | 4.307382 | 199 | user_1559 | 0.719897 |
799776 | 1559 affine user with value 13494438833 and co... | d21482354208b7c3 | 13.494439 | 199 | user_1559 | 0.219101 |
799777 | 1559 affine user with value 11267395546 and co... | 555b23fb396ecab0 | 11.267396 | 199 | user_1559 | 0.341294 |
799778 | 1559 affine user with value 18296621634 and co... | ba49c4f9090602b3 | 18.296622 | 199 | user_1559 | 0.406582 |
Meanwhile, chain_opti_df
lists all the transactions included in the chain.
chain_opti_df.tail()
block_height | tx_index | basefee | tx | start_block | sender | gas_used | tx_hash | gas_premium | max_fee | tip | |
---|---|---|---|---|---|---|---|---|---|---|---|
106976 | 199 | 455 | 17.483026 | 1559 Transaction 8034d24a635a2e2c: max_fee 187... | 199 | 135fa86cc8a0db19 | 21000 | 8034d24a635a2e2c | 1.0 | 18.722951 | 1.0 |
106977 | 199 | 456 | 17.483026 | 1559 Transaction 98baeaee522c0661: max_fee 190... | 199 | 00c9cd46792eed6d | 21000 | 98baeaee522c0661 | 1.0 | 19.015471 | 1.0 |
106978 | 199 | 457 | 17.483026 | 1559 Transaction cbe0f5904d87c2fa: max_fee 190... | 199 | c276db600567649e | 21000 | cbe0f5904d87c2fa | 1.0 | 19.014307 | 1.0 |
106979 | 199 | 458 | 17.483026 | 1559 Transaction 6ba3990799797b1a: max_fee 199... | 199 | 166757b80c27432b | 21000 | 6ba3990799797b1a | 1.0 | 19.916943 | 1.0 |
106980 | 199 | 459 | 17.483026 | 1559 Transaction 849ea7b7070828ca: max_fee 184... | 199 | 8932cefc77b8c395 | 21000 | 849ea7b7070828ca | 1.0 | 18.485555 | 1.0 |
With a simple join on the sender
column we can associate each user with their included transaction. We look at the average value of included users after the second stationary point.
chain_opti_df[(chain_opti_df.block_height >= 150)].join(
user_pool_opti_df.set_index("sender"), on="sender"
)[["value"]].mean()
value 19.210239 dtype: float64
When users expect to be included at least one block after they send their transaction, the average value of included users is around 19.2 Gwei.
user_pool_jump_df = user_pool_jump.export().rename(columns={ "pub_key": "sender" })
chain_jump_df = chain_jump.export()
chain_jump_df[(chain_jump_df.block_height >= 150)].join(
user_pool_jump_df.set_index("sender"), on="sender"
)[["value"]].mean()
value 18.685197 dtype: float64
But when users expect to be included at least five blocks after, the average value of included users is around 18.7 Gwei, confirming that when users expect next block inclusion, higher value users get in and raise the basefee in the process.
We've looked at 1559 when users with their own values and costs decide whether to join the pool or not based on the current basefee level. These users estimate their ultimate payoff by assuming stationarity: the demand between rounds follows the same arrival process and the same distribution of values and costs. In this stationary environment, basefee settles on some value and mostly stays there, allowing users to estimate their payoff should they wait for five or one blocks to be included.
We've again left aside some important questions. Here all users simply leave a 1 Gwei premium in their transactions. In reality, we should expect users to attempt to "game" the system by leaving higher tips to get in first. We can suppose that in a stationary environment, "gaming" is only possible until basefee reaches its stationary point (during the transition period) and exhausts the feasible demand. We will leave this question for another notebook.
(Temporary) non-stationarity is more interesting. The 5% meme during which sudden demand shocks precipitate a large influx of new, high-valued transactions should also see users try to outcompete each other based on premiums alone, until basefee catches up. The question of whether 1559 offers anything in this case or whether the whole situation would look like a first price auction may be better settled empirically, but we can intuit that 1559 would smooth the process slightly by offering a (laggy) price oracle.
And then we have the question of miner collusion, which rightfully agitates a lot of the ongoing conversation. In the simulations we do here, we instantiated one transaction pool only, which should tell you that we are looking at a "centralised", honest miner that includes transactions as much as possible, and not a collection or a cartel of miners cooperating. We can of course weaken this assumption and have several mining pools with their own behaviours and payoff evaluations, much like we modelled our users. We still would like to have a good theoretical understanding of the risks and applicability of miner collusion strategies. Onward!
Individual rationality is the idea that agents won't join a mechanism unless they hope to make some positive payoff out of it. I'd rather not transact if my value for transacting minus my costs is negative.
In general, we like this property and we want to make the mechanism individually rational to as many agents as possible. Yet, some mechanisms fail to satisfy ex post individual rationality: I might expect to make a positive payoff from the mechanism, but some realisation of the mechanism exists where my payoff is negative.
Take an auction. As long as my bid is lower or equal to my value for the auctioned item, the mechanism is ex post individually rational for me: I can never "overpay". If I value the item for 10 ETH and decide to bid 11 ETH, in a first-price auction where I pay for my bid if I have the highest, there is a realisation of the mechanism where I am the winner and I am asked to pay 11 ETH. My payoff is -1 ETH then.
In the transaction fee market, ex post individual rationality is not guaranteed unless I can cancel my transaction. In the simulations here, we do not offer this option to our agents. They expect to wait for inclusion for a certain amount of blocks, and evaluate whether their payoff after that wait is positive or not to decide whether to send their transaction or not. However, some agents might wait longer than their initial estimation, in particular before the mechanism reaches stationarity. Some realisations of the mechanism then yield a negative payoff for these agents, and the mechanism is not ex post individually rational.
Let's look at the agents' payoff using the transcript of transactions included in the chain. For each transaction, we want to find out what was the ultimate payoff for the agent who sent it in. If the transaction was included much later than the agent's initial estimation, this payoff is negative, and the mechanism wasn't ex post individually rational to them.
user_pool_df = user_pool.export().rename(columns={ "pub_key": "sender" })
chain_df = chain.export()
user_txs_df = chain_df.join(user_pool_df.set_index("sender"), on="sender")
In the next chunk we obtain the users' payoffs: their value minus the costs incurred from the transaction fee and the time they waited.
user_txs_df["payoff"] = user_txs_df.apply(
lambda row: row.user.payoff({
"current_block": row.block_height,
"gas_price": row.tx.gas_price({
"basefee": row.basefee * (10 ** 9) # we need basefee in wei
})
}) / (10 ** 9), # put payoff is in Gwei
axis = 1
)
user_txs_df["epir"] = user_txs_df.payoff.apply(
lambda payoff: payoff >= 0
)
Now we count the fraction of users in each block who received a positive payoff.
epir_df = pd.concat([
user_txs_df[["block_height", "tx_hash"]].groupby(["block_height"]).agg(["count"]),
user_txs_df[["block_height", "epir"]][user_txs_df.epir == True].groupby(["block_height"]).agg(["count"])
], axis = 1)
epir_df["percent_epir"] = epir_df.apply(
lambda row: row.epir / row.tx_hash * 100,
axis = 1
)
Let's plot it!
epir_df.reset_index().plot("block_height", ["percent_epir"])
<AxesSubplot:xlabel='block_height'>
At the very beginning, all users (100%) have positive payoff. They have only waited for 1 block to get included. This percentage steadily drops, as basefee increases: some high value users waiting in the pool get included much later than they expected, netting a negative payoff.
Once we pass the initial instability (while basefee is looking for its stationary value), all users receive a positive payoff. This is somewhat expected: once basefee has increased enough to weed out excess demand, users are pretty much guaranteed to be included in the next block, and so the realised waiting time will always be less than their estimate.
Check out also: A recent ethresear.ch post by Onur Solmaz, on a 1559-inspired mechanism for daily gas price stabilization, with simulations.