ethash.py (in src/ethereum)

ethereum.ethash

Ethash is a proof-of-work algorithm designed to be ASIC resistant through memory hardness.

To achieve memory hardness, computing Ethash requires access to subsets of a large structure. The particular subsets chosen are based on the nonce and block header, while the set itself is changed every epoch.

At a high level, the Ethash algorithm is as follows:

Create a seed value, generated with generate_seed and based on the preceding block numbers.
From the seed, compute a pseudorandom cache with generate_cache.
From the cache, generate a dataset with generate_dataset. The dataset grows over time based on DATASET_EPOCH_GROWTH_SIZE.
Miners hash slices of the dataset together, which is where the memory hardness is introduced. Verification of the proof-of-work only requires the cache to be able to recompute a much smaller subset of the full dataset.

EPOCH_SIZE

Number of blocks before a dataset needs to be regenerated (known as an "epoch".) See epoch.

42	EPOCH_SIZE = Uint(30000)

INITIAL_CACHE_SIZE

Size of the cache (in bytes) during the first epoch. Each subsequent epoch's cache roughly grows by CACHE_EPOCH_GROWTH_SIZE bytes. See cache_size.

50	INITIAL_CACHE_SIZE = Uint(2**24)

CACHE_EPOCH_GROWTH_SIZE

After the first epoch, the cache size grows by roughly this amount. See cache_size.

59	CACHE_EPOCH_GROWTH_SIZE = Uint(2**17)

INITIAL_DATASET_SIZE

Size of the dataset (in bytes) during the first epoch. Each subsequent epoch's dataset roughly grows by DATASET_EPOCH_GROWTH_SIZE bytes. See dataset_size.

67	INITIAL_DATASET_SIZE = Uint(2**30)

DATASET_EPOCH_GROWTH_SIZE

After the first epoch, the dataset size grows by roughly this amount. See dataset_size.

77	DATASET_EPOCH_GROWTH_SIZE = Uint(2**23)

HASH_BYTES

Length of a hash, in bytes.

85	HASH_BYTES = Uint(64)

MIX_BYTES

Width of mix, in bytes. See generate_dataset_item.

90	MIX_BYTES = Uint(128)

CACHE_ROUNDS

Number of times to repeat the keccak512 step while generating the hash. See generate_cache.

97	CACHE_ROUNDS = 3

DATASET_PARENTS

Number of parents of each dataset element. See generate_dataset_item.

106	DATASET_PARENTS = Uint(256)

HASHIMOTO_ACCESSES

Number of accesses in the hashimoto loop.

113	HASHIMOTO_ACCESSES = 64

epoch

Obtain the epoch number to which the block identified by block_number belongs. The first epoch is numbered zero.

An Ethash epoch is a fixed number of blocks (EPOCH_SIZE) long, during which the dataset remains constant. At the end of each epoch, the dataset is generated anew. See generate_dataset.

def epoch(block_number: Uint) -> Uint:

122	"""
123	Obtain the epoch number to which the block identified by `block_number`
124	belongs. The first epoch is numbered zero.
125
126	An Ethash epoch is a fixed number of blocks ([`EPOCH_SIZE`]) long, during
127	which the dataset remains constant. At the end of each epoch, the dataset
128	is generated anew. See [`generate_dataset`].
129
130	[`EPOCH_SIZE`]: ref:ethereum.ethash.EPOCH_SIZE
131	[`generate_dataset`]: ref:ethereum.ethash.generate_dataset
132	"""
133	return block_number // EPOCH_SIZE

cache_size

Obtain the cache size (in bytes) of the epoch to which block_number belongs.

See INITIAL_CACHE_SIZE and CACHE_EPOCH_GROWTH_SIZE for the initial size and linear growth rate, respectively. The cache is generated in generate_cache.

The actual cache size is smaller than simply multiplying CACHE_EPOCH_GROWTH_SIZE by the epoch number to minimize the risk of unintended cyclic behavior. It is defined as the highest prime number below what linear growth would calculate.

def cache_size(block_number: Uint) -> Uint:

137	"""
138	Obtain the cache size (in bytes) of the epoch to which `block_number`
139	belongs.
140
141	See [`INITIAL_CACHE_SIZE`] and [`CACHE_EPOCH_GROWTH_SIZE`] for the initial
142	size and linear growth rate, respectively. The cache is generated in
143	[`generate_cache`].
144
145	The actual cache size is smaller than simply multiplying
146	`CACHE_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
147	unintended cyclic behavior. It is defined as the highest prime number below
148	what linear growth would calculate.
149
150	[`INITIAL_CACHE_SIZE`]: ref:ethereum.ethash.INITIAL_CACHE_SIZE
151	[`CACHE_EPOCH_GROWTH_SIZE`]: ref:ethereum.ethash.CACHE_EPOCH_GROWTH_SIZE
152	[`generate_cache`]: ref:ethereum.ethash.generate_cache
153	"""
154	size = INITIAL_CACHE_SIZE + (CACHE_EPOCH_GROWTH_SIZE * epoch(block_number))
155	size -= HASH_BYTES
156	while not is_prime(size // HASH_BYTES):
157	size -= Uint(2) * HASH_BYTES
158
159	return size

dataset_size

Obtain the dataset size (in bytes) of the epoch to which block_number belongs.

See INITIAL_DATASET_SIZE and DATASET_EPOCH_GROWTH_SIZE for the initial size and linear growth rate, respectively. The complete dataset is generated in generate_dataset, while the slices used in verification are generated in generate_dataset_item.

The actual dataset size is smaller than simply multiplying DATASET_EPOCH_GROWTH_SIZE by the epoch number to minimize the risk of unintended cyclic behavior. It is defined as the highest prime number below what linear growth would calculate.

def dataset_size(block_number: Uint) -> Uint:

163	"""
164	Obtain the dataset size (in bytes) of the epoch to which `block_number`
165	belongs.
166
167	See [`INITIAL_DATASET_SIZE`] and [`DATASET_EPOCH_GROWTH_SIZE`][ds] for the
168	initial size and linear growth rate, respectively. The complete dataset is
169	generated in [`generate_dataset`], while the slices used in verification
170	are generated in [`generate_dataset_item`].
171
172	The actual dataset size is smaller than simply multiplying
173	`DATASET_EPOCH_GROWTH_SIZE` by the epoch number to minimize the risk of
174	unintended cyclic behavior. It is defined as the highest prime number below
175	what linear growth would calculate.
176
177	[`INITIAL_DATASET_SIZE`]: ref:ethereum.ethash.INITIAL_DATASET_SIZE
178	[ds]: ref:ethereum.ethash.DATASET_EPOCH_GROWTH_SIZE
179	[`generate_dataset`]: ref:ethereum.ethash.generate_dataset
180	[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
181	"""
182	size = INITIAL_DATASET_SIZE + (
183	DATASET_EPOCH_GROWTH_SIZE * epoch(block_number)
184	)
185	size -= MIX_BYTES
186	while not is_prime(size // MIX_BYTES):
187	size -= Uint(2) * MIX_BYTES
188
189	return size

generate_seed

Obtain the cache generation seed for the block identified by block_number. See generate_cache.

def generate_seed(block_number: Uint) -> Hash32:

193	"""
194	Obtain the cache generation seed for the block identified by
195	`block_number`. See [`generate_cache`].
196
197	[`generate_cache`]: ref:ethereum.ethash.generate_cache
198	"""
199	epoch_number = epoch(block_number)
200
201	seed = b"\x00" * 32
202	while epoch_number != 0:
203	seed = keccak256(seed)
204	epoch_number -= Uint(1)
205
206	return Hash32(seed)

generate_cache

Generate the cache for the block identified by block_number. See generate_dataset for how the cache is used.

The cache is generated in two steps: filling the array with a chain of keccak512 hashes, then running two rounds of Sergio Demian Lerner's RandMemoHash on those bytes.

def generate_cache(block_number: Uint) -> Tuple[Tuple[U32, ...], ...]:

210	"""
211	Generate the cache for the block identified by `block_number`. See
212	[`generate_dataset`] for how the cache is used.
213
214	The cache is generated in two steps: filling the array with a chain of
215	[`keccak512`] hashes, then running two rounds of Sergio Demian Lerner's
216	[RandMemoHash] on those bytes.
217
218	[`keccak512`]: ref:ethereum.crypto.hash.keccak512
219	[`generate_dataset`]: ref:ethereum.ethash.generate_dataset
220	[RandMemoHash]: http://www.hashcash.org/papers/memohash.pdf
221	"""
222	seed = generate_seed(block_number)
223	cache_size_bytes = cache_size(block_number)
224
225	cache_size_words = cache_size_bytes // HASH_BYTES
226	cache = [keccak512(seed)]
227
228	for index in range(1, cache_size_words):
229	cache_item = keccak512(cache[index - 1])
230	cache.append(cache_item)
231
232	for _ in range(CACHE_ROUNDS):
233	for index in range(cache_size_words):
234	# Converting `cache_size_words` to int as `-1 + Uint(5)` is an
235	# error.
236	first_cache_item = cache[
237	(index - 1 + int(cache_size_words)) % int(cache_size_words)
238	]
239	second_cache_item = cache[
240	U32.from_le_bytes(cache[index][0:4]) % U32(cache_size_words)
241	]
242	result = Bytes(
243	[a ^ b for a, b in zip(first_cache_item, second_cache_item)]
244	)
245	cache[index] = keccak512(result)
246
247	return tuple(
248	le_bytes_to_uint32_sequence(cache_item) for cache_item in cache
249	)

fnv

A non-associative substitute for XOR, inspired by the FNV hash by Fowler, Noll, and Vo. See fnv_hash, generate_dataset_item, and hashimoto.

Note that here we multiply the prime with the full 32-bit input, in contrast with the FNV-1 spec which multiplies the prime with one byte (octet) in turn.

def fnv(a: Uint | U32, b: Uint | U32) -> U32:

253	"""
254	A non-associative substitute for XOR, inspired by the [FNV] hash by Fowler,
255	Noll, and Vo. See [`fnv_hash`], [`generate_dataset_item`], and
256	[`hashimoto`].
257
258	Note that here we multiply the prime with the full 32-bit input, in
259	contrast with the [FNV-1] spec which multiplies the prime with one byte
260	(octet) in turn.
261
262	[`hashimoto`]: ref:ethereum.ethash.hashimoto
263	[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
264	[`fnv_hash`]: ref:ethereum.ethash.fnv_hash
265	[FNV]: https://w.wiki/XKZ
266	[FNV-1]: http://www.isthe.com/chongo/tech/comp/fnv/#FNV-1
267	"""
268	# This is a faster way of doing `number % (2 ** 32)`.
269	result = ((Uint(a) * Uint(0x01000193)) ^ Uint(b)) & Uint(U32.MAX_VALUE)
270	return U32(result)

fnv_hash

Combines data into mix_integers using fnv. See hashimoto and generate_dataset_item.

def fnv_hash(mix_integers: Tuple[U32, ...], data: Tuple[U32, ...]) -> Tuple[U32, ...]:

276	"""
277	Combines `data` into `mix_integers` using [`fnv`]. See [`hashimoto`] and
278	[`generate_dataset_item`].
279
280	[`hashimoto`]: ref:ethereum.ethash.hashimoto
281	[`generate_dataset_item`]: ref:ethereum.ethash.generate_dataset_item
282	[`fnv`]: ref:ethereum.ethash.fnv
283	"""
284	return tuple(
285	fnv(mix_integers[i], data[i]) for i in range(len(mix_integers))
286	)

generate_dataset_item

Generate a particular dataset item 0-indexed by index by hashing pseudorandomly-selected entries from cache together. See fnv and fnv_hash for the digest function, generate_cache for generating cache, and generate_dataset for the full dataset generation algorithm.

def generate_dataset_item(cache: Tuple[Tuple[U32, ...], ...], index: Uint) -> Hash64:

292	"""
293	Generate a particular dataset item 0-indexed by `index` by hashing
294	pseudorandomly-selected entries from `cache` together. See [`fnv`] and
295	[`fnv_hash`] for the digest function, [`generate_cache`] for generating
296	`cache`, and [`generate_dataset`] for the full dataset generation
297	algorithm.
298
299	[`fnv`]: ref:ethereum.ethash.fnv
300	[`fnv_hash`]: ref:ethereum.ethash.fnv_hash
301	[`generate_dataset`]: ref:ethereum.ethash.generate_dataset
302	[`generate_cache`]: ref:ethereum.ethash.generate_cache
303	"""
304	mix = keccak512(
305	(
306	le_uint32_sequence_to_uint(cache[index % ulen(cache)]) ^ index
307	).to_le_bytes64()
308	)
309
310	mix_integers = le_bytes_to_uint32_sequence(mix)
311
312	for j in (Uint(k) for k in range(DATASET_PARENTS)):
313	mix_word: U32 = mix_integers[j % Uint(16)]
314	cache_index = fnv(index ^ j, mix_word) % U32(len(cache))
315	parent = cache[cache_index]
316	mix_integers = fnv_hash(mix_integers, parent)
317
318	mix = Hash64(le_uint32_sequence_to_bytes(mix_integers))
319
320	return keccak512(mix)

generate_dataset

Generate the full dataset for the block identified by block_number.

This function is present only for demonstration purposes. It is not used while validating blocks.

def generate_dataset(block_number: Uint) -> Tuple[Hash64, ...]:

324	"""
325	Generate the full dataset for the block identified by `block_number`.
326
327	This function is present only for demonstration purposes. It is not used
328	while validating blocks.
329	"""
330	dataset_size_bytes: Uint = dataset_size(block_number)
331	cache: Tuple[Tuple[U32, ...], ...] = generate_cache(block_number)
332
333	# TODO: Parallelize this later on if it adds value
334	return tuple(
335	generate_dataset_item(cache, Uint(index))
336	for index in range(dataset_size_bytes // HASH_BYTES)
337	)

hashimoto

Obtain the mix digest and the final value for a header, by aggregating data from the full dataset.

Parameters

header_hash is a valid RLP hash of a block header.
nonce is the propagated nonce for the given block.
dataset_size is the size of the dataset. See dataset_size.
fetch_dataset_item is a function that retrieves a specific dataset item based on its index.

Returns

The mix digest generated from the header hash and propagated nonce.
The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.

def hashimoto(header_hash: Hash32, nonce: Bytes8, dataset_size: Uint, fetch_dataset_item: Callable[[Uint], Tuple[U32, ...]]) -> Tuple[Bytes, Hash32]:

346	"""
347	Obtain the mix digest and the final value for a header, by aggregating
348	data from the full dataset.
349
350	#### Parameters
351
352	- `header_hash` is a valid RLP hash of a block header.
353	- `nonce` is the propagated nonce for the given block.
354	- `dataset_size` is the size of the dataset. See [`dataset_size`].
355	- `fetch_dataset_item` is a function that retrieves a specific dataset item
356	based on its index.
357
358	#### Returns
359
360	- The mix digest generated from the header hash and propagated nonce.
361	- The final result obtained which will be checked for leading zeros (in
362	byte representation) in correspondence with the block difficulty.
363
364	[`dataset_size`]: ref:ethereum.ethash.dataset_size
365	"""
366	nonce_le = Bytes(reversed(nonce))
367	seed_hash = keccak512(header_hash + nonce_le)
368	seed_head = U32.from_le_bytes(seed_hash[:4])
369
370	rows = dataset_size // Uint(128)
371	mix = le_bytes_to_uint32_sequence(seed_hash) * (MIX_BYTES // HASH_BYTES)
372
373	for i in range(HASHIMOTO_ACCESSES):
374	new_data: Tuple[U32, ...] = ()
375	parent = fnv(U32(i) ^ seed_head, mix[i % len(mix)]) % U32(rows)
376	for j in range(MIX_BYTES // HASH_BYTES):
377	# Typecasting `parent` from U32 to Uint as 2*parent + j may
378	# overflow U32.
379	new_data += fetch_dataset_item(Uint(2) * Uint(parent) + Uint(j))
380
381	mix = fnv_hash(mix, new_data)
382
383	compressed_mix = []
384	for i in range(0, len(mix), 4):
385	compressed_mix.append(
386	fnv(fnv(fnv(mix[i], mix[i + 1]), mix[i + 2]), mix[i + 3])
387	)
388
389	mix_digest = le_uint32_sequence_to_bytes(compressed_mix)
390	result = keccak256(seed_hash + mix_digest)
391
392	return mix_digest, result

hashimoto_light

Run the hashimoto algorithm by generating dataset item using the cache instead of loading the full dataset into main memory.

Parameters

header_hash is a valid RLP hash of a block header.
nonce is the propagated nonce for the given block.
cache is the cache generated by generate_cache.
dataset_size is the size of the dataset. See dataset_size.

Returns

The mix digest generated from the header hash and propagated nonce.
The final result obtained which will be checked for leading zeros (in byte representation) in correspondence with the block difficulty.

def hashimoto_light(header_hash: Hash32, nonce: Bytes8, cache: Tuple[Tuple[U32, ...], ...], dataset_size: Uint) -> Tuple[Bytes, Hash32]:

401	"""
402	Run the [`hashimoto`] algorithm by generating dataset item using the cache
403	instead of loading the full dataset into main memory.
404
405	#### Parameters
406
407	- `header_hash` is a valid RLP hash of a block header.
408	- `nonce` is the propagated nonce for the given block.
409	- `cache` is the cache generated by [`generate_cache`].
410	- `dataset_size` is the size of the dataset. See [`dataset_size`].
411
412	#### Returns
413
414	- The mix digest generated from the header hash and propagated nonce.
415	- The final result obtained which will be checked for leading zeros (in
416	byte representation) in correspondence with the block difficulty.
417
418	[`dataset_size`]: ref:ethereum.ethash.dataset_size
419	[`generate_cache`]: ref:ethereum.ethash.generate_cache
420	[`hashimoto`]: ref:ethereum.ethash.hashimoto
421	"""
422
423	def fetch_dataset_item(index: Uint) -> Tuple[U32, ...]:
424	item: Hash64 = generate_dataset_item(cache, index)
425	return le_bytes_to_uint32_sequence(item)
426
427	return hashimoto(header_hash, nonce, dataset_size, fetch_dataset_item)

Search Results

ethereum.ethash

EPOCH_SIZE

INITIAL_CACHE_SIZE

CACHE_EPOCH_GROWTH_SIZE

INITIAL_DATASET_SIZE

DATASET_EPOCH_GROWTH_SIZE

HASH_BYTES

MIX_BYTES

CACHE_ROUNDS

DATASET_PARENTS

HASHIMOTO_ACCESSES

epoch

cache_size

dataset_size

generate_seed

generate_cache

fnv

fnv_hash

generate_dataset_item

generate_dataset

hashimoto

Parameters

Returns

hashimoto_light

Parameters

Returns