Sharding -- Honest Validator¶
Notice: This document is a work-in-progress for researchers and implementers.
Table of contents¶
- Introduction
- Prerequisites
- Constants
- Helpers
- Sample verification
- Beacon chain responsibilities
- Validator assignments
- Sample reconstruction
- Minimum online validator requirement
Introduction¶
This document represents the changes to be made in the code of an "honest validator" to implement executable beacon chain proposal.
Prerequisites¶
This document is an extension of the Bellatrix -- Honest Validator guide. All behaviors and definitions defined in this document, and documents it extends, carry over unless explicitly noted or overridden.
All terminology, constants, functions, and protocol mechanics defined in the updated Beacon Chain doc of Sharding are requisite for this document and used throughout. Please see related Beacon Chain doc before continuing and use them as a reference throughout.
Constants¶
Sample counts¶
Name | Value |
---|---|
VALIDATOR_SAMPLE_ROW_COUNT |
2 |
VALIDATOR_SAMPLE_COLUMN_COUNT |
2 |
Helpers¶
get_validator_row_subnets
¶
TODO: Currently the subnets are public (i.e. anyone can derive them.) This is good for a proof of custody with public verifiability, but bad for validator privacy.
get_validator_column_subnets
¶
reconstruct_polynomial
¶
Sample verification¶
verify_sample
¶
Beacon chain responsibilities¶
Validator assignments¶
Attesting¶
Every attester is assigned VALIDATOR_SAMPLE_ROW_COUNT
rows and VALIDATOR_SAMPLE_COLUMN_COUNT
columns of shard samples. As part of their validator duties, they should subscribe to the subnets given by get_validator_row_subnets
and get_validator_column_subnets
, for the whole epoch.
A row or column is available for a slot
if at least half of the total number of samples were received on the subnet and passed verify_sample
. Otherwise it is called unavailable.
If a validator is assigned to an attestation at slot attestation_slot
and had his previous attestation duty at previous_attestation_slot
, then they should only attest under the following conditions:
- For all intermediate blocks
block
withprevious_attestation_slot < block.slot <= attestation_slot
: All sample rows and columns assigned to the validator were available.
If this condition is not fulfilled, then the validator should instead attest to the last block for which the condition holds.
This leads to the security property that a chain that is not fully available cannot have more than 1/16th of all validators voting for it. TODO: This claim is for an "infinite number" of validators. Compute the concrete security due to sampling bias.
Sample reconstruction¶
A validator that has received enough samples of a row or column to mark it as available, should reconstruct all samples in that row/column (if they aren't all available already.) The function reconstruct_polynomial
gives an example implementation for this.
Once they have run the reconstruction function, they should distribute the samples that they reconstructed on all pubsub that the local node is subscribed to, if they have not already received that sample on that pubsub. As an example:
- The validator is subscribed to row
2
and column5
- The sample
(row, column) = (2, 5)
is missing in the column5
pubsub - After they have reconstruction of row
2
, the validator should send the sample(2, 5)
on to the row2
pubsub (if it was missing) as well as the column5
pubsub.
TODO: We need to verify the total complexity of doing this and make sure this does not cause too much load on a validator
Minimum online validator requirement¶
The data availability construction guarantees that reconstruction is possible if 75% of all samples are available. In this case, at least 50% of all rows and 50% of all columns are independently available. In practice, it is likely that some supernodes will centrally collect all samples and fill in any gaps. However, we want to build a system that reliably reconstructs even absent all supernodes. Any row or column with 50% of samples will easily be reconstructed even with only 100s of validators online; so the only question is how we get to 50% of samples for all rows and columns, when some of them might be completely unseeded.
Each validator will transfer 4 samples between rows and columns where there is overlap. Without loss of generality, look at row 0. Each validator has 1/128 chance of having a sample in this row, and we need 256 samples to reconstruct it. So we expect that we need ~256 * 128 = 32,768 validators to have a fair chance of reconstructing it if it was completely unseeded.
A more elaborate estimate here needs about 55,000 validators to be online for high safety that each row and column will be reconstructed.