Skip to content

Sharding -- Honest Validator

Notice: This document is a work-in-progress for researchers and implementers.

Table of contents

Introduction

This document represents the changes to be made in the code of an "honest validator" to implement executable beacon chain proposal.

Prerequisites

This document is an extension of the Bellatrix -- Honest Validator guide. All behaviors and definitions defined in this document, and documents it extends, carry over unless explicitly noted or overridden.

All terminology, constants, functions, and protocol mechanics defined in the updated Beacon Chain doc of Sharding are requisite for this document and used throughout. Please see related Beacon Chain doc before continuing and use them as a reference throughout.

Constants

Sample counts

Name Value
VALIDATOR_SAMPLE_ROW_COUNT 2
VALIDATOR_SAMPLE_COLUMN_COUNT 2

Helpers

get_validator_row_subnets

TODO: Currently the subnets are public (i.e. anyone can derive them.) This is good for a proof of custody with public verifiability, but bad for validator privacy.

def get_validator_row_subnets(validator: Validator, epoch: Epoch) -> List[uint64]:
    return [int.from_bytes(hash_tree_root([validator.pubkey, 0, i])) for i in range(VALIDATOR_SAMPLE_ROW_COUNT)]

get_validator_column_subnets

def get_validator_column_subnets(validator: Validator, epoch: Epoch) -> List[uint64]:
    return [int.from_bytes(hash_tree_root([validator.pubkey, 1, i])) for i in range(VALIDATOR_SAMPLE_COLUMN_COUNT)]

reconstruct_polynomial

1
2
3
4
def reconstruct_polynomial(samples: List[SignedShardSample]) -> List[SignedShardSample]:
    """
    Reconstructs one full row/column from at least 1/2 of the samples
    """

Sample verification

verify_sample

def verify_sample(state: BeaconState, block: BeaconBlock, sample: SignedShardSample):
    assert sample.row < 2 * get_active_shard_count(state, get_current_epoch(block.slot))
    assert sample.column < 2 * SAMPLES_PER_BLOB
    assert block.slot == sample.slot

    # Verify builder signature.
    # TODO: We should probably not do this. This should only be done by p2p to verify samples *before* intermediate block is in
    # builder = state.validators[signed_block.message.proposer_index]
    # signing_root = compute_signing_root(sample, get_domain(state, DOMAIN_SHARD_SAMPLE))
    # assert bls.Verify(sample.builder, signing_root, sample.signature)

    roots_in_rbo = list_to_reverse_bit_order(roots_of_unity(SAMPLES_PER_BLOB * FIELD_ELEMENTS_PER_SAMPLE))

    # Verify KZG proof
    verify_kzg_multiproof(block.body.payload_data.value.sharded_commitments_container.sharded_commitments[sample.row],
                          roots_in_rbo[sample.column * FIELD_ELEMENTS_PER_SAMPLE:(sample.column + 1) * FIELD_ELEMENTS_PER_SAMPLE]
                          sample.data,
                          sample.proof)

Beacon chain responsibilities

Validator assignments

Attesting

Every attester is assigned VALIDATOR_SAMPLE_ROW_COUNT rows and VALIDATOR_SAMPLE_COLUMN_COUNT columns of shard samples. As part of their validator duties, they should subscribe to the subnets given by get_validator_row_subnets and get_validator_column_subnets, for the whole epoch.

A row or column is available for a slot if at least half of the total number of samples were received on the subnet and passed verify_sample. Otherwise it is called unavailable.

If a validator is assigned to an attestation at slot attestation_slot and had his previous attestation duty at previous_attestation_slot, then they should only attest under the following conditions:

  • For all intermediate blocks block with previous_attestation_slot < block.slot <= attestation_slot: All sample rows and columns assigned to the validator were available.

If this condition is not fulfilled, then the validator should instead attest to the last block for which the condition holds.

This leads to the security property that a chain that is not fully available cannot have more than 1/16th of all validators voting for it. TODO: This claim is for an "infinite number" of validators. Compute the concrete security due to sampling bias.

Sample reconstruction

A validator that has received enough samples of a row or column to mark it as available, should reconstruct all samples in that row/column (if they aren't all available already.) The function reconstruct_polynomial gives an example implementation for this.

Once they have run the reconstruction function, they should distribute the samples that they reconstructed on all pubsub that the local node is subscribed to, if they have not already received that sample on that pubsub. As an example:

  • The validator is subscribed to row 2 and column 5
  • The sample (row, column) = (2, 5) is missing in the column 5 pubsub
  • After they have reconstruction of row 2, the validator should send the sample (2, 5) on to the row 2 pubsub (if it was missing) as well as the column 5 pubsub.

TODO: We need to verify the total complexity of doing this and make sure this does not cause too much load on a validator

Minimum online validator requirement

The data availability construction guarantees that reconstruction is possible if 75% of all samples are available. In this case, at least 50% of all rows and 50% of all columns are independently available. In practice, it is likely that some supernodes will centrally collect all samples and fill in any gaps. However, we want to build a system that reliably reconstructs even absent all supernodes. Any row or column with 50% of samples will easily be reconstructed even with only 100s of validators online; so the only question is how we get to 50% of samples for all rows and columns, when some of them might be completely unseeded.

Each validator will transfer 4 samples between rows and columns where there is overlap. Without loss of generality, look at row 0. Each validator has 1/128 chance of having a sample in this row, and we need 256 samples to reconstruct it. So we expect that we need ~256 * 128 = 32,768 validators to have a fair chance of reconstructing it if it was completely unseeded.

A more elaborate estimate here needs about 55,000 validators to be online for high safety that each row and column will be reconstructed.