openethereum/ethcore/sync/src/chain/handler.rs

914 lines
34 KiB
Rust
Raw Normal View History

// Copyright 2015-2020 Parity Technologies (UK) Ltd.
// This file is part of Parity Ethereum.
// Parity Ethereum is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Parity Ethereum is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Parity Ethereum. If not, see <http://www.gnu.org/licenses/>.
use std::time::Instant;
use std::{mem, cmp};
use crate::{
snapshot_sync::ChunkType,
sync_io::SyncIo,
api::{ETH_PROTOCOL, WARP_SYNC_PROTOCOL_ID},
block_sync::{BlockDownloaderImportError as DownloaderImportError, DownloadAction},
chain::{
sync_packet::{
PacketInfo,
SyncPacket::{
self, BlockBodiesPacket, BlockHeadersPacket, NewBlockHashesPacket, NewBlockPacket,
PrivateStatePacket, PrivateTransactionPacket, ReceiptsPacket, SignedPrivateTransactionPacket,
SnapshotDataPacket, SnapshotManifestPacket, StatusPacket,
}
},
BlockSet, ChainSync, ForkConfirmation, PacketDecodeError, PeerAsking, PeerInfo, SyncRequester,
SyncState, ETH_PROTOCOL_VERSION_63, ETH_PROTOCOL_VERSION_64, MAX_NEW_BLOCK_AGE, MAX_NEW_HASHES,
PAR_PROTOCOL_VERSION_1, PAR_PROTOCOL_VERSION_3, PAR_PROTOCOL_VERSION_4,
}
};
use bytes::Bytes;
use enum_primitive::FromPrimitive;
use ethereum_types::{H256, U256};
use keccak_hash::keccak;
use network::PeerId;
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
use network::client_version::ClientVersion;
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
use log::{debug, trace, error, warn};
use rlp::Rlp;
use common_types::{
Move more types out of ethcore (#10880) * WIP move errors, pod_account and state account to own crates * Sort out dependencies, fix broken code and tests Remove botched ethcore-error crate * remove template line * fix review feedback * Remove test-only AccountDBMut::new * Extract AccountDB to account-db * Move Substate to state-account – wip * Add lib.rs * cleanup * test failure * test failure 2 * third time's the charm * Add factories crate * Use new factories crate * Use factories crate * Extract trace * Fix tests * Sort out parity-util-mem and parking_lot * cleanup * WIP port over the rest of state from ethcore * Collect all impls for Machine * some notes * Rename pod-account to pod * Move PodState to pod crate * Use PodState from pod crate * Fix use clause for json tests * Sort out evmbin * Add missing code and use PodState * Move code that depends on Machine and Executive to own module * Sort out cloning errors, fix ethcore to use new state crate * Do without funky From impls * Fix ethcore tests * Fixes around the project to use new state crate * Add back the more specific impls of StateOrBlock From conversions * Move execute to freestanding function and remove it from trait Sort out the error handling in executive_state by moving the result types from state to ethcore Undo the verbose code added to work around the StateOrBlock From conversions * cleanup * Fix "error: enum variants on type aliases are experimental" * Bring back the state tests Fix whitespace * remove ethcore/state/mod.rs * cleanup * cleanup * Cleanup state-account errors * Fix more todos Add module docs * Add error.rs * Fixup Cargo.lock * Smaller ethcore API is fine * Add `to-pod-full` feature to state-account Fix evmbin * Fix a few more test failures * Fix RPC test build * Baptize the new trait * Remove resolved TODOs * Rename state-account to account-state * Do not re-export the trace crate * Don't export state_db from ethcore * Let private-tx use StateDB. :( * Remove ethcore/src/pod_state.rs * Inner type does not need to be pub/pub(crate) * optimise imports * Revert "Inner type does not need to be pub/pub(crate)" This reverts commit 2f839f8a0f72f71334da64620f57e6dd6039f06b. * Move DatabaseExtras to ethcore-blockchain * Add database_extra module to ethcore-blockchain * Remove to-pod-full feature * cosmetics * New crate: state-db * Add new crate * Move PreverifiedBlock and BlockError to types * Sort out the merge * Add missing `license` meta data keys * wip * wip client-traits * merge conflict * verification crate type checks * Move impls for CommonParams to common_types Fix misc stuff in ethcore * Fix tests * Implement VerifyingEngine for all engines except Ethash Temporarily sort out error handling Move more types to common_types * Split Engine in two and move code around * cleanup * verification: don't rexport common_types * Use error from common_types * Consolidate error types * VerifyingEngine use Errors from common_types * verification: Use error type from common_types * SnapshotError moved to common_types * Move more code from Engne to VerifyingEngine Add a VerifyingClient trait: BlockInfo + CallContract Whitespace * Add MAX_UNCLE_AGE const * Port over remaining code from ethcore/verification * Use errors from common_types * Fix the confusing "io" naming * Move more types into common_types * Add todos * Experiment with Engine trait outside ethcore * Hook up types from common_types in ethcore Don't use verification crate Don't use client-traits crate * Revert to impl Engine for Arc<Ethash> and add note to explain why Revert moving ClientIoMessage to common_types Fix build * Remove ClientIoMessage from common_types * Cleanup * More cleanup * Sort error handling changes in the rest of parity * Remove unused code * Remove WIP types * Cleanup todos not tackled here * remove cruft * Fix some whitespace and a merge error * ethcore tests * test failures * Restore Engine impls to master to make review a bit easier * cleanup * whitespace * applied review suggestions * types does not depend on rustc-hex * ethash engine moved to engine module * applied review suggestion
2019-07-18 12:27:08 +02:00
BlockNumber,
block_status::BlockStatus,
ids::BlockId,
errors::{EthcoreError, ImportError, BlockError},
Extract the Engine trait (#10958) * Add client-traits crate Move the BlockInfo trait to new crate * New crate `machine` Contains code extracted from ethcore that defines `Machine`, `Externalities` and other execution related code. * Use new machine and client-traits crates in ethcore * Use new crates machine and client-traits instead of ethcore where appropriate * Fix tests * Don't re-export so many types from ethcore::client * Fixing more fallout from removing re-export * fix test * More fallout from not re-exporting types * Add some docs * cleanup * import the macro edition style * Tweak docs * Add missing import * remove unused ethabi_derive imports * Use latest ethabi-contract * Move many traits from ethcore/client/traits to client-traits crate Initial version of extracted Engine trait * Move snapshot related traits to the engine crate (eew) * Move a few snapshot related types to common_types Cleanup Executed as exported from machine crate * fix warning * Gradually introduce new engine crate: snapshot * ethcore typechecks with new engine crate * Sort out types outside ethcore * Add an EpochVerifier to ethash and use that in Engine.epoch_verifier() Cleanup * Document pub members * Sort out tests Sort out default impls for EpochVerifier * Add test-helpers feature and move EngineSigner impl to the right place * Sort out tests * Sort out tests and refactor verification types * Fix missing traits * More missing traits Fix Histogram * Fix tests and cleanup * cleanup * Put back needed logger import * Don't rexport common_types from ethcore/src/client Don't export ethcore::client::* * Remove files no longer used Use types from the engine crate Explicit exports from engine::engine * Get rid of itertools * Move a few more traits from ethcore to client-traits: BlockChainReset, ScheduleInfo, StateClient * Move ProvingBlockChainClient to client-traits * Don't re-export ForkChoice and Transition from ethcore * Address grumbles: sort imports, remove commented out code * Fix merge resolution error * merge failure
2019-08-15 17:59:22 +02:00
verification::Unverified,
Extract engines to own crates (#10966) * Add client-traits crate Move the BlockInfo trait to new crate * New crate `machine` Contains code extracted from ethcore that defines `Machine`, `Externalities` and other execution related code. * Use new machine and client-traits crates in ethcore * Use new crates machine and client-traits instead of ethcore where appropriate * Fix tests * Don't re-export so many types from ethcore::client * Fixing more fallout from removing re-export * fix test * More fallout from not re-exporting types * Add some docs * cleanup * import the macro edition style * Tweak docs * Add missing import * remove unused ethabi_derive imports * Use latest ethabi-contract * Move many traits from ethcore/client/traits to client-traits crate Initial version of extracted Engine trait * Move snapshot related traits to the engine crate (eew) * Move a few snapshot related types to common_types Cleanup Executed as exported from machine crate * fix warning * Gradually introduce new engine crate: snapshot * ethcore typechecks with new engine crate * Sort out types outside ethcore * Add an EpochVerifier to ethash and use that in Engine.epoch_verifier() Cleanup * Document pub members * Sort out tests Sort out default impls for EpochVerifier * Add test-helpers feature and move EngineSigner impl to the right place * Sort out tests * Sort out tests and refactor verification types * Fix missing traits * More missing traits Fix Histogram * Fix tests and cleanup * cleanup * Put back needed logger import * Don't rexport common_types from ethcore/src/client Don't export ethcore::client::* * Remove files no longer used Use types from the engine crate Explicit exports from engine::engine * Get rid of itertools * Move a few more traits from ethcore to client-traits: BlockChainReset, ScheduleInfo, StateClient * Move ProvingBlockChainClient to client-traits * Don't re-export ForkChoice and Transition from ethcore * Address grumbles: sort imports, remove commented out code * Fix merge resolution error * Extract the Clique engine to own crate * Extract NullEngine and the block_reward module from ethcore * Extract InstantSeal engine to own crate * Extract remaining engines * Extract executive_state to own crate so it can be used by engine crates * Remove snapshot stuff from the engine crate * Put snapshot traits back in ethcore * cleanup * Remove stuff from ethcore * Don't use itertools * itertools in aura is legit-ish * More post-merge fixes * Re-export less types in client * cleanup * Update ethcore/block-reward/Cargo.toml Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com> * Update ethcore/engines/basic-authority/Cargo.toml Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com> * Update ethcore/engines/ethash/Cargo.toml Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com> * Update ethcore/engines/clique/src/lib.rs Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com> * signers is already a ref * Add an EngineType enum to tighten up Engine.name() * Introduce Snapshotting enum to distinguish the type of snapshots a chain uses * Rename supports_warp to snapshot_mode * Missing import * Update ethcore/src/snapshot/consensus/mod.rs Co-Authored-By: Tomasz Drwięga <tomusdrw@users.noreply.github.com> * remove double-semicolons
2019-08-22 18:25:49 +02:00
snapshot::{ManifestData, RestorationStatus},
Move more types out of ethcore (#10880) * WIP move errors, pod_account and state account to own crates * Sort out dependencies, fix broken code and tests Remove botched ethcore-error crate * remove template line * fix review feedback * Remove test-only AccountDBMut::new * Extract AccountDB to account-db * Move Substate to state-account – wip * Add lib.rs * cleanup * test failure * test failure 2 * third time's the charm * Add factories crate * Use new factories crate * Use factories crate * Extract trace * Fix tests * Sort out parity-util-mem and parking_lot * cleanup * WIP port over the rest of state from ethcore * Collect all impls for Machine * some notes * Rename pod-account to pod * Move PodState to pod crate * Use PodState from pod crate * Fix use clause for json tests * Sort out evmbin * Add missing code and use PodState * Move code that depends on Machine and Executive to own module * Sort out cloning errors, fix ethcore to use new state crate * Do without funky From impls * Fix ethcore tests * Fixes around the project to use new state crate * Add back the more specific impls of StateOrBlock From conversions * Move execute to freestanding function and remove it from trait Sort out the error handling in executive_state by moving the result types from state to ethcore Undo the verbose code added to work around the StateOrBlock From conversions * cleanup * Fix "error: enum variants on type aliases are experimental" * Bring back the state tests Fix whitespace * remove ethcore/state/mod.rs * cleanup * cleanup * Cleanup state-account errors * Fix more todos Add module docs * Add error.rs * Fixup Cargo.lock * Smaller ethcore API is fine * Add `to-pod-full` feature to state-account Fix evmbin * Fix a few more test failures * Fix RPC test build * Baptize the new trait * Remove resolved TODOs * Rename state-account to account-state * Do not re-export the trace crate * Don't export state_db from ethcore * Let private-tx use StateDB. :( * Remove ethcore/src/pod_state.rs * Inner type does not need to be pub/pub(crate) * optimise imports * Revert "Inner type does not need to be pub/pub(crate)" This reverts commit 2f839f8a0f72f71334da64620f57e6dd6039f06b. * Move DatabaseExtras to ethcore-blockchain * Add database_extra module to ethcore-blockchain * Remove to-pod-full feature * cosmetics * New crate: state-db * Add new crate * Move PreverifiedBlock and BlockError to types * Sort out the merge * Add missing `license` meta data keys * wip * wip client-traits * merge conflict * verification crate type checks * Move impls for CommonParams to common_types Fix misc stuff in ethcore * Fix tests * Implement VerifyingEngine for all engines except Ethash Temporarily sort out error handling Move more types to common_types * Split Engine in two and move code around * cleanup * verification: don't rexport common_types * Use error from common_types * Consolidate error types * VerifyingEngine use Errors from common_types * verification: Use error type from common_types * SnapshotError moved to common_types * Move more code from Engne to VerifyingEngine Add a VerifyingClient trait: BlockInfo + CallContract Whitespace * Add MAX_UNCLE_AGE const * Port over remaining code from ethcore/verification * Use errors from common_types * Fix the confusing "io" naming * Move more types into common_types * Add todos * Experiment with Engine trait outside ethcore * Hook up types from common_types in ethcore Don't use verification crate Don't use client-traits crate * Revert to impl Engine for Arc<Ethash> and add note to explain why Revert moving ClientIoMessage to common_types Fix build * Remove ClientIoMessage from common_types * Cleanup * More cleanup * Sort error handling changes in the rest of parity * Remove unused code * Remove WIP types * Cleanup todos not tackled here * remove cruft * Fix some whitespace and a merge error * ethcore tests * test failures * Restore Engine impls to master to make review a bit easier * cleanup * whitespace * applied review suggestions * types does not depend on rustc-hex * ethash engine moved to engine module * applied review suggestion
2019-07-18 12:27:08 +02:00
};
/// The Chain Sync Handler: handles responses from peers
pub struct SyncHandler;
impl SyncHandler {
/// Handle incoming packet from peer
pub fn on_packet(sync: &mut ChainSync, io: &mut dyn SyncIo, peer: PeerId, packet_id: u8, data: &[u8]) {
let rlp = Rlp::new(data);
if let Some(packet_id) = SyncPacket::from_u8(packet_id) {
let result = match packet_id {
StatusPacket => SyncHandler::on_peer_status(sync, io, peer, &rlp),
BlockHeadersPacket => SyncHandler::on_peer_block_headers(sync, io, peer, &rlp),
BlockBodiesPacket => SyncHandler::on_peer_block_bodies(sync, io, peer, &rlp),
ReceiptsPacket => SyncHandler::on_peer_block_receipts(sync, io, peer, &rlp),
NewBlockPacket => SyncHandler::on_peer_new_block(sync, io, peer, &rlp),
NewBlockHashesPacket => SyncHandler::on_peer_new_hashes(sync, io, peer, &rlp),
SnapshotManifestPacket => SyncHandler::on_snapshot_manifest(sync, io, peer, &rlp),
SnapshotDataPacket => SyncHandler::on_snapshot_data(sync, io, peer, &rlp),
PrivateTransactionPacket => SyncHandler::on_private_transaction(sync, io, peer, &rlp),
SignedPrivateTransactionPacket => SyncHandler::on_signed_private_transaction(sync, io, peer, &rlp),
PrivateStatePacket => SyncHandler::on_private_state_data(sync, io, peer, &rlp),
_ => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "sync", "{}: Unknown packet {}", peer, packet_id.id());
Ok(())
}
};
match result {
Err(DownloaderImportError::Invalid) => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target:"sync", "{} -> Invalid packet {}", peer, packet_id.id());
io.disable_peer(peer);
sync.deactivate_peer(io, peer);
},
Err(DownloaderImportError::Useless) => {
sync.deactivate_peer(io, peer);
},
Ok(()) => {
// give a task to the same peer first
sync.sync_peer(io, peer, false);
},
}
} else {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "sync", "{}: Unknown packet {}", peer, packet_id);
}
}
/// Called when peer sends us new consensus packet
pub fn on_consensus_packet(io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) {
trace!(target: "sync", "Received consensus packet from {:?}", peer_id);
io.chain().queue_consensus_message(r.as_raw().to_vec());
}
/// Called by peer when it is disconnecting
pub fn on_peer_aborting(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId) {
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
trace!(target: "sync", "== Disconnecting {}: {}", peer_id, io.peer_version(peer_id));
sync.handshaking_peers.remove(&peer_id);
if sync.peers.contains_key(&peer_id) {
debug!(target: "sync", "Disconnected {}", peer_id);
sync.clear_peer_download(peer_id);
sync.peers.remove(&peer_id);
sync.active_peers.remove(&peer_id);
if sync.state == SyncState::SnapshotManifest {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
// Check if we are asking other peers for a snapshot manifest as well. If not,
// set our state to initial state (`Idle` or `WaitingPeers`).
let still_seeking_manifest = sync.peers.iter()
.filter(|&(id, p)| sync.active_peers.contains(id) && p.asking == PeerAsking::SnapshotManifest)
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
.next().is_some();
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
if !still_seeking_manifest {
warn!(target: "snapshot_sync", "The peer we were downloading a snapshot from ({}) went away. Retrying.", peer_id);
sync.state = ChainSync::get_init_state(sync.warp_sync, io.chain());
}
}
sync.continue_sync(io);
}
}
/// Called when a new peer is connected
pub fn on_peer_connected(sync: &mut ChainSync, io: &mut dyn SyncIo, peer: PeerId) {
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
trace!(target: "sync", "== Connected {}: {}", peer, io.peer_version(peer));
if let Err(e) = sync.send_status(io, peer) {
debug!(target:"sync", "Error sending status request: {:?}", e);
io.disconnect_peer(peer);
} else {
sync.handshaking_peers.insert(peer, Instant::now());
}
}
/// Called by peer once it has new block bodies
pub fn on_peer_new_block(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
trace!(target: "sync", "Ignoring new block from unconfirmed peer {}", peer_id);
return Ok(());
}
let difficulty: U256 = r.val_at(1)?;
if let Some(ref mut peer) = sync.peers.get_mut(&peer_id) {
if peer.difficulty.map_or(true, |pd| difficulty > pd) {
peer.difficulty = Some(difficulty);
}
}
let block = Unverified::from_rlp(r.at(0)?.as_raw().to_vec())?;
let hash = block.header.hash();
let number = block.header.number();
trace!(target: "sync", "{} -> NewBlock ({})", peer_id, hash);
if number > sync.highest_block.unwrap_or(0) {
sync.highest_block = Some(number);
}
let mut unknown = false;
if let Some(ref mut peer) = sync.peers.get_mut(&peer_id) {
peer.latest_hash = hash;
}
let last_imported_number = sync.new_blocks.last_imported_block_number();
if last_imported_number > number && last_imported_number - number > MAX_NEW_BLOCK_AGE {
trace!(target: "sync", "Ignored ancient new block {:?}", hash);
return Err(DownloaderImportError::Invalid);
}
match io.chain().import_block(block) {
Err(EthcoreError::Import(ImportError::AlreadyInChain)) => {
trace!(target: "sync", "New block already in chain {:?}", hash);
},
Err(EthcoreError::Import(ImportError::AlreadyQueued)) => {
trace!(target: "sync", "New block already queued {:?}", hash);
},
Ok(_) => {
// abort current download of the same block
sync.complete_sync(io);
sync.new_blocks.mark_as_known(&hash, number);
trace!(target: "sync", "New block queued {:?} ({})", hash, number);
},
Err(EthcoreError::Block(BlockError::UnknownParent(p))) => {
unknown = true;
trace!(target: "sync", "New block with unknown parent ({:?}) {:?}", p, hash);
},
Err(e) => {
debug!(target: "sync", "Bad new block {:?} : {:?}", hash, e);
return Err(DownloaderImportError::Invalid);
}
};
if unknown {
if sync.state != SyncState::Idle {
trace!(target: "sync", "NewBlock ignored while seeking");
} else {
trace!(target: "sync", "New unknown block {:?}", hash);
//TODO: handle too many unknown blocks
sync.sync_peer(io, peer_id, true);
}
}
Ok(())
}
/// Handles `NewHashes` packet. Initiates headers download for any unknown hashes.
pub fn on_peer_new_hashes(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
trace!(target: "sync", "Ignoring new hashes from unconfirmed peer {}", peer_id);
return Ok(());
}
let hashes: Vec<_> = r.iter().take(MAX_NEW_HASHES).map(|item| (item.val_at::<H256>(0), item.val_at::<BlockNumber>(1))).collect();
if let Some(ref mut peer) = sync.peers.get_mut(&peer_id) {
// Peer has new blocks with unknown difficulty
peer.difficulty = None;
if let Some(&(Ok(ref h), _)) = hashes.last() {
peer.latest_hash = h.clone();
}
}
if sync.state != SyncState::Idle {
trace!(target: "sync", "Ignoring new hashes since we're already downloading.");
let max = r.iter().take(MAX_NEW_HASHES).map(|item| item.val_at::<BlockNumber>(1).unwrap_or(0)).fold(0u64, cmp::max);
if max > sync.highest_block.unwrap_or(0) {
sync.highest_block = Some(max);
}
return Ok(());
}
trace!(target: "sync", "{} -> NewHashes ({} entries)", peer_id, r.item_count()?);
let mut max_height: BlockNumber = 0;
let mut new_hashes = Vec::new();
let last_imported_number = sync.new_blocks.last_imported_block_number();
for (rh, rn) in hashes {
let hash = rh?;
let number = rn?;
if number > sync.highest_block.unwrap_or(0) {
sync.highest_block = Some(number);
}
if sync.new_blocks.is_downloading(&hash) {
continue;
}
if last_imported_number > number && last_imported_number - number > MAX_NEW_BLOCK_AGE {
trace!(target: "sync", "Ignored ancient new block hash {:?}", hash);
return Err(DownloaderImportError::Invalid);
}
match io.chain().block_status(BlockId::Hash(hash.clone())) {
BlockStatus::InChain => {
trace!(target: "sync", "New block hash already in chain {:?}", hash);
},
BlockStatus::Queued => {
trace!(target: "sync", "New hash block already queued {:?}", hash);
},
BlockStatus::Unknown => {
new_hashes.push(hash.clone());
if number > max_height {
trace!(target: "sync", "New unknown block hash {:?}", hash);
if let Some(ref mut peer) = sync.peers.get_mut(&peer_id) {
peer.latest_hash = hash.clone();
}
max_height = number;
}
},
BlockStatus::Bad => {
debug!(target: "sync", "Bad new block hash {:?}", hash);
return Err(DownloaderImportError::Invalid);
}
}
};
if max_height != 0 {
trace!(target: "sync", "Downloading blocks for new hashes");
sync.new_blocks.reset_to(new_hashes);
sync.state = SyncState::NewBlocks;
sync.sync_peer(io, peer_id, true);
}
Ok(())
}
/// Called by peer once it has new block bodies
fn on_peer_block_bodies(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
sync.clear_peer_download(peer_id);
let block_set = sync.peers.get(&peer_id)
.and_then(|p| p.block_set)
.unwrap_or(BlockSet::NewBlocks);
Fix ancient blocks sync (#9531) * Log block set in block_sync for easier debugging * logging macros * Match no args in sync logging macros * Add QueueFull error * Only allow importing headers if the first matches requested * WIP * Test for chain head gaps and log * Calc distance even with 2 heads * Revert previous commits, preparing simple fix This reverts commit 5f38aa885b22ebb0e3a1d60120cea69f9f322628. * Reject headers with no gaps when ChainHead * Reset block sync download when queue full * Simplify check for subchain heads * Add comment to explain subchain heads filter * Fix is_subchain_heads check and comment * Prevent premature round completion after restart This is a problem on mainnet where multiple stale peer requests will force many rounds to complete quickly, forcing the retraction. * Reset stale old blocks request after queue full * Revert "Reject headers with no gaps when ChainHead" This reverts commit 0eb865539e5dee37ab34f168f5fb643300de5ace. * Add BlockSet to BlockDownloader logging Currently it is difficult to debug this because there are two instances, one for OldBlocks and one for NewBlocks. This adds the BlockSet to all log messages for easy log filtering. * Reset OldBlocks download from last enqueued Previously when the ancient block queue was full it would restart the download from the last imported block, so the ones still in the queue would be redownloaded. Keeping the existing downloader instance and just resetting it will start again from the last enqueued block.:wq * Ignore expired Body and Receipt requests * Log when ancient block download being restarted * Only request old blocks from peers with >= difficulty https://github.com/paritytech/parity-ethereum/pull/9226 might be too permissive and causing the behaviour of the retraction soon after the fork block. With this change the peer difficulty has to be greater than or euqal to our syncing difficulty, so should still fix https://github.com/paritytech/parity-ethereum/issues/9225 * Some logging and clear stalled blocks head * Revert "Some logging and clear stalled blocks head" This reverts commit 757641d9b817ae8b63fec684759b0815af9c4d0e. * Reset stalled header if useless more than once * Store useless headers in HashSet * Add sync target to logging macro * Don't disable useless peer and fix log macro * Clear useless headers on reset and comments * Use custom error for collecting blocks Previously we resued BlockImportError, however only the Invalid case and this made little sense with the QueueFull error. * Remove blank line * Test for reset sync after consecutive useless headers * Don't reset after consecutive headers when chain head * Delete commented out imports * Return DownloadAction from collect_blocks instead of error * Don't reset after round complete, was causing test hangs * Add comment explaining reset after useless * Replace HashSet with counter for useless headers * Refactor sync reset on bad block/queue full * Add missing target for log message * Fix compiler errors and test after merge * ethcore: revert ethereum tests submodule update
2018-10-09 15:31:40 +02:00
let allowed = sync.peers.get(&peer_id).map(|p| p.is_allowed()).unwrap_or(false);
if !sync.reset_peer_asking(peer_id, PeerAsking::BlockBodies) || !allowed {
trace!(target: "sync", "{}: Ignored unexpected bodies", peer_id);
return Ok(());
}
let expected_blocks = match sync.peers.get_mut(&peer_id) {
Some(peer) => mem::replace(&mut peer.asking_blocks, Vec::new()),
None => {
trace!(target: "sync", "{}: Ignored unexpected bodies (peer not found)", peer_id);
return Ok(());
}
};
let item_count = r.item_count()?;
trace!(target: "sync", "{} -> BlockBodies ({} entries), set = {:?}", peer_id, item_count, block_set);
if item_count == 0 {
Err(DownloaderImportError::Useless)
} else if sync.state == SyncState::Waiting {
trace!(target: "sync", "Ignored block bodies while waiting");
Ok(())
} else {
{
let downloader = match block_set {
BlockSet::NewBlocks => &mut sync.new_blocks,
BlockSet::OldBlocks => match sync.old_blocks {
None => {
trace!(target: "sync", "Ignored block headers while block download is inactive");
return Ok(());
},
Some(ref mut blocks) => blocks,
}
};
downloader.import_bodies(r, expected_blocks.as_slice())?;
}
sync.collect_blocks(io, block_set);
Ok(())
}
}
fn on_peer_fork_header(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
{
let peer = sync.peers.get_mut(&peer_id).expect("Is only called when peer is present in peers");
peer.asking = PeerAsking::Nothing;
let item_count = r.item_count()?;
let (fork_number, fork_hash) = sync.fork_block.expect("ForkHeader request is sent only fork block is Some; qed").clone();
if item_count == 0 || item_count != 1 {
trace!(target: "sync", "{}: Chain is too short to confirm the block", peer_id);
peer.confirmation = ForkConfirmation::TooShort;
} else {
let header = r.at(0)?.as_raw();
if keccak(&header) != fork_hash {
trace!(target: "sync", "{}: Fork mismatch", peer_id);
return Err(DownloaderImportError::Invalid);
}
trace!(target: "sync", "{}: Confirmed peer", peer_id);
peer.confirmation = ForkConfirmation::Confirmed;
if !io.chain_overlay().read().contains_key(&fork_number) {
trace!(target: "sync", "Inserting (fork) block {} header", fork_number);
io.chain_overlay().write().insert(fork_number, header.to_vec());
}
}
}
return Ok(());
}
/// Called by peer once it has new block headers during sync
fn on_peer_block_headers(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
let is_fork_header_request = match sync.peers.get(&peer_id) {
Some(peer) if peer.asking == PeerAsking::ForkHeader => true,
_ => false,
};
if is_fork_header_request {
return SyncHandler::on_peer_fork_header(sync, io, peer_id, r);
}
sync.clear_peer_download(peer_id);
let expected_hash = sync.peers.get(&peer_id).and_then(|p| p.asking_hash);
let allowed = sync.peers.get(&peer_id).map(|p| p.is_allowed()).unwrap_or(false);
let block_set = sync.peers.get(&peer_id).and_then(|p| p.block_set).unwrap_or(BlockSet::NewBlocks);
if !sync.reset_peer_asking(peer_id, PeerAsking::BlockHeaders) {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "sync", "{}: Ignored unexpected headers", peer_id);
return Ok(());
}
let expected_hash = match expected_hash {
Some(hash) => hash,
None => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "sync", "{}: Ignored unexpected headers (expected_hash is None)", peer_id);
return Ok(());
}
};
if !allowed {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "sync", "{}: Ignored unexpected headers (peer not allowed)", peer_id);
return Ok(());
}
let item_count = r.item_count()?;
trace!(target: "sync", "{} -> BlockHeaders ({} entries), state = {:?}, set = {:?}", peer_id, item_count, sync.state, block_set);
if (sync.state == SyncState::Idle || sync.state == SyncState::WaitingPeers) && sync.old_blocks.is_none() {
trace!(target: "sync", "Ignored unexpected block headers");
return Ok(());
}
if sync.state == SyncState::Waiting {
trace!(target: "sync", "Ignored block headers while waiting");
return Ok(());
}
let result = {
let downloader = match block_set {
BlockSet::NewBlocks => &mut sync.new_blocks,
BlockSet::OldBlocks => {
match sync.old_blocks {
None => {
trace!(target: "sync", "Ignored block headers while block download is inactive");
return Ok(());
},
Some(ref mut blocks) => blocks,
}
}
};
downloader.import_headers(io, r, expected_hash)?
};
Fix ancient blocks sync (#9531) * Log block set in block_sync for easier debugging * logging macros * Match no args in sync logging macros * Add QueueFull error * Only allow importing headers if the first matches requested * WIP * Test for chain head gaps and log * Calc distance even with 2 heads * Revert previous commits, preparing simple fix This reverts commit 5f38aa885b22ebb0e3a1d60120cea69f9f322628. * Reject headers with no gaps when ChainHead * Reset block sync download when queue full * Simplify check for subchain heads * Add comment to explain subchain heads filter * Fix is_subchain_heads check and comment * Prevent premature round completion after restart This is a problem on mainnet where multiple stale peer requests will force many rounds to complete quickly, forcing the retraction. * Reset stale old blocks request after queue full * Revert "Reject headers with no gaps when ChainHead" This reverts commit 0eb865539e5dee37ab34f168f5fb643300de5ace. * Add BlockSet to BlockDownloader logging Currently it is difficult to debug this because there are two instances, one for OldBlocks and one for NewBlocks. This adds the BlockSet to all log messages for easy log filtering. * Reset OldBlocks download from last enqueued Previously when the ancient block queue was full it would restart the download from the last imported block, so the ones still in the queue would be redownloaded. Keeping the existing downloader instance and just resetting it will start again from the last enqueued block.:wq * Ignore expired Body and Receipt requests * Log when ancient block download being restarted * Only request old blocks from peers with >= difficulty https://github.com/paritytech/parity-ethereum/pull/9226 might be too permissive and causing the behaviour of the retraction soon after the fork block. With this change the peer difficulty has to be greater than or euqal to our syncing difficulty, so should still fix https://github.com/paritytech/parity-ethereum/issues/9225 * Some logging and clear stalled blocks head * Revert "Some logging and clear stalled blocks head" This reverts commit 757641d9b817ae8b63fec684759b0815af9c4d0e. * Reset stalled header if useless more than once * Store useless headers in HashSet * Add sync target to logging macro * Don't disable useless peer and fix log macro * Clear useless headers on reset and comments * Use custom error for collecting blocks Previously we resued BlockImportError, however only the Invalid case and this made little sense with the QueueFull error. * Remove blank line * Test for reset sync after consecutive useless headers * Don't reset after consecutive headers when chain head * Delete commented out imports * Return DownloadAction from collect_blocks instead of error * Don't reset after round complete, was causing test hangs * Add comment explaining reset after useless * Replace HashSet with counter for useless headers * Refactor sync reset on bad block/queue full * Add missing target for log message * Fix compiler errors and test after merge * ethcore: revert ethereum tests submodule update
2018-10-09 15:31:40 +02:00
if result == DownloadAction::Reset {
sync.reset_downloads(block_set);
}
sync.collect_blocks(io, block_set);
Ok(())
}
/// Called by peer once it has new block receipts
fn on_peer_block_receipts(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
sync.clear_peer_download(peer_id);
let block_set = sync.peers.get(&peer_id).and_then(|p| p.block_set).unwrap_or(BlockSet::NewBlocks);
Fix ancient blocks sync (#9531) * Log block set in block_sync for easier debugging * logging macros * Match no args in sync logging macros * Add QueueFull error * Only allow importing headers if the first matches requested * WIP * Test for chain head gaps and log * Calc distance even with 2 heads * Revert previous commits, preparing simple fix This reverts commit 5f38aa885b22ebb0e3a1d60120cea69f9f322628. * Reject headers with no gaps when ChainHead * Reset block sync download when queue full * Simplify check for subchain heads * Add comment to explain subchain heads filter * Fix is_subchain_heads check and comment * Prevent premature round completion after restart This is a problem on mainnet where multiple stale peer requests will force many rounds to complete quickly, forcing the retraction. * Reset stale old blocks request after queue full * Revert "Reject headers with no gaps when ChainHead" This reverts commit 0eb865539e5dee37ab34f168f5fb643300de5ace. * Add BlockSet to BlockDownloader logging Currently it is difficult to debug this because there are two instances, one for OldBlocks and one for NewBlocks. This adds the BlockSet to all log messages for easy log filtering. * Reset OldBlocks download from last enqueued Previously when the ancient block queue was full it would restart the download from the last imported block, so the ones still in the queue would be redownloaded. Keeping the existing downloader instance and just resetting it will start again from the last enqueued block.:wq * Ignore expired Body and Receipt requests * Log when ancient block download being restarted * Only request old blocks from peers with >= difficulty https://github.com/paritytech/parity-ethereum/pull/9226 might be too permissive and causing the behaviour of the retraction soon after the fork block. With this change the peer difficulty has to be greater than or euqal to our syncing difficulty, so should still fix https://github.com/paritytech/parity-ethereum/issues/9225 * Some logging and clear stalled blocks head * Revert "Some logging and clear stalled blocks head" This reverts commit 757641d9b817ae8b63fec684759b0815af9c4d0e. * Reset stalled header if useless more than once * Store useless headers in HashSet * Add sync target to logging macro * Don't disable useless peer and fix log macro * Clear useless headers on reset and comments * Use custom error for collecting blocks Previously we resued BlockImportError, however only the Invalid case and this made little sense with the QueueFull error. * Remove blank line * Test for reset sync after consecutive useless headers * Don't reset after consecutive headers when chain head * Delete commented out imports * Return DownloadAction from collect_blocks instead of error * Don't reset after round complete, was causing test hangs * Add comment explaining reset after useless * Replace HashSet with counter for useless headers * Refactor sync reset on bad block/queue full * Add missing target for log message * Fix compiler errors and test after merge * ethcore: revert ethereum tests submodule update
2018-10-09 15:31:40 +02:00
let allowed = sync.peers.get(&peer_id).map(|p| p.is_allowed()).unwrap_or(false);
if !sync.reset_peer_asking(peer_id, PeerAsking::BlockReceipts) || !allowed {
trace!(target: "sync", "{}: Ignored unexpected receipts", peer_id);
return Ok(());
}
let expected_blocks = match sync.peers.get_mut(&peer_id) {
Some(peer) => mem::replace(&mut peer.asking_blocks, Vec::new()),
None => {
trace!(target: "sync", "{}: Ignored unexpected bodies (peer not found)", peer_id);
return Ok(());
}
};
let item_count = r.item_count()?;
trace!(target: "sync", "{} -> BlockReceipts ({} entries)", peer_id, item_count);
if item_count == 0 {
Err(DownloaderImportError::Useless)
} else if sync.state == SyncState::Waiting {
trace!(target: "sync", "Ignored block receipts while waiting");
Ok(())
} else {
{
let downloader = match block_set {
BlockSet::NewBlocks => &mut sync.new_blocks,
BlockSet::OldBlocks => match sync.old_blocks {
None => {
trace!(target: "sync", "Ignored block headers while block download is inactive");
return Ok(());
},
Some(ref mut blocks) => blocks,
}
};
downloader.import_receipts(r, expected_blocks.as_slice())?;
}
sync.collect_blocks(io, block_set);
Ok(())
}
}
/// Called when snapshot manifest is downloaded from a peer.
fn on_snapshot_manifest(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "Ignoring snapshot manifest from unconfirmed peer {}", peer_id);
return Ok(());
}
sync.clear_peer_download(peer_id);
if !sync.reset_peer_asking(peer_id, PeerAsking::SnapshotManifest) || sync.state != SyncState::SnapshotManifest {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Ignored unexpected/expired manifest", peer_id);
return Ok(());
}
let manifest_rlp = r.at(0)?;
let manifest = ManifestData::from_rlp(manifest_rlp.as_raw())?;
let is_supported_version = io.snapshot_service().supported_versions()
.map_or(false, |(l, h)| manifest.version >= l && manifest.version <= h);
if !is_supported_version {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
warn!(target: "snapshot_sync", "{}: Snapshot manifest version not supported: {}", peer_id, manifest.version);
return Err(DownloaderImportError::Invalid);
}
sync.snapshot.reset_to(&manifest, &keccak(manifest_rlp.as_raw()));
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
debug!(target: "snapshot_sync", "{}: Peer sent a snapshot manifest we can use. Block number #{}, block chunks: {}, state chunks: {}",
peer_id, manifest.block_number, manifest.block_hashes.len(), manifest.state_hashes.len());
io.snapshot_service().begin_restore(manifest);
sync.state = SyncState::SnapshotData;
Ok(())
}
/// Called when snapshot data is downloaded from a peer.
fn on_snapshot_data(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "Ignoring snapshot data from unconfirmed peer {}", peer_id);
return Ok(());
}
sync.clear_peer_download(peer_id);
if !sync.reset_peer_asking(peer_id, PeerAsking::SnapshotData) || (sync.state != SyncState::SnapshotData && sync.state != SyncState::SnapshotWaiting) {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Ignored unexpected snapshot data", peer_id);
return Ok(());
}
// check service status
let status = io.snapshot_service().status();
match status {
RestorationStatus::Inactive | RestorationStatus::Failed => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Snapshot restoration status: {:?}", peer_id, status);
sync.state = SyncState::WaitingPeers;
// only note bad if restoration failed.
if let (Some(hash), RestorationStatus::Failed) = (sync.snapshot.snapshot_hash(), status) {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
debug!(target: "snapshot_sync", "Marking snapshot manifest hash {} as bad", hash);
sync.snapshot.note_bad(hash);
}
sync.snapshot.clear();
return Ok(());
},
RestorationStatus::Initializing { .. } => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Snapshot restoration is initializing. Can't accept data right now.", peer_id);
return Ok(());
}
RestorationStatus::Finalizing => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Snapshot finalizing restoration. Can't accept data right now.", peer_id);
return Ok(());
}
RestorationStatus::Ongoing { .. } => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Snapshot restoration is ongoing", peer_id);
},
}
let snapshot_data: Bytes = r.val_at(0)?;
match sync.snapshot.validate_chunk(&snapshot_data) {
Ok(ChunkType::Block(hash)) => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Processing block chunk", peer_id);
io.snapshot_service().restore_block_chunk(hash, snapshot_data);
}
Ok(ChunkType::State(hash)) => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Processing state chunk", peer_id);
io.snapshot_service().restore_state_chunk(hash, snapshot_data);
}
Err(()) => {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
trace!(target: "snapshot_sync", "{}: Got bad snapshot chunk", peer_id);
io.disconnect_peer(peer_id);
return Ok(());
}
}
if sync.snapshot.is_complete() {
// wait for snapshot restoration process to complete
sync.state = SyncState::SnapshotWaiting;
}
Ok(())
}
/// Called by peer to report status
fn on_peer_status(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
let mut r = r.iter();
sync.handshaking_peers.remove(&peer_id);
let protocol_version: u8 = r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?;
let eth_protocol_version = io.protocol_version(&ETH_PROTOCOL, peer_id);
let warp_protocol_version = io.protocol_version(&WARP_SYNC_PROTOCOL_ID, peer_id);
let warp_protocol = warp_protocol_version != 0;
let private_tx_protocol = warp_protocol_version >= PAR_PROTOCOL_VERSION_3.0;
let network_id = r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?;
let difficulty = Some(r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?);
let latest_hash = r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?;
let genesis = r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?;
let forkid_validation_error = {
if eth_protocol_version >= ETH_PROTOCOL_VERSION_64.0 {
let fork_id = r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?;
sync.fork_filter.is_compatible(io.chain(), fork_id).err().map(|e| (fork_id, e))
} else {
None
}
};
let snapshot_hash = if warp_protocol { Some(r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?) } else { None };
let snapshot_number = if warp_protocol { Some(r.next().ok_or(rlp::DecoderError::RlpIsTooShort)?.as_val()?) } else { None };
let private_tx_enabled = if private_tx_protocol { r.next().and_then(|v| v.as_val().ok()).unwrap_or(false) } else { false };
let peer = PeerInfo {
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
protocol_version,
network_id,
difficulty,
latest_hash,
genesis,
asking: PeerAsking::Nothing,
asking_blocks: Vec::new(),
asking_hash: None,
asking_private_state: None,
ask_time: Instant::now(),
last_sent_transactions: Default::default(),
last_sent_private_transactions: Default::default(),
expired: false,
confirmation: if sync.fork_block.is_none() { ForkConfirmation::Confirmed } else { ForkConfirmation::Unconfirmed },
asking_snapshot_data: None,
snapshot_hash,
snapshot_number,
block_set: None,
private_tx_enabled,
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
client_version: ClientVersion::from(io.peer_version(peer_id)),
};
trace!(target: "sync", "New peer {} (\
protocol: {}, \
network: {:?}, \
difficulty: {:?}, \
latest:{}, \
genesis:{}, \
snapshot:{:?}, \
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
private_tx_enabled:{}, \
client_version: {})",
peer_id,
peer.protocol_version,
peer.network_id,
peer.difficulty,
peer.latest_hash,
peer.genesis,
peer.snapshot_number,
Snapshot restoration overhaul (#11219) * Comments and todos Use `snapshot_sync` as logging target * fix compilation * More todos, more logs * Fix picking snapshot peer: prefer the one with the highest block number More docs, comments, todos * Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems Docs, todos, comments * Tabs * Formatting * Don't build new rlp::EMPTY_LIST_RLP instances * Dial down debug logging * Don't warn about missing hashes in the manifest: it's normal Log client version on peer connect * Cleanup * Do not skip snapshots further away than 30k block from the highest block seen Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync. When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot. This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec. * lockfile * Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going) Resolve some documentation todos, add more * tweak log message * Don't warp sync twice Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync. More docs, resolve todos. Dial down some `sync` debug level logging to trace * Avoid iterating over all snapshot block/state hashes to find the next work item Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot` struct instead of counting pending chunks each time. * Address review grumbles * Log correct number of bytes written to disk * Revert ChunkType::Dup change * whitespace grumble * Cleanup debugging code * Fix docs * Fix import and a typo * Fix test impl * Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order * Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
peer.private_tx_enabled,
peer.client_version,
);
if io.is_expired() {
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
trace!(target: "sync", "Status packet from expired session {}:{}", peer_id, io.peer_version(peer_id));
return Ok(());
}
if sync.peers.contains_key(&peer_id) {
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
debug!(target: "sync", "Unexpected status packet from {}:{}", peer_id, io.peer_version(peer_id));
return Ok(());
}
let chain_info = io.chain().chain_info();
if peer.genesis != chain_info.genesis_hash {
trace!(target: "sync", "Peer {} genesis hash mismatch (ours: {}, theirs: {})", peer_id, chain_info.genesis_hash, peer.genesis);
return Err(DownloaderImportError::Invalid);
}
if peer.network_id != sync.network_id {
trace!(target: "sync", "Peer {} network id mismatch (ours: {}, theirs: {})", peer_id, sync.network_id, peer.network_id);
return Err(DownloaderImportError::Invalid);
}
if let Some((fork_id, reason)) = forkid_validation_error {
trace!(target: "sync", "Peer {} incompatible fork id (fork id: {:#x}/{}, error: {:?})", peer_id, fork_id.hash.0, fork_id.next, reason);
return Err(DownloaderImportError::Invalid);
}
if false
|| (warp_protocol && (peer.protocol_version < PAR_PROTOCOL_VERSION_1.0 || peer.protocol_version > PAR_PROTOCOL_VERSION_4.0))
|| (!warp_protocol && (peer.protocol_version < ETH_PROTOCOL_VERSION_63.0 || peer.protocol_version > ETH_PROTOCOL_VERSION_64.0))
{
trace!(target: "sync", "Peer {} unsupported eth protocol ({})", peer_id, peer.protocol_version);
return Err(DownloaderImportError::Invalid);
}
if sync.sync_start_time.is_none() {
sync.sync_start_time = Some(Instant::now());
}
sync.peers.insert(peer_id.clone(), peer);
// Don't activate peer immediatelly when searching for common block.
// Let the current sync round complete first.
sync.active_peers.insert(peer_id.clone());
Increase number of requested block bodies in chain sync (#10247) * Increase the number of block bodies requested during Sync. * Increase the number of block bodies requested during Sync. * Check if our peer is an older parity client with the bug of not handling large requests properly * Add a ClientVersion struct and a ClientCapabilites trait * Make ClientVersion its own module * Refactor and extend use of ClientVersion * Replace strings with ClientVersion in PeerInfo * Group further functionality in ClientCapabilities * Move parity client version data from tuple to its own struct. * Implement accessor methods for ParityClientData and remove them from ClientVersion. * Minor fixes * Make functions specific to parity return types specific to parity. * Test for shorter ID strings * Fix formatting and remove unneeded dependencies. * Roll back Cargo.lock * Commit last Cargo.lock * Convert from string to ClientVersion * * When checking if peer accepts service transactions just check if it's parity, remove version check. * Remove dependency on semver in ethcore-sync * Remove unnecessary String instantiation * Rename peer_info to peer_version * Update RPC test helpers * Simplify From<String> * Parse static version string only once * Update RPC tests to new ClientVersion struct * Document public members * More robust parsing of ID string * Minor changes. * Update version in which large block bodies requests appear. * Update ethcore/sync/src/block_sync.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update util/network/src/client_version.rs Co-Authored-By: elferdo <elferdo@gmail.com> * Update tests. * Minor fixes.
2019-02-07 15:27:09 +01:00
debug!(target: "sync", "Connected {}:{}", peer_id, io.peer_version(peer_id));
if let Some((fork_block, _)) = sync.fork_block {
SyncRequester::request_fork_header(sync, io, peer_id, fork_block);
}
Ok(())
}
/// Called when peer sends us new transactions
pub fn on_peer_transactions(sync: &ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), PacketDecodeError> {
// Accept transactions only when fully synced
if !io.is_chain_queue_empty() || (sync.state != SyncState::Idle && sync.state != SyncState::NewBlocks) {
trace!(target: "sync", "{} Ignoring transactions while syncing", peer_id);
return Ok(());
}
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
trace!(target: "sync", "{} Ignoring transactions from unconfirmed/unknown peer", peer_id);
return Ok(());
}
let item_count = r.item_count()?;
trace!(target: "sync", "{:02} -> Transactions ({} entries)", peer_id, item_count);
let mut transactions = Vec::with_capacity(item_count);
for i in 0 .. item_count {
let rlp = r.at(i)?;
let tx = rlp.as_raw().to_vec();
transactions.push(tx);
}
io.chain().queue_transactions(transactions, peer_id);
Ok(())
}
/// Called when peer sends us signed private transaction packet
fn on_signed_private_transaction(sync: &mut ChainSync, _io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
trace!(target: "sync", "{} Ignoring packet from unconfirmed/unknown peer", peer_id);
return Ok(());
}
let private_handler = match sync.private_tx_handler {
Some(ref handler) => handler,
None => {
trace!(target: "sync", "{} Ignoring private tx packet from peer", peer_id);
return Ok(());
}
};
trace!(target: "privatetx", "Received signed private transaction packet from {:?}", peer_id);
match private_handler.import_signed_private_transaction(r.as_raw()) {
Ok(transaction_hash) => {
//don't send the packet back
if let Some(ref mut peer) = sync.peers.get_mut(&peer_id) {
peer.last_sent_private_transactions.insert(transaction_hash);
}
},
Err(e) => {
trace!(target: "privatetx", "Ignoring the message, error queueing: {}", e);
}
}
Ok(())
}
/// Called when peer sends us new private transaction packet
fn on_private_transaction(sync: &mut ChainSync, _io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
trace!(target: "sync", "{} Ignoring packet from unconfirmed/unknown peer", peer_id);
return Ok(());
}
let private_handler = match sync.private_tx_handler {
Some(ref handler) => handler,
None => {
trace!(target: "sync", "{} Ignoring private tx packet from peer", peer_id);
return Ok(());
}
};
trace!(target: "privatetx", "Received private transaction packet from {:?}", peer_id);
match private_handler.import_private_transaction(r.as_raw()) {
Ok(transaction_hash) => {
//don't send the packet back
if let Some(ref mut peer) = sync.peers.get_mut(&peer_id) {
peer.last_sent_private_transactions.insert(transaction_hash);
}
},
Err(e) => {
trace!(target: "privatetx", "Ignoring the message, error queueing: {}", e);
}
}
Ok(())
}
fn on_private_state_data(sync: &mut ChainSync, io: &mut dyn SyncIo, peer_id: PeerId, r: &Rlp) -> Result<(), DownloaderImportError> {
if !sync.peers.get(&peer_id).map_or(false, |p| p.can_sync()) {
trace!(target: "sync", "{} Ignoring packet from unconfirmed/unknown peer", peer_id);
return Ok(());
}
if !sync.reset_peer_asking(peer_id, PeerAsking::PrivateState) {
trace!(target: "sync", "{}: Ignored unexpected private state data", peer_id);
return Ok(());
}
let requested_hash = sync.peers.get(&peer_id).and_then(|p| p.asking_private_state);
let requested_hash = match requested_hash {
Some(hash) => hash,
None => {
debug!(target: "sync", "{}: Ignored unexpected private state (requested_hash is None)", peer_id);
return Ok(());
}
};
let private_handler = match sync.private_tx_handler {
Some(ref handler) => handler,
None => {
trace!(target: "sync", "{} Ignoring private tx packet from peer", peer_id);
return Ok(());
}
};
trace!(target: "privatetx", "Received private state data packet from {:?}", peer_id);
let private_state_data: Bytes = r.val_at(0)?;
match io.private_state() {
Some(db) => {
// Check hash of the rececived data before submitting it to DB
let received_hash = db.state_hash(&private_state_data).unwrap_or_default();
if received_hash != requested_hash {
trace!(target: "sync", "{} Ignoring private state data with unexpected hash from peer", peer_id);
return Ok(());
}
match db.save_state(&private_state_data) {
Ok(hash) => {
if let Err(err) = private_handler.private_state_synced(&hash) {
trace!(target: "privatetx", "Ignoring received private state message, error queueing: {}", err);
}
}
Err(e) => {
error!(target: "privatetx", "Cannot save received private state {:?}", e);
}
}
}
None => {
trace!(target: "sync", "{} Ignoring private tx packet from peer", peer_id);
}
};
Ok(())
}
}
#[cfg(test)]
mod tests {
use std::collections::VecDeque;
use super::{
super::tests::{dummy_sync_with_peer, get_dummy_block, get_dummy_blocks, get_dummy_hashes},
SyncHandler
};
use crate::tests::{helpers::TestIo, snapshot::TestSnapshotService};
Extract the Engine trait (#10958) * Add client-traits crate Move the BlockInfo trait to new crate * New crate `machine` Contains code extracted from ethcore that defines `Machine`, `Externalities` and other execution related code. * Use new machine and client-traits crates in ethcore * Use new crates machine and client-traits instead of ethcore where appropriate * Fix tests * Don't re-export so many types from ethcore::client * Fixing more fallout from removing re-export * fix test * More fallout from not re-exporting types * Add some docs * cleanup * import the macro edition style * Tweak docs * Add missing import * remove unused ethabi_derive imports * Use latest ethabi-contract * Move many traits from ethcore/client/traits to client-traits crate Initial version of extracted Engine trait * Move snapshot related traits to the engine crate (eew) * Move a few snapshot related types to common_types Cleanup Executed as exported from machine crate * fix warning * Gradually introduce new engine crate: snapshot * ethcore typechecks with new engine crate * Sort out types outside ethcore * Add an EpochVerifier to ethash and use that in Engine.epoch_verifier() Cleanup * Document pub members * Sort out tests Sort out default impls for EpochVerifier * Add test-helpers feature and move EngineSigner impl to the right place * Sort out tests * Sort out tests and refactor verification types * Fix missing traits * More missing traits Fix Histogram * Fix tests and cleanup * cleanup * Put back needed logger import * Don't rexport common_types from ethcore/src/client Don't export ethcore::client::* * Remove files no longer used Use types from the engine crate Explicit exports from engine::engine * Get rid of itertools * Move a few more traits from ethcore to client-traits: BlockChainReset, ScheduleInfo, StateClient * Move ProvingBlockChainClient to client-traits * Don't re-export ForkChoice and Transition from ethcore * Address grumbles: sort imports, remove commented out code * Fix merge resolution error * merge failure
2019-08-15 17:59:22 +02:00
use client_traits::ChainInfo;
use ethcore::test_helpers::{EachBlockWith, TestBlockChainClient};
use parking_lot::RwLock;
Extract the Engine trait (#10958) * Add client-traits crate Move the BlockInfo trait to new crate * New crate `machine` Contains code extracted from ethcore that defines `Machine`, `Externalities` and other execution related code. * Use new machine and client-traits crates in ethcore * Use new crates machine and client-traits instead of ethcore where appropriate * Fix tests * Don't re-export so many types from ethcore::client * Fixing more fallout from removing re-export * fix test * More fallout from not re-exporting types * Add some docs * cleanup * import the macro edition style * Tweak docs * Add missing import * remove unused ethabi_derive imports * Use latest ethabi-contract * Move many traits from ethcore/client/traits to client-traits crate Initial version of extracted Engine trait * Move snapshot related traits to the engine crate (eew) * Move a few snapshot related types to common_types Cleanup Executed as exported from machine crate * fix warning * Gradually introduce new engine crate: snapshot * ethcore typechecks with new engine crate * Sort out types outside ethcore * Add an EpochVerifier to ethash and use that in Engine.epoch_verifier() Cleanup * Document pub members * Sort out tests Sort out default impls for EpochVerifier * Add test-helpers feature and move EngineSigner impl to the right place * Sort out tests * Sort out tests and refactor verification types * Fix missing traits * More missing traits Fix Histogram * Fix tests and cleanup * cleanup * Put back needed logger import * Don't rexport common_types from ethcore/src/client Don't export ethcore::client::* * Remove files no longer used Use types from the engine crate Explicit exports from engine::engine * Get rid of itertools * Move a few more traits from ethcore to client-traits: BlockChainReset, ScheduleInfo, StateClient * Move ProvingBlockChainClient to client-traits * Don't re-export ForkChoice and Transition from ethcore * Address grumbles: sort imports, remove commented out code * Fix merge resolution error * merge failure
2019-08-15 17:59:22 +02:00
use rlp::Rlp;
#[test]
fn handles_peer_new_hashes() {
let mut client = TestBlockChainClient::new();
client.add_blocks(10, EachBlockWith::Uncle);
let queue = RwLock::new(VecDeque::new());
let mut sync = dummy_sync_with_peer(client.block_hash_delta_minus(5), &client);
let ss = TestSnapshotService::new();
let mut io = TestIo::new(&mut client, &ss, &queue, None, None);
let hashes_data = get_dummy_hashes();
let hashes_rlp = Rlp::new(&hashes_data);
let result = SyncHandler::on_peer_new_hashes(&mut sync, &mut io, 0, &hashes_rlp);
assert!(result.is_ok());
}
#[test]
fn handles_peer_new_block_malformed() {
let mut client = TestBlockChainClient::new();
client.add_blocks(10, EachBlockWith::Uncle);
let block_data = get_dummy_block(11, client.chain_info().best_block_hash);
let queue = RwLock::new(VecDeque::new());
let mut sync = dummy_sync_with_peer(client.block_hash_delta_minus(5), &client);
//sync.have_common_block = true;
let ss = TestSnapshotService::new();
let mut io = TestIo::new(&mut client, &ss, &queue, None, None);
let block = Rlp::new(&block_data);
let result = SyncHandler::on_peer_new_block(&mut sync, &mut io, 0, &block);
assert!(result.is_err());
}
#[test]
fn handles_peer_new_block() {
let mut client = TestBlockChainClient::new();
client.add_blocks(10, EachBlockWith::Uncle);
let block_data = get_dummy_blocks(11, client.chain_info().best_block_hash);
let queue = RwLock::new(VecDeque::new());
let mut sync = dummy_sync_with_peer(client.block_hash_delta_minus(5), &client);
let ss = TestSnapshotService::new();
let mut io = TestIo::new(&mut client, &ss, &queue, None, None);
let block = Rlp::new(&block_data);
SyncHandler::on_peer_new_block(&mut sync, &mut io, 0, &block).expect("result to be ok");
}
#[test]
fn handles_peer_new_block_empty() {
let mut client = TestBlockChainClient::new();
client.add_blocks(10, EachBlockWith::Uncle);
let queue = RwLock::new(VecDeque::new());
let mut sync = dummy_sync_with_peer(client.block_hash_delta_minus(5), &client);
let ss = TestSnapshotService::new();
let mut io = TestIo::new(&mut client, &ss, &queue, None, None);
let empty_data = vec![];
let block = Rlp::new(&empty_data);
let result = SyncHandler::on_peer_new_block(&mut sync, &mut io, 0, &block);
assert!(result.is_err());
}
#[test]
fn handles_peer_new_hashes_empty() {
let mut client = TestBlockChainClient::new();
client.add_blocks(10, EachBlockWith::Uncle);
let queue = RwLock::new(VecDeque::new());
let mut sync = dummy_sync_with_peer(client.block_hash_delta_minus(5), &client);
let ss = TestSnapshotService::new();
let mut io = TestIo::new(&mut client, &ss, &queue, None, None);
let empty_hashes_data = vec![];
let hashes_rlp = Rlp::new(&empty_hashes_data);
let result = SyncHandler::on_peer_new_hashes(&mut sync, &mut io, 0, &hashes_rlp);
assert!(result.is_ok());
}
}