openethereum/ethcore/sync
David 8c2199dd2a
Snapshot restoration overhaul (#11219)
* Comments and todos
Use `snapshot_sync` as logging target

* fix compilation

* More todos, more logs

* Fix picking snapshot peer: prefer the one with the highest block number
More docs, comments, todos

* Adjust WAIT_PEERS_TIMEOUT to be a multiple of MAINTAIN_SYNC_TIMER to try to fix snapshot startup problems
Docs, todos, comments

* Tabs

* Formatting

* Don't build new rlp::EMPTY_LIST_RLP instances

* Dial down debug logging

* Don't warn about missing hashes in the manifest: it's normal
Log client version on peer connect

* Cleanup

* Do not skip snapshots further away than 30k block from the highest block seen

Currently we look for peers that seed snapshots that are close to the highest block seen on the network (where "close" means withing 30k blocks). When a node starts up we wait for some time (5sec, increased here to 10sec) to let peers connect and if we have found a suitable peer to sync a snapshot from at the end of that delay, we start the download; if none is found and --warp-barrier is used we stall, otherwise we start a slow-sync.
When looking for a suitable snapshot, we use the highest block seen on the network to check if a peer has a snapshot that is within 30k blocks of that highest block number. This means that in a situation where all available snapshots are older than that, we will often fail to start a snapshot at all. What's worse is that the longer we delay starting a snapshot sync (to let more peers connect, in the hope of finding a good snapshot), the more likely we are to have seen a high block and thus the more likely we become to accept a snapshot.
This commit removes this comparison with the highest blocknumber criteria entirely and picks the best snapshot we find in 10sec.

* lockfile

* Add a `ChunkType::Dupe` variant so that we do not disconnect a peer if they happen to send us a duplicate chunk (just ignore the chunk and keep going)
Resolve some documentation todos, add more

* tweak log message

* Don't warp sync twice
Check if our own block is beyond the given warp barrier (can happen after we've completed a warp sync but are not quite yet synced up to the tip) and if so, don't sync.
More docs, resolve todos.
Dial down some `sync` debug level logging to trace

* Avoid iterating over all snapshot block/state hashes to find the next work item

Use a HashSet instead of a Vec and remove items from the set as chunks are processed. Calculate and store the total number of chunks in the `Snapshot`  struct instead of counting pending chunks each time.

* Address review grumbles

* Log correct number of bytes written to disk

* Revert ChunkType::Dup change

* whitespace grumble

* Cleanup debugging code

* Fix docs

* Fix import and a typo

* Fix test impl

* Use `indexmap::IndexSet` to ensure chunk hashes are accessed in order

* Revert increased SNAPSHOT_MANIFEST_TIMEOUT: 5sec should be enough
2019-10-31 16:07:21 +01:00
..
src Snapshot restoration overhaul (#11219) 2019-10-31 16:07:21 +01:00
Cargo.toml Snapshot restoration overhaul (#11219) 2019-10-31 16:07:21 +01:00