Paralel Initial Block Download PIBD TESTING

Re-syncing an old node (beta3) and always stuck at PIBD 100%, not sure why.

1 Like

We definitely found a bug here, old node is losing MMR root after PIBD sync for some reason. I submitted PR to prevent stuck Handle invalid MMR root to prevent sync thread panic by ardocrat · Pull Request #3774 · mimblewimble/grin · GitHub, but it not fixes problem at all, just should start non-PIBD sync.

2 Likes

I started my node after 2 months of downtime. The node synced the missing blocks without encountering any issues.
Also did a clean sync, deleting all blockchain data. Again synced without issues.
[grin-v5.2.0-beta.2-win-x64 on Windows Pro]

1 Like

Caught same error on 5 months old chain data, when it was no even PIBD @ grin-v5.2.0-beta.3, Linux with this fix Handle invalid MMR root to prevent sync thread panic by ardocrat · Pull Request #3774 · mimblewimble/grin · GitHub, now its just restarting PIBD sync over and over again. I am going to share my chain data at cloud, so everyone can try to reproduce.

At logs I can see:

And after all warnings:

20231019 22:56:13.798 ERROR grin_servers::grin::sync::state_sync - pibd_sync restart: chain reset error = Store Error: NotFoundErr("Block with hash: 00001d5480ef"), reason: DB Not Found Error: Block with hash: 00001d5480ef

Link to chain data from non-PIBD node (15.05.2023):

Update: state here was not finalized, so last error is reasonable as PIBD started sync here from 95% after updating all headers at 1st step.

Link chain data from PIBD node (18.08.2023):

PIBD started sync here from 35% after updaing all headers at 1st step.

is it possible to start full sync without PIBD step?

Many thanks for actively testing and debugging!

I’m not sure the warnings are related to the error you got. By looking at it quickly, we can rewind multiple blocks at this loop https://github.com/mimblewimble/grin/blob/master/chain/src/txhashset/txhashset.rs#L1611

Each rewind updates the output_pos db entries by removing them from the block. The output_pos is not per block, but a global state. So if block 10 creates output O1 and then block 20 spends it, O1 will be thrown out from output_pos. Now if we rewind before block 10, we’ll iterate over each block to remove the outputs that were created, but O1 no longer exists because it was spent. My theory is that this is why you see these warnings. It’s strange though that we’d have such large reorgs. Perhaps people spend their outputs right away in the next block? It’d be good to know the length of reorgs and see how outputs were spent to confirm this, but I’m not sure that’s actually an error hence the WARN logs. The error is definitely strange… we keep quite a few blocks around for a rewind iirc.

Only in some cases:

  • If there is no-PIBD peers

For me it started after applying this fix (Handle invalid MMR root to prevent sync thread panic by ardocrat · Pull Request #3774 · mimblewimble/grin · GitHub) on this 2-months old state chain_data_pibd.zip - Google Drive when root could not be calculated and I got message No PIBD-enabled max-difficulty peers for the past 60 seconds.

Some records from logs:

20231020 02:46:23.586 DEBUG grin_chain::txhashset::txhashset - Error returned, discarding txhashset extension: Store Error: NotFoundErr("Block with hash: 0000d50d51d0"), reason: DB Not Found Error: Block with hash: 0000d50d51d0
20231020 02:46:23.586 ERROR grin_servers::grin::sync::state_sync - pibd_sync restart: chain reset error = Store Error: NotFoundErr("Block with hash: 0000d50d51d0"), reason: DB Not Found Error: Block with hash: 0000d50d51d0
20231020 02:46:32.545 DEBUG grin_chain::txhashset::txhashset - Rewind header extension to 0003f168a190 at 2494941 from 0003f168a190 at 2494941
...
20231020 02:47:27.332 INFO grin_servers::grin::sync::state_sync - No PIBD-enabled max-difficulty peers for the past 60 seconds - Aborting PIBD and falling back to TxHashset.zip download
20231020 02:47:27.342 ERROR grin_servers::grin::sync::state_sync - state_sync: error = Aborting PIBD error. restart fast sync
20231020 02:47:27.346 DEBUG grin_servers::grin::sync::state_sync - state_sync: before txhashset request, header head: 2494943 / 000328fb421f, txhashset_head: 2491920 / 00034b16ee06
20231020 02:47:27.346 DEBUG grin_p2p::peer - Asking 142.132.167.15:3414 for txhashset archive at 2491920 00034b16ee06.
20231020 02:47:27.346 DEBUG grin_chain::types - sync_state: sync_status: TxHashsetPibd { aborted: true, errored: true, completed_leaves: 0, leaves_required: 18709626, completed_to_height: 1, required_height: 2491920 } -> TxHashsetDownload(TxHashsetDownloadStats { start_time: 2023-10-19T23:47:27.346560354Z, prev_update_time: 2023-10-19T23:47:27.346560431Z, update_time: 2023-10-19T23:47:27.346560396Z, prev_downloaded_size: 0, downloaded_size: 0, total_size: 0 })