Background

On March 30th, at approximately 11:00AM UTC, Toshi alerted the NEM & Symbol Discord about a problem he was encountering with his node: a fork was occurring at block height 3,191,538.

Author’s Note: We very much appreciate his outreach, and we encourage all node operators to contact us promptly when they sense anything might be amiss!

By the time we’d responded to him, the data folder had already been purged, and we did not have any further data to perform additional diagnostics. We’d mistakenly thought that his node had recovered, and did not think too much of the issue.

A handful of people in the community reported similar issues, but we couldn’t find anything concrete and assumed they’d eventually sync back with the network.

On April 1st, Toshi reported that the same issue had occurred. Now, we were worried, and put some manpower behind the issue.

Dusan, Toshi, and the rest of the community jumped into action, helping provide diagnostics and suggestions on possible causes. By this time, only a handful of nodes (10-20) had forked - representing less than 2% of the network - so there was not a large risk profile…yet.

We shot out a tweet to let everyone know we were on it, and got to work.


Diving In For Data

In the logs of forked nodes, there was this commonly reported error: processing of block 3191538 failed with Failure_LockSecret_Hash_Already_Exists.

The error meant that an active SECRET_LOCK was attempting to be replaced, which the network rules do not allow. Despite that, this block was both confirmed, and finalized on a majority fork. Since this was the divergent block, we know this was where we should focus our attention.

The biggest problem we were facing is that nodes synching from scratch or a backup (a.k.a a checkpoint) could not progress past Block 3191538. We didn’t notice that backups were being maintained by the community, so we reviewed them, but there was nothing out of the ordinary.

It soon became obvious that there was something strange with the secret lock with the composite hash of 0D0D03B478CBFA3A9C0A0A292E2FECB2C09A5DA01659CFF20C199823D0E79E81. Astute students of Symbol will remember that a secret lock composite hash is derived from the secret lock's secret and recipient - if either are changed, the secret lock's composite hash will be different.

We observed the following usage in the blockchain for secret locks with that composite hash:

Hm. Interesting.

Symbol allows for a composite hash to be reused if the prior lock is inactive (meaning, it has been COMPLETED or EXPIRED). So, the recreation of the lock at 3,191,525 is valid and makes sense because the prior lock was completed at block height 3,191,098. However, the recreation of the lock at block height 3,191,538 is unexpected, as the prior lock should still be active.