Further Details Regarding June 2 Network Interruption
On Saturday, June 2, the Lisk network experienced a temporary halt. Despite the fact that the issue was identified and resolved within several hours, we believe the situation warrants a more thorough explanation of precisely what happened to the Lisk network.
Lisk Core uses PostgreSQL, a relational database. Blocks are stored as rows in a `blocks` table. This table has a foreign key constraint requiring that each block contain the ID of the previous block — the constraint here being that the previous block must already exist in the database. Insert operations which do not conform to the constraint will fail and not be saved. Similar to that, transactions are stored as rows in a `transactions` table, with the different parameters of a transaction, like a timestamp, sender, recipient, and amount, being saved in columns.
In the early hours of last Saturday morning, an anonymous individual broadcasted a transaction to the Lisk network with a timestamp of -3704634000, which comes out to be 1898–12–31 13:00:00 CEST. This requires a manually created and signed transaction, which is how we know it was intentional. The timestamp column in the `transactions` table will only accept a 32-bit integer value (a value between -2,147,483,648 and 2,147,483,647) — the timestamp used exceeded this range. Therefore, the transaction can be described as malformed and malicious; this resulted in the transaction not being inserted into the database. When a block is saved to the database, the block and its corresponding transactions are written as one atomic database operation; if a transaction-insert operation failed, the entire block-insert operation will fail as well. Thus, the whole block with the malformed transaction did not get inserted into the database. All subsequent attempts at block-insertions referenced the block ID of this nonexistent block and failed to insert due to the previous block foreign key constraint. This, in turn, halted all transaction and block-insertions into the database. That is the described security measure we reported on Saturday.
Here are the exact errors:
```
ERROR: integer out of range
STATEMENT: INSERT INTO “trs”(“id”,”blockId”,”type”,”timestamp”,”senderPublicKey”,”requesterPublicKey”,”senderId”,”recipientId”,”amount”,”fee”,”signature”,”signSignature”,”signatures”) VALUES (‘181175095785369468’,’5488578331239914243',0,-3704634000,’\x6dd24c92d91c0082f5be68f7350d87b7cdf105267543f1f61d3043a5c2d8a00b’,null,’3402562013208542942L’,’3402562013208542942L’,1,10000000,’\xcde9a3459b1f5590a9b6f32a5d9c2e85596be9d01ded14fa9d9c5f276a37e2562e40f459c727599323e1ee84435a63316b3a3a50fb3b5d687dc7ea1f1ad9e001',null,null)
ERROR: insert or update on table “blocks” violates foreign key constraint “blocks_previousBlock_fkey”
DETAIL: Key (previousBlock)=(5488578331239914243) is not present in table “blocks”.
STATEMENT: INSERT INTO “blocks”(“id”,”version”,”timestamp”,”height”,”previousBlock”,”numberOfTransactions”,”totalAmount”,”totalFee”,”reward”,”payloadLength”,”payloadHash”,”generatorPublicKey”,”blockSignature”) VALUES (‘13977984917448353211’,0,63803270,6144655,’5488578331239914243',0,0,0,400000000,0,’\xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855',’\x6cb825715058d2e821aa4af75fbd0da52181910d9fda90fabe73cd533eeb6acb’,’\x0b534abcbe640cef4df6a579bbffe00eb62401b9090e6d7f1ef72e4dafaa31bdb8594b85007f7d8ed79d4166835334ca2adb0830a175d6ac39263fe5fcf8950f’)
ERROR: insert or update on table “blocks” violates foreign key constraint “blocks_previousBlock_fkey”
```
Upon closer inspection following our Saturday afternoon announcement, we discovered that Lisk Core did not fully catch the error. The chain continued to move forward with mostly empty blocks only stored in memory and not written to the database. Therefore, we can rest assured that no user funds were at risk assuming that our users rely on the immutable blockchain state and not the memory state.
Saturday, June 2, 2018 Lisk Core 0.9.15 Release
When it became apparent that we needed a Lisk Core 0.9.15 release to prevent such a transaction from being accepted again, we immediately contacted exchanges and asked them to temporarily halt withdrawals and deposits, although LSK funds were safe the entire time. A pull request with the fix (https://github.com/LiskHQ/lisk/pull/2087) was made by Simon Warta, one of our community members, which we accepted after a careful peer review. The new release of version 0.9.15 of Lisk Core, which included the fix, was tagged, built and then rolled out to our testnet.
Shortly afterwards, the new release was also applied to our mainnet. To ensure a smooth upgrade of the whole network, we had to do this as a hard fork. We deployed the new release first to one of our mainnet seed nodes, temporarily setting the minVersion parameter in the configuration to an earlier version (0.9.14), so that it could communicate with the rest of the network and sync up to the last fully valid block without the malicious transaction.
Afterwards, we set the minVersion parameter on the seed node back to the latest version (0.9.15). Then, we deployed the new release to the rest of our seed nodes with the minVersion parameter set on the latest version (0.9.15), and let them fully sync up from the one seed node already fully in sync. Normally we don’t carry out this syncing process; but in normal upgrades, databases of our network nodes are in a valid state. However, they were not in this case, therefore we had to execute the steps described above. This slowed down the deployment process.
On late Saturday afternoon, Lisk Core 0.9.15 was officially released to the public and delegates immediately upgraded their nodes to the latest version. By the early evening, activity on the network and exchanges returned to normal.
June 5, 2018 Lisk Core 0.9.16 Release
As explained above, Lisk Core 0.9.15 only fixed the timestamp problem, however, it didn’t fix the root cause of the problem, which was that Lisk Core didn’t correctly catch the error that occurred. In an effort to prevent similar situations from arising in the future, we have released Lisk Core 0.9.16 today.
The fixes of the new release are:
- to handle errors properly and crash the application if an error is returned from the database during blocks/transactions-inserts (when one of the database queries fails).
- to set `lastBlock` into node memory only after the block is successfully written to the database.
- to handle errors properly and crash the application when the round tick fails (the round tick tracks the progress of each round, mainly the weight of votes for particular delegates, blocks’ rewards and transactions’ fee splits).
Looking Forward
We’re proud of the Lisk network’s safety and security, and the people who facilitate this. The developers at Lightcurve’s office were incredibly quick to respond to the issue; several of them came into the office on Saturday morning and worked quickly and efficiently to resolve the issue. All the while, the Lightcurve Marketing team immediately established channels of communication, particularly on Lisk.chat and Reddit, to address the situation to our community.
We’d also like to take this opportunity to give a huge thanks to the Lisk delegates. They were integral in helping us get the network back to normal. We’d especially like to give a big shoutout to Simon Warta, Carbonara and Nerigal!
And finally, we want to thank our community for their patience, feedback and ongoing support.
-The Lisk Team