Improve speed #2

deffi420 · 2016-04-06T23:44:25Z

The CRC computation is too slow for files >100MB. This can be improved a lot.

resilar · 2016-08-08T07:55:10Z

We could use Nayuki's method with loss of generality. But first, let's try to speed up the CRC function with precomputed tables because the cost of forging is roughly b times the computation of CRC(msg), where b is the number of modifiable bits, and CRC(msg) is currently slow as hell. We should also handle the case that b is a lot larger than the CRC polynomial size w. After these improvements, the resulting cost is about min(b,w) * CRC(msg), which is acceptable if CRC(msg) is reasonably fast.

major rewrite to support chunk-based input message processing. the code still requires some refactoring. Forging exploits O(log n)-time sparse CRC computation of messages consisting of a single 1-bit and n-1 zeros. therefore, the time complexity of forging is roughly O(n + log n) = O(n) where the linear-time CRC computation dominates for large n in practice.

resilar · 2019-08-30T22:48:51Z

The cost of forging has been reduced to O(n + b w³ log n) by exploiting O(w³ log n)-time sparse CRC calculation for b messages consisting of n-1 zeros and a single 1-bit.

The sparse CRC calculation utilizes (w×w)-bit difference-matrix A of differences (bit flips) caused by each input bit inversion in a w-bit window. X is solved from AX=B where B is the (w×w)-bit difference-matrix of the next w-bit window so that AX²=C represents the differences of the second w-bit window, and so on. Sparse CRC calculation precomputes log n difference-matrix squarings: X¹, (X¹)²=X², (X²)²=X⁴, (X⁴)²=X⁸ ... For a n-bit input, the difference caused by the ith bit (0 ≤ i < n) is obtained as the product of O(log n) multiplications of the precomputed (w×w)-bit differences-matrices.

In conclusion, the method constructs the final (b×w)-bit matrix for forging in O(b w³ log n) time assuming each (w×w)-bit matrix multiplication takes O(w³) time and the CRC of the input message is known.

TLDR Forging of CRC checksums is fast. For long messages, calculating the non-sparse CRC of the input message dominates the total running time in practice. This issue will be closed once the (non-sparse) CRC calculation has been optimized for speed as well.

major rewrite to support chunk-based input message processing. the code still requires some refactoring. forging exploits O(log n)-time sparse crc computation of messages consisting of a single 1-bit and n-1 zeros. therefore, the time complexity of forging is roughly O(n + log n) = O(n) where the linear-time crc computation dominates for large n in practice.

resilar added the enhancement label Jan 7, 2019

resilar mentioned this issue Aug 14, 2019

Split into a dynamic lib and a CLI interface to it #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve speed #2

Improve speed #2

deffi420 commented Apr 6, 2016

resilar commented Aug 8, 2016

resilar commented Aug 30, 2019

Improve speed #2

Improve speed #2

Comments

deffi420 commented Apr 6, 2016

resilar commented Aug 8, 2016

resilar commented Aug 30, 2019