-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uint256: Introduce package. #2787
Conversation
e3f55fb
to
8850e5c
Compare
8850e5c
to
41c2fa8
Compare
41c2fa8
to
a876369
Compare
82ec71c
to
090a6a9
Compare
Need to have another pass over the multiplication and division sections but looks good so far. |
090a6a9
to
08c130a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Halfway through the commits
08c130a
to
40400a4
Compare
Pretty amazing work. Learning a lot looking over this. In commit message of c8fc7bf In commit message of dce5714 benchmarks
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 🎉 🎉
Excellent work as always! Only caught few more documentational issues.
40400a4
to
08c8728
Compare
Updated the commit messages per the @JoeGruffins review as well. |
08c8728
to
2fc2e67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work as always!
Benchmarks
goos: linux
goarch: amd64
pkg: github.com/decred/dcrd/internal/staging/primitives/uint256
cpu: AMD Ryzen 3 2200G with Radeon Vega Graphics
BenchmarkUint256SetBytes-4 476069810 2.482 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntSetBytes-4 160615147 7.908 ns/op 0 B/op 0 allocs/op
BenchmarkUint256SetBytesLE-4 430822285 2.687 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntSetBytesLE-4 19294600 60.83 ns/op 32 B/op 1 allocs/op
BenchmarkUint256Bytes-4 88200577 12.33 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntBytes-4 18895275 55.32 ns/op 32 B/op 1 allocs/op
BenchmarkUint256BytesLE-4 100000000 12.18 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntBytesLE-4 15329814 68.77 ns/op 32 B/op 1 allocs/op
BenchmarkUint256Zero-4 1000000000 1.120 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntZero-4 451413762 2.716 ns/op 0 B/op 0 allocs/op
BenchmarkUint256IsZero-4 681048356 1.635 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntIsZero-4 613439154 1.849 ns/op 0 B/op 0 allocs/op
BenchmarkUint256IsOdd-4 797677572 1.669 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntIsOdd-4 302746519 3.462 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Eq-4 654700365 1.949 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntEq-4 112233123 10.35 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lt-4 480044094 2.420 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLt-4 100000000 10.72 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Gt-4 428775073 2.633 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntGt-4 112796678 12.01 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Cmp-4 162790765 6.916 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntCmp-4 117192634 10.42 ns/op 0 B/op 0 allocs/op
BenchmarkUint256CmpUint64-4 375092656 3.216 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntCmpUint64-4 241991760 6.247 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Add-4 575967142 1.912 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntAdd-4 8228707 138.5 ns/op 4 B/op 0 allocs/op
BenchmarkUint256AddUint64-4 462350301 2.604 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntAddUint64-4 26791534 45.15 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Sub-4 651187911 1.755 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntSub-4 22559347 55.86 ns/op 0 B/op 0 allocs/op
BenchmarkUint256SubUint64-4 448262576 2.629 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntSubUint64-4 25150804 40.20 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Mul-4 135067701 8.883 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntMul-4 2939754 388.9 ns/op 64 B/op 1 allocs/op
BenchmarkUint256MulUint64-4 321874484 3.923 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntMulUint64-4 5118799 247.4 ns/op 8 B/op 1 allocs/op
BenchmarkUint256Square-4 194034268 7.041 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntSquare-4 2914981 393.1 ns/op 64 B/op 1 allocs/op
BenchmarkUint256Div/dividend_lt_divisor-4 398487291 2.945 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/dividend_eq_divisor-4 364279308 3.101 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/1_by_1_near-4 302580482 3.855 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/1_by_1_far-4 96309692 13.61 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/2_by_1_near-4 129636962 8.936 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/2_by_1_far-4 46576605 25.07 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/3_by_1_near-4 100000000 11.71 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/3_by_1_far-4 32242239 37.36 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_1_near-4 63737451 15.75 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_1_far-4 20309397 49.48 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/2_by_2_near-4 53796532 20.07 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/2_by_2_far-4 42929392 30.03 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/3_by_2_near-4 38020507 26.79 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/3_by_2_far-4 25221169 47.17 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_2_near-4 33020859 35.22 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_2_far-4 16499960 61.37 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/3_by_3_near-4 68166744 17.98 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/3_by_3_far-4 42235713 29.73 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_3_near-4 35219878 32.05 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_3_far-4 28236904 41.50 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_4_near-4 70831348 17.26 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Div/4_by_4_far-4 38107536 31.37 ns/op 0 B/op 0 allocs/op
BenchmarkUint256DivRandom-4 60160770 20.02 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/dividend_lt_divisor-4 15166644 71.20 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/dividend_eq_divisor-4 5455660 261.5 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDiv/1_by_1_near-4 20247571 58.81 ns/op 8 B/op 1 allocs/op
BenchmarkBigIntDiv/1_by_1_far-4 42041121 28.51 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/2_by_1_near-4 36022141 32.12 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/2_by_1_far-4 28347584 42.89 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/3_by_1_near-4 30974647 38.03 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/3_by_1_far-4 20732623 49.20 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/4_by_1_near-4 28156827 44.69 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/4_by_1_far-4 17195563 63.88 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDiv/2_by_2_near-4 5086784 216.1 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/2_by_2_far-4 5746678 212.5 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/3_by_2_near-4 5075608 253.1 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/3_by_2_far-4 4244466 257.3 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/4_by_2_near-4 4156348 295.6 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDiv/4_by_2_far-4 4208538 295.9 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDiv/3_by_3_near-4 4301845 269.1 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/3_by_3_far-4 5341867 221.1 ns/op 64 B/op 1 allocs/op
BenchmarkBigIntDiv/4_by_3_near-4 4738832 247.4 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDiv/4_by_3_far-4 4631062 250.4 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDiv/4_by_4_near-4 5088583 234.5 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDiv/4_by_4_far-4 5235777 229.4 ns/op 80 B/op 1 allocs/op
BenchmarkBigIntDivRandom-4 5518939 221.6 ns/op 72 B/op 1 allocs/op
BenchmarkUint256DivUint64-4 22925576 44.61 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntDivUint64-4 10211529 124.1 ns/op 8 B/op 1 allocs/op
BenchmarkUint256Negate-4 850187539 1.238 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntNegate-4 24669282 51.68 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_0-4 507476983 2.201 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_1-4 316203012 3.606 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_64-4 459433334 2.491 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_128-4 484224193 2.384 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_192-4 462848384 2.296 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_255-4 480112038 2.348 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Lsh/bits_256-4 448478752 2.463 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_0-4 172591105 6.000 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_1-4 93923139 12.78 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_64-4 81818536 14.59 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_128-4 79875747 14.30 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_192-4 83981918 14.12 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_255-4 87148645 13.77 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntLsh/bits_256-4 84002761 14.29 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_0-4 475755678 2.233 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_1-4 313881753 3.717 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_64-4 448903030 2.456 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_128-4 522788181 2.415 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_192-4 501043711 2.303 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_255-4 493278232 2.352 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Rsh/bits_256-4 524338425 2.198 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_0-4 163935820 7.280 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_1-4 79080118 13.90 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_64-4 107008660 10.92 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_128-4 121404361 9.779 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_192-4 136675233 8.820 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_255-4 135231632 9.048 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntRsh/bits_256-4 179642262 6.345 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Not-4 455189844 2.649 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntNot-4 56325128 21.92 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Or-4 425439813 2.828 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntOr-4 76605534 15.22 ns/op 0 B/op 0 allocs/op
BenchmarkUint256And-4 385027156 3.201 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntAnd-4 79475395 15.53 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Xor-4 415490449 3.001 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntXor-4 63178353 15.93 ns/op 0 B/op 0 allocs/op
BenchmarkUint256BitLen/bits_64-4 688524416 1.729 ns/op 0 B/op 0 allocs/op
BenchmarkUint256BitLen/bits_128-4 681189094 1.707 ns/op 0 B/op 0 allocs/op
BenchmarkUint256BitLen/bits_192-4 872857674 1.380 ns/op 0 B/op 0 allocs/op
BenchmarkUint256BitLen/bits_255-4 865733118 1.431 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntBitLen/bits_64-4 540039884 2.160 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntBitLen/bits_128-4 600470946 1.953 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntBitLen/bits_192-4 575591056 1.964 ns/op 0 B/op 0 allocs/op
BenchmarkBigIntBitLen/bits_255-4 570677790 2.301 ns/op 0 B/op 0 allocs/op
BenchmarkUint256Text/base_2-4 2326405 532.3 ns/op 512 B/op 2 allocs/op
BenchmarkUint256Text/base_8-4 5222268 214.1 ns/op 192 B/op 2 allocs/op
BenchmarkUint256Text/base_10-4 2746434 466.3 ns/op 160 B/op 2 allocs/op
BenchmarkUint256Text/base_16-4 7155501 195.4 ns/op 128 B/op 2 allocs/op
BenchmarkBigIntText/base_2-4 2267157 523.1 ns/op 528 B/op 2 allocs/op
BenchmarkBigIntText/base_8-4 4910493 246.6 ns/op 192 B/op 2 allocs/op
BenchmarkBigIntText/base_10-4 2231427 529.7 ns/op 224 B/op 3 allocs/op
BenchmarkBigIntText/base_16-4 6620342 184.6 ns/op 135 B/op 2 allocs/op
BenchmarkUint256Format/base_2-4 1265379 966.8 ns/op 768 B/op 3 allocs/op
BenchmarkUint256Format/base_8-4 2023450 523.9 ns/op 288 B/op 3 allocs/op
BenchmarkUint256Format/base_10-4 1683890 712.5 ns/op 240 B/op 3 allocs/op
BenchmarkUint256Format/base_16-4 2705778 450.0 ns/op 192 B/op 3 allocs/op
BenchmarkBigIntFormat/base_2-4 1315701 919.3 ns/op 552 B/op 5 allocs/op
BenchmarkBigIntFormat/base_8-4 2011153 584.3 ns/op 216 B/op 5 allocs/op
BenchmarkBigIntFormat/base_10-4 1255659 889.5 ns/op 248 B/op 6 allocs/op
BenchmarkBigIntFormat/base_16-4 2304246 522.1 ns/op 160 B/op 5 allocs/op
BenchmarkUint256PutBig-4 46232820 23.74 ns/op 0 B/op 0 allocs/op
BenchmarkUint256SetBig-4 26237026 43.49 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/decred/dcrd/internal/staging/primitives/uint256 230.729s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. The documentation is excellent, making the logic easy to follow. I just had a few very minor comments inline.
e097b4f
to
15b8ddc
Compare
Profiling the CPU usage during an initial chain sync shows that roughly 65-70% of all time is spent in garbage collection operations. This is primarily the result of a large number of in-use allocations. Profiling the in-use allocations shows that around almost a quarter of all in-use allocations (~22%) are due to standard library big integers which require allocations. In other words, eliminating those allocations should lead to a speedup of around 10% to the initial chain sync. More specifically, the allocations in question are the result of several important calculations which could be done without allocations, and more efficiently in terms of execution time, via fixed precision unsigned 256-bit integers. Thus, motivated by the previous discussion, this is part of a series of commits that implements highly optimized allocation free fixed precision unsigned 256-bit integer arithmetic that can ultimately be used in place of the standard library big integers. For the time being, the package is introduced into the internal staging area for initial review. The following is a brief overview of the main features and benefits: - Strong focus on performance and correctness - Every operation is faster than the stdlib big.Int equivalent and most operations, including the primary math operations, are significantly faster - Allocation free - All non-formatting operations with the specialized type are allocation free - Supports boolean comparison, bitwise logic, and bitwise shift operations - All operations are performed modulo 2^256 - Ergonomic API with unary-style arguments as well as some binary variants - Conversion-free support for interoperation with native uint64 integers - Direct conversion to and from little and big endian byte arrays - Full support for formatted output and common base conversions - Formatted output uses fewer allocations than stdlib big.Int - 100% test coverage - Comprehensive benchmarks In order to help ease the review process, the full implementation will be done across many commits. This commit only contains the basic type definition and ability to set it to a uint64 or another uint256 along with tests, so it is not very useful on its own. Future commits will implement support for interpreting and producing big and little endian bytes, the primary arithmetic operations (addition, subtraction, multiplication, squaring, division, negation), bitwise operations (lsh, rsh, not, or, and, xor), comparison operations (equals, less, greater, cmp), and other convenience methods such as determining the minimum number of bits required to represent the current value, whether or not the value can be represented as a uint64 without loss of precision, and text formatting with base conversion.
This adds the ability for the uint256 to be set by interpreting arrays and slices as a 256-bit big-endian integer and associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ------------------------------------------------------------------ SetBytes 9.09ns ±13% 3.05ns ± 1% -66.43% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ---------------------------------------------------------- SetBytes 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds the ability to determine if a uint256 is odd along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ------------------------------------------------------------------ IsOdd 3.62ns ± 4% 1.64ns ± 1% -54.65% (p=0.000 n=10+10) name old allocs/op new allocs/op delta --------------------------------------------------------- IsOdd 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support for uint256 bitwise left shifting along with associated tests to ensure proper functionality. It includes left shifting an existing uint256 (a << b) and assigning the result of left shifting a uint256 to a second one (a <<= b). This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ------------------------------------------------------------------------- Lsh/bits_0 7.1ns ± 3% 2.58ns ± 1% -63.94% (p=0.000 n=10+10) Lsh/bits_1 14.8ns ± 1% 4.2ns ± 1% -71.40% (p=0.000 n=10+10) Lsh/bits_64 16.7ns ± 1% 2.7ns ± 1% -84.00% (p=0.000 n=10+10) Lsh/bits_128 16.9ns ± 2% 2.7ns ± 0% -84.21% (p=0.000 n=10+10) Lsh/bits_192 16.6ns ± 1% 2.6ns ± 1% -84.19% (p=0.000 n=10+10) Lsh/bits_255 16.3ns ± 2% 2.8ns ± 2% -83.11% (p=0.000 n=10+10) Lsh/bits_256 16.9ns ± 2% 2.6ns ± 2% -84.77% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ---------------------------------------------------------------- Lsh/bits_0 0.00 0.00 ~ (all equal) Lsh/bits_1 0.00 0.00 ~ (all equal) Lsh/bits_64 0.00 0.00 ~ (all equal) Lsh/bits_128 0.00 0.00 ~ (all equal) Lsh/bits_192 0.00 0.00 ~ (all equal) Lsh/bits_255 0.00 0.00 ~ (all equal) Lsh/bits_256 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support for uint256 bitwise right shifting along with associated tests to ensure proper functionality. It includes right shifting an existing uint256 (a >> b) and assigning the result of right shifting a uint256 to a second one (a >>= b). This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ------------------------------------------------------------------------- Rsh/bits_0 8.76ns ± 2% 2.57ns ± 1% -70.63% (p=0.000 n=10+10) Rsh/bits_1 14.4ns ± 2% 4.3ns ± 2% -70.28% (p=0.000 n=10+10) Rsh/bits_64 12.8ns ± 1% 2.9ns ± 2% -77.31% (p=0.000 n=10+10) Rsh/bits_128 11.8ns ± 0% 2.9ns ± 2% -75.51% (p=0.000 n=10+10) Rsh/bits_192 10.5ns ± 2% 2.6ns ± 1% -75.17% (p=0.000 n=10+10) Rsh/bits_255 10.5ns ± 3% 2.8ns ± 2% -73.89% (p=0.000 n=10+10) Rsh/bits_256 5.50ns ± 1% 2.58ns ± 2% -53.15% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ---------------------------------------------------------------- Rsh/bits_0 0.00 0.00 ~ (all equal) Rsh/bits_1 0.00 0.00 ~ (all equal) Rsh/bits_64 0.00 0.00 ~ (all equal) Rsh/bits_128 0.00 0.00 ~ (all equal) Rsh/bits_192 0.00 0.00 ~ (all equal) Rsh/bits_255 0.00 0.00 ~ (all equal) Rsh/bits_256 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support to compute the bitwise not of a uint256 along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ----------------------------------------------------------------- Not 25.4ns ± 2% 3.3ns ± 2% -86.79% (p=0.000 n=10+10) name old allocs/op new allocs/op delta -------------------------------------------------------- Not 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support to compute the bitwise or of two uint256s along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ----------------------------------------------------------------- Or 17.9ns ± 5% 3.4ns ± 6% -80.94% (p=0.000 n=10+10) name old allocs/op new allocs/op delta -------------------------------------------------------- Or 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support to compute the bitwise and of two uint256s along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ----------------------------------------------------------------- And 16.7ns ± 5% 3.4ns ± 6% -79.93% (p=0.000 n=10+10) name old allocs/op new allocs/op delta -------------------------------------------------------- And 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support to compute the bitwise xor of two uint256s along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta ----------------------------------------------------------------- Xor 17.9ns ± 5% 3.4ns ± 6% -80.91% (p=0.000 n=10+10) name old allocs/op new allocs/op delta -------------------------------------------------------- Xor 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds support for determining the minimum number of bits required to represent the current value of a uint256 along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta --------------------------------------------------------------------- bits_64 2.24ns ± 1% 1.94ns ± 3% -13.04% (p=0.000 n=10+10) bits_128 2.25ns ± 2% 1.96ns ± 2% -13.17% (p=0.000 n=10+10) bits_192 2.25ns ± 1% 1.60ns ± 1% -28.65% (p=0.000 n=10+10) bits_255 2.26ns ± 2% 1.61ns ± 1% -29.04% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ------------------------------------------------------------ bits_64 0.00 0.00 ~ (all equal) bits_128 0.00 0.00 ~ (all equal) bits_192 0.00 0.00 ~ (all equal) bits_255 0.00 0.00 ~ (all equal) This is part of a series of commits to fully implement the uint256 package.
This adds full support for formatting a uint256 along with associated tests to ensure proper functionality. It includes a fmt.Formatter that supports the full suite of the fmt package format flags for integral types, a fmt.Stringer, and a separate Text method that accepts an output base directly and produces the relevant output with fewer allocations than using the standard fmt methods. This is part of a series of commits to fully implement the uint256 package.
The following is a comparison between stdlib big integers (old) and the specialized type (new) averaging 10 runs each: name old time/op new time/op delta --------------------------------------------------------------------------- Text/base_2 579ns ± 3% 496ns ± 2% -14.37% (p=0.000 n=10+10) Text/base_8 266ns ± 1% 227ns ± 1% -14.58% (p=0.000 n=10+10) Text/base_10 536ns ± 1% 458ns ± 2% -14.58% (p=0.000 n=10+10) Text/base_16 205ns ± 2% 180ns ± 4% -11.90% (p=0.000 n=10+10) Format/base_2 987ns ±15% 852ns ± 2% -13.64% (p=0.000 n=10+10) Format/base_8 620ns ± 6% 544ns ± 3% -12.31% (p=0.000 n=10+10) Format/base_10 888ns ± 1% 726ns ± 1% -18.25% (p=0.000 n=10+10) Format/base_16 565ns ± 1% 449ns ± 1% -20.41% (p=0.000 n=10+10) name old allocs/op new allocs/op delta -------------------------------------------------------------------------- Text/base_2 2.00 ± 0% 2.00 ± 0% ~ (all equal) Text/base_8 2.00 ± 0% 2.00 ± 0% ~ (all equal) Text/base_10 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=10+10) Text/base_16 2.00 ± 0% 2.00 ± 0% ~ (all equal) Format/base_2 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10) Format/base_8 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10) Format/base_10 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=10+10) Format/base_16 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10) This is part of a series of commits to fully implement the uint256 package.
f9dc32d
to
e6218ce
Compare
This adds convenience methods for converting a uint256 to a standard library big integer along with associated tests to ensure proper functionality. It includes a method that allows an existing big integer to be reused thereby potentially saving allocations as well as a method that returns a new big integer. The latter is often more convenient to use, but is also virtually guaranteed to cause an allocation. This is part of a series of commits to fully implement the uint256 package.
The following shows the typical performance of converting a uint256 to a standard library big integer using one that already exists: Uint256PutBig 43651442 27.29 ns/op 0 B/op 0 allocs/op This is part of a series of commits to fully implement the uint256 package.
This adds a convenience method for converting a standard library big integer to a uint256 (modulo 2^256) along with associated tests to ensure proper functionality. This is part of a series of commits to fully implement the uint256 package.
The following shows the typical performance of converting a standard library big integer that has already been reduced modulo 2^256 to a uint256: Uint256SetBig 26944130 44.45 ns/op 0 B/op 0 allocs/op This is part of a series of commits to fully implement the uint256 package.
This adds an example of calculating the result of dividing a max unsigned 256-bit integer by a max unsigned 128-bit integer and outputting that result in hex with leading zeros. This is part of a series of commits to fully implement the uint256 package.
e6218ce
to
8f3fd5a
Compare
Profiling the CPU usage during an initial chain sync shows that roughly 65-70% of all time is spent in garbage collection operations. This is primarily the result of a large number of in-use allocations. Profiling the in-use allocations shows that around almost a quarter of all in-use allocations (~22%) are due to standard library big integers which require allocations. In other words, eliminating those allocations should lead to a speedup of around 5-10% to the initial chain sync. However, note that this series of commits only introduces the package and does not update all of the relevant code to make use of it as that will be done separately.
More specifically, the allocations in question are the result of several important calculations which could be done without allocations, and more efficiently in terms of execution time, via fixed precision unsigned 256-bit integers.
Thus, motivated by the previous discussion, this is part of a series of commits that implements highly optimized allocation free fixed precision unsigned 256-bit integer arithmetic that can ultimately be used in place of the standard library big integers.
For the time being, the package is introduced into the internal staging area for initial review.
The following is a brief overview of the main features and benefits:
big.Int
equivalent and most operations, including the primary math operations, are significantly fasteruint64
integersbig.Int
README.md
The following benchmark results demonstrate the performance of most operations as compared to standard library
big.Int
s. The benchmarks are from a Ryzen 7 1700 processor and are the result of feedingbenchstat
10 iterations of each.Arithmetic Methods
big.Int
Time/OpUint256
Time/Opbig.Int
Comparison Methods
big.Int
Time/OpUint256
Time/Opbig.Int
Bitwise Methods
big.Int
Time/OpUint256
Time/Opbig.Int
Conversion Methods
big.Int
Time/OpUint256
Time/Opbig.Int
Misc Convenience Methods
big.Int
Time/OpUint256
Time/Opbig.Int
Output Formatting Methods
big.Int
Time/OpUint256
Time/Opbig.Int
This is work towards #2786.