Here’s how the more truncated buckets on 2080 Ti affects the number of cycles found among first 100 nonces:
| cycle length | cuda31.0 | cuda31.1 |
|---|---|---|
| 2 | 40 | 41 |
| 4 | 19 | 22 |
| 6 | 11 | 11 |
| 8 | 10 | 10 |
| 10 | 8 | 8 |
| 12 | 6 | 6 |
| 14 | 12 | 15 |
| 16 | 2 | 3 |
| 18 | 2 | 4 |
| 20 | 3 | 3 |
| 22 | 5 | 5 |
| 24 | 2 | 2 |
| 26 | 4 | 4 |
| 28 | 5 | 6 |
| 30 | 1 | 2 |
| 32 | 1 | 1 |
| 34 | 3 | 5 |
| 40 | 2 | 2 |
| 42 | 1 | 1 |
On this range, cuda31.1 loses no cycles compared to the reference lcuda31 (that’s guaranteed to find all cycles).
cuda31.0 is the soon-to-be-released miner that can use all 64 KB shared memory on Turing GPUs (like 2080 Ti). cuda31.1 is the already released miner that needs to set PART_BITS=1 to cope with only 32/48 KB os shared memory, as on the 1080 Ti.