In September last year, I introduced Cuckatoo to address a weakness in Cuckoo Cycle, namely that it needs log_2(3) ~ 1.5849625 bits per node, where Cuckatoo needs only 1 bit per node.
After some more reflection, it appears that trits may not be completely out of the picture yet.
A lean solver loops over all remaining edges twice per round. Once to set bits on all adjacent endpoints in one partition, and once to kill edges whose endpoint partner bit is not set. Remember that in Cuckatoo nodes in each partition are paired up so that edges incident on one node are considered adjacent to edges incident on the other node in the pair. We can number nodes such that each pair consist of an even and an odd node.
If we define the following constants for various operations
(updating is reading plus possibly modifying shorty afterwards):
er: cost of reading an edge bit
eu: cost of updating an edge bit
sh: cost of computing a siphash
nr2: cost of reading a node bit
nu2: cost of updating a node bit
then the cost of a round with E out of M edges remaining is
M * (er + eu) + E * (2*sh + nu2 + nr2)
If we halve the node bitmap while doubling the passes over the edge bitmap, then the cost becomes
M * (2 * er + 2 * eu) + E * (4*sh + nu2 + nr2)
Using a trit per node pair reduces the node bitmap memory by ~ 40% and allows a cost of
M * (er + 2 * eu) + E * (3sh + nu3 + 0.5nr3)
where nr3/nu3 are the corresponding reading/update costs for trits.
Quite appropriately for trits, this uses a 3-pass approach:
First set trits for node pairs with an even endpoint to 1, then test trits for node pairs with an odd endpoint, killing the edge if the trit is 0, and setting it to 2 otherwise.
Finally, test trits for node pairs with an even endpoint again, killing the edge if the trit is 1.
Compared to the earlier bit-based approaches, this one is rather more messy. In the interest of simplicity, I hope that ASIC manufacturers, who might have considered this approach, deemed it not a worthwhile tradeoff.
Please let me know what you thinkā¦