Continuation from gitter, just another crazy idea.
“One more I can think of is to start with highly asic resistant cuckoo, but gradually (based on height) remove extra bits (and/or instructions) that come into siphash24 until at some point zero is added and we are left with pure cuckoo30 again.”
Lean miner like that needs to read 64 bytes per edge initially (up from one bit in stock cuckoo). First 32B to set counters and then again to read counters and trim edges. Compared to GPU that reads/writes approx. 24 bytes per edge after the initial lookup. Eventually this goes down more and more until asic need to read same amount of data, but is otherwise much more effective. Asic maker needs to strategically plan optimal time to deploy lean miner.
To make it work on 8GB GPU and leverage their architecture, only one edge endpoint could do this and the other might execute short program with 16KB of chain data lookup. This simple measure prevents asic from doubling its speed for free. Number of instructions would slowly decrease to zero, turning both endpoints to pure cuckoo for easy verification once chain gets larger and more popular.
EDIT: Instead of a random program on one side of the edge, %1MB and %16KB D lookups can be used instead.