A Quantitative Research about Grin Privacy

I believe this is much harder than it appears at first. It should be relatively easy to show little can be learned about the amounts and addresses but the transaction graph seems like a complex thing to analyze. The main reason is that you don’t know what information people have and this metadata could help reason about the ownership of outputs. Here’s one example I expect to see in the future.

You know the output P is owned by a store located at physical location S. If P is used in a payjoin transaction, then you can probabilistically map the GPS location of cellular phones/identities in that range at time when the transaction T containing P arrived at the mempool. This gives you a set of people which are good candidates for one of the outputs that were created in T. When these outputs created are spent next, you can similarly use location information to drastically reduce the set, quite possibly to a single individual. Then you can start probabilistically tracking the user spendings based on this metadata. This isn’t a problem of payjoins. The owner of the store could report on a receive the ownership of P - and some likely will. It also isn’t a problem of a physical location. Instead of having a physical location, you can have a digital location and instead of gps coordinates you have network traffic and timing analysis (which is harder, but probably not impossible). To make matters worse, you can analyze this retroactively, so even if the links were unlikely to be known today, if you collect all the information you could do a computation that reveals this data later in the future when you actually know how to derive the links with identities.

How do you solve this? One obvious path is to get rid of the transaction graph altogether which is what ZCash wants to do, or have an anonymity set which is very large (lelantus). But if you retain the UTXO model like Bitcoin or Grin, you will have these issues. Can we battle this and make it hard? I’m not sure it’s possible to prevent learning about an output with metadata. Location will be available and deriving the owner from that will likely be possible. But the real issue in my opinion is the subsequent tracking that happens because of the identity knowledge of one transaction. Knowing that you bought X at store Y can always be leaked by the store, the question is whether people can then track your next transactions because of the knowledge of a previous transaction.

An attempt to mitigate this is Mimblewimble CoinSwap proposal. If this was a default wallet behaviour, they might learn which output is yours as you spend it (due to metadata like location etc.), but the knowledge about this output only lasts for 24 hours because after 24 hours, it will become one of 1000 outputs. So if you only use outputs whose knowledge was forgotten, they would only learn that it belonged to you, but would not be able to tell its history. In other words, the fungibility is improved. It blinds all of your neighbours from doing analysis on your transactions and it further blinds everyone if at least one mix node is honest. While not perfect, it seems to me like a good start at improving the issues mentioned above.

3 Likes