Mimblewimble CoinSwap proposal

tromp · January 28, 2021, 11:18pm

Abstract

We present a coin shuffling proposal with the following properties:

Users submit self-spends throughout the day. No interaction needed for shuffling.
Shuffling is performed at the end of the day by a set of mixnodes that cannot steal any coins.
Invalid self-spends are automatically filtered out. No need to abort or restart the shuffling.
As long as at least one mixnode is honest, then no one learns the input output links.
The size of the shuffle is limited only by blocksize and could easily be over a thousand.
Each shuffle only grows the chainsize by a small constant (~100 byte), thanks to MW cut-through.

Widespread use of the protocol would leave the transaction graph mostly obscured.

Introduction

This document proposes a possible design for a ``trustless’’ Mimblewimble CoinSwap service to which users submit self spend data throughout the day (or hour or any desired period) which are validated, aggregated, and published at the end of the day.

Coins swaps are useful for obfuscating the Mimblewimble transaction graph. They can be applied to self spending outputs that were recently received, or to self-spends resulting from canceled/aborted transactions. The lack of urgency of such self spends lends itself well to the slow nature (and possible failure) of coinswaps.

Whereas self spends normally reveal their input-output link publicly in every mempool on the network, use of a CoinSwap service obscures this link from public view. If one further trusts at least one service node to be honest, then the link is obscured from the nodes’ view as well. In this sense little trust is needed to benefit from the service. If all nodes collude, they not only learn the input output links, but could also choose to revert the entire coinswap after it confirms on-chain. Which is still a manageable problem for wallets, even if very unlikly.

This document is not concerned with aggregation of regular transactions, or what could be called a CoinJoin service. Consistent use of coinswaps on outputs received in regular transactions should make coinjoins mostly redundant, although it does come at a roughly doubling of fees.

This design follows the Grin forum discussion at Daily Aggregator (Partial Transactions) - #3 by antioch and combines ideas of users antioch, oryhp, vegycslol, and myself.

We first present the general design, and then discuss several refinements dealing with practical issues such as timing attacks, spam attacks and ownership proofs.

The CoinSwap protocol

MWixnet Architecture

The protocol is similar to that of mixnets, but instead of mixing messages from senders to receivers, we mix self spends from coin owners to themselves. Whereas the messages in a mixnet do not change, our mixnodes transform commitments into new ones, and then achieve mixing by sorting the results.

Let there be m self spends spend 1 … spend m, and n mixnodes node 1 … node n. We let subscript i range over spends, and subscript j range over nodes.

Let x_i,j be the randomly generated excess by which node j should additively transform the commitment in spend i: C_i,j = C_i,j-1 + x_i,j * G.

Conceptually C_i,0 and C_i,n are the input and output of spend i, except for a necessary fee adjustment:

C_i,0 = C_iⁱⁿ - fee * H
C_i,n = C_i,0 + (Σ_j x_i,j) * G = C_i^out

The excesses form a matrix such as this one for 4 self spends and 3 servers:

x_1,1 x_1,2 x_1,3

x_2,1 x_2,2 x_2,3

x_3,1 x_3,2 x_3,3

x_4,1 x_4,2 x_4,3

which correspond to the linkages in this matrix C of commitments:

C_1,0 <–> C_1,1 <–> C_1,2 <–> C_1,3

C_2,0 <–> C_2,1 <–> C_2,2 <–> C_2,3

C_3,0 <–> C_3,1 <–> C_3,2 <–> C_3,3

C_4,0 <–> C_4,1 <–> C_4,2 <–> C_4,3

Each node receives a sorted column, not knowing the i indices shown, transforms each commitment by some excess specified for that commitment, sorts the results, and passes on the new column to the next node.

Data provision

Node 1 is provided, for each i, with with a tuple data_i,1 containing:

commitment C_iⁱⁿ
a proof of ownership of C_iⁱⁿ
excess x_i,1

Node n is provided, for each i, with with a tuple data_i,n containing:

commitment C_i,n-1
corresponding range proof BP_i
excess share x_i,n

Each other node j is provided, for each i, with a tuple data_i,j containing:

commitment C_i,j-1
excess share x_i,j

Input validation

Validation starts with node 1 computing, for each i, C_i,0 = C_iⁱⁿ - fee * H. It also verifies that each input commitment is unique, present in the UTXO set, and has a valid proof of ownership. All invalid data is removed (resulting in a smaller m).

Output derivation

Then for each j from 1 to n in turn, node j computes, for each i, C_i,j = C_i,j-1 + x_i,j * G, and if j<n, sends the ordered set of m commitments C_i,j to its neighbour node j+1.
Any commitments received by a node that are not among its tuples are invalid and get removed.
Also, any non-unique C_i,j are similarly invalid and removed. This will allow correct determination of validity in the later kernel derivation stage.

Output validation

Node n removes all received commitments that don’t have valid range proofs.
Let Out be the set of remaining commitments.

Kernel derivation

Then for each j from n to 1 in turn, node j computes kernel K_j for a public excess of G * Σ_valid x_i,j, and if j>1, sends Out, the set of n+1-j kernels, and the set of valid commitments C_i,j-1 to its neighbour node j-1.
This maintains an invariant that Σ_{C ∈ Out} C - Σ_valid C_i,j = Σ_k>j K_k.
Mimblewimble CoinSwap proposal - #57 by oryhp explains how to optimize this to a single kernel.

Aggregation

Finally, node 1 ends up with a set of valid C_0,j that by invariance differs from the Out sum by all n kernels.
It constructs the final coinswap transaction CS from the set of valid outputs C_i^out in Out, the set of valid inputs C_iⁱⁿ, the n kernels K_j, and an extra input and/or output to collect the leftover fees (or pay missing fees) as appropriate.

Once confirmed, the coinswap is irreversible unless all n nodes collude.

Practical Issues

Data provision

The data provision step above suggests that users send data directly to n different nodes. While possible, this is not the most practical.
The following improvement is inspired by TOR, The Onion Router.
Let’s assume that each node j has a known public key PK_j, and denote by ENC(PK, D) the pair <DHPK, E> where DHPK is a temporary public key generated by Diffie-Hellman for PK, and E is the symmetric key encryption of data D with the shared secret as key.
Then for each i we can recursively define an onion bundle OB_i,j for nodes j…n as follows:

OB_i,n = ENC(PK_n, data_i,n)
OB_i,j = ENC(PK_j, <data_i,j, OB_i,j+1>)

Now a user only needs to send OB_i,1 to node 1, which can decrypt to obtain both <data_i,1 and OB_i,2. In the output derivation stage, each node j decrypts each OB_i,j, checks that the received C_i,j-1 matches the one in data_i,j (removing mismatches), computes C_i,j, sorts all OB_i,j+1 by C_i,j, and passes those on to its neighbor node j+1.

This onion bundling not only prevents timing attacks on users whose spend data submisions to different nodes are correlated in time, but also allows them to send their data to a single node, possibly as soon as they receive the output they want to self spend.

Fee handling (Grin)

In a more realistic setting, each node pays itself a mixfee with a 1-input 1-output transaction replacing $K_j$, where Σ_valid x_i,j can instead by added to the transaction offset.
The contributed fees should then match the coinswap relay fee plus n mixfees. In Grin this gives the identity (with FB being the fee base of half a millicent):

|Out| * fee = (|Out| + n) * (1 + 21) * FB + n * 3 * FB + n * mixfee, or
mixfee = |Out| * (fee - 22 * FB) / n - 25 * FB.

For simplicity, we could require a fee of (22 + n) * FB, resulting in a mixfee of (|Out| - 25) * FB. In that case, for a mixnode to earn 10 grin-cents a day requires 225 daily self spends.

Spam attacks

A deluge of bogus onion bundles could fill up node 1’s memory buffers long before the day is done. This is mostly remedied by doing immediate input validation on every new bundle received by node 1. At the end of the day, we’d still redo the UTXO membership checks since self spend inputs could be spent during the day.

Thanks to the proofs of ownership, the number of self spends of any one attacker, surviving input validation, is limited by the number of outputs they own.
Having enough buffers to cover the entire UTXO set is one solution.

If after input validation at the end of the day, more spends remain than fit in a block (about 40000/22 = 1818) then two options are available:

run a modified version of the above protocol that merely filters out all invalid data. Any invalidly spent input can be banned from future coinswapping.
randomly partition them into equal sized (under 1818) subsets, and run coinswap on each in turn.

A proof of ownership of commitment C can take the form of a proof of a length-1 vector commitment as described in https://forum.grin.mw/t/vector-set-commitments-generalize-schnorr-and-utxo-proof-of-ownership/
which takes only 32 bytes more than a Schnorr signature.

Bandwidth Optimization for n=2

In the case of only 2 non-colluding nodes, we can do away with the (relatively large) proofs of ownership, and reverse the flow of data. That is, the onions are constructed with the (much smaller) node 1 data inside the bundle for node 2. Node 2 starts with output validation and sends transformed valid commitments back to node 1, which then does input validation on its transformed commitments. Kernel derivation then proceeds back to node 2. In this case it’s possible to handle spam with intermediate filtering rounds. Even though both nodes determine valid sets of inputs and outputs, they would require collusion to link them together. One downside is that node 2 can try to correlate submission order with the order that final coinswap inputs appeared on-chain.

Other Bandwidth Optimization

Node n could send the large Out set directly to node 1, reducing bandwidth through intermediate nodes on the backward pass by about half, but at the cost of those intermediate nodes no longer being able to check the invariant.

Reorg protection

A deep reorg is likely to invalidate a big coinswap transaction. In order to be able to still redo most of the contained self spends, node 1 could remember all valid input onion bundles for the past few days, and reprocess any that were undone in a reorg.

grin001 · February 3, 2021, 9:50pm

For people who don’t understand technology, it’s not easy to understand. Can someone give a brief overview?

tromp · February 3, 2021, 10:10pm

When you send your output from tx1 through a coinswap service before spending it in tx2, nobody can link these two tx together except with a small 1/N probability where N is the number of other coinswappers that day.

So it’s like getting a size N anonimity set, where N is limited to the max number of self spends in a block (1818).

Suffice to say, this is pretty awesome!

grin001 · February 3, 2021, 10:22pm

If this plan can be realized, it will be great.

Neo · February 3, 2021, 11:07pm

What impact will this have on Avg tx size?

oryhp · February 3, 2021, 11:30pm

No impact on the tx size as regular transactions stay exactly the same. What changes is that the outputs you get from a regular transaction can be sent to the coinswap service. When the coinswap service broadcasts the aggregated transaction, the new outputs should have a relatively high anonymity set.

This service comes at some fee cost, but that’s free today and will be for quite some time. It also seems preferable compared to a coinjoin service where the transaction needs to wait to be joined before it is published. This gets rid of the waiting as the actual tx can go on the chain directly to get confirmed sooner, but you pay some small fees for the self-spend coinswap.

Chronos · February 3, 2021, 11:39pm

I’m not sure what’s better, the CoinSwap proposal, or this phrase. I like both.

Neo · February 3, 2021, 11:42pm

That’s what I thought, but it sounded too good to be true.

Can we estimate the max size of the anonymity set based on current usage?

tromp · February 4, 2021, 4:20pm

Can we estimate the max size of the anonymity set based on current usage?

It would be nice if some explorer could show number of non-coinbase txs per day (we know there are 1440 coinbase ones), but I’m not aware of any.

bl0ckch41nsm0ker · February 4, 2021, 5:40pm

Doc couple of questions.
“nobody can link these two tx together except with a small 1/N probability where N is the number of other coinswappers that day”

Does the coinswap broadcast all of the transactions to the blockchain at the end of the day if a user participates in the coinswap? In other words the transactions get swapped in the memepool until some set time.

If so then the coinswap will take a day to finalize or could it be done at two times a day or three?

Could it just have a minimum number of transactions before it is broadcast to the chain from the memepool like after 100/N or maybe a smaller number like 20/N instead of once daily?

Does grin need to hardfork to implement the coinswap?

vegycslol · February 4, 2021, 7:48pm

Coinswap broadcasts exactly 1 transaction at the end of the day (unless there are too many txa but to get the idea you can ignore such case). This transaction is composed of only self-spend encrypted transactions (together all coinswap nodes can decrypt it, this happens at the end of the day). Those encrypted transactions are never broadcasted to the other nodes by themselves, only as part of the coinswap merged transaction, so nothing changes in the mempools. If there are enough transactions coinswap could run every hour instead of once per day.

Not really because then it’s not secure (eg. Let’s say you want to stop mixing at 20 txs. When i see coinswap broadcast a tx then i can send 19 selfspends to it and i would be able to link the next (20th) tx that would be sent to the coinswap).

No, it’s an offchain solution of how to build a new valid transaction, wallets would need to change though

oryhp · February 4, 2021, 7:58pm

An interesting debate here will be if we default to sending to coinswap service:) should every wallet send to the coinswap? I’m leaning towards yes if we figure out it would work out well simply because privacy by default is what makes it fungible. We don’t want exchanges blacklisting coins that send to the coinswap or treat the users that do as higher risk.

oryhp · February 6, 2021, 1:37am

I wrote a script to parse data from grinscan.net because it also shows also whether an output in a block has already been spent. Below is data on outputs and not txs since we are interested in the number of outputs for the coinswaps.

In this block range 2902 outputs were created out of which 736 were spent

This was for height in range(1078419-1440, 1078419). Now if we subtract 1440 outputs because I guess we had that many coinbase outputs, we get 1462 outputs. If every non-coinbase created output went to coinswap, we would send 1462 outputs to the coinswap service in this block range. However, since 736 of these were spent, we also need to subtract 736 because these outputs won’t pass the validation on the coinswap service because they have been spent in the meantime, so we get 1462-736=726. It’s kinda late so it’s quite possible I made a mistake somewhere.

The script I used is in this gist

bl0ckch41nsm0ker · February 7, 2021, 9:29am

I thought we should have this paper that @tromp posted in keybase because i think the general ideas are the same and it is very well written.

tromp · February 7, 2021, 9:59am

Also mentioned in bitcointalk Coinshuffle thread at

https://bitcointalk.org/index.php?topic=567625.msg56288711#msg56288711

Roelsmajor · February 8, 2021, 10:53am

Is the explanation in this article correct to describe “Coinswap” for a layman like me?

https://www.coindesk.com/coinswap-and-the-ongoing-effort-to-make-bitcoin-privacy-invisible

https://www.coindesk.com/first-coinswap-test-herald-era-stronger-bitcoin-privacy?amp=1

tromp · February 8, 2021, 12:15pm

No, MW CoinSwap is only superficially similar to Bitcoin CoinSwap, in that both look like self-spends to the users that end up getting unlinked. So those articles explain something which is quite different under the hood.

gene · March 30, 2021, 7:06pm

hi @tromp, a few questions:

what is the benefit of the swap nodes contributing random excess to the tx?
how are non-signing swap nodes able to malleate txs with extra randomness?
- does the original sender need to sign after the swap is “complete”?
could similar goals be achieved by adding coinswap rules to all nodes?
- something like: “X tx marked for coinswap, keep in coinswap pool for M blocks, then aggregate”
- all txs in the “coinswap pool” could also go through rounds of propagation (via Dandelion?)
- some metadata could be attached for how many propagation rounds a tx has gone through
- only txs on their last propagation round get aggregated

Hopefully, the questions and ideas make sense (still learning inner workings of Grin/MW). What are your thoughts?

tromp · March 30, 2021, 8:29pm

You mean the mixnodes? And what random excess are you referring to?

By txs, do you mean the user’s self-spend data? Mixnodes can’t change the input, excesses, or output of anyone.

No; as the proposal makes clear, the users just submit data. “No interaction needed for shuffling.”

That defeats the entire purpose of the coinswap protocol, since all i/o links are publicly visible in the coinswap pool.

gene · March 31, 2021, 1:57am

Yes, the mixnodes, sorry for terminology mixup. Referring to the randomness mentioned here:

Whereas the messages in a mixnet do not change, our mixnodes transform commitments into new ones
...
Let xi,j be the randomly generated excess by which node j should additively transform the commitment in spend i: Ci,j = Ci,j-1 + xi,j * G.

I was confused by the snippet in the previous quote, it makes it sound like the mixnodes are adding random excesses to commitments. What is the point of the mixnodes if they don’t change the inputs / commitments in any way?

I thought they would need to sign again, because I understood your proposal to include random excesses from mixnodes. Apparently, that was a mistaken understanding.

All i/o links are still observable by mixnode 1, or am I wrong? My suggestion was to decentralize it to all participating nodes, but maybe that is just the Dandelion++ protocol.

If the mixnodes don’t mutate the inputs / commitments in any way, is their purpose just to perform aggregation / cut-through on accumulated txs in a given time period? If not, is there a one or two sentence explanation for how the mixnodes obfuscate the transaction graph?

Thanks for your replies

Topic		Replies	Views
Request for Funding @scilio (CoinSwap Implementation) Governance	87	7781	May 3, 2024
Daily Aggregator (Partial Transactions) Research	21	2274	January 28, 2021
Coinswap list of problems	9	970	March 6, 2022
Opposition to default coinswap Development and Technical Discussion	47	1148	October 5, 2023
Yo Dawg, I heard you like CoinJoins	41	17966	February 2, 2021

Mimblewimble CoinSwap proposal

Abstract

Introduction

The CoinSwap protocol

MWixnet Architecture

Data provision

Input validation

Output derivation

Output validation

Kernel derivation

Aggregation

Practical Issues

Data provision

Fee handling (Grin)

Spam attacks

Bandwidth Optimization for n=2

Other Bandwidth Optimization

Reorg protection

Related topics