Recently @tevador shared a provably secure 2-step transaction scheme:
I would like to formally ask OC and the community if there is any interest in implementing this in Grin. I think the UX improvements are worth consideration, even with the increased kernel size, but that’s just me. What do you all think?
I think the 2-step protocol is more suitable for a new experimental blockchain.
There is Minglejingle, which is a non-interactive protocol based on MW that also supports permanent pruning, but the costs are higher (a fully spent 2-2 transaction is ~336 bytes for Minglejingle, ~160 bytes for the 2-step protocol and ~96 bytes for the 3-step protocol).
If I well understand what you’re guys discussing is to change 3-step TXs by slatepacks into 2-step TXs, it will be really useful.
Example: I sell stuffs to customer without online wallet, I have to ping-pong-ping to finalize payment and it could take more time if we’re not at same timezone to perform transaction.
It would continue to be lightweight, maintaining its most important scalability benefit over BTC. That is, it would still scale with the number of historical transactions/kernels and the UTXO set, as opposed to scaling with the historical TXO set (like bitcoin).
Every NITX proposal so far has come with (much) greater scaling costs.
Simplicity in everything but usage
This is a good proposal. It has enormous usability benefits, while continuing to allow full control over a wallet’s UTXOs (an oft-repeated concern by @phyro and @vegycslol), yet can be achieved for only a tiny fraction of the complexity of the NITX schemes.
We should seriously weigh the pros and cons of this proposal before just writing it off. The early GRIN community would’ve loved this, and would’ve rallied around such a clean minimalistic solution (provided it’s found to be secure, of course). It’s sad we’ve lost that spirit somewhere along the way.
I haven’t had the time to look into this scheme closely. From a quick skim at the kernel section, my understanding is that the tradeoff required is to remove CISA, or rather, cross-input-output-signature-aggregation (CIOSA?) in the case of Mimblewimble. Undoing this comes with at least some privacy (#parties leak) and storage (a sig per tx party rather than a sig per tx) downsides. I do believe 2-step txs are an improvement for obvious reasons, and it seems this proposal may be a simple construction to get that. I’m still of the opinion required protocol-level interactivity is a good thing. If we can reduce the number of rounds to 2 with the tradeoffs we’re comfortable, that’s great. My current view is somewhat half-baked as I didn’t dive deep in the proposal, I have a long todo list of things I have to do before that unfortunately.
Should we clearly weigh in the pros/cons? Absolutely.
Is throwing CISA away to achieve 2-step txs a good tradeoff to make on Grin? I’m not sure.
I hope we all share our opinion on this, would love to hear them.
That’s not my understanding. Both parties would still sum their output secrets and subtract the sum of the secrets for all inputs being spent to get their total kernel excess (which is then split between the offset and the one-time kernel public key). AFAICT, the only privacy leakage would be that you would learn the number of participants involved in a transaction based on the number of kernel public keys (Please correct me if I’m wrong, @tevador)
As a modification, we could always require all kernels 2 have 2 public keys, which would eliminate any privacy loss at the cost of some additional transaction building complexity for multi-party(3+) scenarios. I’m not sold on that idea, but it’s something that could be discussed.
One area that would need researched is what effect this new scheme would have on the various scriptless script schemes, payment channels, etc. Just speculating out loud with my (still-limited) understanding of the signature scheme, I don’t believe this would affect any of those, since they already build the equivalent of a single-party SASchnorr signature, but it’s possible there’s some nuance I’m missing.
Agreed. Introducing a new signature scheme is a significant complication, especially one whose size varies with the number of parties. Bloating the entire kernel history that must be synced by 50% is also a significant price to pay.
Having a single signature scheme with a smaller size, independent of number of transacting parties, is much more suitable for a lightweight and minimal implementation.
While it’s a moderate usability improvement, I feel that bigger improvements are possible by continued development of grin wallets and their ability to relay slates to/from various web protocols / communication channels / and directly to other wallets, with help from QR codes. Seeing some of the Bitcoin Lightning wallets in action, which also run a 3 round protocol, makes me think that there’s quite some room for improvement in the Grin wallet, without resorting to consensus changes that must be supported for eternity.
I also wonder to what extent the appearance of a 2-step protocol can be implemented with current signatures. If one transacts repeatedly with the same counterparty (and would thus have their public key already in the MWC protocol), then couldn’t the required key material (random public excess and public nonce) for round 1 of the next transaction with a party already be relayed in a signing round of the current transaction with that party?
Good point, and afaik if we want to have address book and filterable step1 slatepacks (to avoid spam attacks) then you need to send something together with grin address. If we ever implement such “hello” message then we could include public nonce and excess for the next transaction. It does complicate some things though (donations become more annoying, reinstalling wallet requires new “hello” message etc).
Hmm, this is basically performing the setup phase for the next transaction at the sign step. It does reduce a round trip, but both parties would have to exchange that which means it’d have to be done at the setup phase because the person at step2 never gets the information after the last sign. This means both parties would have to do a double-setup, for the current transaction and for the next one. Since you have no transaction context for the next one, things like amount isn’t known at the time which means it’d require the transaction to be late-locked. Makes me wonder how far we can push this, but it is a bit more complicated and I’m not sure it helps with things like exchange integration because the nonce private key will be only on one device. It will work if it’s the same one, but will obviously fail if it’s not.
Why do you say CISA has to be thrown away? From what I can tell, that should still be possible. As long as all input sigs aggregate with each other and all output sigs aggregate with each other, then the privacy properties are the same.
So, to be complete in our assessment, I think we should identify what user stories (if any) would be improved with the 2-step protocol, and then we can decide if those benefits are worth the protocol changes. I tend to think they aren’t worth the change, but we should still be complete in our analysis.
From what I can see, the UX improvement only exists where Alice and Bob have never transacted before and neither can run an always-online wallet server. Can anyone think of a user story which falls into this category? So far I can only think of one such user story:
Bob is a nonprofit who wants to accept donations, but he cannot run a server for cost reasons.
I can’t think of any other user story couldn’t be addressed with better wallet communication strategies for the current 3-step protocol.
The idea behind CISA is to only provide a single signature per Bitcoin transaction even when there are multiple inputs.
The reasoning above, which may very well be wrong, came from reading this statement.
The kernel contains a separate public key Ki for every participant.
Mimblewimble has a built-in CISA variant by design and based on these two links, it would seem this deviates away from it i.e. 1 signature per transaction. If you do half-aggregation, you end up adding the scalars together which is no longer a single signature, it’s more similar to a compressed form of two (or more) signatures. But as I said, this is based on a skim, not on real understanding of the scheme.
An aggregated Schnorr multisignature cannot be completed in 2 steps. The best you can do is to separate the kernel keys and half-aggregate the signatures, which is exactly what the 2-step protocol does.
If you want a 2-step protocol with full aggregation, you’d need to use a pairing friendly curve like BLS12-381.
There is still cross input signature aggregation, but only across inputs with a common owner. So, Alice may provide many inputs into a transaction, and still she need only provide one signature. Bob may also provide many inputs to the transaction, and he too only needs to provide one signature. They each contribute half of a single HASchnorr signature. The number of (half)signatures still scales O(1) with the number of transactions, instead of O(N) for N inputs in the transaction.
This does reveal how many parties are involved in a transaction, but it does not reveal which inputs or outputs belong to which party, which is the critical property for attaining privacy.
So, in my assessment, the scalability and privacy properties are comparable to the three step protocol. So as I seet, the only points to compare are:
2-step protocol has a slightly larger kernel size
2-step protocol requires Bob to ensure no address to avoid cross transaction linking (but not theft!) of Bob’s outputs
2-step protocol eliminates first step for users that cannot coordinate synchronously (… not many real world examples of this)
It depends how you define CISA which is why I linked one definition from a Bitcoin dev above which defines CISA as having a single signature per transaction.
I think it scales O(N) where N is the number of parties in a transaction - each party leaves a pubkey in this new kernel format. Signature size is however reduced by half as scalars get summed together so you only have a single scalar making it N*32 + 32.
Interestingly, the SASchnorr signature scheme I’m using in the protocol aggregates the nonces, not the scalars. So you have one scalar s for each participant and a single nonce R. This aggregation has the advantage of being reversible (you can extract the individual nonces) and more secure than the scalar aggregation.
Interesting, this one is new for me. I’ve only seen one that sums the scalars which makes it impossible to deaggregate the sigs unless you saw the scalars before aggregation. I think the ability to deaggregate is a tradeoff. In theory, if you can noninteractively aggregate two bitcoin sigs of separate txs, but others can’t deaggregate them, it makes the two txs atomic if they were not seen prior to aggregation. If someone aggregates 5 txs, it’s impossible to tell if they belong to one owner or they’re 5 unrelated txs. I think this allows for a new type of hidden coinjoin variant which lives not as a single tx, but inside a transaction group. That’s a bit offtopic, but it reminded me that grouping HASchnorr sigs provides some interesting cross tx binding properties.
Ah, here is the gist where I tried to explain potential new ways of camouflaging what’s going on through transaction groups atomic_txs.md · GitHub.