Objective-Dandelion

I’ve been experimenting with a modified Dandelion protocol intended to increase chances of stem-phase aggregation. I called it Kademlion, because it is inspired by Kademlia, but now think that Objective-Dandelion is a better name, both as a play on Dandelion++/C++/Objective-C, and because it introduces an “objective” routing metric that nodes agree on.

I want to note that I have no clue if this is a good idea or not!

What

All grin nodes pick a random 256-bit node id which is broadcast to their peers. A node’s ID need only stay the same for short periods of time, on the order of a few block heights, so it could rotate frequently enough to avoid allowing node tracking and identification.

Upon receiving a block, a new epoch begins if block_height % 10 == 0 . The epoch id is the hash of the block that started the epoch.

At the start of each epoch, every node:

  • Calculates the relay weight of itself and all peers using the formula: relay_weight = blake2b(node_id || epoch_id) . Relay weight is interpreted as a 256 bit unsigned integer.
  • Selects the node among itself and all peers with the highest relay weight as the relay target .
  • Determines that it is in stem mode if the relay target is a peer, and in fluff mode if the relay target is itself.

Then, for every stem transaction received, each node:

  • If it is in stem mode, it immediately relays it to its relay target.
  • If it is in fluff mode, it aggregates transactions until its next patience timer expires, then broadcast them.

Why

“Heavy” peers will naturally be chosen for relay targets, creating agreed-upon stem paths and fluff sinks, and leading to more opportunities for transaction aggregation.

Notes

  • In a well-connected network, one node may be the sink for all transactions during a given epoch, and thus see all unaggregated transactions during the epoch. It may thus be desirable to stick to random mode selection as in Dandelion++, so as to break up stem paths with randomly selected fluff sinks.
  • Since the epoch ID is known by all, a node could pick node IDs that had high relay weight during that epoch. To provent this, if a node connects to a new peer it should use a randomly assigned peer ID for the current epoch.
  • I wrote a crappy aggregation and routing simulation, and the results were encouraging. Objective routing led to around 2x more aggregation in the toy network. Also, as would be expected, increasing the number of connections between nodes increased aggregation.
  • Potentially, some of the benefits of dandelion are lost, since a node can to some degree predict routing between neighbors. However, since nodes cannot observe connections between other nodes, they still don’t have a complete picture.
  • I had previously used an xor-based relay metric, but I think using a hash-based metric is better. Kademlia uses an xor metric, but Grin doesn’t require it, so better to use something more resistant to gaming.

Pretty pictures!

Grey nodes are in stem mode, aquamarine nodes are in fluff mode. Connections between nodes are invisible, since they clutter the graph too much. Each node has one outgoing edge if it is in stem mode, showing the node that it is routing to in the current epoch.

100 nodes w/8 connections each


All transactions will be forwarded to and aggregated by one of the 7 aquamarine nodes during the current epoch.

100 nodes w/64 connections each


As you would expect, higher connectivity leads to fewer fluff-mode nodes, in this case only one.

1000 nodes w/8 outgoing connections

1000 nodes w/8 outgoing connections, alternate layout


This somewhat nicer layout shows more clearly the groups of nodes that all share the same sink during the current epoch.

5 Likes

Nice writeup!

To provent this, if a node connects to a new peer it should use a randomly assigned peer ID for the current epoch.

How is this actually guaranteed to be randomly selected? Would peers need to validate other peers’ ids?

Also, any thoughts on how a peer should treat two peers with the same id? Pick one in random? Ban whoever was last to announce the colliding id?

Thank you!

A peer ID cannot be verified to be randomly selected, so for this to be viable, non-random ID selection needs to lead to an outcome which is no worse than not implementing this scheme at all.

Even if it isn’t randomly selected though, since a new peer’s ID isn’t valid until an epoch after you first see it, it shouldn’t be possible to game transaction routing.

I think picking one at random is probably the right call.

This is what I was missing from the original post, thanks for clarifying.

1 Like

Hi @rodarmor, thanks for sharing!

A couple of questions based on my understanding of your proposal:

  1. You mention both node id and peer_id. Are they mean to be the same thing or different? If the latter, how are they different?
  2. In your proposal, it seems that a new epoch is determined (fairly) synchronously across all nodes (when a new block is broadcasted). In Dandelion++, epochs are asynchronously determined (each node keeps its own internal clock). What is the impact of this modification to an adversary’s ability to deanonymize the network?
  3. If an adversary is able to listen on the entire network using multiple byzantine nodes, it would presumably be able to obtain the peer_id of each node as it is being broadcasted to its peers. With epoch_id already known, it would then be able to calculate the relay_weight of all nodes. How would this impact an adversary’s ability to determine the originating sources of transactions? Presumably nodes with very low relay_weight during an epoch would very infrequently fluff transactions whereas nodes with high relay_weight would almost certainly do so.
  4. More generally speaking, what was the motivation to improve on Dandelion++? The notion of a “fluff sink” also exists with Dandelion++. Did you do simulations that determined that the aggregation of ‘vanilla Dandelion++’ would not lead to enough aggregated transactions?
1 Like

Whoops, good catch.They should be the same thing, I’ve fixed the original post.

It’s not clear to me if the synchronous nature of epochs increases an attackers ability to deanonymize the network. I think the main potential issue is known peer relay preference, discussed below.

It would have some information, but not a complete picture. In particular, it can’t observe connections between nodes, so it can’t recover the complete graph of transaction paths. So if you are stemmed a transaction from a node with a very low relay preference, that node is more likely to be the originating node, but it isn’t guaranteed.

Please take this with a giant grain of salt!

The metric for aggregation that I tracked was total_txs / fluffed_txs, i.e. the number of individual transactions divided by the number of aggregated transaction bundles that were eventually fluffed.

When simulating the same conditions (transactions per second, node count, and graph connectivity), the aggregation metric for Dandelion++ was 1.03, whereas for Objective-Dandelion is was 1.79. So, it was relatively rare for transactions to be aggregated under Dandelion++, whereas it was the norm under Objective-Dandelion.

My simulation has a lot of simplifying assumptions though, and the network graph and transaction loads surely differ from those of the live Grin network.

Going back to the two forms of privacy afforded by Dandelion++, I don’t think we know the relative importance of each yet, and thus are poorly equipped to make decisions involving potential tradeoffs between the two.

If mixing is ubiquitous, and the transaction graph is hard for full nodes to observe, then tx-graph privacy isn’t as much as a priority. Alternatively, if users don’t use mixers, and attackers run full nodes that allow them to recover the tx graph and mount attacks on users, then the additional tx graph privacy provided by increased protocol-level aggregation might be a high priority.

Although it would be more complicated, Objective-Dandelion could be implemented as a second phase after the Dandelion++ stem phase is complete. Thus, the origination privacy provided by Dandelion++ would be fully intact, and Objective-Dandelion could concern itself solely with maximizing aggregation.

I suspect being able to deterministically figure out who’s most likely to fluff opens up for a wide range of attacks.

For example, if I know all the relay_weights of honest nodes and observe the network, and a node with a relay_weight in the 50th percentile fluffs, then I know for a fact that only nodes with a relay_weight less than that could have been stemming to this node and can exclude half my dataset immediately and work further on narrowing the list of suspects down.

Here’s another one: If I’m running malicious supernodes that connect to most peers on the network, what’s to stop me from setting artificially high relay_weights on my nodes? Wouldn’t this mean that the honest nodes that are connected to my malicious nodes end up forwarding their own transactions to me immediately, thus exposing themselves directly?

That’s definitely a valid criticism. As is, the proposal is a tradeoff between transaction origination privacy and transaction graph privacy, so depending on which we value more highly, it may or may not be worth it in its current form.

My hunch is that transaction graph privacy is more valuable, for a few reasons:

Transaction graph information

  • Requires running only one archival node
  • Comes in the form of kernels, inputs, and outputs which irrefutably prove the topology of part of the transaction graph
  • Since it is objective, multiple parties who don’t trust each other can aggregate data to form a more complete picture (i.e. if party A sees some kernels/inputs/outputs, party B sees some others, they can join them together into a graph without needing to believe that the other is honest)
  • Used in the majority of real-world blockchain forensics and deanonymization attacks

Transaction origination information

  • Requires running large numbers of nodes, nonstandard software, and some sophistication to collect
  • Comes in the form of “I saw this node send this transaction, so I believe that node originated the transaction”, and not objective, irrefutable evidence that can be presented to a third party (i.e. kernels + inputs + outputs)
  • Since information is not objective, you must trust the source of the information. (i.e. I can lie to you about which node I saw stem which transaction)
  • Sees limited use as an attack vector (I’m sure there are examples of this that I’m missing. They probably come in the form of targeted attacks on exchanges or large coin holders.)

Adding objective dandelion as a second phase after initial stemming would remove the tradeoff between these two sources of privacy loss, at the cost of additional complexity and an increase in relay time.

Maybe it would be better to call it something other than Objective-Dandelion to emphasize that it isn’t necessarily an alternative to it.

The proposal contains a provision that prevents this:

So you can pick a node ID that gives a high relay weight, but other nodes will ignore it until the next epoch, at which point the relay weight will be scrambled.

Say we simplify the various timers with something along the lines of -

Then nodes generate ids via H(ip_address|block_hash) every n blocks (say 10 for simplicity).

Each node relays stem transactions to the outbound peer with lowest id. The analogy here is stem transactions always flow downhill toward minimum peer ids.

Now instead of deciding if we are stem/fluff randomly - we decide we fluff if our node has the lowest id vs all our outbound peers. i.e. If we are a local minima we fluff.

Stem txs will always flow toward peers with lowest ids (sinks).
Peers with lowest ids aggregate and fluff transactions.

The epoch id is simply the block hash every n blocks (every 10th block).
Nobody gets any advantage attempting to game the system with node ids.
Everybody on the network generates new (but deterministic and global) ids for their outbound peers based on peer ip address and latest epoch id.

Maybe we still maintain some level of randomness to keep stem paths randomized and not strictly “flowing downhill” - relay to lowest id 90% of the time and a random outbound node 10% of the time.

And if the tx is our tx then we always relay to our lowest id peer, regardless of whether we are the current local minima.

Edit: I actually thought the “we aggregate and fluff if we are the local minima” was a new idea here. But the original post here actually describes this first… (apologies to @rodarmor).

And “heaviest” (aka largest) ids makes more sense here in that these are sinks and stem txs flow downhill toward these.

2 Likes

Thanks for revisiting this discussion!

Naive question:

Then nodes generate ids via H(ip_address|block_hash) every n blocks (say 10 for simplicity).

Is this gameable? I.e. would malicious nodes be able to manipulate their id to get a very “heavy” ID by somehow spoofing their IP address?

I suppose it could be mitigated the same way as @rodarmor proposes above (em mine):

  • Since the epoch ID is known by all, a node could pick node IDs that had high relay weight during that epoch. To provent this, if a node connects to a new peer it should use a randomly assigned peer ID for the current epoch.

We should be ok. They would have to manipulate their IPs every 10 minutes, and then reconnect. If they were able to do this, we’d have bigger problems (banning would be ineffective, for example)

There are additional improvements we could make that would make it even harder to manipulate. We could split the IP address in half, and hash with the first 2 octets and then hash with the last 2 octets separately. Then the hash with first 2 becomes the primary ID (higher priority), and the hash of last 2 just sort of become a tiebreaker. This makes it so gaming the system would require having IP addresses in several regions.

We’re already only considering outbound peer connections as stem relays. So you’d need to not just reconnect with your carefully chosen ip address but other nodes would need to actively reach out and connect to you (on your new ip address).

Maybe we can take a more hybrid approach that maintains longer stem paths (better anonymity via Dandelion itself) while still having txs “flow downhill” to give us good opportunity for aggregation.

Rather than choosing the heaviest peer to relay stem txs to we simply choose a random heavier peer for that epoch. At each hop along the stem path txs flow downhill but they take a more random path to get to one of the final “sink” nodes for aggregation and eventual broadcast.

Intuitively at least it feels like this more random path gives transactions a better chance of finding a more global maximum and less chance to get stuck hitting a local maximum. Which would increase aggregation effectiveness.

This also has the nice benefit of paths never crossing and preventing cycles in the stem paths. Txs will always end up at an aggregation/fluff/broadcast node eventually.

We’re already only considering outbound peer connections as stem relays. So you’d need to not just reconnect with your carefully chosen ip address but other nodes would need to actively reach out and connect to you (on your new ip address).

Ah, that’s right, and yes, that makes it even harder to game.

Rather than choosing the heaviest peer to relay stem txs to we simply choose a random heavier peer for that epoch. At each hop along the stem path txs flow downhill but they take a more random path to get to one of the final “sink” nodes for aggregation and eventual broadcast.

So in a given epoch, with this approach, it will take more “stem hops” to reach the “final sink” than in the previous proposal, but the path to reach the sink would be less deterministic, and could potentially lead to better overall degree aggregation, is that right?

This also has the nice benefit of paths never crossing and preventing cycles in the stem paths. Txs will always end up at an aggregation/fluff/broadcast node eventually.

That’s a very nice benefit that I had not thought of.


In a construction such as this one, where the sink weights are determined using publicly available data (IP + block hash), a global adversary observing the entire p2p network would be able to calculate the weights for the entire network, thus also understand the likelihood of flows of stem transactions. If they are able to monitor the communication, would they also be able to determine how the nodes are connected to each other by studying these flows?

What’s the impact of that? Are we concerned about this helping to build up a topology of the network? Does it matter if network observers and participants can determine where the sinks will end up being during a given epoch?

I’m not sure to be honest.
You could do the same thing with the current Dandelion++ impl, knowing that stem txs can flow randomly from any node to an adjacent outbound peer.

With this weighting approach the only additional information would be that txs flow from lighter nodes to heavier nodes (so some subset of outbound peers for a given epoch).

You could do the same thing with the current Dandelion++ impl, knowing that stem txs can flow randomly from any node to an adjacent outbound peer.

Yes, you are right, this is already possible, but the impact today might be less. I.e. today, if I understand the topology of how nodes are connected, then I can only assign a probability that a certain node will be stemming or fluffing (based on DANDELION_STEM_PROBABILITY).

On the other hand, with this proposal, if I understand the topology, then in every epoch I will know which node will be fluffing, and which nodes most certainly will not be fluffing. Granted, your proposal of randomly choosing a heavier weighted node to stem to mitigates this somewhat. I don’t know how much or what the implications of this are.


All in all, I personally like the ideas and directions here, combined with what’s proposed in Dandelion++ Thoughts on Timers/Scheduling · Issue #2880 · mimblewimble/grin · GitHub of using blocks as timers.

We’re are already making considerable deviations from the D++ spec today in our implementation, which leads to some attacks being possible.

The direction of this proposal allows nodes (and passive adversaries) to deterministically understand where fluffing will occur on the network and in which direction transactions will flow during an epoch. It’s not clear to me how these changes would lead to particularly worse privacy than what we have today. I am not convinced we today are able to adequately protect against an adversary that monitors the entire p2p network in any case. The proposed changes do not impact this as far as I can tell.

I do see however that with this proposal, our design would become more simpler and elegant, less error-prone, and lead to increased pre-fluff aggregation. Which is good.

It would be nice to get other people’s thoughts on the matter.

@rodarmor do you have thoughts or anything to add? It was your ideas above that originally put us on this path and inspired some of the thinking after all. For example, what do you think of @antioch’s suggestion to use

H(ip_address|block_hash) every n blocks

over your proposed

relay_weight = blake2b(node_id || epoch_id)

He and I did discuss this a bit when I proposed using ip address a few months ago. See here:

1 Like

Thanks for that @david. Apologies, I didn’t recall that the IP address approach had been originally proposed by you. So the attribution of that idea should go to you. :v:

No worries. Several of us have independently designed systems like Casey’s, each with various advantages and disadvantages. I’m not too worried about credit personally, but it would be great if we could all work together (like we’re doing) to make sure the best of our ideas make it into the final design.

On that note, we’re already adding an additional (powerful) tool that we can probably now toss into the mix: TOR. Hidden services could be advertised by nodes that can possibly be used to submit txs and maybe even eliminate some of the more inefficient parts of Dandelion.

1 Like

I’m definitely not picky about attribution or credit! I’m just happy if blockchain analysis becomes much more difficult :slight_smile:

I think using IP addresses and block hashes instead of introducing new concepts is definitely better, and I don’t see any downsides.

I like the idea of picking a random heavier peer. But, I don’t have a strong or well-reasoned opinion. I think that design choices like these are amenable to simulation, i.e. simulating nodes, connections, and transaction flows under different relay policies. I wrote a simulator to draw the graphs in the original post, but it’s disastrously messy code, so I wasn’t going to post it unless someone was really interested.

Regarding a potential privacy trade off vs vanilla dandelion, i.e that any kind of non-random behavior can be predicted by an attacker, it could be useful to think about this as an intermediate third state, in addition to stem and fluff, that a transaction can be in. So a transaction is first stemmed, using the purely random vanilla dandelion algorithm. At some point a node flips a coin and decides that it should enter the aggregation phase, where it follows non-random rules designed to maximize aggregation, and then at some point the now hopefully aggregated transaction bundle gets fluffed.

Of course, the introduction of a third state increases complexity with an uncertain gain.

1 Like