Thank you @vegycslol !
In regular scalable BFT the network of size
n has a subset of nodes called the root committee of size
c which is supposed to be small compared to
n. In my modification I pose a one more set much larger than n called the candidate pool, let us assume the candidate pool is of size
n^2. So we have
c << n << n^2.
The root committee nodes produce a new state. The remaining
n-c nodes are called the verifying nodes and they are only in charge of validating and replicating the state proposed by the root committee. In the picture I propose the remaining
n^2 nodes in the candidate pool are just candidates to join the network in case if the replacement is needed.
The experiment described in the paper by Jalalzai et al. numbers used are the following:
n=200. For the purpose of the example lets imagine the candidate pool has
N=10,000 nodes waiting to join (note difference between capital and lowercase symbol) and
f of those nodes are malicious nodes implanted by the attacker.
- New state is proposed only by the root committee, initiated by a single node within it called a “primary node” (which also changes randomly). The new tx to be aggregated can be encrypted using the primary node key and broacasted in the network following the Dandelion protocol to hide the origin IP. Once the encrypted tx reaches the primary node it will aggregate it, sync within the
c nodes of the root committee and then validated by remaining
- That is a good question. There are multiple possible disruptions: preventing consensus from being reached and delaying the processing, leaking information that was supposed to be secure (unaggregated transactions) and killing entire cluster. Let me address them one by one at the end of the response as it will be a bit longer.
- I assume that in order to link inputs and outputs you need to leak the unaggregated transactions (correct me if I’m wrong). It is hard to tell explicitly how many you need because answer to that question is probabilistic. Let’s say only the primary node knows the raw transactions and only for fixed and limited amount of time before it gets re-elected randomly so to leak out this information you need your malicious node to become the primary node via the random election process within the cluster. We established some numbers earlier, the probability of your malicious agent implanted in the candidate pool becoming the primary node should be
P=(1/c)*(1/(n-c))*(f/N). Imagine you put
f=100 malicious agents in the candidate pool, given that
N=10,000 the probability of one of them becoming the primary node would be
P=(1/36)*(1/(200-36))*(100/10000)=0.000001693766938. If this idea gets good feedback I can make better models, such as estimating the expected number of block heights given a number of malicious nodes in the candidates pool etc. Those might answer your question more explicitly. The plot in my article is the probability of replacing entire committee in terms of number of malicious nodes.
Now regarding the scenarios of the disruption of the aggregator, lets address them one by one
- Preventing consensus from being reached and delaying the processing, this requires
c/2 malicious nodes in the root committee, so in our little example you would need to implant 18 of them (and you could see in the point 3 the probability of introducing one of them). If you managed to prevent the consensus from being reached the loss is only in the transactions waiting to be aggregated. The overal state gets preserved and new root committee gets elected. As long as it does not happen often it is not a problem as it only affects the performance.
- Leaking information that was supposed to be secure (unaggregated transactions), for that you only need a single node in the root commitee, but this node has to be the primary node. If you put
f=100 malicious actors in the candidates pool, in our little example the probability of one of them reaching becoming the primary node is
- Killing entire cluster. It is possible to literally destroy the network. State is replicated only by
n nodes as
N nodes in the candidates pool do not replicate it. If you implant
c/2 nodes in the root committee to prevent consensus from being reached (which is 18 nodes in our example) and then among the replica nodes you would need
f/2-c/2 malicious nodes implanted to prevent the new honest root committee to be formed (that would be 82 nodes in our example). So for the case of
n=200 to destroy the aggregator you need 100 malicious nodes and 18 of them must be in the root committee. If that occurs consensus could not be reached, there would either be no new root committee or it would be purposely dominated by the malicious nodes via manipulated protocol. To my understanding such scenario is not repairable. In such case a new BFT cluster has to be formed from scratch.
In the end I would like to make few important comments:
- Apart from disrupting the cluster, leaking information and affecting the performance potential attacker has nothing to gain. It is not like (hypothetically) hacking an actual Grin network in which you could produce blocks, double spend etc.
- Even without the PoW, it would not be easy to produce those Sybil nodes. Imagine joining candidates pool for 24h requires a refundable deposit in Grin of equivalence of $1,000.00. Implanting
f=100 faulty nodes would require you to borrow $100,000.00 from someone for one day and it would only give you probability of
0.000001693766938 that one of them would become the primary node.
- I am not fluent with MW theory, I just assume that once aggregated it is safe to broadcast (of course the more txs aggregated then safer, so the later ones would benefit more from the aggregation improved privacy), also I assume it is possible to validate the aggregated transaction without knowing the tx (as only thing to check is the zero sum and matching signature). Please correct me if I’m wrong here.
I hope it helps. I’m also quite new to BFT, so we are actually learning together. While writing it I’m still before my first coffee so please let me know if the calculations don’t make sense so that I can review and fix them.