What is Consensus Protocol?
Blockchain as we know it, is said to be decentralised. But what is it under the hood that makes it decentralised. Today, we are going to explore further into this area.
Blockchain is a subset of Distributed Ledger Technology (DLT). It is a distributed system comprised of independent nodes communicating with each other via message channels, with the objective of reading, writing and storing updated copies of the ledger. Nodes are computers connected to the network, they are not bounded geography or timezones. They can be anywhere and connect to the network at any time.
Interestingly, the ledger is owned by everyone and at the same time no one. Every node will have a copy of the ledger and every node will be made aware of any updates or amendment proposed and made to the ledger.
Here comes the problem. How do these nodes who are strangers to one another coordinate and perform decision-making as a unit? This is why there is a need for a consensus protocol to serve as guiding principles for the network to abide by and come to an agreement on outputs.
Safety & Liveness of Consensus Protocols
This leads me to the first aspect of consensus protocols — Safety and Liveness. For all nodes to be agreeable to any given output or to be considered ‘correct’, the distributed system shall abide by these two rules. First, safety is that the system does not do bad things. Second, liveness is that the system must come to a positive conclusion. And it is achieved via consensus algorithms that meets the following requirements:
- Validity [Complies with Safety] — Any agreed upon output must be proposed by one of the nodes
- Agreement [Complies with Safety]— All non-faulty nodes must be agreeable the same output
- Termination [Complies with Liveness]— All non-faulty nodes eventually decides.
Next, as with all things in life, when we take some we got to give some. You can’t have everything in life. The famous saying goes:
Fast, Good & Cheap. You can only pick 2.
In any consensus protocol, only 2 out of the following 3 can be achieved at any one time.
Partition Tolerant — The ability to operate despite a partition in the network.
Consistency — All nodes to provide the latest state of the network.
Availability — All nodes to have constant read and write access.
Below are a few scenarios to illustrate the above situation.
If the network achieves Partition Tolerance and Consistency, it will fail to achieve Availability. When there is a partition in the network, possibly due to internet connectivity problems or electrical outages, etc., causing them to communicate in silos within their own group. The two groups of nodes will not have the same recent state of the system. In this scenario, the nodes belonging to the two different groups will not be able to respond with the same output and will opt to not respond. Hence, not achieving Availability.
If the network achieves Partition Tolerance and Availability, it will fail to achieve Availability. In the same scenario as above, the nodes belonging to the two different groups, then the outputs are not the same. Hence, not achieving Consistency.
If the network achieves Consistency and Availability, it will fail to achieve Availability. The nodes in the network will have to be in constant communication with each other to be updated of the same recent state and thereafter able to respond with the same output. If a partition were to occur, then it will fail to achieve either of the two.
In the case of a distributed system, it is almost a given the system will need to achieve Partition Tolerance as the system is maintained by independent nodes, it is impractical to expect all nodes to be active at all times and at the same time. It is very likely for the network to experience partitions and, in such conditions, the system must still be able to maintain its operability and recency.
Hence, most blockchain will have to make a choice between Consistency and Availability while achieving Partition Tolerance at all times.
Byzantine Generals Problem
In an ideal world where everyone is honest and have the community’s interest at heart, we might be able to believe that every node will behave in accordance to altruistic behaviours and can trust all the nodes on the network. However, in reality, that is rarely the case. Humans tend to put their personal interest at priority over their communities, leading to problems such as the ‘Tragedy Of The Commons’.
Even worse, there may be entities that might just want to harm others intentionally, which brings me to the Byzantine Generals Problem. A Byzantine node is said to possibly act maliciously, with the intent to harm or deceive the network, for their personal interests.
There are 2 possible types of faults produced by Byzantine nodes.
- Fail-stop — A node can crash and not return values.
- Byzantine — A node can intentionally send the incorrect or corrupted values. Any other forms of deviations from the protocol will fall under this category.
Beware: If there are more than 1/3 nodes which are Byzantine, then achieving consensus that is ‘correct’ will be impossible.
There are various forms of consensus mechanisms available and they all have their unique approach, advantages and disadvantages when dealing with different situations. Following are some examples, which you can look into further and compare them:
- Voting-Based Consensus Mechanism, such as pBFT(Zilliqa).
- Nakamoto Consensus Mechanism, such as PoW (Bitcoin & Ethereum) & PoS(Polkadot, Polygon & EOS).
- Federated Consensus Mechanism
PS: Another popular subset of DLT, that might be gaining traction, is Directed Acyclic Graph (DAG).