This post is the sixth post in a series dedicated to demystifying blockchains. The last post introduced the Proof of Stake. The third post introduced the concept of consensus and classified them into two categories: lottery-based and Byzantine Fault agreement-based protocols. This post presents the most known Byzantine Fault agreement-based consensus: Proof of Work
Byzantine Fault Tolerance is known for about four decades. A Byzantine Fault is any failure of a system. The failure may be that the system is not working, or the system is behaving maliciously. An f-Byzantine Fault-Tolerant network is a network that operates correctly under the assumption that f nodes may be concurrently experiencing Byzantine faults. The nodes are either well-behaving or ill-behaving, i.e., suffering from Byzantine Faults. The network comprises n nodes with a maximum of f ill-behaving nodes.
The best implementation of Byzantine Fault Tolerant system for asynchronous protocols is the Practical Byzantine Fault Tolerance (PBFT).  PBFT assumes that the size of the network is at least 3 f + 1.
At any given moment, one node is the primary and the other nodes are replicas. The simplified algorithm operates as follows:
- 
A client sends a request to the primary node for an operation 
- 
The primary node broadcasts this request to all replicas 
- 
Replicas perform the requested operation and return an answer to the client 
- 
The client waits for at least replicas answering the same result. This is the consensus outcome of the operation. 
 The figure illustrates the basic behavior for the case of a 1-node fault tolerant system. It has at least four nodes. The client sends its request to the primary node. The primary node broadcasts the request of operation to the three other nodes: replica 1, replica 2 and replica 3. As soon as the client receives an affirmative answer from two replicas (in the figure, replica 1 and replica 3), it considers the operation successful. If one replica is failing, the client will still receive proper answers from the two other replicas. If the primary node is failing, then the protocol switches the role of primary to one replica.
The figure illustrates the basic behavior for the case of a 1-node fault tolerant system. It has at least four nodes. The client sends its request to the primary node. The primary node broadcasts the request of operation to the three other nodes: replica 1, replica 2 and replica 3. As soon as the client receives an affirmative answer from two replicas (in the figure, replica 1 and replica 3), it considers the operation successful. If one replica is failing, the client will still receive proper answers from the two other replicas. If the primary node is failing, then the protocol switches the role of primary to one replica.
The PBFT is well adapted for fixed networks with known trusted entities. As such, it may be useful for blockchain validated by a set of known, approved authorities, i.e., permissioned blockchains. The number of authorities defines the system’s resilience. The minimal size is four which allows for one ill-behaving authority. Each authority has the same level of trust and shares trust with the other authorities.
PBFT is straightforward and robust. The theoretical background is well studied, known for a long time and used by the community. The trust is not linked to the ownership of resources or stakes. It is fast and computationally efficient. There is no notion of mining. It is deterministic. A permissioned blockchain using PBFT can have a very high through output of transactions.
Unfortunately, the trust is not flexible. PBFT requires a pre-established list of collaborating trusted participants. Any new participant requires new settings. This adjustment cannot be dynamic. The consensus of permissionless blockchains cannot employ PBFT. By participants, we mean the validation nodes and not the users of the blockchain. The system does not need to trust the users. It is the role of the validation and consensus to screen bad transactions.
A known weakness is that the system is prone to Sybil attacks. An entity may attempt to register several participants that it controls stealthily. If the cheating entity controls at least f sibyls, then it controls the full system. Therefore, the initial trust in the participants is paramount. In most permissioned blockchains, the likelihood of Sybil attacks is negligible.
Conclusion: PBFT is a well-known, well-studied protocol. For decades, distributed systems have taken their decisions using it. It is fast and computationally efficient. PBFT is an interesting consensus protocol for permissioned blockchain. Some variants of BFT are also available.
Bibliography
M. Castro and B. Liskov, “Practical Byzantine Fault Tolerance,” in Proceedings of the Third Symposium on Operating Systems Design and Implementation, New Orleans, USA, 1999