Large scale dependable distributed systems - A.A. 2012-2013

Lecturer: Leonardo Querzoni
CFUs: 3
Lecture hours:
    1st semester:
    Thursday 14:00-15:30, room A4.
    Friday 12:00-13:30, room A4.

The success of the *-as-a-service business model recently shifted the demand for distributed reliable architectures towards previously unseen scales. Modern cloud platforms represent the main result of years of research in the area of dependable distributed systems. Yet, the design of their internal architectures pushed researchers to find new solutions to well known problems in order to withstand the sheer scale and the demand for elasticity that characterize cloud scenarios.
This course aims at analyzing current trends in the design of large scale dependable distributed systems, that are at the base of cloud platforms. The course will focus on the following topics:

  • Resilience to byzantine faults (including Byzantine-resilient state machine replication)
  • Storage in dynamic settings
  • Large scale replicated data storage

Wed 19/12/2012: the lecture of Thu 20/12/2012 will start at 14:30.


November 15th, 2012 - Intro, a short recap on dependability.
November 23rd, 2012 - RSMs, consensus and the byzantine generals problem.
November 24th, 2012 - BFT.
November 29th, 2012 - BFT.
November 30th, 2012 - BFT.
December 6th, 2012 - Dynamic registers.
December 7th, 2012 - Large scale replicated data storage.
December 13th, 2012 - Large scale replicated data storage.
December 14th, 2012 - Large scale replicated data storage.
December 20th, 2012 - Concluding remarks.


The password for accessing the following PDFs is "eds"
Introduction - 1 - 2 - 3 - 4 - 5 - 6 - 7

Exam rules

Instructions (Please read carefully and contact me for further information)
Suggested topics
Paper templates in LaTeX and Word.

Useful links:
  1. A. Avizienis, J.-C. Laprie, B. Randell and C. Landwehr. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. on Dependable and Secure Computing, vol. 1, no. 1, pp. 11–33, 2004.
  2. L. Lamport,R. Shostak and M. Pease. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4, 3 (July 1982), 382-401.
  3. F. Schneider. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Comp. Surveys, vol. 22, n. 4, 1990
  4. L. Lamport. The Part-Time Parliament. ACM Transactions on Computer Systems, vol. 16, n. 2, pp.133-169, 1998.
  5. L. Lamport. Paxos Made Simple. ACM SIGACT News vol. 32, n. 4, pp. 51-58, 2001
  6. M. Castro and B. Liskov. Practical Byzantine fault-tolerance and proactive recovery. ACM Transactions on Computer Systems (TOCS), vol. 20, n. 4, November 2002
  7. R. Kotla, L. Alvisi, M. Dahlin, A. Clement, and E. Wong. Zyzzyva: speculative byzantine fault tolerance. Proc. of 21st ACM SIGOPS symposium on Operating systems principles (SOSP '07). ACM, New York, NY, USA, 45-58, 2007.
  8. J. Yin, J-Ph. Martin, A. Venkataramani, L. Alvisi, and M. Dahlin. Separating agreement from execution for byzantine fault tolerant services. SIGOPS Oper. Syst. Rev. 37, 5 (October 2003), 253-267.
  9. Allen Clement, Edmund L. Wong, Lorenzo Alvisi, Michael Dahlin, Mirco Marchetti. Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults. NSDI 2009
  10. Rachid Guerraoui, Nikola Knezevic, Vivien Quéma, Marko Vukolic. The next 700 BFT protocols. EuroSys 2010
  11. Nancy Lynch and Seth Gilbert. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, Volume 33 Issue 2, 2002.
  12. Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni. PNUTS: Yahoo!'s hosted data serving platform, VLDB 2008
  13. Avinash Lakshman, Prashant Malik. Cassandra: a structured storage system on a P2P network. LADIS 2009.

Many of these papers are freely available. Those that require an active subscription can be downloaded from computers connected through the proxy installed at La Sapienza. Check the BIXY service (in italian), or contact me for further details.