direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Distributed Systems

Almost every computing system nowadays is distributed, ranging from multi-core laptops to Internet-scale services; understanding the principles of distributed computing is hence important for the design and engineering of modern computing systems.  Fundamental issues that arise in reliable and efficient distributed systems include developing adequate methods for modeling failures and synchrony assumptions, determining precise performance bounds on implementations of concurrent data structures, capturing the trade-off between consistency and efficiency, and demarcating the frontier of feasibility in distributed computing.

For example, popular Internet services and applications such as CNN.com, YouTube, Facebook, Skype, BitTorrent attract millions of users every day, and only by the effective load-balancing and collaboration of many thousand machines, an acceptable Quality-of-Service/Quality-of-Experience can be guaranteed. While distributed systems promise a good scalability as well as a high robustness, they pose challenging research problems, such as: How to design robust and scalable distributed architectures and services? How to coordinate access to a shared resource, e.g., by electing a leader? Or how to provide incentives for cooperation in an open, collaborative distributed system?

Selected Publications

The Weakest Failure Detector for Solving k-Set Agreement
Citation key GK-WFDSKSA-09
Author Gafni, Eli and Kuznetsov, Petr
Title of Book 28th ACM SIGACT/SIGOPS Symposium on Principles of Distributed Computing (PODC 2009)
Pages 83–91
Year 2009
ISBN 978-1-60558-396-9
DOI http://dx.doi.org/10.1145/1582716.1582735
Location Calcary, AB, Canada
Address New York, NY, USA
Month August
Publisher ACM
Abstract A failure detector is a distributed oracle that provides processes in a distributed system with hints about failures. The notion of a weakest failure detector captures the exact amount of synchrony needed for solving a given distributed computing problem. In this paper, we determine the weakest failure detector for solving k-set agreement among n processes (n>k) using reads and writes in shared memory, regardless of the assumptions on when and where failures might occur. This failure detector is derived directly from the impossibility of wait-free k+1-process k-set agreement. Our approach can be viewed as an extension of the asynchronous BG-simulation technique to partially synchronous systems.
Link to publication Download Bibtex entry

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe