Almost every computing system
nowadays is distributed, ranging from multi-core laptops to
Internet-scale services; understanding the principles of distributed
computing is hence important for the design and engineering of modern
computing systems. Fundamental issues that arise in reliable and
efficient distributed systems include developing adequate methods for
modeling failures and synchrony assumptions, determining precise
performance bounds on implementations of concurrent data structures,
capturing the trade-off between consistency and efficiency, and
demarcating the frontier of feasibility in distributed computing.
For example, popular Internet services and applications such as CNN.com, YouTube, Facebook, Skype, BitTorrent attract millions of users every day, and only by the effective load-balancing and collaboration of many thousand machines, an acceptable Quality-of-Service/Quality-of-Experience can be guaranteed. While distributed systems promise a good scalability as well as a high robustness, they pose challenging research problems, such as: How to design robust and scalable distributed architectures and services? How to coordinate access to a shared resource, e.g., by electing a leader? Or how to provide incentives for cooperation in an open, collaborative distributed system?
|Author||Fabian Kuhn and Thomas Locher and Schmid, Stefan|
|Title of Book||27th ACM Symposium on Principles of Distributed Computing (PODC)|
|Abstract||This paper studies the problem of computing the most frequent element (the mode) by means of a distributed algorithm where the elements are located at the nodes of a network. Let k denote the number of distinct elements and further let m_i be the number of occurrences of the element e_i in the ordered list of occurrences m_1>m_2>=...>=m_k. We give a deterministic distributed algorithm with time complexity O(D+k) where D denotes the diameter of the graph, which is essentially tight. As our main contribution, a Monte Carlo algorithm is presented which computes the mode in O(D + F_2/m_1^2 log k) time with high probability, where the frequency moment F_l is defined as F_l = Sum_(i=1)^k m_i^l. This algorithm is substantially faster than the deterministic algorithm for various relevant frequency distributions. Moreover, we provide a lower bound of Ω(D + F_5/(m_1^5 B)), where B is the maximum message size, that captures the eect of the frequency distribution on the time complexity to compute the mode.|