Inhalt des Dokuments
Topics for the Seminar on Internet Measurement, SoSe 2011
Topics for the seminar on Internet Measurement
(SoSe 2011) .
Themenvorschläge für das Seminar über Internet Measurement (SoSe 2011) .
- 01 — Quantifying Path Exploration in the Internet
- 05 — Routing Stability in Static Wireless Mesh Networks
- 08 — A First Look at Mobile Hand-held Device Traffic
- 09 — Understanding Online Social Network Usage from a Network Perspective
- 10 — Seven Years and One Day: Sketching the Evolution of Internet Traffic
- 16 — TCP Revisited: A Fresh Look at TCP in the Wild
- 17 — On Dominant Characteristics of Residential Broadband Internet Traffic
- 19 — Characterizing VLAN-Induced Sharing in a Campus Network
- 20 — Live Streaming Performance of the Zattoo Network
- 23 — Measuring Serendipity: Connecting People, Locations and Interests in a Mobile 3G
- 25 — The nature of data center traffic: measurements & analysis
- 32 — Measuring availability in the Domain Name System
- 34 — TopBT: A Topology-Aware and Infrastructure-Independent BitTorrent Client
- 99 — Improving Content Delivery - Using Provider-aided Distance Information
01 — Quantifying Path Exploration in the Internet
Student/Bearbeiter: Jan Henning;
Supervisor/Betreuer: Steve Uhlig;
A number of previous measurement studies have shown the existence of path exploration and slow convergence in the global Internet routing system, and a number of protocol enhancements have been proposed to remedy the problem. However all the previous measurements were conducted over a small number of testing pre fixes. There has been no systematic study to quantify the pervasiveness of BGP slow convergence in the operational Internet, nor there is any known effort to deploy any of the proposed solutions. In this paper we present our measurement results from identifying BGP slow convergence events across the entire global routing table. Our data shows that the severity of path exploration and slow convergence varies depending on where prefixes are originated and where the observations are made in the Internet routing hierarchy. In general, routers in tier-1 ISPs observe less path exploration, hence shorter convergence delays than routers in edge ASes, and prefixes originated from tier-1 ISPs also experience less path exploration than those originated from edge ASes. Our data also shows that the convergence time of route fail-over events is similar to that of new route announcements, and significantly shorter than that of route failures, which confirms our earlier analytical results. In addition, we also developed a usage-time based path preference inference method which can be used by future studies of BGP dynamics.
- R. Oliveira, B. Zhang, D. Pei, R. Izhak-Ratzin & L. Zhang. Quantifying Path Exploration in the Internet , Internet Measurement Conference 2006
05 — Routing Stability in Static Wireless Mesh Networks
Andrii Soloviov; Supervisor/Betreuer: Ruben Merz;
Considerable research has focused on the design of routing protocols for wireless mesh networks. Yet, little is understood about the stability of routes in such networks. This understanding is important in the design of wireless routing protocols, and in network planning and management. In this paper, we present results from our measurement-based characterization of routing stability in two network deployments, the UCSB MeshNet and the MIT Roofnet. To conduct these case studies, we use detailed link quality information collected over several days from each of these networks1 . Using this information, we investigate routing stability in terms of route-level characteristics, such as prevalence, persistence and flapping. Our key findings are the following: wireless routes are weakly dominated by a single route; dominant routes are extremely short-lived due to excessive route flapping; and simple stabilization techniques, such as hysteresis thresholds, can provide a significant improvement in route persistence.
- K. Ramachandran, I. Sheriff, E. Belding, K. Almerotha. Routing Stability in Static Wireless Mesh Networks , Passive and Active Measurement Conference 2007
08 — A First Look at Mobile Hand-held Device Traffic
Student/Bearbeiter: Martin Schenck;
Supervisor/Betreuer: Pan Hui;
Although mobile hand-held devices (MHDs) are ubiquitous today, little is know about how they are used—especially at home. In this paper, we cast a first look on mobile hand-held device usage from a network perspective. We base our study on anonymized packet level data representing more than 20,000 residential DSL customers. Our characterization of the traffic shows that MHDs are active on up to 3 % of the monitored DSL lines. Mobile devices from Apple (i.e., iPhones and iPods) are, by a huge margin, the most commonly used MHDs and account for most of the traffic. We find that MHD traffic is dominated by multimedia content and downloads of mobile applications.
- Gregor Maier, Fabian Schneider, Anja Feldmann. A First Look at Mobile Hand-held Device Traffic , In PAM '10: Proceedings of the 11th International Conference on Passive and Active Network Measurement, (Location: Zurich, Switzerland), Springer, April 2010.
09 — Understanding Online Social Network Usage from a Network Perspective
Student/Bearbeiter: Eric Klieme;
Supervisor/Betreuer: Gilles Trédan;
Online Social Networks (OSNs) have already attracted more than half a billion users. However, our understanding of which OSN features attract and keep the attention of these users is poor. Studies thus far have relied on surveys or interviews of OSN users or focused on static properties, e. g., the friendship graph, gathered via sampled crawls. In this paper, we study how users actually interact with OSNs by extracting clickstreams from passively monitored network traffic. Our characterization of user interactions within the OSN for four different OSNs (Facebook, LinkedIn, Hi5, and StudiVZ) focuses on feature popularity, session characteristics, and the dynamics within OSN sessions. We find, for example, that users commonly spend more than half an hour interacting with the OSNs while the byte contributions per OSN session are relatively small.
- Fabian Schneider, Anja Feldmann, Balachander Krishnamurthy, Walter Willinger. Understanding Online Social Network Usage from a Network Perspective , In IMC '09: Proceedings of the 2009 Internet Measurement Conference, (Location: Chicago, IL), Pages 35–48, ACM Press, New York, NY, USA, November 2009.
10 — Seven Years and One Day: Sketching the Evolution of Internet Traffic
Student/Bearbeiter: Dominik Barczyk;
Supervisor/Betreuer: Steve Uhlig;
This contribution aims at performing a longitudinal study of the evolution of the traffic collected every day for seven years on a trans-Pacific backbone link (the MAWI dataset). Long term characteristics are investigated both at TCP/IP layers (packet and flow attributes) and application usages. The analysis of this unique dataset provides new insights into changes in traffic statistics, notably on the persistence of Long Range Dependence, induced by the on-going increase in link bandwidth. Traffic in the MAWI dataset is subject to bandwidth changes, to congestions, and to a variety of anomalies. This allows the comparison of their impacts on the traffic statistics but at the same time significantly impairs long term evolution characterizations. To account for this difficulty, we show and explain how and why random projection (sketch) based analysis procedures provide practitioners with an efficient and robust tool to disentangle actual long term evolutions from time localized events such as anomalies and link congestions. Our central results consist in showing a strong and persistent long range dependence controlling jointly byte and packet counts. An additional study of a 24-hour trace complements the long-term results with the analysis of intraday variabilities.
- Pierre Borgnat, Guillaume Dewaele, Kensuke Fukuda, Patrice Abry, Kenjiro Cho. Seven Years and One Day: Sketching the Evolution of Internet Traffic , INOFOCOM 2009
16 — TCP Revisited: A Fresh Look at TCP in the Wild
Dönmez; Supervisor/Betreuer: Amir Mehmood;
Since the last in-depth studies of measured TCP traffic some 6–8 years ago, the Internet has experienced significant changes, including the rapid deployment of backbone links with 1–2 orders of magnitude more capacity, the emergence of bandwidth-intensive streaming applications, and the massive penetration of new TCP variants. These and other changes beg the question whether the characteristics of measured TCP traffic in today's Internet reflect these changes or have largely remained the same. To answer this question, we collected and analyzed packet traces from a number of Internet backbone and access links, focused on the "heavy-hitter" flows responsible for the majority of traffic. Next we analyzed their within-flow packet dynamics, and observed the following features: (1) in one of our datasets, up to 15.8% of flows have an initial congestion window (ICW) size larger than the upper bound specified by RFC 3390. (2) Among flows that encounter retransmission rates of more than 10%, 5% of them exhibit irregular retransmission behavior where the sender does not slow down its sending rate during retransmissions. (3) TCP flow clocking (i.e., regular spacing between flights of packets) can be caused by both RTT and non-RTT factors such as application or link layer, and 60% of flows studied show no pronounced flow clocking. To arrive at these findings, we developed novel techniques for analyzing unidirectional TCP flows, including a technique for inferring ICW size, a method for detecting irregular retransmissions, and a new approach for accurately extracting flow clocks.
- Feng Qian, Alexandre Gerber, Z. Morley Mao, Subhabrata Sen, Oliver Spatscheck, Walter Willinger. TCP Revisited: A Fresh Look at TCP in the Wild , 9th ACM SIGCOMM conference on Internet measurement conference, Chicago, IL, USA, 2009
17 — On Dominant Characteristics of Residential Broadband Internet Traffic
Student/Bearbeiter: Michal Tadeusz Stawski;
Supervisor/Betreuer: Juhoon Kim;
While residential broadband Internet access is popular in many parts of the world, only a few studies have examined the characteristics of such traffic. In this paper we describe observations from monitoring the network activity for more than 20,000 residential DSL customers in an urban area. To ensure privacy, all data is immediately anonymized. We augment the anonymized packet traces with information about DSL-level sessions, IP (re-)assignments, and DSL link bandwidth.
Our analysis reveals a number of surprises in terms of the mental models we developed from the measurement literature. For example, we find that HTTP—not peer-to-peer—traffic dominates by a significant margin; that more often than not the home user's immediate ISP connectivity contributes more to the round-trip times the user experiences than the WAN portion of the path; and that the DSL lines are frequently not the bottleneck in bulk-transfer performance.
- Gregor Maier, Anja Feldmann, Vern Paxson, Mark Allman. On Dominant Characteristics of Residential Broadband Internet Traffic , 9th ACM SIGCOMM conference on Internet measurement conference, Chicago, IL, USA, 2009
19 — Characterizing VLAN-Induced Sharing in a Campus Network
Student/Bearbeiter: Ece Gürler;
Supervisor/Betreuer: Cigdem Sengul;
Many enterprise, campus, and data-center networks have complex layer-2 virtual LANs ("VLANs") below the IP layer. The interaction between layer-2 and IP topologies in these VLANs introduces hidden dependencies between IP level network and the physical infrastructure that has implications for network management tasks such as planning for capacity or reliability, and for fault diagnosis. This paper characterizes the extent and effect of these dependencies in a large campus network. We first present the design and implementation of EtherTrace, a tool that we make publicly available, which infers the layer-2 topology using data passively collected from Ethernet switches. Using this tool, we infer the layer-2 topology for a large campus network and compare it with the IP topology. We find that almost 70% of layer-2 edges are shared by 10 or more IP edges, and a single layer-2 edge may be shared by as many as 34 different IP edges. This sharing of layer-2 edges and switches among IP paths commonly results from trunking multiple VLANs to the same access router, or from colocation of academic departments that share layer-2 infrastructure, but have logically separate IP subnet and routers. We examine how this sharing affects the accuracy and specificity of fault diagnosis. For example, applying network tomography to the IP topology to diagnose failures caused by layer-2 devices results in only 54% accuracy, compared to 100% accuracy when our tomography algorithm takes input across layers.
- Ahmed Mansy, Mukarram bin Tariq, Nick Feamster, Mostafa Ammar. Characterizing VLAN-Induced Sharing in a Campus Network , 9th ACM SIGCOMM conference on Internet measurement conference, Chicago, IL, USA, 2009
20 — Live Streaming Performance of the Zattoo Network
Michael Winkelmann; Supervisor/Betreuer: Oliver
A number of commercial peer-to-peer systems for live streaming, such as PPLive, Joost, LiveStation, SOPCast, TVants, etc. have been introduced in recent years. The behavior of these popular systems has been extensively studied in several measurement papers. Due to the proprietary nature of these commercial systems, however, these studies have to rely on a "black-box" approach, where packet traces are collected from a single or a limited number of measurement points, to infer various properties of traffic on the control and data planes. Although such studies are useful to compare different systems from end-user's perspective, it is difficult to intuitively understand the observed properties without fully reverse-engineering the underlying systems. Our paper presents a large-scale measurement study of Zattoo, one of the largest production live streaming providers in Europe, using data collected by the provider. To highlight, we found that even when the Zattoo system was heavily loaded with as high as 20,000 concurrent users on a single overlay, the median channel join delay remained less than 2 to 5 seconds, and that, for a majority of users, the streamed signal lags over-the-air broadcast signal by no more than 3 seconds. To motivate the measurement study, we also present a description of the Zattoo network architecture.
- Hyunseok Chang, Sugih Jamin, Wenjie Wang. Live Streaming Performance of the Zattoo Network , 9th ACM SIGCOMM conference on Internet measurement conference, Chicago, IL, USA, 2009
23 — Measuring Serendipity: Connecting People, Locations and Interests in a Mobile 3G
Francisco Javier Sanchez - Migallon Blanco;
Supervisor/Betreuer: Ingmar Poese;
Characterizing the relationship that exists between people's application interests and mobility properties is the core question relevant for location-based services, in particular those that facilitate serendipitous discovery of people, businesses and objects. In this paper, we apply rule mining and spectral clustering to study this relationship for a population of over 280,000 users of a 3G mobile network in a large metropolitan area. Our analysis reveals that (i) People's movement patterns are correlated with the applications they access, e.g., stationary users and those who move more often and visit more locations tend to access different applications. (ii) Location affects the applications accessed by users, i.e., at certain locations, users are more likely to evince interest in a particular class of applications than others irrespective of the time of day. (iii) Finally, the number of serendipitous meetings between users of similar cyber interest is larger in regions with higher density of hotspots. Our analysis demonstrates how cellular network providers and location-based services can benefit from knowledge of the inter-play between users and their locations and interests.
- Ionut Trestian, Supranamaya Ranjan, Aleksandar Kuzmanovic, Antonio Nucci. Measuring Serendipity: Connecting People, Locations and Interests in a Mobile  3G, 9th ACM SIGCOMM conference on Internet measurement conference, Chicago, IL, USA, 2009
25 — The nature of data center traffic: measurements & analysis
Student/Bearbeiter: Adin Sljivar;
Supervisor/Betreuer: Nadi Sarrar;
We explore the nature of traffic in data centers, designed to support the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and report detailed views of traffic and congestion conditions and patterns. We further consider whether traffic matrices in the cluster might be obtained instead via tomographic inference from coarser-grained counter data.
- Srikanth Kandula, Sudipta Sengupta, Albert Greenberg, Parveen Patel. The nature of data center traffic: measurements & analysis , 9th ACM SIGCOMM conference on Internet measurement conference, Chicago, IL, USA, 2009
32 — Measuring availability in the Domain Name System
Michal Holowaty; Supervisor/Betreuer: Bernhard Ager;
The domain name system (DNS) is critical to Internet functionality. The availability of a domain name refers to its ability to be resolved correctly. We develop a model for server dependencies that is used as a basis for measuring availability. We introduce the minimum number of servers queried (MSQ) and redundancy as availability metrics and show how common DNS misconfigurations impact the availability of domain names. We apply the availability model to domain names from production DNS and observe that 6.7% of names exhibit sub-optimal MSQ, and 14% experience false redundancy. The MSQ and redundancy values can be optimized by proper maintenance of delegation records for zones.
- Casey Deccio, Jeff Sedayao, Krishna Kant, Prasant Mohapatra. Measuring Availability in the Domain Name System, IEEE INFOCOM 2010
34 — TopBT: A Topology-Aware and Infrastructure-Independent BitTorrent Client
Nehring; Supervisor/Betreuer: Georgios Smaragdakis;
BitTorrent (BT) has carried out a significant and continuously increasing portion of Internet traffic. While several designs have been recently proposed and implemented to improve the resource utilization by bridging the application layer (overlay) and the network layer (underlay), these designs are largely dependent on Internet infrastructures, such as ISPs and CDNs. In addition, they also demand large-scale deployments of their systems to work effectively. Consequently, they require multi-efforts far beyond individual users' ability to be widely used in the Internet.
In this paper, aiming at building an infrastructure-independent user-level facility, we present our design, implementation, and evaluation of a topology-aware BT system, called TopBT, to significantly improve the overall Internet resource utilization without degrading user downloading performance. The unique feature of TopBT client lies in that a TopBT client actively discovers network proximities (to connected peers), and uses both proximities and transmission rates to maintain fast downloading while reducing the transmitting distance of the BT traffic and thus the Internet traffic. As a result, a TopBT client neither requires feeds from major Internet infrastructures, such as ISPs or CDNs, nor requires large-scale deployment of other TopBT clients on the Internet to work effectively. We have implemented TopBT based on widely used open-source BT client code base, and made the software publicly available. By deploying TopBT and other BitTorrent clients on hundreds of Internet hosts, we show that on average TopBT can reduce about 25% download traffic while achieving a 15% faster download speed compared to several prevalent BT clients. TopBT has been widely used in the Internet by many users all over the world.
- Shansi Ren, Enhua Tan, Tian Luo, Songqing Chen, Lei Guo, and Xiaodong Zhang. TopBT: A Topology-Aware and Infrastructure-Independent BitTorrent Client, IEEE INFOCOM 2010
99 — Improving Content Delivery - Using Provider-aided Distance Information
Robert Philipp Skupin; Supervisor/Betreuer: Juhoon
Content delivery systems constitute a major portion of today's Internet traffic. While they are a good source of revenue for Internet Service Providers (ISPs), the huge volume of content delivery traffic also poses a significant burden and traffic engineering challenge for the ISP. The difficulty is due to the immense volume of transfers, while the traffic engineering challenge stems from the fact that most content delivery systems themselves utilize a distributed infrastructure. They perform their own traffic flow optimization and realize this using the DNS system. While content delivery systems may, to some extent, consider the user's performance within their optimization criteria, they currently have no incentive to consider any of the ISP's constraints. As a consequence, the ISP has "lost control" over a major part of its traffic. To overcome this impairment, we propose a solution where the ISP offers a Provider-aided Distance Information System (PaDIS). PaDIS uses information available only to the ISP to rank any client-host pair based on distance information, such as delay, bandwidth or number of hops.
In this paper we show that the applicability of the system is significant. More than 70% of the HTTP traffic of a major European ISP can be accessed via multiple different locations. Moreover, we show that deploying PaDIS is not only beneficial to ISPs, but also to users. Experiments with different content providers show that improvements in download times of up to a factor of four are possible. Furthermore, we describe a high performance implementation of PaDIS and show how it can be deployed within an ISP.
- Ingmar Poese, Benjamin Frank, Bernhard Ager, Georgios Smaragdakis, Anja Feldmann. Improving Content Delivery using Provider-aided Distance Information . Internet Measurement Conference 2010