TU Berlin

Internet Network ArchitecturesSE Internet Routing – Topics

Inhalt des Dokuments

zur Navigation

Topics for the Seminar on Internet Routing, WS 2015/16

Topics for the seminar on Internet Routing (WS 2015/16).
Themen für das Seminar über Internet Routing (WS 2015/16).

Model-Driven Data Acquisition in Sensor Networks

Declarative queries are proving to be an attractive paradigm for ineracting with networks of wireless sensors. The metaphor that "the sensornet is a database" is problematic, however, because sensors do not exhaustively represent the data in the real world. In order to map the raw sensor readings onto physical reality, a model of that reality is required to complement the readings. In this paper, we enrich interactive sensor querying with statistical modeling techniques. We demonstrate that such models can help provide answers that are both more meaningful, and, by introducing approximations with probabilistic confidences, significantly more efficient to compute in both time and energy. Utilizing the combination of a model and live data acquisition raises the challenging optimization problem of selecting the best sensor readings to acquire, balancing the increase in the confidence of our answer against the communication and data acquisition costs in the network. We describe an exponential time algorithm for finding the optimal solution to this optimization problem, and a polynomial-time heuristic for identifying solutions that perform well in practice. We evaluate our approach on several real-world sensor-network data sets, taking into account the real measured data and communication quality, demonstrating that our model-based approach provides a high-fidelity representation of the real phenomena and leads to significant performance gains versus traditional data acquisition techniques.


A Distributed and Robust SDN Control Plane for Transactional Network Updates

Software-defined networking (SDN) is a novel paradigm that outsources the control of programmable network switches to a set of software controllers. The most fundamental task of these controllers is the correct implementation of the network policy, i.e., the intended network behavior. In essence, such a policy specifies the rules by which packets must be forwarded across the network.

This paper studies a distributed SDN control plane that enables concurrent and robust policy implementation. We introduce a formal model describing the interaction between the data plane and a distributed control plane (consisting of a collection of fault- prone controllers). Then we formulate the problem of consistent composition of concurrent network policy updates (termed the CPC Problem). To anticipate scenarios in which some conflicting policy updates must be rejected, we enable the composition via a natural transactional interface with all-or-nothing semantics.

We show that the ability of an f-resilient distributed control plane to process concurrent policy updates depends on the tag complexity, i.e., the number of policy labels (a.k.a. tags) available to the controllers, and describe a CPC protocol with optimal tag complexity f + 2.


Alibi Routing

There are several mechanisms by which users can gain insight into where their packets have gone, but no mechanisms allow users undeniable proof that their packets did not traverse certain parts of the world while on their way to or from another host. This paper introduces the problem of finding “proofs of avoidance”: evidence that the paths taken by a packet and its response avoided a user-specified set of “forbidden” geographic regions. Proving that something did not happen is often intractable, but we demonstrate a low-overhead proof structure built around the idea of what we call “alibis”: relays with particular timing constraints that, when upheld, would make it impossible to traverse both the relay and the forbidden regions.

We present Alibi Routing, a peer-to-peer overlay routing system for finding alibis securely and efficiently. One of the primary distinguishing characteristics of Alibi Routing is that it does not require knowledge of —or modifications to— the Internet’s routing hardware or policies. Rather, Alibi Routing is able to derive its proofs of avoidance from user-provided GPS coordinates and speed of light propagation delays. Using a PlanetLab deployment and larger-scale simulations, we evaluate Alibi Routing to demonstrate that many source-destination pairs can avoid countries of their choosing with little latency inflation. We also identify when Alibi Routing does not work: it has difficulty avoiding regions that users are very close to (or, of course, inside of).


InterTubes: A Study of the US Long-haul Fiber-optic Infrastructure

The complexity and enormous costs of installing new long-haul fiber-optic infrastructure has led to a significant amount of infrastructure sharing in previously installed conduits. In this paper, we study the characteristics and implications of infrastructure sharing by analyzing the long-haul fiber-optic network in the US. We start by using fiber maps provided by tier-1 ISPs and major cable providers to construct a map of the long-haul US fiber-optic infrastructure. We also rely on previously under-utilized data sources in the form of public records from federal, state, and municipal agencies to improve the fidelity of our map. We quantify the resulting map's connectivity characteristics and confirm a clear correspondence between long-haul fiber-optic, roadway, and railway infrastructures. Next, we examine the prevalence of high-risk links by mapping end-to-end paths resulting from large-scale traceroute campaigns onto our fiber-optic infrastructure map. We show how both risk and latency (i.e., propagation delay) can be reduced by deploying new links along previously unused transportation corridors and rights-of-way. In particular, focusing on a subset of high-risk links is sufficient to improve the overall robustness of the network to failures. Finally, we discuss the implications of our findings on issues related to performance, net neutrality, and policy decision-making.


R2C2: A Network Stack for Rack-scale Computers

Rack-scale computers, comprising a large number of micro-servers connected by a direct-connect topology, are expected to replace servers as the building block in data centers. We focus on the problem of routing and congestion control across the rack's network, and find that high path diversity in rack topologies, in combination with workload diversity across it, means that traditional solutions are inadequate. We introduce R2C2, a network stack for rack-scale computers that provides flexible and efficient routing and congestion control. R2C2 leverages the fact that the scale of rack topologies allows for low-overhead broadcasting to ensure that all nodes in the rack are aware of all network flows. We thus achieve rate-based congestion control without any probing; each node independently determines the sending rate for its flows while respecting the provider's allocation policies. For routing, nodes dynamically choose the routing protocol for each flow in order to maximize overall utility. Through a prototype deployed across a rack emulation platform and a packet-level simulator, we show that R2C2 achieves very low queuing and high throughput for diverse and bursty workloads, and that routing flexibility can provide significant throughput gains.


Adaptive Congestion Control for Unpredictable Cellular Networks

Legacy congestion controls including TCP and its variants are known to perform poorly over cellular networks due to highly variable capacities over short time scales, self-inflicted packet delays, and packet losses unrelated to congestion. To cope with these challenges, we present Verus, an end-to-end congestion control protocol that uses delay measurements to react quickly to the capacity changes in cellular networks without explicitly attempting to predict the cellular channel dynamics. The key idea of Verus is to continuously learn a delay profile that captures the relationship between end-to-end packet delay and outstanding window size over short epochs and uses this relationship to increment or decrement the window size based on the observed short-term packet delay variations. While the delay-based control is primarily for congestion avoidance, Verus uses standard TCP features including multiplicative decrease upon packet loss and slow start. Through a combination of simulations, empirical evaluations using cellular network traces, and real-world evaluations against standard TCP flavors and state of the art protocols like Sprout, we show that Verus outperforms these protocols in cellular channels. In comparison to TCP Cubic, Verus achieves an order of magnitude (> 10x) reduction in delay over 3G and LTE networks while achieving comparable throughput (sometimes marginally higher). In comparison to Sprout, Verus achieves up to 30% higher throughput in rapidly changing cellular networks.


Enabling End-host Network Functions

Many network functions executed in modern datacenters, e.g., load balancing, application-level QoS, and congestion control, exhibit three common properties at the data-plane: they need to access and modify state, to perform computations, and to access application semantics  this is critical since many network functions are best expressed in terms of application-level messages. In this paper, we argue that the end hosts are a natural enforcement point for these functions and we present Eden, an architecture for implementing network functions at datacenter end hosts with minimal network support. Eden comprises three components, a centralized controller, an enclave at each end host, and Eden-compliant applications called stages. To implement network functions, the controller configures stages to classify their data into messages and the enclaves to apply action functions based on a packet's class. Our Eden prototype includes enclaves implemented both in the OS kernel and on programmable NICs. Through case studies, we show how application-level classification and the ability to run actual programs on the data-path allows Eden to efficiently support a broad range of network functions at the network's edge.


Packet-Level Telemetry in Large Datacenter Networks

Debugging faults in complex networks often requires capturing and analyzing traffic at the packet level. In this task, datacenter networks (DCNs) present unique challenges with their scale, traffic volume, and diversity of faults. To troubleshoot faults in a timely manner, DCN administrators must a) identify affected packets inside large volume of traffic; b) track them across multiple network components; c) analyze traffic traces for fault patterns; and d) test or confirm potential causes. To our knowledge, no tool today can achieve both the specificity and scale required for this task.

We present Everflow, a packet-level network telemetry system for large DCNs. Everflow traces specific packets by implementing a powerful packet filter on top of "match and mirror" functionality of commodity switches. It shuffles captured packets to multiple analysis servers using load balancers built on switch ASICs, and it sends "guided probes" to test or confirm potential faults. We present experiments that demonstrate Everflow's scalability, and share experiences of troubleshooting network faults gathered from running it for over 6 months in Microsoft's DCNs.


Presto: Edge-based Load Balancing for Fast Datacenter Networks

Datacenter networks deal with a variety of workloads, ranging from latency-sensitive small flows to bandwidth-hungry large flows. Load balancing schemes based on flow hashing, e.g., ECMP, cause congestion when hash collisions occur and can perform poorly in asymmetric topologies. Recent proposals to load balance the network require centralized traffic engineering, multipath-aware transport, or expensive specialized hardware. We propose a mechanism that avoids these limitations by (i) pushing load-balancing functionality into the soft network edge (e.g., virtual switches) such that no changes are required in the transport layer, customer VMs, or networking hardware, and (ii) load balancing on fine-grained, near-uniform units of data (flowcells) that fit within end-host segment offload optimizations used to support fast networking speeds. We design and implement such a soft-edge load balancing scheme, called Presto, and evaluate it on a 10 Gbps physical testbed. We demonstrate the computational impact of packet reordering on receivers and propose a mechanism to handle reordering in the TCP receive offload functionality. Presto's performance closely tracks that of a single, non-blocking switch over many workloads and is adaptive to failures and topology asymmetry.


Condor: Better Topologies Through Declarative Design

The design space for large, multipath datacenter networks is large and complex, and no one design fits all purposes. Network architects must trade off many criteria to design cost-effective, reliable, and maintainable networks, and typically cannot explore much of the design space. We present Condor, our approach to enabling a rapid, efficient design cycle. Condor allows architects to express their requirements as constraints via a Topology Description Language (TDL), rather than having to directly specify network structures. Condor then uses constraint-based synthesis to rapidly generate candidate topologies, which can be analyzed against multiple criteria. We show that TDL supports concise descriptions of topologies such as fat-trees, BCube, and DCell; that we can generate known and novel variants of fat-trees with simple changes to a TDL file; and that we can synthesize large topologies in tens of seconds. We also show that Condor supports the daunting task of designing multi-phase network expansions that can be carried out on live networks.


Poptrie: A Compressed Trie with Population Count for Fast and Scalable Software IP Routing Table Lookup

Internet of Things leads to routing table explosion. An inexpensive approach for IP routing table lookup is required against ever growing size of the Internet. We contribute by a fast and scalable software routing lookup algorithm based on a multiway trie, called Poptrie. Named after our approach to traversing the tree, it leverages the population count instruction on bit-vector indices for the descendant nodes to compress the data structure within the CPU cache. Poptrie outperforms the state-of-the-art technologies, Tree BitMap, DXR and SAIL, in all of the evaluations using random and real destination queries on 35 routing tables, including the real global tier-1 ISP's full-route routing table. Poptrie peaks between 174 and over 240 Million lookups per second (Mlps) with a single core and tables with 500800k routes, consistently 4578% faster than all competing algorithms in all the tests we ran. We provide the comprehensive performance evaluation, remarkably with the CPU cycle analysis. This paper shows the suitability of Poptrie in the future Internet including IPv6, where a larger route table is expected with longer prefixes.


Central Control Over Distributed Routing

Centralizing routing decisions offers tremendous flexibility, but sacrifices the robustness of distributed protocols. In this paper, we present Fibbing, an architecture that achieves both flexibility and robustness through central control over distributed routing. Fibbing introduces fake nodes and links into an underlying link-state routing protocol, so that routers compute their own forwarding tables based on the augmented topology. Fibbing is expressive, and readily supports flexible load balancing, traffic engineering, and backup routes. Based on high-level forwarding requirements, the Fibbing controller computes a compact augmented topology and injects the fake components through standard routing-protocol messages. Fibbing works with any un-modified routers speaking OSPF. Our experiments also show that it can scale to large networks with many forwarding requirements, introduces minimal overhead, and quickly reacts to network and controller failures.


C3: Cutting Tail Latency in Cloud Data Stores via Adaptive Replica Selection

Achieving predictable performance is critical for many distributed applications, yet difficult to achieve due to many factors that skew the tail of the latency distribution even in well-provisioned systems. In this paper, we present the fundamental challenges involved in designing a replica selection scheme that is robust in the face of performance fluctuations across servers. We illustrate these challenges through performance evaluations of the Cassandra distributed database on Amazon EC2. We then present the design and implementation of an adaptive replica selection mechanism, C3, that is robust to performance variability in the environment. We demonstrate C3’s effectiveness in reducing the latency tail and improving throughput through extensive evaluations on Amazon EC2 and through simulations. Our results show that C3 significantly improves the latencies along the mean, median, and tail (up to 3 times improvement at the 99.9th percentile) and provides higher system throughput.


Beacon-Based Routing Optimization in Data-Gathering Wireless Sensor Networks

In this paper we focus on beacon-based routing optimization of data-gathering wireless sensor networks. Such a network consists of a sink node and a number of scattered sensor nodes which send data packets back to the sink in a multihop fashion. Usually the routing messages, denoted as “beacon”, are initiated by the sink and spread out the whole network by flooding or gossiping. We believe that the mechanism of selecting upstream node as relay has great impact to the packet delivery performance. In this work several beacon rebroadcasting mechanism are inspected and the data packet delivery performance of them are compared by simulations. The simulation result shows that current Zigbee tree-based routing has great flaw in large scale networks. Based on the comparison of the results we propose an optimal link-state beacon-based routing (OLSBR) which shows that parent selection by link state can greatly improve the packet delivery performance.


A General Model for the Virtual Router

Router virtualization technology allows multiple router instances to run on the same physical platform. And two features of high performance and flexible resource scheduling are the main challenges for the design of virtual router. In this paper, we firstly analyze, discuss and illustrate the features and limitations of the existent virtual router models proposed by research communities to support the network virtualization technology. Then, we put forward a general model for the virtual router which has a light-weight VMM layer (Virtual Machine Monitor) that can provide multiple standard APIs (Application Programming Interfaces) to achieve a fine-grained resource scheduling and make the developers design one flexible and plentiful virtual router instance without knowing anything of the underlying physical platform. At last, this paper proposes three resource evaluation norms to estimate the CPU and I/O resource of the general model and presents the advantages of the general model.


Hypercube-Based Multipath Social Feature Routing in Human Contact Networks

Most routing protocols for delay tolerant networks resort to the sufficient state information, including trajectory and contact information, to ensure routing efficiency. However, state information tends to be dynamic and hard to obtain without a global and/or long-term collection process. In this paper, we use the internal social features of each node in the network to perform the routing process. In this way, feature-based routing converts a routing problem in a highly mobile and unstructured contact space to a static and structured feature space. This approach is motivated from several human contact networks, such as the Infocom 2006 trace and MIT reality mining data, where people contact each other more frequently if they have more social features in common. Our approach includes two unique processes: social feature extraction and multipath routing. In social feature extraction, we use entropy to extract the m most informative social features to create a feature space (F-space): (F1, F2,..., Fm), where Fi corresponds to a feature. The routing method then becomes a hypercube-based feature matching process, where the routing process is a step-by-step feature difference resolving process. We offer two special multipath routing schemes: node-disjoint-based routing and delegation-based routing. Extensive simulations on both real and synthetic traces are conducted in comparison with several existing approaches, including spray-and-wait routing, spray-and-focus routing, and social-aware routing based on betweenness centrality and similarity. In addition, the effectiveness of multipath routing is evaluated and compared to that of single-path routing.


An Empirical Reexamination of Global DNS Behavior

The performance and operational characteristics of the DNS protocol are of deep interest to the research and network operations community. In this paper, we present measurement results from a unique dataset containing more than 26 billion DNS query-response pairs collected from more than 600 globally distributed recursive DNS resolvers. We use this dataset to reaffirm findings in published work and notice some significant differences that could be attributed both to the evolving nature of DNS traffic and to our differing perspective. For example, we find that although characteristics of DNS traffic vary greatly across networks, the resolvers within an organization tend to exhibit similar behavior. We further find that more than 50% of DNS queries issued to root servers do not return successful answers, and that the primary cause of lookup failures at root servers is malformed queries with invalid TLDs. Furthermore, we propose a novel approach that detects malicious domain groups using temporal correlation in DNS queries. Our approach requires no comprehensive labeled training set, which can be difficult to build in practice. Instead, it uses a known malicious domain as anchor, and identifies the set of previously unknown malicious domains that are related to the anchor domain. Experimental results illustrate the viability of this approach, i.e. , we attain a true positive rate of more than 96%, and each malicious anchor domain results in a malware domain group with more than 53 previously unknown malicious domains on average.


Trinocular: Understanding Internet Reliability Through Adaptive Probing

Natural and human factors cause Internet outages -from big events like Hurricane Sandy in 2012 and the Egyptian Internet shutdown in Jan. 2011 to small outages every day that go unpublicized. We describe Trinocular, an outage detection system that uses active probing to understand reliability of edge networks. Trinocular is principled: deriving a simple model of the Internet that captures the information pertinent to outages, and populating that model through long-term data, and learning current network state through ICMP probes. It is parsimonious, using Bayesian inference to determine how many probes are needed. On average, each Trinocular instance sends fewer than 20 probes per hour to each /24 network block under study, increasing Internet "background radiation" by less than 0.7%. Trinocular is also predictable and precise: we provide known precision in outage timing and duration. Probing in rounds of 11 minutes, we detect 100% of outages one round or longer, and estimate outage duration within one-half round. Since we require little traffic, a single machine can track 3.4M /24 IPv4 blocks, all of the Internet currently suitable for analysis. We show that our approach is significantly more accurate than the best current methods, with about one-third fewer false conclusions, and about 30% greater coverage at constant accuracy. We validate our approach using controlled experiments, use Trinocular to analyze two days of Internet outages observed from three sites, and re-analyze three years of existing data to develop trends for the Internet.


A First Look at Cellular Machine-to-Machine Traffic – Large Scale Measurement and Characterization

Cellular network based Machine-to-Machine (M2M) communication is fast becoming a market-changing force for a wide spectrum of businesses and applications such as telematics, smart metering, point-of-sale terminals, and home security and automation systems. In this paper, we aim to answer the following important question: Does traffic generated by M2M devices impose new requirements and challenges for cellular network design and management? To answer this question, we take a first look at the characteristics of M2M traffic and compare it with traditional smartphone traffic. We have conducted our measurement analysis using a week- long traffic trace collected from a tier-1 cellular network in the United States. We characterize M2M traffic from a wide range of perspectives, including temporal dynamics, device mobility, application usage, and network performance. Our experimental results show that M2M traffic exhibits significantly different patterns than smartphone traffic in multiple aspects. For instance, M2M devices have a much larger ratio of uplink to downlink traffic volume, their traffic typically exhibits different diurnal patterns, they are more likely to generate synchronized traffic resulting in bursty aggregate traffic volumes, and are less mobile compared to smartphones. On the other hand, we also find that M2M devices are generally competing with smartphones for network resources in co-located geographical regions. These and other findings suggest that better protocol design, more careful spectrum allocation, and modified pricing schemes may be needed to accommodate the rise of M2M devices.


The Need for End-to-End Evaluation of Cloud Availability

People’s computing lives are moving into the cloud, making understanding cloud availability increasingly critical. Prior studies of Internet outages have used ICMP-based pings and traceroutes. While these studies can detect network availability, we show that they can be inaccurate at estimating cloud availability. Without care, ICMP probes can underestimate availability because ICMP is not as robust as application-level measurements such as HTTP. They can overestimate availability if they measure reachability of the cloud’s edge, missing failures in the cloud’s back-end. We develop methodologies sensitive to five “nines” of reliability, and then we compare ICMP and end-to-end measurements for both cloud VM and storage services. We show case studies where one fails and the other succeeds, and our results highlight the importance of application-level retries to reach high precision. When possible, we recommend end-to-end measurement with application-level protocols to evaluate the availability of cloud services.


Volume-based Transit Pricing: Is 95 The Right Percentile?

The 95 th percentile billing mechanism has been an industry de facto standard for transit providers for well over a decade. While the simplicity of the scheme makes it attractive as a billing mechanism, dramatic evolution in traffic patterns, associated interconnection practices and industry structure over the last two decades motivates an obvious question: is it still appropriate? In this paper, we evaluate the 95 th percentile pricing mechanism from the perspective of transit providers, using a decade of traffic statistics from SWITCH (a large research/academic network), and more recent traffic statistics from 3 Internet Exchange Points (IXPs). We find that over time, heavy-inbound and heavy-hitter networks are able to achieve a lower 95th-to-average ratio than heavy-inbound and moderate-hitter networks, possibly due to their ability to better manage their traffic profile. The 95 th percentile traffic volume also does not necessarily reflect the cost burden to the provider, motivating our exploration of an alternative metric that better captures the costs imposed on a network. We define the provision ratio for a customer, which captures its contribution to the provider's peak load.


Cell vs. WiFi: On the Performance of Metro Area Mobile Connections

Cellular and 802.11 WiFi are compelling options for mobile Internet connectivity. The goal of our work is to understand the performance afforded by each of these technologies in diverse environments and use conditions. In this paper, we compare and contrast cellular and WiFi performance using crowd-sourced data from Speedtest.net. Our study considers spatio-temporal performance (upload/download throughput and latency) using over 3 million user-initiated tests from iOS and Android apps in 15 different metro areas collected over a 15 week period. Our basic performance comparisons show that (i) WiFi provides better absolute download/upload throughput, and a higher degree of consistency in performance; (ii) WiFi networks generally deliver lower absolute latency, but the consistency in latency is often better with cellular access; (iii) throughput and latency vary widely depending on the particular access type e.g., HSPA, EVDO, LTE, WiFi, etc.) and service provider. More broadly, our results show that performance consistency for cellular and WiFi is much lower than has been reported for wired broadband. Temporal analysis shows that average performance for cell and WiFi varies with time of day, with the best performance for large metro areas coming at non-peak hours. Spatial analysis shows that performance is highly variable across metro areas, but that there are subregions that offer consistently better performance for cell or WiFi. Comparisons between metro areas show that larger areas provide higher throughput and lower latency than smaller metro areas, suggesting where ISPs have focused their deployment efforts. Finally, our analysis reveals diverse performance characteristics resulting from the rollout of new cell access technologies and service differences among local providers.


Analysis of a “/0” Stealth Scan from a Botnet

Botnets are the most common vehicle of cyber-criminal activity. They are used for spamming, phishing, denial-of-service attacks, brute-force cracking, stealing private information, and cyber warfare. Botnets carry out network scans for several reasons, including searching for vulnerable machines to infect and recruit into the botnet, probing networks for enumeration or penetration, etc. We present the measurement and analysis of a horizontal scan of the entire IPv4 address space conducted by the Sality botnet in February 2011. This 12-day scan originated from approximately 3 million distinct IP addresses and used a heavily coordinated and unusually covert scanning strategy to try to discover and compromise VoIP-related (SIP server) infrastructure. We observed this event through the UCSD Network Telescope, a /8 darknet continuously receiving large amounts of unsolicited traffic, and we correlate this traffic data with other public sources of data to validate our inferences. Sality is one of the largest botnets ever identified by researchers. Its behavior represents ominous advances in the evolution of modern malware: the use of more sophisticated stealth scanning strategies by millions of coordinated bots, targeting critical voice communications infrastructure. This paper offers a detailed dissection of the botnet's scanning behavior, including general methods to correlate, visualize, and extrapolate botnet behavior across the global Internet.


Understanding the Domain Registration Behavior of Spammers

Spammers register a tremendous number of domains to evade blacklisting and takedown efforts. Current techniques to detect such domains rely on crawling spam URLs or monitoring lookup traffic. Such detection techniques are only effective after the spammers have already launched their campaigns, and thus these countermeasures may only come into play after the spammer has already reaped significant benefits from the dissemination of large volumes of spam. In this paper we examine the registration process of such domains, with a particular eye towards features that might indicate that a given domain likely has a malicious purpose at registration time, before it is ever used for an attack. Our assessment includes exploring the characteristics of registrars, domain life cycles, registration bursts, and naming patterns. By investigating zone changes from the .com TLD over a 5-month period, we discover that spam- mers employ bulk registration, that they often re-use domains previously registered by others, and that they tend to register and host their domains over a small set of registrars. Our findings suggest steps that registries or registrars could use to frustrate the efforts of miscreants to acquire domains in bulk, ultimately reducing their agility for mounting large-scale attacks.


A Fistful of Bitcoins: Characterizing Payments Among Men with No Names

Bitcoin is a purely online virtual currency, unbacked by either physical commodities or sovereign obligation; instead, it relies on a combination of cryptographic protection and a peer-to-peer protocol for witnessing settlements. Consequently, Bitcoin has the un-intuitive property that while the ownership of money is implicitly anonymous, its flow is globally visible. In this paper we explore this unique characteristic further, using heuristic clustering to group Bitcoin wallets based on evidence of shared authority, and then using re-identification attacks (i.e., empirical purchasing of goods and services) to classify the operators of those clusters. From this analysis, we characterize longitudinal changes in the Bitcoin market, the stresses these changes are placing on the system, and the challenges for those seeking to use Bitcoin for criminal or fraudulent purposes at scale.


Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+

Social network structure and evolution has important implications for many aspects of network and system design including provisioning, bootstrapping trust and reputation systems via social networks, and defenses against Sybil attacks. Several recent results suggest that augmenting the social network structure with user attributes (e.g., location, employer, communities of inter- est) can provide a more fine-grained understanding of social networks. However, there have been few studies to provide a systematic understanding of these effects at scale.

We bridge this gap using a unique dataset collected as the Google+ social network grew over time since its release in late June 2011. We observe novel phenomena with respect to both standard social network metrics and new attribute-related metrics (that we define). We also observe interesting evolutionary patterns as Google+ went from a bootstrap phase to a steady invitation-only stage before a public release.

Based on our empirical observations, we develop a new generative model to jointly reproduce the social structure and the node attributes. Using theoretical analysis and empirical evaluations, we show that our model can accurately reproduce the social and attribute structure of real social networks. We also demonstrate that our model provides more accurate predictions for practical application contexts.


Mapping the Expansion of Google’s Serving Infrastructure

Modern content-distribution networks both provide bulk content and act as “serving infrastructure” for web services in order to reduce user-perceived latency. Serving infrastruc- tures such as Google’s are now critical to the online economy, making it imperative to understand their size, geographic distribution, and growth strategies. To this end, we develop techniques that enumerate IP addresses of servers in these infrastructures, find their geographic location, and identify the association between clients and clusters of servers. While general techniques for server enumeration and geolocation can exhibit large error, our techniques exploit the design and mechanisms of serving infrastructure to improve accuracy. We use the EDNS-client-subnet DNS extension to measure which clients a service maps to which of its serving sites. We devise a novel technique that uses this mapping to geolocate servers by combining noisy information about client loca- tions with speed-of-light constraints. We demonstrate that this technique substantially improves geolocation accuracy relative to existing approaches. We also cluster server IP ad- dresses into physical sites by measuring RTTs and adapting the cluster thresholds dynamically. Google’s serving infras- tructure has grown dramatically in the ten months, and we use our methods to chart its growth and understand its con- tent serving strategy. We find that the number of Google serving sites has increased more than sevenfold, and most of the growth has occurred by placing servers in large and small ISPs across the world, not by expanding Google’s backbone.


A Measurement-based Study of MultiPath TCP Performance over Wireless Networks

With the popularity of mobile devices and the pervasive use of cellular technology, there is widespread interest in hybrid networks and on how to achieve robustness and good per- formance from them. As most smart phones and mobile de- vices are equipped with dual interfaces (WiFi and 3G/4G), a promising approach is through the use of multi-path TCP, which leverages path diversity to improve performance and provide robust data transfers. In this paper we explore the performance of multi-path TCP in the wild, focusing on sim- ple 2-path multi-path TCP scenarios. We seek to answer the following questions: How much can a user benefit from using multi-path TCP over cellular and WiFi relative to using the either interface alone? What is the impact of flow size on average latency? What is the effect of the rate/route con- trol algorithm on performance? We are especially interested in understanding how application level performance is af- fected when path characteristics (e.g., round trip times and loss rates) are diverse. We address these questions by con- ducting measurements using one commercial Internet service provider and three major cellular carriers in the US.


3GOL: Power-boosting ADSL using 3G OnLoading

The co-existence of cellular and wired networks has been exploited almost exclusively in the direction of OffLoading traffic from the former onto the latter. In this paper we claim that there exist cases that call for the exact opposite, i.e., use the cellular network to assist a fixed wired network. In par- ticular, we show that by “OnLoading” traffic from the wired broadband network onto the cellular network we can usefully speedup wired connections, on the downlink or the uplink. We consider the technological challenges pertaining to this idea and implement a prototype 3G OnLoading service that we call 3GOL, that can be deployed by an operator provid- ing both the wired and cellular network services. By strate- gically OnLoading a fraction of the data transfers to the 3G network, one can significantly enhance the performance of particular applications. In particular we demonstrate non- trivial performance benefits of 3GOL to two widely used applications: video-on-demand and multimedia upload. We also consider the case when the operator that provides wired and cellular services is different, adding the analysis on eco- nomic constraints and volume cap on cellular data plans that need to be respected. Simulating 3GOL over a DSLAM trace we show that 3GOL can reduce video pre-buffering time by at least 20% for 50% of the users while respecting data caps and we design a simple estimator to compute the daily allowance that can be used towards 3GOL while re- specting caps. Our prototype is currently being piloted in 30 households in a large European city by a large network provider.


Bobtail: Avoiding Long Tails in the Cloud

Highly modular data center applications such as Bing, Facebook, and Amazon’s retail platform are known to be susceptible to long tails in response times. Services such as Amazon’s EC2 have proven attractive platforms for building similar applications. Unfortunately, virtualization used in such platforms exacerbates the long tail problem by factors of two to four. Surprisingly, we find that poor response times in EC2 are a property of nodes rather than the network, and that this property of nodes is both pervasive throughout EC2 and persistent over time. The root cause of this problem is co-scheduling of CPU-bound and latency-sensitive tasks. We leverage these observations in Bobtail, a system that proactively detects and avoids these bad neighboring VMs without significantly penalizing node instantiation. With Bobtail, common communication patterns benefit from reductions of up to 40% in 99.9th percentile response times.


CONGA: distributed congestion-aware load balancing for datacenters

We present the design, implementation, and evaluation of CONGA, a network-based distributed congestion-aware load balancing mechanism for datacenters. CONGA exploits recent trends including the use of regular Clos topologies and overlays for network virtualization. It splits TCP flows into flowlets, estimates real-time congestion on fabric paths, and allocates flowlets to paths based on feedback from remote switches. This enables CONGA to efficiently balance load and seamlessly handle asymmetry, without requiring any TCP modifications. CONGA has been implemented in custom ASICs as part of a new datacenter fabric. In testbed experiments, CONGA has 5x better flow completion times than ECMP even with a single link failure and achieves 2-8x better throughput than MPTCP in Incast scenarios. Further, the Price of Anarchy for CONGA is provably small in Leaf-Spine topologies; hence CONGA is nearly as effective as a centralized scheduler while being able to react to congestion in microseconds. Our main thesis is that datacenter fabric load balancing is best done in the network, and requires global schemes such as CONGA to handle asymmetry.



Schnellnavigation zur Seite über Nummerneingabe