Inhalt des Dokuments
Topics for the Seminar on Internet Measurement, SS 2013
Topics for the seminar on Internet
Measurement, SS 2013 .
Themen für das Seminar über Internet Measurement, SS 2013 .
- Analysis of a "/0" Stealth Scan from a Botnet
- Classifying Internet One-way Traffic
- Content delivery and the natural evolution of DNS: remote dns trends, performance issues and alternative solutions
- Longtime Behavior of Harvesting Spam Bots
- Measuring the Deployment of IPv6: Topology, Routing and Performance
- Mitigating Sampling Error when Measuring Internet Client IPv6 Capabilities
- Obtaining In-Context Measurements of Cellular Network Performance
- Anatomy of a Large European IXP
- Deadline-Aware Datacenter TCP (D2TCP)
- Measuring and Fingerprinting Click-Spam in Ad Networks
- Tracking Millions of Flows in High Speed Networks for Application Identification
- Unreeling Netflix: Understanding and Improving Multi-CDN Movie Delivery
Analysis of a "/0" Stealth Scan from a Botnet
Botnets are the most common vehicle of cyber-criminal activity. They are used for spamming, phishing, denial of service attacks, brute-force cracking, stealing private information, and cyber warfare. Botnets carry out network scans for several reasons, including searching for vulnerable machines to infect and recruit into the botnet, probing networks for enumeration or penetration, etc. We present the measurement and analysis of a horizontal scan of the entire IPv4 address space conducted by the Sality botnet in February of last year. This 12-day scan originated from approximately 3 million distinct IP addresses, and used a heavily coordinated and unusually covert scanning strategy to try to discover and compromise VoIP-related (SIP server) infrastructure. We observed this event through the UCSD Network Telescope, a /8 darknet continuously receiving large amounts of unsolicited trafﬁc, and we correlate this trafﬁc data with other public sources of data to validate our inferences. Sality is one of the largest botnets ever identiﬁed by researchers, its behavior represents ominous advances in the evolution of modern malware: the use of more sophisticated stealth scanning strategies by millions of coordinated bots, targeting critical voice communications infrastructure. This work offers a detailed dissection of the botnet’s scanning behavior, including general methods to correlate, visualize, and extrapolate botnet behavior across the global Internet.
A. Dainotti, A. King, K. Claffy, F. Papale, and A. Pescapè, in Internet Measurement Conference (IMC), Nov 2012.
Classifying Internet One-way Traffic
In this work we analyze a massive data-set that
captures 5.23 petabytes of traffic to shed light into the
composition of one-way traffic towards a large network based on a
novel one-way traffic classifier. We find that one-way traffic
makes a very large fraction of all traffic in terms of flows, it
can be primarily attributed to malicious causes, and it has
declined since 2004 because of relative decrease of scan traffic.
In addition, we show how our classifier is useful for detecting
Eduard Glatz, Xenofontas Dimitropoulos, in Internet Measurement Conference (IMC), Nov 2012.
Content delivery and the natural evolution of DNS: remote dns trends, performance issues and alternative solutions
Content Delivery Networks (CDNs) rely on the
Domain Name System (DNS) for replica server selection. DNS-based
server selection builds on the assumption that, in the absence of
information about the client's actual network location, the
location of a client's DNS resolver provides a good approximation.
The recent growth of remote DNS services breaks this assumption
and can negatively impact client's web performance.
In this paper, we assess the end-to-end impact of using remote DNS services on CDN performance and present the first evaluation of an industry-proposed solution to the problem. We find that remote DNSusage can indeed significantly impact client's web performance and that the proposed solution, if available, can effectively address the problem for most clients. Considering the performance cost of remote DNS usage and the limited adoption base of the industry-proposed solution, we present and evaluate an alternative approach, Direct Resolution, to readily obtain comparable performance improvements without requiring CDN or DNS participation.
John S. Otto, Mario A. Sánchez, John P. Rula, and Fabián E. Bustamante, in Internet Measurement Conference (IMC), Nov 2012.
Longtime Behavior of Harvesting Spam Bots
Our observations suggest that simple obfuscation methods are still efficient for protecting addresses from being harvested. A key finding is that search engines are used as proxies, either to hide the identity of the harvester or to optimize the harvesting process.
Oliver Hohlfeld, Thomas Graf, and Florin Ciucu, in Internet Measurement Conference (IMC), Nov 2012.
Measuring the Deployment of IPv6: Topology, Routing and Performance
We use historical BGP data and recent active
measurements to analyze trends in the growth, structure, dynamics
and performance of the evolving IPv6 Internet, and compare them to
the evolution of IPv4. We find that the IPv6 network is maturing,
albeit slowly. While most core Internet transit providers have
deployed IPv6, edge networks are lagging. Early IPv6 network
deployment was stronger in Europe and the Asia-Pacific region,
than in North America. Current IPv6 network deployment still shows
the same pattern. The IPv6 topology is characterized by a single
dominant player -- Hurricane Electric -- which appears in a large
fraction of IPv6 AS paths, and is more dominant in IPv6 than the
most dominant player in IPv4. Routing dynamics in the IPv6
topology are largely similar to those in IPv4, and churn in both
networks grows at the same rate as the underlying topologies. Our
measurements suggest that performance over IPv6 paths is
comparable to that over IPv4 paths if the AS-level paths are the
same, but can be much worse than IPv4 if the AS-level paths
Amogh Dhamdhere, Matthew Luckie, Bradley Huffaker, kc claffy, Ahmed Elmokashfi, Emile Aben, in Internet Measurement Conference (IMC), Nov 2012.
Mitigating Sampling Error when Measuring Internet Client IPv6 Capabilities
Despite the predicted exhaustion of
unallocated IPv4 addresses between 2012 and 2014, it remains
unclear how many current clients can use its successor, IPv6, to
access the Internet. We propose a refinement of previous
measurement studies that mitigates intrinsic measurement biases,
and demonstrate a novel web-based technique using Google ads to
perform IPv6 capability testing on a wider range of clients. After
applying our sampling error reduction, we find that 6% of
world-wide connections are from IPv6-capable clients, but only
1--2% of connections preferred IPv6 in dual-stack (dual-stack
failure rates less than 1%). Except for an uptick around IPv6-day
2011 these proportions were relatively constant, while the
percentage of connections with IPv6-capable DNS resolvers has
increased to nearly 60%. The percentage of connections from clients
with native IPv6 using happy eyeballs has risen to over 20%.
Sebastian Zander, Lachlan L.H. Andrew, Grenville Armitage, Geoff Huston, George Michaelson, in Internet Measurement Conference (IMC), Nov 2012.
Obtaining In-Context Measurements of Cellular Network Performance
Network service providers, and other parties,
require an accurate understanding of the performance cellular
networks deliver to users. In particular, they often seek a
measure of the network performance users experience solely when
they are interacting with their device---a measure we call
in-context. Acquiring such measures is challenging due to the many
factors, including time and physical context, that influence
cellular network performance. This paper makes two contributions.
First, we conduct a large scale measurement study, based on data
collected from a large cellular provider and from hundreds of
controlled experiments, to shed light on the issues underlying
in-context measurements. Our novel observations show that
measurements must be conducted on devices which (i) recently used
the network as a result of user interaction with the device, (ii)
remain in the same macro-environment (e.g., indoors and stationary),
and in some cases the same micro-environment (e.g., in the user's
hand), during the period between normal usage and a subsequent
measurement, and (iii) are currently sending/ receiving little or no
user-generated traffic. Second, we design and deploy a prototype
active measurement service for Android phones based on these key
insights. Our analysis of 1650 measurements gathered from 12
volunteer devices shows that the system is able to obtain average
throughput measurements that accurately quantify the performance
experienced during times of active device and network usage.
Aaron Gember, Aditya Akella, Jeffrey Pang, Alexander Varshavsky, Ramon Caceres, in Internet Measurement Conference (IMC), Nov 2012.
Anatomy of a Large European IXP
The largest IXPs
carry on a daily basis traffic volumes in the petabyte range,
similar to what some of the largest global ISPs reportedly handle.
This little-known fact is due to a few hundreds of member ASes
exchanging traffic with one another over the IXP's infrastructure.
This paper reports on a first-of-its-kind and in-depth analysis of
one of the largest IXPs worldwide based on nine months' worth of
sFlow records collected at that IXP in 2011.
A main finding of our study is that the number of actual peering links at this single IXP exceeds the number of total AS links of the peer-peer type in the entire Internet known as of 2010! To explain such a surprisingly rich peering fabric, we examine in detail this IXP's ecosystem and highlight the diversity of networks that are members at this IXP and connect there with other member ASes for reasons that are similarly diverse, but can be partially inferred from their business types and observed traffic patterns. In the process, we investigate this IXP's traffic matrix and illustrate what its temporal and structural properties can tell us about the member ASes that generated the traffic in the first place. While our results suggest that these large IXPs can be viewed as a microcosm of the Internet ecosystem itself, they also argue for a re-assessment of the mental picture that our community has about this ecosystem.
Bernhard Ager, Nikolaos Chatzis, Anja Feldmann, Nadi Sarrar, Steve Uhlig, Walter Willinger, In ACM SIGCOMM, Aug 2012.
Deadline-Aware Datacenter TCP (D2TCP)
An important class of datacenter
applications, called Online Data-Intensive (OLDI) applications,
includes Web search, online retail, and advertisement. To achieve
good user experience, OLDI applications operate under
soft-real-time constraints (e.g., 300 ms latency) which imply
deadlines for network communication within the applications.
Further, OLDI applications typically employ tree-based algorithms
which, in the common case, result in bursts of children-to-parent
traffic with tight deadlines. Recent work on datacenter network
protocols is either deadline-agnostic (DCTCP) or is deadline-aware
(D3) but suffers under bursts due to race conditions. Further, D3
has the practical drawbacks of requiring changes to the switch
hardware and not being able to coexist with legacy TCP. We propose
Deadline-Aware Datacenter TCP (D2TCP), a novel transport protocol,
which handles bursts, is deadline-aware, and is readily
deployable. In designing D2TCP, we make two contributions: (1)
D2TCP uses a distributed and reactive approach for bandwidth
allocation which fundamentally enables D2TCP's properties. (2)
D2TCP employs a novel congestion avoidance algorithm, which uses
ECN feedback and deadlines to modulate the congestion window via a
gamma-correction function. Using a small-scale implementation and
at-scale simulations, we show that D2TCP reduces the fraction of
missed deadlines compared to DCTCP and D3 by 75% and 50%,
Balajee Vamanan, Jahangir Hasan, T.N. Vijaykumar, In ACM SIGCOMM, Aug 2012.
Measuring and Fingerprinting Click-Spam in Ad Networks
Advertising plays a vital role in supporting free
websites and smartphone apps. Click-spam, i.e., fraudulent or
invalid clicks on online ads where the user has no actual interest
in the advertiser's site, results in advertising revenue being
misappropriated by click-spammers. While ad networks take active
measures to block click-spam today, the effectiveness of these
measures is largely unknown. Moreover, advertisers and third
parties have no way of independently estimating or defending
In this paper, we take the first systematic look at click-spam. We propose the first methodology for advertisers to independently measure click-spam rates on their ads. We also develop an automated methodology for ad networks to proactively detect different simultaneous click-spam attacks. We validate both methodologies using data from major ad networks. We then conduct a large-scale measurement study of click-spam across ten major ad networks and four types of ads. In the process, we identify and perform in-depth analysis on seven ongoing click-spam attacks not blocked by major ad networks at the time of this writing. Our findings highlight the severity of the click-spam problem, especially for mobile ads.
Vacha Dave, Saikat Guha, Yin Zhang, In ACM SIGCOMM, Aug 2012.
Tracking Millions of Flows in High Speed Networks for Application Identification
applications exhibit increased diversity, while the Internet
routers are still oblivious to this trend. To improve the
end-to-end application QoS, one solution is to embed the application
information explicitly in packet headers, but it will bring global
changes. Another local solution is router-assisted traffic
differentiation. To achieve this, the functionalities including
packet identification and flow tracking inside the router are
required. While most existing studies focus on the former, fewer
efforts are put on the later. Given a large flow table is involved,
how to track millions of concurrent flows in a cost-effective manner
on a router's line card raises a great space-time challenge. To
address this, we design an on-chip/off-chip flow tracking system to
accommodate millions of flows and achieve the throughput at tens of
Gigabits. By exploiting temporal locality and heavy-tailedness of
Layer-4 traffic, we design the Adaptive Least Frequently Evicted
(ALFE) replacement policy to catch elephant flows, therefore
maintain a high cache hit rate. To alleviate performance penalty due
to the cache misses, we organize the flow table in a fixed-allocated
manner to fully utilize modern DRAM's burst feature. We have
implemented a research prototype using FPGA for performance
evaluation. The experiment results show that our system can reach
80% hit rate with a small-sized cache of 16K entries, while
achieving 70Mpps throughput. This enables backbone line rate
processing. Further, more than 40% power saving can be achieved by
our system, which is fast and accurate with only 3% FPGA resource
Tian Pan, Xiaoyu Guo, Chenhui Zhang, Junchen Jiang, Hao Wu, Bin Liu, In IEEE Infocom, Mar 2012.
Unreeling Netflix: Understanding and Improving Multi-CDN Movie Delivery
Netflix is the leading provider of
on-demand Internet video streaming in the US and Canada,
accounting for 29.7% of the peak downstream traffic in US.
Understanding the Netflix architecture and its performance can
shed light on how to best optimize its design as well as on the
design of similar on-demand streaming services. In this paper, we
perform a measurement study of Netflix to uncover its architecture
and service strategy. We find that Netflix employs a blend of data
centers and Content Delivery Networks (CDNs) for content
distribution. We also perform active measurements of the three
CDNs employed by Netflix to quantify the video delivery bandwidth
available to users across the US. Finally, as improvements to
Netflix's current CDN assignment strategy, we propose a
measurement-based adaptive CDN selection strategy and a
multiple-CDN-based video delivery strategy, and demonstrate their
potentials in significantly increasing user's average bandwidth.
Vijay Kumar Adhikari, Yang Guo, Fang Hao, Matteo Varvello, Volker Hilt, Moritz Steiner, Zhi-Li Zhang, In IEEE Infocom, Mar 2012.
Lehre / Teaching, WS 2012/13
- Network Protocols and Architectures (VL+UE) 
- NA: Internet Routing (SE) 
- Meshlab (PR) 
0432 L 822
Dozent: Anja Feldman et al.
Ort: TEL 1118/19
ab 19.10.2012 16:00 Uhr
19.10.2012: Preparatory meeting. The dates for the seminar itself will be fixed later.