Archive for the 'Measurement' Category

Dataset Comparison: IPv4 vs IPv6 traffic seen at the DNS Root Servers

Wednesday, October 1st, 2014 by Bradley Huffaker

image

As economic pressure imposed by IPv4 address exhaustion has grown, we seek methods to track deployment of IPv6, IPv4’s designated successor. We examine per-country allocation and deployment rates through the lens of the annual “Day in the Life of the Internet” (DITL) snapshots collected at the DNS roots by the DNS Operations, Analysis, and Research Center (DNS-OARC) from 2009 to 2014.

For more details of data sources and analysis, see:
http://www.caida.org/research/policy/dns-country/

DRoP:DNS-based Router Positioning

Saturday, September 6th, 2014 by Bradley Huffaker

As part of CAIDA’s ongoing research into Internet topology mapping, we have been working on improving our ability to geolocate backbone router infrastructure. Determining the physical locations of Internet routers is crucial for characterizing Internet infrastructure and understanding geographic pathways of global routing, as well as for creating more accurate geographic-based maps. Current commercial geolocation services focus predominantly on geolocating clients and servers, that is, edge hosts rather than routers in the core of the network.

DRoP-process Figure 1, shows the inputs and steps used by the DRoP process to generate hostname decoding rules.

In a recent paper, DRoP:DNS-based Router Positioning, we presented a new methodology for extracting and decoding geography-related strings from fully qualified domain names (DNS hostnames). We first compiled an extensive dictionary associating geographic strings (e.g., airport codes) with geophysical locations. We then searched a large set of router hostnames for these strings, assuming each autonomous naming domain uses geographic hints consistently within that domain. We used topology and performance data continually collected by our global measurement infrastructure to ascertain whether a given hint appears to co-locate different hostnames in which it is found. Finally, we combine geolocation hints into domain-specific rule sets. We generated a total of 1,711 rules covering 1,398 different domains, and validated them using domain-specific ground truth we gathered for six domains. Unlike previous efforts that relied on labor-intensive domain-specific manual analysis, our process for inferring domain-specific heuristics is automated, representing a measurable advance in the state-of-the-art of methods for geolocating Internet resources.

DDec processFigure 2, shows how users interact with DDec to decode hostnames.

In order to provide a public interface and gather feedback on our inferences, we have developed DDec. DDec allows users to decode individual hostnames, exmaine rulesets for individual domains, and provide feedback on rulesets. In addition to DRoP’s inferences, we have also included undns rules.

For more details please review the paper or the slides.

Under the Telescope: Time Warner Cable Internet Outage

Friday, August 29th, 2014 by Vasco Asturiano

In the early hours of August 27th 2014, Time Warner Cable (TWC) suffered a major Internet outage, which started around 9:30am and lasted until 11:00am UTC (4:30am-6:00am EST). According to Time Warner, this disconnect was caused by an issue with its Internet backbone during a routine network maintenance procedure.

A few sources have documented the outage based on BGP and/or active measurements, including Renesys and RIPE NCC. Here we present a view from passive traffic measurement, specifically from the UCSD Network Telescope, which continuously listens for Internet Background Radiation (IBR) traffic. IBR is a constantly changing mix of traffic caused by benign misconfigurations, bugs, malicious activity, scanning, responses to spoofed traffic (backscatter), etc.  In order to extract a signal usable for our inferences, we count the number of unique source IP addresses (in IBR observed from a certain AS or geographical area) that pass a series of filters. Our filters try to remove (i) spoofed traffic, (ii) backscatter, and (iii) ports/protocols that generate significant noise.

Most of TWC’s Autonomous Systems seem to have been affected during the time of the reported outage. Our indicators from the telescope show a total absence of traffic from TWC’s ASes, indicating a complete network outage.

Figure 1: Number of unique IBR source IPs (after filtering) observed per minute for the TWC ASes

Figure 1 shows the number of unique source IPs originated by TWC ASes per minute, as observed by the network telescope; we plot only TWC ASes from which there was any IBR traffic observed just before and after the event. For reference, these ASes are: AS7843, AS10796, AS11351, AS11426, AS11427, AS11955, AS12271 and AS20001.

TWC is a large Internet access provider in the United States, and this IBR signal can also reveal insight into the impact of this outage across the country. Figure 2 shows the same metric as Figure 1, but for source IPs across the entire country, indicating a drop of about 12% in the number of (filtered) IBR sources, which suggests that during the incident, a significant fraction of the US population lost Internet access.

Figure 2: Number of unique IBR source IPs (after filtering) observed in the US 

Drilling down to a regional level shows which US states seem to have suffered a larger relative drop in traffic.

Figure 3: Decrease ratio of unique IBR source IPs per US state 

Figure 3 compares the number of IBR sources observed in the 5 minute-interval just before the incident (9:25-9:30UTC) to the 5-minute interval after it (9:30-9:35UTC). The yellow to red color gradient represents the ratio at which a certain state’s IBR sources have decreased (redder means larger drop). States that did not suffer a substantial relative decrease are shown in yellow. This geographical spread is likely correlated with market penetration of TWC connectivity within each state.

 

 

network mapping and measurement conference

Tuesday, May 28th, 2013 by kc

I had the honor of presenting an overview of CAIDA’s recent research activities at the Network Mapping and Measurement Conference hosted by Sean Warnick and Daniel Zappala. Talks topics included: social learning behavior in complex networks, re-routing based on expected network outages along current paths, twitter data mining to analyze suicide risk factors and political sentiments (three different talks). James Allen Evans gave a sociology of science talk, an interview form of which seems to be achived by the Oxford Internet Institute. The organizers even arranged a talk from a local startup, NUVI, doing some fascinating real-time visualization and analytics of social network data (including Twitter, Facebook, Reddit, Youtube).

The workshop was held at Sundance, Utah, one of the most beautiful places I’ve ever been for a workshop. This workshop series was originally DoD-sponsored with lots of government attendees interested in Internet infrastructure protection, but sequester and travel freezes this year yielded only two USG attendees, and budget constraints may keep this workshop from happening again next year. I hope not, it was really a unique environment and exposed me to a range of work I would not otherwise have discovered anytime soon. Kudos to the organizers and sponsors.

Carna botnet scans confirmed

Monday, May 13th, 2013 by Alistair King

On March 17, 2013, the authors of an anonymous email to the “Full Disclosure” mailing list announced that last year they conducted a full probing of the entire IPv4 Internet. They claimed they used a botnet (named “carna” botnet) created by infecting machines vulnerable due to use of default login/password pairs (e.g., admin/admin). The botnet instructed each of these machines to execute a portion of the scan and then transfer the results to a central server. The authors also published a detailed description of how they operated, along with 9TB of raw logs of the scanning activity.

Online magazines and newspapers reported the news, which triggered some debate in the research community about the ethical implications of using such data for research purposes. A more fundamental question received less attention: since the authors went out of their way to remain anonymous, and the only data available about this event is the data they provide, how do we know this scan actually happened? If it did, how do we know that the resulting data is correct?

(more…)

2001:deba:7ab1:e::effe:c75

Tuesday, January 22nd, 2013 by Robert Beverly

[This blog entry is guest written by Robert Beverly at the Naval Postgraduate School.]

In many respects, the deployment, adoption, use, and performance of IPv6 has received more recent attention than IPv4. Certainly the longitudinal measurement of IPv6, from its infancy to the exhaustion of ICANN v4 space to native 1% penetration (as observed by Google), is more complete than IPv4. Indeed, there are many vested parties in (either the success or failure) of IPv6, and numerous IPv6 measurement efforts afoot.

Researchers from Akamai, CAIDA, ICSI, NPS, and MIT met in early January, 2013 to firstly share and make sense of current measurement initiatives, while secondly plotting a path forward for the community in measuring IPv6. A specific objective of the meeting was to understand which aspects of IPv6 measurement are “done” (in the sense that there exists a sound methodology, even if measurement should continue), and which IPv6 questions/measurements remain open research problems. The meeting agenda and presentation slides are archived online.

(more…)

Packet Loss Metrics from Darknet Traffic

Thursday, January 17th, 2013 by Karyn Benson

At the CoNEXT Student Workshop, in Nice, France on December 10, 2012, CAIDA shared recent research on Internet outages in a poster entitled “Gaining Insight Into AS-Level Outages through Analysis of Internet Background Radiation.”

(more…)

Syria disappears from the Internet

Wednesday, December 5th, 2012 by Alistair King and Alberto Dainotti

On the 29th of November, shortly after 10am UTC (12pm Damascus time), the Syrian state telecom (AS29386) withdrew the majority of BGP routes to Syrian networks (see reports from Renesys, Arbor, CloudFlare, BGPmon). Five prefixes allocated to Syrian organizations remained reachable for another several hours, served by Tata Communications. By midnight UTC on the 29th, as reported by BGPmon, these five prefixes had also been withdrawn from the global routing table, completing the disconnection of Syria from the rest of the Internet.

(more…)

CAIDA at the NSF Secure and Trustworthy Cyberspace (SaTC) Principal Investigators’ Meeting

Tuesday, December 4th, 2012 by Alberto Dainotti

Last week CAIDA researchers (Alberto and kc) visited National Harbor (Maryland) for the 1st NSF Secure and Trustworthy Cyberspace (SaTC) Principal Investigators Meeting. The National Science Foundation’s SATC program is an interdisciplinary expansion of the old Trustworthy Computing program sponsored by CISE, extended to include the SBE, MPS, and EHR directorates. The SATC program also includes a bold new Transition to Practice category of project funding — to address the challenge of moving from research to capability — which we are excited and honored to be a part of.

(more…)

Twelve Years in the Evolution of the Internet Ecosystem

Tuesday, April 10th, 2012 by Amogh Dhamdhere

Our recent study of the evolution of the Internet ecosystem over the last twelve years (1998-2010) appeared in the IEEE/ACM Transactions on Networking in October 2011. Why is the Internet an ecosystem? The Internet, commonly described as a network of networks, consists of thousands of Autonomous Systems (ASes) of different sizes, functions, and business objectives that interact to provide the end-to-end connectivity that end users experience. ASes engage in transit (or customer-provider) relations, and also in settlement-free peering relations. These relations, which appear as inter domain links in an AS topology graph, indicate the transfer of not only traffic but also economic value between ASes. The Internet AS ecosystem is highly dynamic, experiencing growth (birth of new ASes), rewiring (changes in the connectivity of existing ASes), as well as deaths (of existing ASes). The dynamics of the AS ecosystem are determined both by external business environment factors (such as the state of the global economy or the popularity of new Internet applications) and by complex incentives and objectives of each AS. Specifically, ASes attempt to optimize their utility or financial gains by dynamically changing, directly or indirectly, the ASes they interact with.

The goal of our study was to better understand this complex ecosystem, the behavior of entities that constitute it (ASes), and the nature of interactions between those entities (AS links). How has the Internet ecosystem been growing? Is growth a more significant factor than rewiring in the formation of new links? Is the population of transit providers increasing (implying diversification of the transit market) or decreasing (consolidation of the transit market)? As the Internet grows in its number of nodes and links, does the average AS-path length also increase? Which ASes engage in aggressive multihoming? Which ASes are especially active, i.e., constantly adjust their set of providers? Are there regional differences in how the Internet evolves?

(more…)