Archive for the 'Topology' Category

What’s in a Ranking? comparing Dyn’s Baker’s Dozen and CAIDA’s AS Rank

Thursday, July 2nd, 2015 by Bradley Huffaker

The Internet infrastructure is composed of thousands of independent networks (Autonomous Systems, or ASes) that engage in typically voluntary bilateral interconnection (“peering”) agreements to provide reachability to each other. Underlying these peering relationships, are business relationships between networks, although whether and how much money ASes exchange when they interconnect is not generally published. Some of these business relationships are relatively easy to infer with a high degree of confidence using a basic economic assumption that commercial providers do not give away traffic transit services (i.e., route announcements) for free.

For several years CAIDA has used publicly available BGP data to infer business relationships among ASes and, consequently, rank Autonomous Systems based on a measure of their influence in the global routing system, specifically the size of their customer cone. (An AS’s customer cone is the set of ASes, IPv4 prefixes, or IPv4 addresses that the AS can reach via its customers, i.e., by crossing only customer links.) The methodology behind our ranking is described in detail in our IMC2013 paper (“AS Relationships, Customer Cones, and Validation”). By default, CAIDA’s AS Rank sorts by the number of other ASes in each AS’s customer cone (an AS granularity), but the AS Rank web interface also supports sorting by the number of IPv4 prefixes or IPv4 addresses observed in each AS’s customer cone (which the web interface calls prefix or IP address granularities).

Other organizations also provide rankings of ASes; the most well-known is Dyn’s IP Transit Intelligence AS ranking. Since both CAIDA’s and Dyn’s rankings aim to use a metric that reflects some notion of “predominant role in the global Internet routing system”, we have received several inquiries on how our ranking methodology and results differ from theirs. In this essay we try to answer this question to the best of our ability, acknowledging that their methodology is proprietary and we do not know exactly what they are doing beyond what they have released publicly. This 2013 MENOG presentation (Dyn bought the Renesys company in 2014) states that their ranking is based on quantity of transited IP space, so the closest possible comparison to what we currently do would be to compare their ranking with our IP-address-based customer cone ranking (which is not currently our default). For this exercise we will compare CAIDA’s 1st January 2015 AS ranking by customer cone with the chronologically last value on Dyn’s 2014 Baker’s Dozen, which is based on data observed around the same date.

Dyn’s web site provides the following image showing their rankings throughout 2014: Dyn-Bakers-Dozen-2014-All

In order to compare not only the computed ranking, but the values of the metrics being ranked (i.e., transited IPv4 space vs. number of addresses in customer cone), we create a mapping between the two spaces. Dyn does not put numbers on their y-axis, and they plot only the top 13 ranked ASes, so we do not know the range of y-values represented. In order to make the comparison possible, we will (make a leap of faith and) assume that the top thirteen ranked ASes for each metric cover roughly the same rank of values. (We caution that this assumption may be unjustified and are trying to validate it with Dyn.) So we map the top ranked ASes in Dyn (Level 3 AS3356), to the top ranked AS in CAIDA (also Level 3 AS3356), and map the 13th-ranked AS in Dyn, (Hurricane AS6939), to the 13th ranked AS in CAIDA, (Korea Telecom AS4766). These upper and lower thresholds result in the following mapping between the transited IPv4 space and number of IPv4 addresses in customer cone:

ASdyn_i.dyn_y = ASdyn_i.transit_ip – ASdyn_13.transit_ip + AScaida_13.number_addresses
AScaida_0.num_addresses – AScaida_13.num_addresses

Dyn vs CAIDA's AS Ranking
An AS’s rank is based on the number of ASes with a value (of the ranked metric) greater than the given AS. CAIDA’s 8th, 11th, and 13th ranked ASes are gray because we do not know their Dyn ranking.
as-prefix-percentage Hilbert map visulization shows utilization of IPv4 address space, rendered in two dimensions using as space-filling continous fractal Hilbert curve of order 12. Each pixel in the full resolution image represents a /24 block; red indicates used blocks, green unassigned blocks and blue RFC special blocks. Routed unused blocks are grey and unrouted assigned black

Although their order changes, the top nine ASes are the same in both rankings. Three of Dyn’s top-ranked ASes — China Telecom (AS4134), Beyond (AS3491), and Level 3 (AS3549) — are not in CAIDA’s top 14 ranked ASes; instead CAIDA’s top 14 includes AT&T (AS7018), Deutsche Telecom (AS3320), and Korea Telecom (AS4766). Some of this discrepancy can be explained by Dyn’s curation of the data, including “dealing with anomalies, discounting pre-CIDR allocations, ignoring short-lived announcements, counting remaining prefixes (non-linearly) based on size (/8 – /24 only), etc“. We assume these heuristics aim to make the number of transited addresses a closer approximation to the amount of transited traffic, which Dyn suggests is the more interesting ranking (in the same 2013 MENOG presentation).

We agree with Dyn that the number of IP addresses is not representative of traffic, and have always emphasized that we are not in a position to rank ASes by traffic transited. Not only is there huge variation in traffic to/from different IP addresses (e.g., home user versus popular web servers), but many announced IP addresses are not even assigned to any hosts. In an October 2013 study, CAIDA researchers found that of the 10.4M addresses announced in that month, only 5.3M (51%) were observed sending traffic (these “used” address blocks are shown as red in the Hilbert map on the right). This observation suggests another arguably more meaningful (but computationally expensive) method to rank ASes: normalizing by the amount of observably actively used address space.

July, August, and September 2013



Since we do not yet have census information for January 2015, we use July, August, and September 2013 usage data to compare Dyn’s 2013 ranking with CAIDA’s AS ranking weighted by the number of observably used /24 IPv4 prefixes in the customer cone. (A /24 is defined as “used” if the census observed it as in use.)

The results of this ranking by “observably used IPv4 address /24 blocks”-based customer cone (i.e., the number of apparently used /24 blocks in an AS’s customer cone) look more similar to the Dyn rankings, consistent with the fact that this method of calculating customer cones accounts for some of the effect Dyn captures by discounting pre-CIDR blocks, which are less likely to be fully utilized.

Dyn vs CAIDA's AS Ranking
An AS’s ranking is based on the number of ASes with a value greater than the given AS. The CAIDA’s 8th, 12th, and 13th ranked AS are colored gray to indicate that we do not have a known their Dyn ranking.
 2015   2013 
 dyn   address   address   used   dyn 
 2015 

dyn 1.00 0.82 0.83 0.86 0.82
address 0.82 1.00 0.74 0.66 0.49
 2013 

address 0.83 0.74 1.00 0.96 0.86
used 0.86 0.66 0.96 1.00 0.90
dyn 0.82 0.49 0.86 0.90 1.00

We computed the Pearson correlation coefficient between the results of the two ranking methods. A value of 1 shows perfect correlation or that the two systems have identical rankings. A 0 means there is no correlation or that the two rankings are completely different. Outside the comparison with themselves, which by definition produces 1.00, the two most similar rankings are Dyn’s 2013 transit addresses and CAIDA’s 2013 used /24s with a correlation of 0.90.

This approach improves the correlation between Dyn’s and CAIDA’s ranking (e.g., the Pearson correlation coefficient increases from 0.82 to 0.90, see Table), but it amplifies the dominance of the top-ranked AS (Level 3 AS3356) for CAIDA’s census-derived customer cone ranking.

If we correlate how the rankings have changed over the last two years — which we cannot do for the census-based ranking since we only have 2013 data — we find that Dyn’s ranking showed greater consistency (a correlation between the 2013 and 2015 rankings of 0.82 compared with CAIDA’s 0.74), perhaps due to their data curation process.

In summary, CAIDA’s IPv4 address-based customer cone and Dyn’s transited IPv4 address space roughly agree on the top ASes, although their relative weighting diverges.


CAIDA Delivers More Data To the Public

Wednesday, February 12th, 2014 by Paul Hick

As part of our mission to foster a collaborative research environment in which data can be acquired and shared, CAIDA has developed a framework that promotes wide dissemination of our datasets to researchers. We classify a dataset as either public or restricted based on a consideration of privacy issues involved in sharing it, as described in our data sharing framework document Promotion of Data Sharing (http://www.caida.org/data/sharing/).

Public datasets are available for downloaded from our public dataserver (http://data.caida.org) subject to conditions specified in our Acceptable Use Agreement (AUA) for public data (http://www.caida.org/home/legal/aua/public_aua.xml). CAIDA provides access to restricted datasets conditionally to qualifying researchers of academic and CAIDA-member institutions agreeing to a more restrictive AUA (http://www.caida.org/home/legal/aua/).

In January 2014 we reviewed our collection of datasets in order to re-evaluate their classification. As a result, as of February 1, we have converted several popular restricted CAIDA datasets into public datasets, including most of one of our largest and most popular data collections: topology data from the (now retired) skitter measurement infrastructure (operational between 1998 and 2008), and its successor, the Archipelago (or Ark) infrastructure (operational since September 2007). We have now made all IPv4 measurements older than two years (which includes all skitter data) publicly available. In addition to the raw data, this topology data includes derived datasets such as the Internet Topology Data Kits (ITDKs). Further, to encourage research on IPv6 deployment, we made our IPv6 Ark topology and performance measurements, from,December 2008 up to the present, publicly available as a whole. We have added these new public data to the existing category of public data sets, which includes AS links data inferred from traceroute measurements taken by skitter and Ark platforms.

Several other datasets remain under consideration for public release, so stay tuned. For an overview of all datasets currently provided by CAIDA (both public and restricted) see our data overview page (http://www.caida.org/data/overview/).

Support for this data collection and sharing provided by DHS Science and Technology Directorate’s PREDICT project via Cooperative Agreement FA8750-12-2-0326 and NSF’s Computing Research Infrastructure Program via CNS-0958547.

 

 

IPv4 and IPv6 AS Core 2013

Friday, August 9th, 2013 by Bradley Huffaker

We recently released a visualization at http://www.caida.org/research/topology/as_core_network/ that represents our macroscopic snapshots of IPv4 and IPv6 Internet topology samples captured in 2013. The plots illustrate both the extensive geographical scope as well as rich interconnectivity of nodes participating in the global Internet routing system.

IPv4 and IPv6 AS Core Graph, Jan 2013

This AS core visualization addresses one of CAIDA’s topology mapping project goals is to develop techniques to illustrate structural relationships and depict critical components of the Internet infrastructure. These IPv4 and IPv6 graphs show the relative growth of the two Internet topologies, and in particular the steady continued growth of the IPv6 topology. Although both IPv4 and IPv6 topologies experienced a lot of churn, the net change in number of ASes was 3,290 (10.7%) in our IPv4 graph and 495 (25.7%) in our IPv6 graph.

In order to improve our AS Core visualization over previous years, this year we made two major refinements to our graphing methodology, including how we rank individual ASes. First, we now rank ASes based on their transit degree rather then their outdegree. Second, we now infer links across Internet eXchange (IX) point address space, rather than considering the IX itself a node to which various ISPs attach. Details at http://www.caida.org/research/topology/as_core_network/.

[For details on a more sophisticated methodology for ranking AS interconnectivity, based on inferring AS relationships from BGP data, see http://www.caida.org/data/active/as-relationships/.]

2001:deba:7ab1:e::effe:c75

Tuesday, January 22nd, 2013 by Robert Beverly

[This blog entry is guest written by Robert Beverly at the Naval Postgraduate School.]

In many respects, the deployment, adoption, use, and performance of IPv6 has received more recent attention than IPv4. Certainly the longitudinal measurement of IPv6, from its infancy to the exhaustion of ICANN v4 space to native 1% penetration (as observed by Google), is more complete than IPv4. Indeed, there are many vested parties in (either the success or failure) of IPv6, and numerous IPv6 measurement efforts afoot.

Researchers from Akamai, CAIDA, ICSI, NPS, and MIT met in early January, 2013 to firstly share and make sense of current measurement initiatives, while secondly plotting a path forward for the community in measuring IPv6. A specific objective of the meeting was to understand which aspects of IPv6 measurement are “done” (in the sense that there exists a sound methodology, even if measurement should continue), and which IPv6 questions/measurements remain open research problems. The meeting agenda and presentation slides are archived online.

(more…)

IPv6: What could be (but isn’t yet)

Monday, June 4th, 2012 by Matthew Luckie

With IPv6 Launch approaching, there is increasing interest in measuring the readiness of the IPv6 infrastructure. A major concern, particularly for networks that source or sink content, is the performance that is achievable over IPv6, and how it compares to the performance over IPv4. A recent study by Nikkah et al. argues that data plane performance, as measured by web page download times, is largely comparable in IPv4 and IPv6, as long as the AS-level paths in IPv4 and IPv6 are identical.  We have confirmed these findings with our own measurements covering 593 dual-stack ASes: we found that 79% of paths had IPv6 performance within 10% of IPv4 (or IPv6 had better performance) if the forward AS-level path was the same in both protocols, while only 63% of paths had similar performance if the forward AS-level path was different.

Given the apparent importance of congruent AS-level paths in IPv4 and IPv6, we measured to what extent such congruence exists today, and how this has evolved historically. We measure IPv4 and IPv6 AS paths from seven vantage points (ACOnet/AS1853, IIJ/AS2497, NTT/AS2914, Tinet/AS3257, HE/AS6939, AT&T/AS7018, NL-BIT/AS12859) which have provided BGP data to Routeviews and RIPE RIS since 2003. The figure below plots the fraction of dual-stack paths that are identical in IPv4 and IPv6 from each vantage point over time. According to this metric, IPv6 paths are maturing slowly. In January 2004, 10-20% of paths were the same for IPv4 and IPv6; eight years later, 40-50% of paths are the same for six of the seven vantage points.

Fraction of identical dual-stack paths over time

(more…)

Shutting the phone network off while you’re running out of internet protocol numbers

Friday, January 20th, 2012 by kc

I ended 2011 with a short (20 December) visit to a pleasantly warm Washington, D.C. for my 5th FCC Technical Advisory Council meeting. Some of the discussions from the third meeting were extended, others cut off for lack of time. We spent over an hour on the suggestion made by the Legacy Transition working group two meetings ago to advise the FCC to move forward in sunsetting (although we shunned that term at this meeting — “It’s a new beginning, not an end!”) the public-switched telephone network (PSTN). Many questions have arisen repeatedly in the discussions over the course of the last two meetings (and two FCC workshops in between), notably, “What happens to the telephony numbering system?” The initial strategy was imprecise, “The numbering plan will continue to exist but governance and allocation process needs to be considered.” Another repeated question has been “What exactly do we mean by PSTN?”

(more…)

Model for Internet Evolution Predicts Consolidation in Tier-1 Transit Market

Friday, July 15th, 2011 by Amogh Dhamdhere

Although the outcome is not good news, it is gratifying to see the predictions of a model of the Internet ecosystem being validated by the real world. Specifically, the recent spate of ISP consolidations is precisely what our network formation model predicts. First, Level3 acquired Global Crossing in a deal valued at $3B. A few months later, Centurylink (QWEST) acquired Savvis for $2.5B. Our model predicts that this consolidation will continue unless ailing tier-1 providers find a new source of revenue to compensate for their losses on IP transit.

(more…)

CAIDA participation in IPv6 day

Sunday, June 5th, 2011 by kc

On June 8 2011 a group of content providers, including Google, Yahoo and Facebook, are going to dual-stack their content, in an event called World IPv6 Day. This trial will enable content providers to gain experience with increased levels of IPv6 traffic and gauge the extent and effect of broken dual-stack end-users. CAIDA is cooperating with RIPE NCC’s measurements on this day, providing a dozen Ark monitors to increase the number of vantage points from which RIPE will actively test a set of dual-stacked websites for levels of IPv6 support: existence of AAAA records; ping/ping6 response; traceroute/traceroute6; and HTTP reachability.

(more…)

Exhausted IPv4 address architectures

Tuesday, May 3rd, 2011 by kc

In light of available data on global IPv6 deployment, ISPs, and those who build equipment for them, have already accepted that multi-level network address translation (NAT, between IPv4 and IPv6 networks) is here for the foreseeable future, with all its limits on end-to-end reachability and application functionality, and its required unscalable per-protocol hacks. Whether “carrier-grade” NAT (CGN) technology supports a transition to IPv6 or becomes the endgame itself is irrelevant to the planning horizon of public companies, who must now develop sustainable business models that accommodate, if not support, IPv4 scarcity. I’ve heard a few notable predicted outcomes from engineers in the field.

(more…)

CAIDA’s IPv6 measurement and analysis activities

Friday, April 29th, 2011 by kc

In pursuit of more rigorous data on IPv6 deployment, CAIDA has undertaken four IPv6 measurement and analysis exercises: address allocation data; traceroute-based topology; DNS queries from root servers; and a global survey of network operators in 2008.

(more…)