Archive for the 'Commentaries' Category

1st CAIDA BGP Hackathon brings students and community experts together

Thursday, February 18th, 2016 by Josh Polterock

We set out to conduct a social experiment of sorts, to host a hackathon to hack streaming BGP data. We had no idea we would get such an enthusiastic reaction from the community and that we would reach capacity. We were pleasantly surprised at the response to our invitations when 25 experts came to interact with 50 researchers and practitioners (30 of whom were graduate students). We felt honored to have participants from 15 countries around the world and experts from companies such as Cisco, Comcast, Google, Facebook and NTT, who came to share their knowledge and to help guide and assist our challenge teams.

Having so many domain experts from so many institutions and companies with deep technical understanding of the BGP ecosystem together in one room greatly increased the kinetic potential for what we might accomplish over the course of our two days.

(more…)

So, you want to draw the Internet?

Saturday, February 6th, 2016 by Bradley Huffaker

When visualizing the Internet, one can consider several different levels of abstraction, including the Internet Protocal (IP) address, router, and Autonomous System (AS) levels. IP addresses identify interfaces on devices that connect to the Internet. Routers are devices that route traffic by accepting it on one interface and forwarding it out another interface. (Routers may have many interfaces.) An Autonomous Systems (AS) is a set of IP addresses operated under a single administrative umbrella. The three granularities are illustrated below:

Internet level Abstraction

Most Internet mapping methods have focused on characterizing and modeling network structure at the level of interconnected Autonomous Systems (ASes). We have developed different ways to annotate ASes, using a variety of available datasets, to support visualizations of AS topology:
three-views.

I gave a class lecture at UCSD in January 2016 on visualizing Internet AS topology. I also prepared a supplemental data set to facilitate student exploration and experimentation. Comments and feedback welcome!

CAIDA BGP Hackathon 2016 Attendees

Wednesday, January 13th, 2016 by Josh Polterock

We are pleased to post the attendees list for the upcoming CAIDA BGP Hackathon 2016 organized jointly with Colorado State University, University of Southern California, University of Waikato, the Route Views Project, RIPE NCC, Universidade Federal de Minas Gerais and FORTH. We look forward to hosting over 80 attendees — including more than 20 domain experts — from over 50 organizations who will come from around the world to participate in the first CAIDA BGP Hackathon at the San Diego Supercomputer Center at UC San Diego in La Jolla, CA. The hackathon is sponsored by industry, professional organizations, and government agencies, with interest in promoting the development of tools to model, measure, and monitor the routing infrastructure of the Internet. This support allowed us to to provide 33 travel grants.

Due to the overwhelming interest in the hackathon, we have reached capacity. We are no longer accepting applications for this year’s hackathon.

We would like to give special thanks to our sponsors.

  • ACM SIGCOMM
  • Cisco
  • Comcast
  • Department of Homeland Security (DHS)
  • Google NetOps and Google Open Source Research Group
  • The Internet Society (ISOC)
  • National Science Foundation (NSF)
  • San Diego Supercomputer Center

Please send any questions or media inquiries regarding the hackathon to bgp-hackathon-info at caida dot org.

Report from the 2nd NDN Community Meeting (NDNcomm 2015)

Tuesday, November 10th, 2015 by kc

The report for the Second NDN Community Meeting (NDNcomm 2015) is available online now. The meeting, held at UCLA in Los Angeles, California on September 28-29, 2015, provided a platform for attendees from 63 institutions across 13 countries to exchange recent NDN research and development results, to debate existing and proposed functionality in NDN forwarding, routing, and security, and to provide feedback to the NDN architecture design evolution.

[The workshop was partially supported by the National Science Foundation CNS-1345286, CNS-1345318, and CNS-1457074. We thank the NDNcomm Program Committee members for their effort of putting together an excellent program. We thank all participants for their insights and feedback at the workshop.]

CAIDA releases the August 2015 Internet Topology Data Kit (ITDK 2015-08)

Friday, November 6th, 2015 by Josh Polterock

Nothing feels better than publishing fresh data for the research community, especially when fresh brings new features. Today, CAIDA released the August 2015 version of our popular Internet Topology Data Kit (ITDK) that includes topologies for both IPv4 and IPv6. CAIDA’s ITDK provides researchers with data that describes connectivity and routing observations gathered from a large cross-section of the global Internet. This dataset enables the study of the topology of the IPv4 and IPv6 Internet at the router-level with inferences for assignments of routers to Autonomous Systems (AS). The August 2015 release of the ITDK includes two related IPv4 router-level topologies; an IPv6 router-level topology; assignments of routers to ASes; geographic locations of each router; and Domain Name Service (DNS) lookups of all observed IP addresses.

We produce the ITDKs from active measurements conducted on our Archipelago (Ark) measurement infrastructure. This release made use of 94 Ark monitors located in 36 countries to produce the IPv4 topologies and 26 monitors located in 15 countries for the IPv6 topology.

CAIDA restricts access to recent ITDKs less than two years old. CAIDA provides unrestricted public access to ITDKs older than two years.

For complete details about the ITDK collection process, data files and formats, data availability, and more, please see Macroscopic Internet Topology Data Kit (ITDK).

DHS S&T DDoS Defense PI Meeting

Monday, August 31st, 2015 by kc

Earlier this month, Marina and I went to our first Principal Investigators meeting for a new DHS program on distributed denial of service defense (DDoS Defense), lead by DHS S&T Cybersecurity Division Program Manager Dan Massey. Dan is one of Doug Maughan’s team, and he seems to have picked up Doug’s impressive talent for running effective meetings. I presented these slides on our new spoofer project, a collaboration with Dr. Matthew Luckie, now a senior lecturer at U. Waikato, and Rob Beverly at NPS.

CAIDA’s Annual Report for 2014

Wednesday, July 22nd, 2015 by kc

[Executive Summary from our annual report for 2014:]

This annual report covers CAIDA’s activities in 2014, summarizing highlights from our research, infrastructure, data-sharing and outreach activities. Our research projects span Internet topology, routing, traffic, security and stability, future Internet architecture, economics and policy. Our infrastructure activities support measurement-based Internet studies, both at CAIDA and around the world, with focus on the health and integrity of the global Internet ecosystem.
(more…)

Panel on Cyberwarfare and Cyberattacks at 9th Circuit Judicial Conference

Monday, July 20th, 2015 by kc

I had the honor of contributing to a panel on “Cyberwarfare and cyberattacks: protecting ourselves within existing limitations” at this year’s 9th Circuit Judicial Conference. The panel moderator was Hon. Thomas M. Hardiman, and the other panelists were Professor Peter Cowhey, of UCSD’s School of Global Policy and Strategy, and Professor and Lt. Col. Shane R. Reeves of West Point Academy. Lt. Col. Reeves gave a brief primer on the framework of the Law of Armed Conflict, distinguished an act of cyberwar from a cyberattack, and described the implications for political and legal constraints on governmental and private sector responses. Professor Cowhey followed with a perspective on how economic forces also constrain cybersecurity preparedness and response, drawing comparisons with other industries for which the cost of security technology is perceived to exceed its benefit by those who must invest in its deployment. I used a visualization of an Internet-wide cybersecurity event to illustrate technical, economic, and legal dimensions of the ecosystem that render the fundamental vulnerabilities of today’s Internet infrastructure so persistent and pernicious. A few people said I talked too fast for them to understand all the points I was trying to make, so I thought I should post the notes I used during my panel remarks. (My remarks borrowed heavily from Dan Geer’s two essays: Cybersecurity and National Policy (2010), and his more recent Cybersecurity as Realpolitik (video), both of which I highly recommend.) After explaining the basic concept of a botnet, I showed a video derived from CAIDA’s analysis of a botnet scanning the entire IPv4 address space (discovered and comprehensively analyzed by Alberto Dainotti and Alistair King). I gave a (too) quick rundown of the technological, economic, and legal circumstances of the Internet ecosystem that facilitate the deployment of botnets and other threats to networked critical infrastructure.
(more…)

What’s in a Ranking? comparing Dyn’s Baker’s Dozen and CAIDA’s AS Rank

Thursday, July 2nd, 2015 by Bradley Huffaker

The Internet infrastructure is composed of thousands of independent networks (Autonomous Systems, or ASes) that engage in typically voluntary bilateral interconnection (“peering”) agreements to provide reachability to each other. Underlying these peering relationships, are business relationships between networks, although whether and how much money ASes exchange when they interconnect is not generally published. Some of these business relationships are relatively easy to infer with a high degree of confidence using a basic economic assumption that commercial providers do not give away traffic transit services (i.e., route announcements) for free.

For several years CAIDA has used publicly available BGP data to infer business relationships among ASes and, consequently, rank Autonomous Systems based on a measure of their influence in the global routing system, specifically the size of their customer cone. (An AS’s customer cone is the set of ASes, IPv4 prefixes, or IPv4 addresses that the AS can reach via its customers, i.e., by crossing only customer links.) The methodology behind our ranking is described in detail in our IMC2013 paper (“AS Relationships, Customer Cones, and Validation”). By default, CAIDA’s AS Rank sorts by the number of other ASes in each AS’s customer cone (an AS granularity), but the AS Rank web interface also supports sorting by the number of IPv4 prefixes or IPv4 addresses observed in each AS’s customer cone (which the web interface calls prefix or IP address granularities).

Other organizations also provide rankings of ASes; the most well-known is Dyn’s IP Transit Intelligence AS ranking. Since both CAIDA’s and Dyn’s rankings aim to use a metric that reflects some notion of “predominant role in the global Internet routing system”, we have received several inquiries on how our ranking methodology and results differ from theirs. In this essay we try to answer this question to the best of our ability, acknowledging that their methodology is proprietary and we do not know exactly what they are doing beyond what they have released publicly. This 2013 MENOG presentation (Dyn bought the Renesys company in 2014) states that their ranking is based on quantity of transited IP space, so the closest possible comparison to what we currently do would be to compare their ranking with our IP-address-based customer cone ranking (which is not currently our default). For this exercise we will compare CAIDA’s 1st January 2015 AS ranking by customer cone with the chronologically last value on Dyn’s 2014 Baker’s Dozen, which is based on data observed around the same date.

Dyn’s web site provides the following image showing their rankings throughout 2014: Dyn-Bakers-Dozen-2014-All

In order to compare not only the computed ranking, but the values of the metrics being ranked (i.e., transited IPv4 space vs. number of addresses in customer cone), we create a mapping between the two spaces. Dyn does not put numbers on their y-axis, and they plot only the top 13 ranked ASes, so we do not know the range of y-values represented. In order to make the comparison possible, we will (make a leap of faith and) assume that the top thirteen ranked ASes for each metric cover roughly the same rank of values. (We caution that this assumption may be unjustified and are trying to validate it with Dyn.) So we map the top ranked ASes in Dyn (Level 3 AS3356), to the top ranked AS in CAIDA (also Level 3 AS3356), and map the 13th-ranked AS in Dyn, (Hurricane AS6939), to the 13th ranked AS in CAIDA, (Korea Telecom AS4766). These upper and lower thresholds result in the following mapping between the transited IPv4 space and number of IPv4 addresses in customer cone:

ASdyn_i.dyn_y = ASdyn_i.transit_ip – ASdyn_13.transit_ip + AScaida_13.number_addresses
AScaida_0.num_addresses – AScaida_13.num_addresses

Dyn vs CAIDA's AS Ranking
An AS’s rank is based on the number of ASes with a value (of the ranked metric) greater than the given AS. CAIDA’s 8th, 11th, and 13th ranked ASes are gray because we do not know their Dyn ranking.
as-prefix-percentage Hilbert map visulization shows utilization of IPv4 address space, rendered in two dimensions using as space-filling continous fractal Hilbert curve of order 12. Each pixel in the full resolution image represents a /24 block; red indicates used blocks, green unassigned blocks and blue RFC special blocks. Routed unused blocks are grey and unrouted assigned black

Although their order changes, the top nine ASes are the same in both rankings. Three of Dyn’s top-ranked ASes — China Telecom (AS4134), Beyond (AS3491), and Level 3 (AS3549) — are not in CAIDA’s top 14 ranked ASes; instead CAIDA’s top 14 includes AT&T (AS7018), Deutsche Telecom (AS3320), and Korea Telecom (AS4766). Some of this discrepancy can be explained by Dyn’s curation of the data, including “dealing with anomalies, discounting pre-CIDR allocations, ignoring short-lived announcements, counting remaining prefixes (non-linearly) based on size (/8 – /24 only), etc“. We assume these heuristics aim to make the number of transited addresses a closer approximation to the amount of transited traffic, which Dyn suggests is the more interesting ranking (in the same 2013 MENOG presentation).

We agree with Dyn that the number of IP addresses is not representative of traffic, and have always emphasized that we are not in a position to rank ASes by traffic transited. Not only is there huge variation in traffic to/from different IP addresses (e.g., home user versus popular web servers), but many announced IP addresses are not even assigned to any hosts. In an October 2013 study, CAIDA researchers found that of the 10.4M addresses announced in that month, only 5.3M (51%) were observed sending traffic (these “used” address blocks are shown as red in the Hilbert map on the right). This observation suggests another arguably more meaningful (but computationally expensive) method to rank ASes: normalizing by the amount of observably actively used address space.

July, August, and September 2013



Since we do not yet have census information for January 2015, we use July, August, and September 2013 usage data to compare Dyn’s 2013 ranking with CAIDA’s AS ranking weighted by the number of observably used /24 IPv4 prefixes in the customer cone. (A /24 is defined as “used” if the census observed it as in use.)

The results of this ranking by “observably used IPv4 address /24 blocks”-based customer cone (i.e., the number of apparently used /24 blocks in an AS’s customer cone) look more similar to the Dyn rankings, consistent with the fact that this method of calculating customer cones accounts for some of the effect Dyn captures by discounting pre-CIDR blocks, which are less likely to be fully utilized.

Dyn vs CAIDA's AS Ranking
An AS’s ranking is based on the number of ASes with a value greater than the given AS. The CAIDA’s 8th, 12th, and 13th ranked AS are colored gray to indicate that we do not have a known their Dyn ranking.
 2015   2013 
 dyn   address   address   used   dyn 
 2015 

dyn 1.00 0.82 0.83 0.86 0.82
address 0.82 1.00 0.74 0.66 0.49
 2013 

address 0.83 0.74 1.00 0.96 0.86
used 0.86 0.66 0.96 1.00 0.90
dyn 0.82 0.49 0.86 0.90 1.00

We computed the Pearson correlation coefficient between the results of the two ranking methods. A value of 1 shows perfect correlation or that the two systems have identical rankings. A 0 means there is no correlation or that the two rankings are completely different. Outside the comparison with themselves, which by definition produces 1.00, the two most similar rankings are Dyn’s 2013 transit addresses and CAIDA’s 2013 used /24s with a correlation of 0.90.

This approach improves the correlation between Dyn’s and CAIDA’s ranking (e.g., the Pearson correlation coefficient increases from 0.82 to 0.90, see Table), but it amplifies the dominance of the top-ranked AS (Level 3 AS3356) for CAIDA’s census-derived customer cone ranking.

If we correlate how the rankings have changed over the last two years — which we cannot do for the census-based ranking since we only have 2013 data — we find that Dyn’s ranking showed greater consistency (a correlation between the 2013 and 2015 rankings of 0.82 compared with CAIDA’s 0.74), perhaps due to their data curation process.

In summary, CAIDA’s IPv4 address-based customer cone and Dyn’s transited IPv4 address space roughly agree on the top ASes, although their relative weighting diverges.


Comments on Cybersecurity Research and Development Strategic Plan

Wednesday, July 1st, 2015 by kc

An excerpt from a comment that David Clark and I wrote in response to Request for Information (RFI)-Federal Cybersecurity R&D Strategic Plan, posted by the National Science Foundation on 4/27/2015.

The RFI asks “What innovative, transformational technologies have the potential to enhance the security, reliability, resiliency, and trustworthiness of the digital infrastructure, and to protect consumer privacy?

We believe that it would be beneficial to reframe and broaden the scope of this question. The security problems that we face today are not new, and do not persist because of a lack of a technical breakthrough. Rather, they arise in large part in the larger context within which the technology sits, a space defined by misaligned economic incentives that exacerbate coordination problems, lack of clear leadership, regulatory and legal barriers, and the intrinsic complications of a globally connected ecosystem with radically distributed ownership of constituent parts of the infrastructure. Worse, although the public and private sectors have both made enormous investments in cybersecurity technologies over the last decade, we lack relevant data that can characterize the nature and extent of specific cybersecurity problems, or assess the effectiveness of technological or other measures intended to address them.

We first examine two inherently disconnected views of cybersecurity, the correct-operation view and the harm view. These two views do not always align. Attacks on specific components, while disrupting correct operation, may not map to a specific and quantifiable harm. Classes of harms do not always derive from a specific attack on a component; there may be many stages of attack activity that result in harm. Technologists tend to think about assuring correct operation while users, businesses, and policy makers tend to think about preventing classes of harms. Discussions of public policy including research and development funding strategies must bridge this gap.

We then provide two case studies to illustrate our point, and emphasize the importance of developing ways to measure the return on federal investment in cybersecurity R&D.

Full comment:
http://www.caida.org/publications/papers/2015/comments_cybersecurity_research_development/

Background on authors: David Clark (MIT Computer Science and Artificial Intelligence Laboratory) has led network architecture and security research efforts for almost 30 years, and has recently turned his attention toward non-technical (including policy) obstacles to progress in cybersecurity through a new effort at MIT funded by the Hewlett Foundation. kc claffy (UC San Diego’s Center for Applied Internet Data Analysis (CAIDA)) leads Internet research and data analysis efforts aimed at informing network science, architecture, security, and public policy. CAIDA is funded by the U.S. National Science Foundation, Department of Homeland Security’s Cybersecurity Division, and CAIDA members. This comment reflects the views of its authors and not necessarily the agencies sponsoring their research.