Archive for the 'Review' Category

Streamlining Access to BGP Routing Data

Monday, October 7th, 2024 by Elena Yulaeva

Users can now request access to the CAIDA BGP2GO (https://bgp2go.caida.org) platform. BGP2GO lets users find the MRT files that contain a specific resource and thus avoid the download and processing of unrelated data. Users can compile a customize list of relevant MRT files, share that exact list with others, or stream the matching MRT files (e.g., using BGPStream). 

Finding the needle in the haystack 

Public BGP route collectors receive update messages from over 1,000 BGP routers worldwide. These updates are archived and made available for download and analysis. However, the data is organized in a way that often requires downloading vast amounts of unrelated information making it increasingly difficult to focus on the needles in the haystack. 

Imagine you’re a network operator announcing a new prefix and you want to analyze its propagation. You may not know exactly which collector or MRT file contains the relevant data, forcing you to download and sift through unnecessary information. 

BGP2GO solves this problem by allowing users to easily select only the files they need, saving significant time and effort. 

We have developed a comprehensive index of all prefixes, ASNs, and communities across RouteViews update files (https://archive.routeviews.org), along with BGP2GO, a user interface that allows you to easily select the specific files you need (https://bgp2go.caida.org) 

Use Case: Analyzing Prefix Propagation with PEERING Testbed, BGP2GO, and BGPStream 

Let’s walk through a real-world example. Suppose you’re a network operator advertising a prefix and want to examine how it propagates across RouteViews collectors. Using the PEERING testbed (https://peering.ee.columbia.edu/), we performed a controlled advertisement of the prefix 184.164.246.0/24 during August 2024. 

Step 1: Looking up the prefix and setting filters 

In this case, we search for the prefix 184.164.246.0/24 in BGP2GO, filtering for data from August 2024. The platform identifies 33,251 announcements and withdrawals, spread across 746 files (2.32GB) from 26 collectors. This curated selection lets us focus only on the data we need, saving time and resources. 

https://bgp2go.caida.org/details?pre=184.164.246.0/24&years=2024&months=8

Step 2: Stream selected files in the terminal 

Once the relevant MRT files are selected, you can stream them for further processing using BGPStream. Clicking the BGPSTREAM button in the top right corner, right above the “collectors” chart gives instructions on how to stream the files in your terminal. 

 

For this example, we use the following command (see step 4a above) in the terminal to process the relevant files: 

bgpreader -k 184.164.246.0/24 -d csvfile -o csv-file="bgp2go.csv" 

This command leverages bgpreader to read the files and output the lines that pertain to the resource (prefix 184.164.246.0/24) . The following screenshot shows an excerpt of routes related to the prefix 184.164.246.0/24 extracted from all files that contain this prefix in August 2024. 

 

To learn more about the bgpreader command and its options, visit the BGPStream website  (https://bgpstream.caida.org/) . 

Getting Access to BGP2GO 

To request access to the BGP2GO platform, a user should first create an account with the CAIDA Services Single Sign On (SSO) system ( https://auth.caida.org/realms/CAIDA/account ) by providing basic information and undergoing authentication via Keycloak. After authentication, a user can request access to the BGP2Go platform (which requires a CAIDA-authorized account with bgp2go-api:read role) by going to https://bgp2go.caida.org/ and filling out the request form.  If you have any questions or problems, please contact us at data-info@caida.org 

Seeking Beta Users for 100 GB link Anonymized Passive Traces

Sunday, August 11th, 2024 by Elena Yulaeva

We are seeking beta users for our new Anonymized Two-Way Passive Trace dataset, captured on a 100 GB link between Los Angeles and San Jose. Beginning in April 2024, we have been capturing a one-hour trace each month. To protect privacy, we strip all packet payloads after the layer 4 headers, and anonymize IP (v4 and v6) addresses with CryptoPan. The monthly data is provided in two separate files, one for each direction of traffic.

This dataset includes the following metadata fields:

  • Monitor Name
  • Year and month (including a link to a graphical display of breakup by protocol, application, and country)
  • Start time of trace (UTC)
  • Stop time of trace (UTC)
  • Number of IPv4 packets
  • Number of IPv6 packets
  • Unknown packets (as a fraction of the total number of packets)
  • Transmission rate in packets per second
  • Transmission rate in bits per second
  • Link load (as a fraction of the nominal maximum load for a 100 GB link)
  • Average packet size (bytes) (including a link to a graph of the packet size distribution).

The data is stored in our Swift OpenStack object storage. Each one-directional anonymized pcap file captured monthly is approximately 1TB in size, so users will need more than 2TB of space to download the entire one-hour capture. For those without access to such storage and/or processing capacity, contact us and we will discuss other alternatives. We are also releasing statistical information for each hourly trace.

Academic researchers can request access to the data by filling out and submitting the request form.

We will prioritize users who:

  • Have significant experience with network traffic analysis
  • Demonstrate a clear plan for how they will use the dataset
  • Can commit to regular feedback and participation throughout the beta testing period

Help CAIDA Refine and Enhance the FANTAIL Traceroute Analytics platform.

Friday, August 9th, 2024 by Elena Yulaeva

We are excited to announce the beta testing phase of the Facilitating Advances in Network Topology Analysis (FANTAIL) platform (https://www.caida.org/projects/fantail/), a cutting-edge topology query system designed to search vast archives of raw Internet end-to-end path (traceroute) measurement data. FANTAIL is poised to support and advance various research domains within the Computer and Information Science and Engineering (CISE) field that heavily rely on the emerging sub-discipline of Internet cartography. Key areas of focus include:

  • Understanding the intricate ownership and interconnection structures and dynamics of Internet infrastructure.
  • Exploring methods for device identification and characterization within the digital landscape.
  • Enhancing the ability to detect and respond to network outages and route hijacking incidents.
  • Investigating network congestion patterns and their impact on data flow and quality of service.
  • Identifying and mitigating vulnerabilities within network topologies.

FANTAIL consists of four components:

  1. Interactive Web Interface: FANTAIL Web Interface
  2. Application Programming Interface (API): Built on web standards (FANTAIL API Documentation)
  3. Full-Text Search System
  4. Big Data Processing System

The system’s central data type is the traceroute path, representing the inferred IP-level Internet path that network traffic would take between two hosts, from the measurement vantage point to the destination, as determined with the traceroute technique by scamper (https://catalog.caida.org/software/scamper). FANTAIL leverages annotated and indexed data generated through the utilization of Spark, SQLite, and Elasticsearch, originating from CAIDA Internet traceroute probing data dating back to 2015.

Academic researchers interested in accessing the platform can request access by emailing fantail-info@caida.org.

We will prioritize users who can commit to regular feedback and participation throughout the beta testing period. 

CAIDA’s 2015 Annual Report

Tuesday, July 19th, 2016 by kc

[Executive summary and link below]

The CAIDA annual report summarizes CAIDA’s activities for 2015, in the areas of research, infrastructure, data collection and analysis. Our research projects span Internet topology, routing, security, economics, future Internet architectures, and policy. Our infrastructure, software development, and data sharing activities support measurement-based internet research, both at CAIDA and around the world, with focus on the health and integrity of the global Internet ecosystem. The executive summary is excerpted below:

Mapping the Internet. We continued to pursue Internet cartography, improving our IPv4 and IPv6 topology mapping capabilities using our expanding and extensible Ark measurement infrastructure. We improved the accuracy and sophistication of our topology annotation capabilities, including classification of ISPs and their business relationships. Using our evolving IP address alias resolution measurement system, we collected curated, and released another Internet Topology Data Kit (ITDK).

Mapping Interconnection Connectivity and Congestion.
We used the Ark infrastructure to support an ambitious collaboration with MIT to map the rich mesh of interconnection in the Internet, with a focus on congestion induced by evolving peering and traffic management practices of CDNs and access ISPs, including methods to detect and localize the congestion to specific points in networks. We undertook several studies to pursue different dimensions of this challenge: identification of interconnection borders from comprehensive measurements of the global Internet topology; identification of the actual physical location (facility) of an interconnection in specific circumstances; and mapping observed evidence of congestion at points of interconnection. We continued producing other related data collection and analysis to enable evaluation of these measurements in the larger context of the evolving ecosystem: quantifying a given ISP’s global routing footprint; classification of autonomous systems (ASes) according to business type; and mapping ASes to their owning organizations. In parallel, we examined the peering ecosystem from an economic perspective, exploring fundamental weaknesses and systemic problems of the currently deployed economic framework of Internet interconnection that will continue to cause peering disputes between ASes.

Monitoring Global Internet Security and Stability. We conduct other global monitoring projects, which focus on security and stability aspects of the global Internet: traffic interception events (hijacks), macroscopic outages, and network filtering of spoofed packets. Each of these projects leverages the existing Ark infrastructure, but each has also required the development of new measurement and data aggregation and analysis tools and infrastructure, now at various stages of development. We were tremendously excited to finally finish and release BGPstream, a software framework for processing large amounts of historical and live BGP measurement data. BGPstream serves as one of several data analysis components of our outage-detection monitoring infrastructure, a prototype of which was operating at the end of the year. We published four other papers that either use or leverage the results of internet scanning and other unsolicited traffic to infer macroscopic properties of the Internet.

Future Internet Architectures. The current TCP/IP architecture is showing its age, and the slow uptake of its ostensible upgrade, IPv6, has inspired NSF and other research funding agencies around the world to invest in research on entirely new Internet architectures. We continue to help launch this moonshot from several angles — routing, security, testbed, management — while also pursuing and publishing results of six empirical studies of IPv6 deployment and evolution.

Public Policy. Our final research thrust is public policy, an area that expanded in 2015, due to requests from policymakers for empirical research results or guidance to inform industry tussles and telecommunication policies. Most notably, the FCC and AT&T selected CAIDA to be the Independent Measurement Expert in the context of the AT&T/DirecTV merger, which turned out to be as much of a challenge as it was an honor. We also published three position papers each aimed at optimizing different public policy outcomes in the face of a rapidly evolving information and communication technology landscape. We contributed to the development of frameworks for ethical assessment of Internet measurement research methods.

Our infrastructure operations activities also grew this year. We continued to operate active and passive measurement infrastructure with visibility into global Internet behavior, and associated software tools that facilitate network research and security vulnerability analysis. In addition to BGPstream, we expanded our infrastructure activities to include a client-server system for allowing measurement of compliance with BCP38 (ingress filtering best practices) across government, research, and commercial networks, and analysis of resulting data in support of compliance efforts. Our 2014 efforts to expand our data sharing efforts by making older topology and some traffic data sets public have dramatically increased use of our data, reflected in our data sharing statistics. In addition, we were happy to help launch DHS’ new IMPACT data sharing initiative toward the end of the year.

Finally, as always, we engaged in a variety of tool development, and outreach activities, including maintaining web sites, publishing 27 peer-reviewed papers, 3 technical reports, 3 workshop reports, 33 presentations, 14 blog entries, and hosting 5 workshops. This report summarizes the status of our activities; details about our research are available in papers, presentations, and interactive resources on our web sites. We also provide listings and links to software tools and data sets shared, and statistics reflecting their usage. sources. Finally, we offer a “CAIDA in numbers” section: statistics on our performance, financial reporting, and supporting resources, including visiting scholars and students, and all funding sources.

For the full 2015 annual report, see http://www.caida.org/home/about/annualreports/2015/

The 2nd NDN Project Retreat

Sunday, February 5th, 2012 by kc

I kicked off 2012 with a visit to Colorado State University in Fort Collins, CO to attend the principal investigators (PI) retreat for the Named Data Networking Project, one of four projects funded under NSF’s “Future Internet Architecture” (FIA) program. Impressive progress since the first FIA meeting, with substantial development and coordination of the NDN Testbed connecting the initial participating institutions, including network status reporting, state of (phase-one) OSPF routing, and testbed status pages. This two-day meeting packed in a wide range of collaborative discussions of architecture and implementation issues, including: topology and namespace structure and constraints; organizational structure and network management; routing and forwarding strategy; security issues such as attribution and privacy; early experiences with application development; evaluation and measurement; social and ethical values in technology design; and educational outreach (classes teaching NDN concepts). We also discussed how to dispel the misconception that NDN is simply collaborative web caching. (The caching is essential but the most revolutionary piece of this new communication model is retrieving data by names.)

(more…)

my third FCC TAC meeting — the most exciting yet

Monday, July 25th, 2011 by kc

My third FCC Technical Advisory Council meeting (3-hr. video archive here) was the most exciting yet. The TAC’s Critical Legacy Transition working group, studying the legacy public switched telephone network, recommended that the Council advise the FCC to set a concrete date to sunset (shut down) the Public Switched Telephone Network (PSTN). (!) The working group recommended the year 2018 as a starting point for lively discussion.

(more…)

Exhausted IPv4 address architectures

Tuesday, May 3rd, 2011 by kc

In light of available data on global IPv6 deployment, ISPs, and those who build equipment for them, have already accepted that multi-level network address translation (NAT, between IPv4 and IPv6 networks) is here for the foreseeable future, with all its limits on end-to-end reachability and application functionality, and its required unscalable per-protocol hacks. Whether “carrier-grade” NAT (CGN) technology supports a transition to IPv6 or becomes the endgame itself is irrelevant to the planning horizon of public companies, who must now develop sustainable business models that accommodate, if not support, IPv4 scarcity. I’ve heard a few notable predicted outcomes from engineers in the field.

(more…)

my second FCC TAC meeting, and its IPv6 promise

Saturday, April 30th, 2011 by kc

I recently remotely attended my second meeting of the FCC’s Technological Advisory Council (slides but no video archives). The chairs of four working groups created at the first TAC meeting (Critical Transitions; IPv6; Broadband Infrastructure Deployment; and Sharing Opportunities) presented their interim results. The FCC then issued a set of “TAC recommendations” (which the TAC never saw); it is mostly a wish list from industry to the FCC. Ironically, IPv6 did not appear anywhere in the recommendations, despite being the most popular topic at the first TAC meeting last November, and despite us running out of IPv4 addresses since the last TAC meeting. But the TAC’s IPv6 WG did commit to (on slide 53) delivering a report by November 2011 on what the FCC could or should do to help promote IPv6 deployment. Specifically, the WG has the following charter:

(more…)

my first “Future Internet Architecture” PI meeting

Wednesday, January 5th, 2011 by kc

Among the interesting meetings I attended in 2010 was the principal investigators (PI) meeting for NSF’s new “Future Internet Architecture” (FIA) program. The FIA program builds on the successes of NSF’s previous Future Internet Design (FIND) program, the recommendations of a review panel, and a community summit in October 2009. (The FIND program itself has been integrated into NSF’s new Network Science and Engineering research program, while the four FIA teams are attempting to implement some of the ideas developed thus far.) CAIDA is participating in one of these projects — Named Data Networking (NDN), led by Van Jacobson at Xerox Parc and Lixia Zhang at UCLA. (Background links to 2010 technical report describing the proposed architecture, Van’s August 2006 video lecture and 2009 ACM Queue Q&A on NDN ideas.)

(more…)

my first FCC TAC meeting

Monday, November 15th, 2010 by kc

I recently attended my first FCC Technological Advisory Council meeting (video archives). A week before the meeting we received a memo from the chairman of the committee (Tom Wheeler) notifying the committee of a “clear and challenging mandate from Chairman Genachowski: to generate ideas and spur actions that lead to job creation and economic growth in the ICT [information and communication technologies] ecosystem.” Specifically, “The TAC will focus on the short term implementation of innovative ideas to create investment and jobs, as opposed to long term regulatory changes.”

(more…)