{"id":5079,"date":"2022-11-14T08:05:02","date_gmt":"2022-11-14T15:05:02","guid":{"rendered":"https:\/\/blog.caida.org\/best_available_data\/?p=5079"},"modified":"2023-11-14T17:07:04","modified_gmt":"2023-11-15T00:07:04","slug":"new-caida-prefix-to-as-mapping-data-set","status":"publish","type":"post","link":"https:\/\/blog.caida.org\/best_available_data\/2022\/11\/14\/new-caida-prefix-to-as-mapping-data-set\/","title":{"rendered":"New CAIDA Prefix-to-AS Mapping Data Set"},"content":{"rendered":"<p>Since May 9th, 2005, CAIDA has produced a data set that maps IPv4 prefixes (and later also IPv6 prefixes) to the AS (Autonomous System) originating that prefix into the global BGP routing system, as observed via a single BGP data collector of the <a href=\"http:\/\/www.routeviews.org\/\">Route Views<\/a> data collection system. We have called this data set <a href=\"https:\/\/catalog.caida.org\/details\/dataset\/routeviews_ipv4_prefix2as\">&#8220;RouteViews Prefix to AS&#8221;<\/a>. We used CAIDA&#8217;s <a href=\"straighten_rv\">straighten_rv<\/a> script to filter the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Routing_table\">RIB<\/a> (routing information base file used as input data. We will discontinue this data set on December 31st, 2022 an replace it with a new more complete data set that we call <a href=\"https:\/\/catalog.caida.org\/details\/dataset\/caida_prefix2as\">CAIDA&#8217;s Prefix-to-AS<\/a> data set.<\/p>\n<p>CAIDA will use the <a href=\"https:\/\/catalog.caida.org\/details\/software\/bgpstream\">BGPStream<\/a> software package (and in particular the <a href=\"https:\/\/github.com\/CAIDA\/bgpview\">bgpview<\/a> library) to include data from all available BGP collectors from both of the primary global publicly available collection systems: <a href=\"http:\/\/www.routeviews.org\/routeviews\/index.php\/collectors\/\">Route Views<\/a> and <a href=\"https:\/\/www.ripe.net\/publications\/docs\/ripe-200\">RIPE NCC Routing Information Service<\/a>. We will backfill Prefix-to-AS data to 2000. As part of this transition, CAIDA will no longer use <a href=\"https:\/\/www.caida.org\/catalog\/software\/straighten_rv\/\"><em>straighten_rv<\/em><\/a> to preprocess AS paths. We will create two files: an annotated file with all the data observed in BGP, and a simple file that filters out data of no interest to many researchers as described below.<\/p>\n<p><b>Annotated files.<\/b> The annotated file will include information about the stability and visibility of prefixes by different peers and collectors. Individuals who wish to produce a more refined mapping can fairly easily filter this data. The table below compares the older &#8220;Routeviews2&#8221; (a single Route Views collector) and the new annotated CAIDA Prefix-to-AS dataset (all collectors from both RIPE RIS and Route Views) for 1 June 2022. Most (99.6%) ASes and (87.2%) prefixes appeared in both datasets. Note that multiple ASNs announced the prefix 0.0.0.0\/0, we exclude it since it covers the entire IPv4 address space.<\/p>\n<table>\n<tbody>\n<tr>\n<th>ASN<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th>filtered<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\" colspan=\"2\">Routeviews2 only<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\" colspan=\"2\">Routeviews+RIPE<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\" colspan=\"2\">both<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: right;\">total<\/th>\n<\/tr>\n<tr>\n<th style=\"text-align: right;\">Multiorigin\/set<\/th>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: right;\">128<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">4.10%<\/td>\n<td style=\"text-align: right;\">1552<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">49.73%<\/td>\n<td style=\"text-align: right;\">1441<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">46.17%<\/td>\n<td style=\"text-align: right;\">3121<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: right;\">public<\/th>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.00%<\/td>\n<td style=\"text-align: right;\">295<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.40%<\/td>\n<td style=\"text-align: right;\">73294<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">99.60%<\/td>\n<td style=\"text-align: right;\">73589<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: right;\">reserved<\/th>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.00%<\/td>\n<td style=\"text-align: right;\">1379<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">88.97%<\/td>\n<td style=\"text-align: right;\">171<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">11.03%<\/td>\n<td style=\"text-align: right;\">1550<\/td>\n<\/tr>\n<tr>\n<th colspan=\"5\"><\/th>\n<\/tr>\n<tr>\n<th>Prefix<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th>filtered<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\" colspan=\"2\">Routeviews2 only<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\" colspan=\"2\">Routeviews+RIPE<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\" colspan=\"2\">both<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: right;\">total<\/th>\n<\/tr>\n<tr>\n<th style=\"text-align: right;\">larger then \/8<\/th>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.00%<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">100.00%<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.00%<\/td>\n<td style=\"text-align: right;\">1<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: right;\">private<\/th>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.00%<\/td>\n<td style=\"text-align: right;\">504<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">84.85%<\/td>\n<td style=\"text-align: right;\">90<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">15.15%<\/td>\n<td style=\"text-align: right;\">594<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: right;\">public<\/th>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: right;\">0<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">0.00%<\/td>\n<td style=\"text-align: right;\">138498<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">12.81%<\/td>\n<td style=\"text-align: right;\">942469<\/td>\n<td style=\"font-size: 70%%; font-color: grey; text-align: right;\">87.19%<\/td>\n<td style=\"text-align: right;\">1080967<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><b>Simple files.<\/b> The simple file will exclude very large prefixes, e.g., with mask lengths &lt; 8, private addresses (<a href=\"https:\/\/www.rfc-editor.org\/rfc\/rfc1918.html\">RFC 1918<\/a>), or prefixes announced exclusively by reserved ASNs (<a href=\"https:\/\/www.iana.org\/assignments\/iana-as-numbers-special-registry\/iana-as-numbers-special-registry.xhtml\">Special-Purpose ASN<\/a>). The resulting simple prefix-to-ASN mapping covers 99.7% of the address space captured by the annotated file. In the table below (also reflecting 1 June 2022), 0.94% of prefixes and 0.42% of addresses had an additional origin AS that was not also observed in the Routeviews2-only dataset. This reflects the expanded visibility of more collectors and peer. 4.92% of CAIDA&#8217;s prefixes and 1.82% of addresses were not covered by Routeviews2-only prefix2as. Overall the combined data set provides visibility of 5.86% of prefixes and 2.24% of addresses not covered by routeviews2-only.<\/p>\n<p><b> CAIDA&#8217;s Prefix to AS &#8220;simple&#8221; (99.7% of addresses observed in annotated files) <\/b><\/p>\n<table>\n<tbody>\n<tr>\n<th colspan=\"3\"><\/th>\n<th style=\"border: 1px; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"1\"><\/th>\n<th style=\"text-align: center;\" colspan=\"3\">ASN type<\/th>\n<th style=\"border: 1px; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"1\"><\/th>\n<th style=\"text-align: center;\" colspan=\"5\">prefixes<\/th>\n<th style=\"border: 1px; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"1\"><\/th>\n<th style=\"text-align: center;\" colspan=\"5\">addressses<\/th>\n<\/tr>\n<tr>\n<th>source<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th>agreement<\/th>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\">Routeviews2<br \/>\nonly<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th style=\"text-align: center;\">Routeviews<br \/>\n+ RIPE<\/th>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"4\"><\/th>\n<th>number<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"20\"><\/th>\n<th>group %<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"20\"><\/th>\n<th>all %<\/th>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"20\"><\/th>\n<th>number<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"20\"><\/th>\n<th>group %<\/th>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"20\"><\/th>\n<th>all %<\/th>\n<\/tr>\n<tr>\n<td>both<\/td>\n<td>different<\/td>\n<td>multiorigin<\/td>\n<td>multiorigin<\/td>\n<td>626<\/td>\n<td>11.43%<\/td>\n<td>0.11%<\/td>\n<td>1241088<\/td>\n<td>9.65%<\/td>\n<td>0.04%<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td>public<\/td>\n<td>multiorigin<\/td>\n<td>4816<\/td>\n<td>87.95%<\/td>\n<td>0.82%<\/td>\n<td>11442617<\/td>\n<td>88.93%<\/td>\n<td>0.37%<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td>set<\/td>\n<td>multiorigin<\/td>\n<td>34<\/td>\n<td>0.62%<\/td>\n<td>0.01%<\/td>\n<td>183039<\/td>\n<td>1.42%<\/td>\n<td>0.01%<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: left;\" colspan=\"8\"><\/th>\n<td><b>5476<\/b><\/td>\n<td><b>100.00%<\/b><\/td>\n<td><b>0.94%<\/b><\/td>\n<td><b>12866744<\/b><\/td>\n<td><b>100.00%<\/b><\/td>\n<td><b>0.42%<\/b><\/td>\n<\/tr>\n<tr>\n<td>both<\/td>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>same<\/td>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>multiorigin<\/td>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>multiorigin<\/td>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>9869<\/td>\n<td>1.79%<\/td>\n<td>1.69%<\/td>\n<td>12609229<\/td>\n<td>0.42%<\/td>\n<td>0.41%<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td>public<\/td>\n<td>public<\/td>\n<td>540032<\/td>\n<td>98.20%<\/td>\n<td>92.45%<\/td>\n<td>2988739528<\/td>\n<td>99.58%<\/td>\n<td>97.35%<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td>set<\/td>\n<td>set<\/td>\n<td>8<\/td>\n<td>0.00%<\/td>\n<td>0.00%<\/td>\n<td>9216<\/td>\n<td>0.00%<\/td>\n<td>0.00%<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: left;\" colspan=\"8\"><\/th>\n<td><b>549909<\/b><\/td>\n<td><b>100.00%<\/b><\/td>\n<td><b>94.14%<\/b><\/td>\n<td><b>3001357973<\/b><\/td>\n<td><b>100.00%<\/b><\/td>\n<td><b>97.76%<\/b><\/td>\n<\/tr>\n<tr>\n<td>Routeviews+RIPE<\/td>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>N\/A<\/td>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td><\/td>\n<th style=\"border: 1px solid darkgrey; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>multiorigin<\/td>\n<th style=\"border: 1px solid black; border-collapse: collapse; padding: 0; margin: 0;\" rowspan=\"3\"><\/th>\n<td>1884<\/td>\n<td>6.55%<\/td>\n<td>0.32%<\/td>\n<td>908601<\/td>\n<td>1.63%<\/td>\n<td>0.03%<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td>public<\/td>\n<td>26856<\/td>\n<td>93.44%<\/td>\n<td>4.60%<\/td>\n<td>54919321<\/td>\n<td>98.37%<\/td>\n<td>1.79%<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td>set<\/td>\n<td>2<\/td>\n<td>0.01%<\/td>\n<td>0.00%<\/td>\n<td>2816<\/td>\n<td>0.01%<\/td>\n<td>0.00%<\/td>\n<\/tr>\n<tr>\n<th style=\"text-align: left;\" colspan=\"8\"><\/th>\n<td><b>28742<\/b><\/td>\n<td><b>100.00%<\/b><\/td>\n<td><b>4.92%<\/b><\/td>\n<td><b>55830738<\/b><\/td>\n<td><b>100.00%<\/b><\/td>\n<td><b>1.82%<\/b><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>You can find the new <a href=\"https:\/\/catalog.caida.org\/dataset\/caida_prefix2as\">CAIDA Prefix-to-AS Mapping Data Set here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since May 9th, 2005, CAIDA has produced a data set that maps IPv4 prefixes (and later also IPv6 prefixes) to the AS (Autonomous System) originating that prefix into the global BGP routing system, as observed via a single BGP data collector of the Route Views data collection system. We have called this data set &#8220;RouteViews [&hellip;]<\/p>\n","protected":false},"author":18,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"coauthors":[27],"_links":{"self":[{"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/posts\/5079"}],"collection":[{"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/comments?post=5079"}],"version-history":[{"count":44,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/posts\/5079\/revisions"}],"predecessor-version":[{"id":5255,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/posts\/5079\/revisions\/5255"}],"wp:attachment":[{"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/media?parent=5079"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/categories?post=5079"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/tags?post=5079"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.caida.org\/best_available_data\/wp-json\/wp\/v2\/coauthors?post=5079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}