Archive for the 'Measurement' Category
[Last month, I remotely attended the second meeting of the FCC's current Technical Advisory Committee (TAC), where chairs of several working groups set up at the first meeting (in November) reported on their progress and plans. I'm a member of the FCC TAC's IPv6 working group, (more on this soon), and so far have been asked to answer two questions I've been thinking about for a couple of years: what data do we have to gauge IPv6 deployment by Internet service providers, and what data do we need? Last November I addressed the first question in a (still pending) NSF proposal to measure IPv6 deployment, with the following text. I'll post some updates shortly.]
Amidst the recent political unrest in the Middle East, researchers have observed significant changes in Internet traffic and connectivity. In this article we tap into a previously unused source of data: unsolicited Internet traffic arriving from Libya. The traffic data we captured shows distinct changes in unsolicited traffic patterns since 17 February 2011.
Most of the information already published about Internet connectivity in the Middle East has been based on four types of data:
We have performed an analysis of the IP-AS mapping obtained from Routeviews/RIPE collectors.
A crucial step in various research efforts that study the Internet infrastructure is to map an IP address to the Autonomous System (AS) to which it is assigned. The most common approach to map IP addresses to ASes is by using BGP table dumps from public repositories such as Routeviews and RIPE. We assign “ownership” of an IP address to the AS that originates the longest BGP prefix that matches the IP address. Routeviews and RIPE, however, have multiple collectors, each of which peers with a diverse set of ASes. Consequently, the IP-AS mapping obtained by using the BGP table dump from one collector could be different from that obtained from a different collector. The obvious solution is to aggregate views from as many vantage points as possible to obtain the most complete IP-AS mapping possible. In practice, however, it is common to use data from just one or two collectors, as it greatly simplifies the process of collecting and pre-processing data. The goal of our analysis is to compare different collectors, in terms of the different metrics that we are interested in, viz. address space coverage, IP-AS mapping, unique ASes, unique prefixes, unique more specific prefixes, AS links, and AS paths. Further, we study the utility of adding data from more collectors, in terms of the resulting change in the aforementioned metrics. Finally, we compare the IP-AS mapping from Routeviews and RIPE tables with that obtained from Team Cymru’s whois service.
This post is our submitted response to NSF’s call for expressions of interest in the Future Internet Architectures summit, which i am attending this week.
What scientific contributions will you bring to the discussion about Future Internet architectures?
As scientists, we are compelled to explore how the peculiar structure relates to the function(s) of complex networks. Many complex networks in nature share the peculiar structural character of the Internet, but they also manifest phenomenal behavior: they efficiently route information without any observable routing protocol overhead. This achievement is currently beyond the reach of man-made networks. The Internet still uses a 30-year old routing architecture with fundamentally unscalable overhead requirements. Yet in those 30 years, the Internet’s inter-domain topology has evolved toward a structure for which nature has superior routing technology, if only we can figure out how to use it!
I was delighted to see Sid Faber and Tim Shimeall co-teaching a “Network situational awareness” course at Carnegie-Mellon University last semester, using DatCat and DITL data, they even put the class projects online. Not only did some of the students use DITL data (contributed by Japanese academics), as well as Internet2′s netflow data, but they used DatCat to find both data sets. To quote Sid,
“About three weeks into the class, we finally got across one of the key features to the students: we were looking at how things really work on the internet, not just a theoretical discussion of RFCs. The data sets were invaluable, but we had challenges dealing with anonymization, sampling, and the overall volume of the data sets — kind of understandable for the first offering of the course.”
Last month I submitted two proposals to the National Cyber Leap Year call for input from the U.S. Networking Information Technology Research and Development (NITRD) Program. I submitted two ideas, the International Bureau of Internet Statistics, and Cooperative Measurement and Modeling of Open Networked Systems (COMMONS, a two-year old idea). The Bureau of Internet Statistics still strikes some as batty, but over the holidays I caught up on some panicky OECD state-of-malware-landscape papers on how uninformed we are and how little data we have, while the only concrete recommendation in the “ITU’s study on the financial aspects of network security: malware and spam” report was
Although the financial aspects of malware and spam are increasingly documented, serious gaps and inconsistencies exist in the available information. This sketchy information base also complicates finding meaningful and effective responses. For this reason, more systematic efforts to gather more reliable information would be highly desirable.
#7: The traditional mode of getting data from public infrastructures to inform policymaking — regulating its collection — is a quixotic path, since the government regulatory agencies have as much reason to be reluctant as providers regarding disclosure of how the Internet is engineered, used, and financed.