DatCat and DITL (day-in-the-life) data used in classroom curriculum — anonymization revisited

Friday, January 23rd, 2009 by kc

I was delighted to see Sid Faber and Tim Shimeall co-teaching a “Network situational awareness” course at Carnegie-Mellon University last semester, using DatCat and DITL data, they even put the class projects online. Not only did some of the students use DITL data (contributed by Japanese academics), as well as Internet2’s netflow data, but they used DatCat to find both data sets. To quote Sid,

“About three weeks into the class, we finally got across one of the key features to the students: we were looking at how things really work on the internet, not just a theoretical discussion of RFCs. The data sets were invaluable, but we had challenges dealing with anonymization, sampling, and the overall volume of the data sets — kind of understandable for the first offering of the course.”

(more…) covers caida’s recent work in Nature

Wednesday, January 14th, 2009 by kc

As a follow-up to the recent press flurry about Dima’s routing research, Voice of San Diego interviewed us for several hours last week, and no doubt spent twice that time focused on trying to get a complex story mostly right. Hyperbolic headlines notwithstanding, the journalist who interviewed us, David Washburn, did an outstanding job of fact-checking and making sure he accurately represented our views. If this is the future of journalism, I’m not the least bit worried about the death of 20th century journalism models. The real fourth estate is in good hands.

proposition: International Bureau of Internet Statistics

Friday, January 9th, 2009 by kc

Last month I submitted two proposals to the National Cyber Leap Year call for input from the U.S. Networking Information Technology Research and Development (NITRD) Program. I submitted two ideas, the International Bureau of Internet Statistics, and Cooperative Measurement and Modeling of Open Networked Systems (COMMONS, a two-year old idea). The Bureau of Internet Statistics still strikes some as batty, but over the holidays I caught up on some panicky OECD state-of-malware-landscape papers on how uninformed we are and how little data we have, while the only concrete recommendation in the “ITU’s study on the financial aspects of network security: malware and spam” report was

Although the financial aspects of malware and spam are increasingly documented, serious gaps and inconsistencies exist in the available information. This sketchy information base also complicates finding meaningful and effective responses. For this reason, more systematic efforts to gather more reliable information would be highly desirable.


in (re)search of scalable routing..

Tuesday, January 6th, 2009 by kc

I’ve written before about the growing consensus among experts that the Internet’s underlying communications routing algorithms are fundamentally unscalable, so I am delighted to have CAIDA’s routing research group led by Dima Krioukov achieve some fundamental routing research results worth extensive media coverage. We have not solved the Internet’s routing scalability problem, but these recent discoveries will help that cause.


an amazing trip talking IP in Santiago and Patagonia

Monday, January 5th, 2009 by kc

In November 2008 I had the honor of being invited to speak at the Chilean Computer Science Society Annual Meeting, this year at the Universidad de Magallanes in Punta Arenas, Chile. I followed a colleague who has been visiting CAIDA for the last two years, Sebastian Castro, back to his sponsoring institution, NIC Chile. We started out with an interesting meeting with a core of technical folk where I learned about the activities of NIC Chile’s recently established research arm (NIC Labs). We exchanged valuable information on the common (and less common) challenges of doing successful research in our respective environments.


the inevitable conflict between data privacy and science

Sunday, January 4th, 2009 by kc

Balancing individual privacy against other needs, such as national security, critical infrastructure protection, or even science, has long been a challenge for law enforcement, policymakers and scientists. It’s good news when regulations prevent unauthorized people from examining the contents of your communications, but current privacy laws often make it hard — sometimes impossible — to provide academic researchers with data needed to scientifically study the Internet. Our critical dependence on the Internet has rapidly grown much stronger than our comprehension of its underlying structure, performance limits, dynamics, and evolution, and unfortunately current privacy law is part of the problem — legal constraints intended to protect individual communications privacy also leave researchers and policymakers trying to analyze the global Internet ecosystem essentially in the dark. To make matters worse, the few data points suggest a dire picture, shedding doubt on the Internet’s ability to sustain its role as the world’s preferred communications substrate. In the meantime, Internet science struggles to make progress given much less available empirical data than most fields of scientific inquiry.