IP-AS mappings

July 28th, 2010 by amogh

We have performed an analysis of the IP-AS mapping obtained from Routeviews/RIPE collectors.

A crucial step in various research efforts that study the Internet infrastructure is to map an IP address to the Autonomous System (AS) to which it is assigned. The most common approach to map IP addresses to ASes is by using BGP table dumps from public repositories such as Routeviews and RIPE. We assign “ownership” of an IP address to the AS that originates the longest BGP prefix that matches the IP address. Routeviews and RIPE, however, have multiple collectors, each of which peers with a diverse set of ASes. Consequently, the IP-AS mapping obtained by using the BGP table dump from one collector could be different from that obtained from a different collector. The obvious solution is to aggregate views from as many vantage points as possible to obtain the most complete IP-AS mapping possible. In practice, however, it is common to use data from just one or two collectors, as it greatly simplifies the process of collecting and pre-processing data. The goal of our analysis is to compare different collectors, in terms of the different metrics that we are interested in, viz. address space coverage, IP-AS mapping, unique ASes, unique prefixes, unique more specific prefixes, AS links, and AS paths. Further, we study the utility of adding data from more collectors, in terms of the resulting change in the aforementioned metrics. Finally, we compare the IP-AS mapping from Routeviews and RIPE tables with that obtained from Team Cymru’s whois service.

The figures above show the relative changes in address space coverage when we start with table dumps from Routeviews’ LINX collector (which we choose as the base table, as it provides the largest coverage of IPv4 address space of any single table in Routeviews), and successively add data from collectors in decreasing order of address space coverage. The first plot above shows that data from additional collectors results in less than 1% increase in address space coverage, and the second plot shows that additional collectors incur a change in the IP-AS mapping for fewer than 1% of addresses represented in the tables.

We repeated the same analysis for other metrics of interest – unique prefixes, unique more-specific prefixes, unique ASes, unique origin ASes, and unique AS links. We find that additional table dumps yield fewer than 1% additional (previously unseen) ASes and origin ASes, confirming previous reports that most ASes are observable from even a few vantage points. However, for other metrics, additional table dumps matter: adding a table dump can yield up to 4.8% more AS links (shown in the above figure), up to 4.6% more prefixes and 4.7% more specific prefixes than seen in the base table. Furthermore, between 10% and 70% of the more specific prefixes seen in additional table dumps are originated by a different origin AS than in the base table.

We also compared the IP-AS mapping obtained from Routeviews/RIPE table dumps with that obtained from Team Cymru’s whois service. We find that the difference between the IP-AS mapping from Cymru and that obtained by combining data from all Routeviews and RIPE collectors is small (0.7% of queried addresses returned a different origin AS). 56% of these IP-AS mismatches are due to cases where Cymru and the table dumps return a single, but different AS. A significant number (41%) of mismatches are due to Multi-Origin ASes (MOASes). In particular, 34% of IP-AS mismatches are due to MOASes where the Cymru mapping does not contain one of the ASes returned by the table dumps.

In summary, our findings are reassuring. In terms of IP-AS mapping, using data from just a few of the largest Routeviews/RIPE collectors is sufficient; adding data from more collectors does not significantly change the IP-AS mapping or coverage of IPv4 address space. Also, using data from Routeviews/RIPE is not significantly different from using Team Cymru’s whois service. In fact, in the data we compared, the combination of table dumps from all Routeviews/RIPE collectors gave a better view of MOAS prefixes than Cymru’s lookup service.

Growth trends in the AS-level Internet

May 7th, 2010 by amogh

We have studied growth trends in the number of ASes seen advertised in the global routing system from different regional registries (similar to Geoff Huston’s 32-bit AS Number Report, but with per-registry trends). We used Routeviews and RIPE BGP dumps over the last 12 years, and Team Cymru’s WHOIS lookup service to map ASNes to registries as of March 2010. To our knowledge, historical data to map an ASN to a regional registry at any given time in the past is not available, so we cannot account for ASN movement between registries. More information about the data collection and pre-processing is in our IMC 2008 paper, “Ten Years in the Evolution of the Internet Ecosystem” and our supplemental data page.

Our most interesting observation is that the two largest registries in terms of the number of advertised ASes (ARIN and RIPE) have shown distinctly different growth trends since 2001. Both registries showed exponential growth until mid-2001, but since then ARIN’s AS count has grown linearly while RIPE’s has continued to grow exponentially, though with a smaller exponent than in the pre-2001 period. The number of advertised ASes allocated from RIPE is now larger than ARIN’s.

all_split

We conjecture a couple of possible reasons for the shift. One possible contribution is the presence of companies in Europe that assist enterprises in obtaining Provider Independent (PI) address space and AS numbers. RIPE Labs’ Forum published a recent discussion of this issue. A second, related factor is the inclination of enterprises to seek PI address space (and ASNs) in the first place. The primary objective of enterprises in obtaining an ASN and PI address space is to multihome — connecting to multiple upstream providers for reliability, performance, and other traffic engineering goals. A larger concentration of Internet Exchange Points (IXPs) and more competition in the European transit market could make multihoming more attractive for enterprise customers in Europe than in North America. Measurements in our IMC 2008 paper confirm that the transit market is now larger and more dynamic in Europe than in North America. But we cannot absolutely confirm this theory directly with BGP routing data, since only networks with ASNs show up in BGP, and the majority of customer ASes (both in Europe and North America) are multihomed (otherwise, technically, it should not need an ASN in the first place). It would be illuminating to examine whether enterprise customers that did not request an ASN in North America would have pursued ASN + PI address space if they were in Europe, simply because the competitive transit market in Europe makes multihoming more attractive?

data collection and reporting requirements for broadband stimulus recipients

November 12th, 2009 by kc


No one was more surprised than I to see data collection requirements in the NTIA’s Notice of Funds Availability (NOFA) for the Rural Utilities Service’s (RUS) Broadband Initiatives Program (BIP) and the Broadband Technology Opportunities Programs (BTOP):

Awardees receiving Last Mile or Middle Mile Broadband Infrastructure grants must report, for each specific BTOP project, on the following:

  1. The terms of any interconnection agreements entered into during the reporting period;
  2. Traffic exchange relationships (e.g., peering) and terms;
  3. Broadband equipment purchases;
  4. Total and peak utilization of access links;
  5. Total and peak utilization on interconnection links to other networks;
  6. Internet protocol address utilization and IPv6 implementation;
  7. Any changes or updates to their network management practices;


Incumbents have fought hard against far less onerous data collection requirements — indeed, the above requirements in part kept incumbents away from applying for BTOP funds. So the pragmatist in me cannot imagine these requirements actually being enforced, much less extended to existing providers of access to the public Internet as part of the national broadband plan, which the FCC owes Congress by February 17th. However, the researcher in me can imagine such requirements, in conjunction with privacy-sensitive data sharing frameworks (e.g., one we’ve proposed), positively transforming the state of Internet science and cybersecurity. Kudos to NTIA for this earnest attempt to improve the transparency of an industry more opaque than the financial sector (for similar reasons, and in the face of just as profound risks).

‘academic’ thoughts about a ‘future Internet’

October 12th, 2009 by kc

[we didn't intentionally take the summer off from blogging, we just got mired in more proposal and paper deadlines than we've ever experienced. will catch up this quarter.]

This post is our submitted response to NSF’s call for expressions of interest in the Future Internet Architectures summit, which i am attending this week.

What scientific contributions will you bring to the discussion about Future Internet architectures?

As scientists, we are compelled to explore how the peculiar structure relates to the function(s) of complex networks. Many complex networks in nature share the peculiar structural character of the Internet, but they also manifest phenomenal behavior: they efficiently route information without any observable routing protocol overhead. This achievement is currently beyond the reach of man-made networks. The Internet still uses a 30-year old routing architecture with fundamentally unscalable overhead requirements.  Yet in those 30 years, the Internet’s inter-domain topology has evolved toward a structure for which nature has superior routing technology, if only we can figure out how to use it!

The prospect of zero-overhead routing is sufficiently attractive that in our previous NeTS-FIND project we developed a new theoretical framework to study it. In our framework, nodes in real networks exist in a separate but related “hidden metric space,” which guides routing without overhead or topology knowledge.  We found strong evidence that not only do hidden metric spaces underlie real complex network topologies including the Internet (http://www.caida.org/publications/papers/2008/self_similarity/), but that a greedy routing mechanism applied to such topologies and underlying spaces yields a maximum percentage of paths that successfully reach their destinations. Remarkably, these successful paths almost always are shortest, regardless of the hidden space structure (http://www.caida.org/publications/papers/2009/navigating_ultrasmall/). This explanation for why (if not how) complex networks are naturally navigable had sufficiently high interdisciplinary impact for recent publication in Nature (http://www.caida.org/publications/papers/2009/navigability_complex_networks/)

We have also developed a model of Internet growth which provides strong evidence that preferential attachment is a driving force behind Internet evolution (http://www.caida.org/publications/papers/2009/AS_evolution/).  The model yields AS-level topologies with links annotated by AS business relationships (customer-provider or peer-to-peer), and suggests that preferential attachment must be related to economic realities of ISP business decisions. To our knowledge, this is the first Internet evolution model that is realistic, parsimonious, analytically tractable, uses only measurable parameters, and “closes the loop.” The last feature means that having all model parameters measured from real Internet data, and substituted in analytic solutions, we can predict peculiar structural and dynamical properties of the Internet.

Finally CAIDA contributes an active measurement platform (Ark) as well as vital data resources to the creation of an underlying discipline that formalizes our observations and understanding of large-scale, complex networked systems such as the Internet. Ark directly addresses a short-term call in the
Network Science and Engineering Council’s recently published research agenda
, namely to improve the quality of measurement-driven research in the computer networking community and a broader range of scientific disciplines.  Ark provides an opportunity to test and validate hypotheses about how the current Internet operates. We are planning modules for integration with other data sources, as well as external validation of measurements and inferences against reported reality help balance the inevitable trade-off between fidelity and utility of network models.

Discuss how your research ideas might contribute to an overall network architecture, where the focus is on the system as a whole and on the interactions among the components.

We are using our current FIND funding to investigate exactly this question, including implementing and potentially deploying greedy routing over hidden metric spaces in an experimental control plane such as LISP (http://www.ietf.org/dyn/wg/charter/lisp-charter.html and http://tools.ietf.org/wg/lisp/). This implementation/deployment initiative will require a coordinated effort among different groups working on future Internet architectures.

Indeed, there are many practical technical details that still need to be worked out. Which components can we deploy incrementally? For example, we must change the semantics of IP packets to hold hidden space coordinates of the packet’s destination. Since we cannot touch end systems, we need address family gateways that translate between IP and hidden space headers, similar to the mapping function implemented as part of LISP. Based on our preliminary estimates, the IPv6 header provide enough bit space to hold hidden coordinates, so that LISP does appear quite close to what we need at the control plane. However, the data plane changes are more involved.  We have begun discussions with router vendors regarding implementation constraints.

And then we still have policy, security, ownership, trust, and business models to worry about. But information dissemination (e.g., routing and forwarding) is the core function of any network. Our approach is to modernize how we understand and implement this primary function as well as the associated implications for realistic future network architectures.

What is your experience in working collaboratively in a multidisciplinary setting, across disciplines and areas of expertise as well as across academe and industry?

CAIDA has extensive experience participating in as well as coordinating and hosting interdisciplinary conversations, as illustrated by the listing of our ISMA workshop series titles (http://www.caida.org/workshops/). Many of our workshops have focused on interdisciplinary conversations explicitly structures to bridge gaps between and build connections across domains. In August 2008 we co-hosted a workshop on Networks and Navigation at the Santa Fe Institute, a unique institution dedicated to multidisciplinary collaborations on complex systems.

A strength of our recent work is the composition of researcher skills including first-hand knowledge of operational and engineering realities of the Internet, expertise in the theory and practice of Internet routing and data collection, skills in mathematical analysis and modeling of complex networks, the ability to provide realistic approximations to analytically intractable problems, experience with large-scale network simulation and emulation, and the interdisciplinary capability to broaden the impact of this project to other disciplines. The collection of results we have achieved so far demonstrate that an interdisciplinary research team can make rapid progress in formalizing our understanding of large-scale, complex networked systems.

AIMS 2009 Workshop Report

July 15th, 2009 by kc

We finally posted the final report for our workshop on Active Internet Measurements (AIMS ‘09). The abstract:

Measuring the global Internet is a perpetually challenging task for technical as well as economic and policy reasons, which leaves scientists as well as policymakers navigating critical questions in their field with little if any empirical grounding. On February 12-13, 2009, CAIDA hosted the Workshop on Active Internet Measurements (AIMS) as part of our series of Internet Statistics and Metrics Analysis (ISMA) workshops which provide a venue for researchers, operators, and policymakers to exchange ideas and perspectives. The two-day workshop included presentations, discussion after each presentation, and breakout sessions focused on how to increase potential and mitigate limitations of active measurements in the wide area Internet. We identified relevant stakeholders who may support and/or oppose measurement, and explored how collaborative solutions might maximize the benefit of research at minimal cost. This report describes the findings of the workshop, outlines open research problems identified by participants, and concludes with recommendations that can benefit both Internet science and communications policy. Slides from workshop presentations are available at http://www.caida.org/workshops/isma/0902/.

What’s Belmont Got To Do With It?

June 12th, 2009 by erin

Recently a group of Internet technology researchers, attorneys and policy professionals participated in a DHS-sponsored workshop, “Ethical Principles and Guidelines for the Protection of Human Subjects in Information and Communications Technology Network and Security Research.” Possible nickname: Belmont Flux Workshop. If you’re still glassy-eyed: (1) you have yet to engage the depths of an Institutional Review Board (IRB) in the context of network and security research; (2) you gave up after seeing “Ethical principles”; and/or (3) you think human subjects issues and network research are orthogonal.

Here’s a summary of the event, and hopefully some inspiration. The purpose of the workshop was to attempt to interpret the guidelines set forth in the three-decades-old Belmont Report as they might translate to the newer and more dynamic domain of Internet, and particularly Internet security, research. The Belmont Report was promulgated by a Commission spawned from the National Research Act of 1974. It provided guidance for protecting human subjects involved in biomedical and behavioral research supported by the now-named Dept. of Health and Human Services (HHS). This Belmont Report became the basis for HHS regulations (codified at 45 CFR part 46) which in turn became the model for the uniform rules (the “Common Rule”) for human subjects research for 14 other Federal departments and agencies.

The important takeaway from this recount of authoritative history is understanding what catalyzed it. The ground truth of our individual and collective human nature is to not take precautionary, preventative or remedial measures until we’ve been damaged, materially or otherwise. This practical truth is institutionalized in our system of law and regulation, which largely reacts to appreciable harm by proscribing and prescribing certain actions. The original Belmont Report occurred ex post facto to infamous abuses of human subjects experimentation by doctors and scientists such as in WWII concentration camps and the 1940’s Tuskegee syphilis study. As a result of these abuses, the government recognized a need to develop standards for judging doctors, scientists and researchers whose work involves human subjects. These principle-based standards have been applied in the context of formal judicial proceedings, e.g., the Nuremburg War Crime Trials, down to researchers concerned about ethically sound experiment design and review committees (e.g., IRBs) to assess whether research risks are justified.

Fast forward to today’s information and communication technology (ICT) landscape, and in particular to network and security research on the global Internet, a domain that has evolved similar principles to the Belmont Report, but has no ratified method of applying them. Rather than wait for the first ‘Electronic Guantanamo Experiment’, the ultimate goal of this workshop series (there is likely to be at least another workshop) is to establish ethically defensible guidelines for current and future network and security research, so that both individually and collectively we can more effectively avoid and/or mitigate risks of harm to persons. Guidelines ratified by the research community will also help navigate the legal grey area of ICT transactions in daily operations.

To map the Belmont principles from traditional scientific disciplines into a blueprint for network and security research, we considered three axes:

  1. the boundaries between ICT network research and the accepted and routine practice of network operations management;
  2. the basic ethical principles of: (a) respect for persons (research should consider persons’ choice and opinions, should provide adequate notice and allow voluntariness, and persons with diminished autonomy deserve protection); (b) beneficence, (research should maximize possible benefits and minimize possible harms); (c) justice (benefits should accrue to those who bear any burden of the research and the burdens of the research should be distributed to the extent reasonable); and
  3. the application of those principles by way of (a) informed consent (how does it apply to different types of network measurement and experimental research?); (b) risk-benefit analysis (does the research merit the risk to subjects?); and (c) selection of subjects (are the research subjects in the same population who will benefit from the results?), respectively.

Our Game Plan:

Day 1 consisted of largely of foundational presentations to help frame the discussions of the three components above. The first panel gave background information and perspectives from Institutional Review Boards, including HHS and several academic and research organizations. The second panel was comprised of network and security researchers disclosing common and prominent scenarios that vividly illustrate the need for interpretation of these ethical principles in the expanding domain of Internet research. Finally, a few attorneys addressed prominent legal issues in empirical Internet research.

The remainder of the two-day workshop consisted of two breakout sessions, both tasked with a gap analysis between the earlier presented research data use cases and the Belmont framework, recognizing that some aspects of the framework will not translate well to the network research domain, e.g., pregnant persons being in a diminished capacity category, and other aspects will need to be added to a viable framework for network research. The case-based scenarios included: botnet research (e.g., infiltration of botnets and monitoring or disrupting traffic); wide-scale network survey research (e.g, port and wireless scanning); experiments involving reputation services (e.g., scoring and publishing blacklist data); network traffic analysis (e.g., backbone tapping, P2P research); and research involving deception of individuals (e.g., phishing research, honey-* research).

Interestingly, each group produced quite different but complementary results. One group took a high-level approach and crafted the beginnings of a fleshed out Belmont framework that could generally apply to network research, including some but not all portions of Belmont while including additional principles and application guidance.

The other group anchored off the general use cases and similarly highlighted the components of the original Belmont Report that were irrelevant and in need of interpretation. For the latter, this group expounded on specific technology, privacy and risk-assessment issues to consider.

The cost-benefit element of Belmont was arguably the most fundamental dimension of our task, and certainly the most vexing. We cannot expect otherwise, as our electronic operational lives — both individual citizen-consumers and information age institutions — are forming the risk-based synapses that we take for granted in traditional, analog (meatspace) activities. At the least, we are challenged to understand and frame if not (help policymakers) define boundaries of psychological, physical, legal, social and economic harms in the electronic landscape. (If you thought measuring one-way packet delay was hard..)

While few if any would argue against the benefits of empirically grounded network, security, and critical infrastructure protection research, there is little fundamental appreciation and understanding of those benefits — and much well-founded concern regarding privacy. Ethical and legal challenges inhibit access to network data or impede in vivo network experimentation for measurement and analysis, and generalize across familiar spaces such as on-line crime, computer systems security threats, and infrastructure vulnerabilities.

Other thoughts from the workshop on how to effectively balance network research utility and ethical obligations among various stakeholders:

  1. Network researchers pursuing scientific and intellectual freedom, and empirical knowledge that will inform business models and policies predicated on economic and usage patterns, security, and social behavior, etc.;
  2. Data subjects and owners seeking the benefits of technology advancement without having to surrender control of personal information or renounce liberties and freedom of on-line movement;
  3. Network/platform owners exercising their rights in a free market economy to create wealth and cultivate business and customer relationships; and,
  4. Collective right of network and data owners to build and enhance the networks within which norms, transactions and livelihoods are maturing.

Motivated to try to get ahead of the metaphorical Milgram experiment (not to be confused with his small world experiment; we’re actively trying to emulate that one in a future routing architecture) in the field of Internet research, this workshop was an initial step in that direction. I’d say we succeeded in raising the level of discourse surrounding the application of ethical principles to ICT network research and upon which specific rules may be formulated, criticized and interpreted. We’ll supplement the intellectual capital we produced with subsequent workshops, dialogue and research. The eventual outcome will be a formal report (I’ll codename it “Belmont Flux Report” just for the moment) designed to serve as guiding policy for stakeholders. Stay tuned.

Where you see risks, I see opportunity. — Alfred Blalock (performed first heart surgery, documented in Something the Lord Made).

a recent visit to the fcc

June 9th, 2009 by kc

I spent a few hours at the FCC two weeks back, presented a slide version of a top ten list I wrote last year. Requested discussion topics: obstacles to data collection, how data is collected and used, policy-making based on inference, how to develop an objective knowledge base for science and policy, privacy expectations/rights versus the need for understanding the system as critical infrastructure. Audience mostly lawyers, worried about how they are going to accomplish a reasonable broadband plan. As I tried to describe in my five-minute presentation slot (and 1 slide, and more expansive blog entry) on the broadband panel at the DOC ten weeks ago, solutions begin with recognition of some underlying empirical facts, starting with one that is strangely not being emphasized by lobbyists: you can’t make Wall-Street-approved margins moving bits around over long distances. Lot of implications to that reality; the sooner we admit it, the more realistic our broadband plan will be.

CAIDA’s Annual Report for 2008

June 3rd, 2009 by josh

2008 was an exciting year for the Internet and no less exciting for CAIDA. As network-capable personal/computing devices became ever more affordable and ubiquitous, and developers continued the flow of [open] applications/protocols that make it easier to create, capture, edit, publish and share information at the increasing speeds allowed by optical fiber, cable, and wifi services, we continue to make vast empirically untested assumptions about how the Internet is financed, operated, and used. What’s going on under the hood of the engine of our new digitized economy?

Over the last two decades, the Internet operational and research communities have gathered overwhelming evidence that underneath the exciting developments at the application level, the Internet’s architecture faces overwhelmingly and relatively near-term challenges with arguably intractable technological, political, social, and economic dimensions. We have previously taxonomized these problems into four categories of concerns for the Internet as emerging critical infrastructure: safety, scalability, sustainability, and stewardship.

CAIDA’s 2008 Annual Report describes our recent efforts to illuminate these aspects of the Internet, providing highlights from our research, infrastructure, and outreach activities. Our current research projects, primarily funded by the U.S. National Science Foundation (NSF), include several measurement-based studies of the Internet’s core infrastructure, focused on the health and integrity of the global Internet topology, routing, addressing, and naming systems.

We made fundamental advances in several of our research projects this year, supported by increased coverage by our measurement infrastructure, and increased collaborations with colleagues around the world. Highlights from the annual report include:

  • The first full calendar year of the most comprehensive annotated view of IPv4 topology thus far. We also began to deploy IPv6 topology measurement instrumentation.
  • Some of our topology research focused on how different routing approaches in nature are maximally efficient on certain types of peculiarly structured topologies, conveniently, those structured like the Internet AS graph. Further, we found that self-similarity of clustering in real complex networks provides strong empirical evidence that some hidden metric spaces underlie these networks. In trying to model self-similar (scale-free) networks embedded into such a hidden space, we discover that a certain approach to routing — greedy routing — is phenomenally successful and efficient in such a model. We are still exploring the ramifications of this intense discovery, and the even more intriguing breakthrough that this hidden space seems to be hyperbolic.
  • Our research into network growth dynamics also yielded two papers with surprising results about different regimes of network growth: (1)  that there may be a vast pre-asymptotic regime of complex network growth that gives rise to power-law like effects in degree distribution; (2)  a simple customer-provider-based modification of the preferential attachment model can account for Internet topology evolution, including the ISP consolidation toward monopoly.
  • increased active and passive measurement infrastructure as well as continued maintainance of a catalog of Internet measurement data sets.
  • coordination and analysis of another DITL’s (Day in the Life of the Internet worth of data.
  • updated our real-time traffic report generator, geographical visualizations of DNS workload to a given set of servers, updates to our IPv4 and IPv6 AScore posters, and visual maps of IPv4 address space consumption.
  • a set of blog entries that became a short Internet research tutorial for policy folks.

For all the exciting details, we encourage you read the full report and post comments/questions, which we can integrate into next year’s update of our strategic program plan.

Proposal for ICANN/RIR scenario planning exercise

May 25th, 2009 by kc

Internet infrastructure economics research”, and how to do reasonable examples of it, has come up a lot lately, so i’m posting a brief description of an academic+icann community workshop i’ve been recommending for a few years, which has yet to happen, and (I still believe) is long past due, and specifically more important than passing policies, especially emergency ones to allow IP address markets with no supporting research on the impact on security and stability of the Internet, and even at the risk of killing IPv6 altogether.]

Goal: a more structured conversation according to established discipline of scenario planning.

Objective: help understand what we don’t know. different way of seeing, thinking, ‘re-perceiving’ link system structure and behavior — “model what you don’t know”


Phase 1: SAST: strategic assumption surfacing and testing (SAST). Start with specific decision (in our case, IPv4 address markets/transfer), build out toward environment/context:
(1) what are driving forces /trends in macro environment
(2) what is uncertain, inevitable? rank forces by importance
(3) what do decisions makers want to know?
(4) what will they see as success or failure?
(5) what considerations will shape these outcomes?

Phase 2: Interview key players

Phase 3: Create proposed scenarios (~4; no probability assignment, since this is not about predicting the future, but understanding and preparing for the future). Effective scenarios are:
(a) plausible and surprising
(b) have the power to break old stereotypes
(c) decision-makers assume ownership of the scenario
(d) participatory (help thoroughly flesh out scenario)
(e) few in number, the differences among which matter to decision-makers.

So we would need scenarios to cover routing table explosion, nationalization of the addressing allocation function (and thus likely other aspects of Internet infrastrtucture), and market cartelization), as well as for a takeoff of IPv6 growth.

Phase 4: Create scenarios as a group (workshop #1, 2 days)
(a) understand present, past, demographic and technology changes
(b) describe variety of possible futures
(c) delineate how scenarios above evolve
(c) identify indicators to track what may trigger scenarios
(d) link to specific decisions
(e) link to analysis process
(f) link to organizational procedures
(g) involve decision makers

(So (c) above is where you would make sure someone writes up a neutral analysis of the “NAT tax”, that allegedly kills growth by strangling new applications and paving nonneutral networks. no easy trick, but the RIRs should make sure there is evidence of an earnest attempt.)

Workshop day 1: 1 hour defining issue; 3-4 hours key factors, environmental forces, setting on scenario matrix; 3-4 hours socialize, informally , compare impressions

Workshop day 2: 2nd thoughts on skeletal scenario logic; 1-2 hours: fleshing out one scenario together: beginning, middle, end. afternoon: break up into smaller groups to flesh out other scenarios, including preliminary and strategic impacts of each

Phase 5: follow up after workshop: 4-6 weeks of interim research while writing final scenarios
and exploring implications. circulate drafts, more interviews.

Phase 6: (possibly another workshop to) develop a framework for how to monitor indicators and reevaluate scenarios in light of empirical data.

Participants:
– at least 1-2 represenatative from each RIR
– 1-2 represenatatives from ICANN and advisory councils
– 4 economists/media policy folks
– 2-4 Internet routing operational experts
– 1-2 from U.S. DOD (who have elephantine amounts of legacy IPv4 space)
– researchers from related disciplines, with accepted abstract submission

(need representation/support/participation from: top management, key decision makers and implementers, broad range of functions and divisions represented imaginative, open minds, at least 2 people who can write up results in unbiased way)



References

Learning from the Future: Competitive Foresight Scenarios

The Sixth Sense: Accelerating Organisational Learning with Scenarios

Inevitable Surprises: Thinking Ahead in a Time of Turbulence

Creating Futures: Scenario Planning As a Strategic Management Tool

A handbook for scenario planning: practicing futurists Bill Ralston and Ian Wilson offer practical guidelines for using scenarios in business settings

The Changing Foundation of the Internet: Address Transfers and Markets

Reform Establishing the Rule of Law (pdf)

According to the Best Available Data: internet telemetry, v6

disclosure: ARIN has sponsored CAIDA research efforts in gauging IPv6 penetration and obstacles, some results presented at ARIN meetings (October 2005, April 2008, and October 2008), others on the research pages of CAIDA’s website. ARIN has also told me it is planning to launch a more formal research program, which could be used to inform current and future policy debates.]

ethical phishing experiments have to lie?

May 4th, 2009 by kc

Stefan pointed me at a paper titled “Designing and Conducting Phishing Experiment” (in IEEE Technology and Society Special Issue on Usability and Security, 2007) that makes an amazing claim: it might be more ethical to not debrief the subjects of your phishing experiments after the experiments are over, in particular you might ‘do less harm’ if you do not reveal that some of the sites you had them browse were phishing sites.

This brings us to the question: Does a phishing experiment that deceives a subject and exposes the subject to a fake phishing attack adversely affect the subject’s rights or welfare? As noted above, as long as the researcher can ensure the security of any personal information of any information released by the subject (the procedures of which are outlined below), neither a laboratory phishing study nor a naturalistic phishing study should adversely affect the welfare of the subject. However, we question whether the use of debriefing in naturalistic phishing studies might, in fact, adversely affect the welfare of the subject and propose that this, in part, is justification for not debriefing subjects in these types of phishing studies. In regards to adversely affecting the rights of subjects, the use of deception or waiving consent is not seen as a violation of a personal right, see 45 CFR 46 [5], 116 and [7]. Although laudable, the right to know the truth is not a recognized absolute right. However, the federal regulations and ethicists recognize that it is advisable to address this issue and use debriefing to provide the pertinent information relevant to the truth, when appropriate, see 45 CFR 46 [5], 116(d)4, and [7]. The question we raise is whether using debriefing in a naturalistic phishing study is appropriate.

“Designing and Conducting Phishing Experiment”, Peter Finn and Markus Jakobsson, http://www.indiana.edu/~phishing/papers/finn-conducting.pdf

This is an interesting, but questionable position: “If people know what’s happening, then they will be upset. But what they will be upset by is learning they were deceived, therefore we must completely deceive them.” That’s an argument that makes a case against itself in one sentence.

There are other problems with the approach, including the assumption of implicit rationality in the users; it does not address the prevalence or degree of anxiety and even fear of being observed in the digital media. The researchers present the problem as dichotomous, choosing not to explore methods that could establish the degree of difference between behavior during informed consent and non-consent. At what sample size and study interval do informed consent procedures change behavior? (If you told someone you were studying their behavior on Internet for the next hour, they’d probably change. But over the next year?) Also, what’s wrong with knowing only conservative values of phishing vulnerability? If it’s such a big problem, wouldn’t even those estimates be influential in designing anti-phishing sites and informing policymakers and law enforcement?

There is a lot of research which is compromised — or completely impossible — with informed consent. But in cases where those compromises can be studied, and estimates of uncertainty established, perhaps researchers (especially psychology researchers?) should not be exempt from that process.

However, I’ve also heard from commercial security consultants that the “tricking users into getting phished without telling them” approach is exactly how many corporations measure the extent their own employees are getting phished on corporate networks. Of course, commercial entities don’t need their internal research projects to pass IRB approval, or peer review, much less public review. The paper’s most important contribution may be its acknowledgement of the lack of current guidelines for how to conduct ethical Internet research. DHS S&T’s upcoming workshop on Ethical Issues in Network Research (26-27 May, by invitation) is happening not a moment too soon. More on this workshop later.