Archive for April, 2021

Guiding principles for a “Bureau of Cyber Statistics”

Saturday, April 24th, 2021 by David Clark and kc claffy

The recent Cyberspace Solarium Commission report (1) set out a strategic plan to improve the security of cyberspace. Among its many recommendations is that the government establish a Bureau of Cyber Statistics, to provide the government with the information that it needs for informed planning and action. A recent report from the Aspen Institute echoed this call. (2) Legal academics and lobbyists have already started to consider its structure. (3) The Internet measurement community needs to join this conversation.

The Solarium report proposed some specific characteristics: they recommend a bureau located in the Department of Commerce, and funded and authorized to gather necessary data. The report also says that “the center should be funded and equipped to host academics as well as private sector and independent security researchers as a part of extended exchanges”. We appreciate that the report acknowledges the value of academic researchers and that this objective requires careful thought to achieve. The report specifically mentions “purchasing private or proprietary data repositories”. Will “extended exchanges” act as the only pattern of access, where an academic would work under a Non-Disclosure Agreement (NDA), unable to publish results that relied on proprietary data? Would this allow graduate students to participate, i.e., how would they publish a thesis? The proposal does not indicate deep understanding of how academic research works. As an illustrative example, CAIDA/UCSD and MIT were hired by AT&T as “independent measurement experts” to propose and oversee methods for AT&T to satisfy FCC reporting requirements imposed as a merger condition. (4) AT&T covered all the data we received by an NDA, and we were not able to publish any details about what we learned. This sort of work does not qualify as academic research. It is consulting.

In our view, the bureau must be organized in such a way that academics are able and incentivized to utilize the resources of the bureau for research on questions that motivate the creation of the bureau in the first place. But this requires that when the U.S. government establishes the bureau, it makes apparent the value of academic participation and the modes of operation that will allow it.

These reports focus on cybersecurity, and indeed, security is the most prominent national challenge of the Internet. But the government needs to understand many other issues related to the character of the Internet ecosystem, many of which are inextricably related to security. We cannot secure what we do not understand, and we cannot understand what we do not measure. Measurement of the Internet raises epistemological challenges that span many disciplines, from network engineering and computer science to economics, sociology, ethics, law, and public policy. The following guiding principles can help accommodate these challenges, and the sometimes conflicting incentives across academic, government, commercial, and civic stakeholders.

  1. Incentivize academic participation. A national infrastructure must be organized in such a way that academics are able and incentivized to utilize its resources. This requires designing and implementing modes of operation that will incentivize independent researcher participation.
  2. Demonstrate innovation and value through real projects that address national-scale problems with data-intensive science and engineering research. To justify substantial U.S. government investment in cyberinfrastructure, the research community must demonstrate its value as an independent voice with important results that help to inform the future of the Internet. This demonstration will not be effective if it is hypothetical. Real projects are tricky, because the data does not necessarily exist yet, and if it does, may be proprietary. So researchers must overcome the chicken-and-egg problem of how to demonstrate the value of an independent research community before the Bureau exists.
  3. Start with public data and shared community infrastructure. The starting point must be to work with public data, and translate research results into forms that are meaningful to a constituency broader than the research community. But this path reveals more specific barriers: Who would fund such research? What are the incentives of the academic research community to undertake it? Yet if we do not recognize and overcome this challenge, the independent research community may essentially be written out of the story, as more and more data is proprietary and hidden away.
  4. Make specific and concrete calls for data of national importance. In our view, the community needs a focal point for discussion about collection and use of data, presenting an opportunity and responsibility to transform abstract calls for access to data into more specific and concrete articulations.
  5. Prioritize framework for research access to proprietary data. Sharing of proprietary data must address the reasons that data is considered proprietary. Understanding these reasons is required to design approaches to allow reasonable access for research purposes.
  6. Integrate focus on and metrics to evaluate workforce training efforts. The other risk of continuing on the current path, rather than confronting the data access problem, is the lost opportunity to train students to interpret complex operational data about Internet infrastructure, which is crucial to developing a globally competitive U.S. cybersecurity workforce capable of securing Internet infrastructure.

Other parts of the globe have moved to regularize cybersecurity data, and they have explicitly recognized the importance of engaging and sustaining the academic research establishment in developing cybersecurity tools to secure network infrastructure (5). If the U.S. does not take coherent steps to support its research community, there is a risk that it is sidelined in shaping the future of the Internet. The European Union’s proposed regulation for Digital Services (6) also discussed the importance of ensuring access to proprietary data by the academic research community:

Investigations by researchers on the evolution and severity of online systemic risks are particularly important for bridging information asymmetries and establishing a resilient system of risk mitigation, informing online platforms, Digital Services Coordinators, other competent authorities, the Commission and the public. This Regulation therefore provides a framework for compelling access to data from very large online platforms to vetted researchers.

They clarify what they mean by “vetted researchers”:

In order to be vetted, researchers shall be affiliated with academic institutions, be independent from commercial interests, have proven records of expertise in the fields related to the risks investigated or related research methodologies, and shall commit and be in a capacity to preserve the specific data security and confidentiality requirements corresponding to each request.

This regulation emphasizes a structure that allows the academic community to work with proprietary data, sending an important signal that they intend to make their academic research establishment a recognized part of shaping the future of the Internet in the EU. The U.S. needs to take a similar proactive stance.


  1. Cyberspace Solarium Commission report
  2. The Aspen Institute: A National Cybersecurity Agenda for Digital Infrastructure
  3. Lawfare: Considerations for the Structure of the Bureau of Cyber Statistics
  4. CAIDA: First Amended Report of AT&T Independent Measurement Expert: Reporting requirements and measurement methods
    CAIDA: Report of AT&T Independent Measurement Expert Background and supporting arguments for measurement and reporting requirements
  5. DIRECTIVE OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL on measures for a high common level of cybersecurity across the Union, repealing Directive (EU) 2016/1148
  6. Regulation of the European Parliament and of the Council on a Single Market For Digital Services