Detecting Internet outages world-wide and in real-time is no small feat. It requires distributed measurement infrastructure, tools and processing power to analyze the resulting data, plenty of storage to save it, and a powerful user interface to visualize the data. IODA (short for Internet Outage Detection and Analysis) is CAIDA’s solution to this problem.
In an attempt to make IODA more useful, we just launched @caida_ioda, a Twitter account to bring attention to select Internet outages. We inaugurated this account by revealing an outage that took place in Morocco, on July 19, from 11:30 pm to 3:50 am local time. The visualization below illustrates this outage. The blue time series represents our active probing data. This data comes from a cluster of twenty software instances, located at SDSC in San Diego, that repeatedly ping active hosts in the IPv4 address space. Each data point of the time series captures the normalized number of /24 network blocks in Morocco that responded to these pings. The data is normalized with respect to the maximum value observed in the inspected time interval. Starting at 10:20 pm UTC, this fraction dropped significantly (from ~19,700 /24 network blocks to as low as ~13,400) and slowly started to recover after a few hours. The green time series exhibits a drop at the same time—it represents the normalized number of /24 network blocks that are reachable according to BGP, and geolocated to Morocco. The gaps in the BGP time series are due to missing data points caused by temporary issues with our infrastructure. You can use our interactive dashboard to investigate this outage yourself.
Internet outages do not always affect entire countries; their scope is frequently limited to regions or autonomous systems (ASes). IODA can detect such sub-national outages and, coming back to our example, did so for Morocco. The map below suggests that not all of the country’s regions were affected equally. Note, however, that IP address geolocation (that is, the mapping from IP address to geographical location) is far from perfect, so take this information with a grain of salt.
IODA determines an anomaly score for each outage that it detects. Our help page provides more details on how we determine this score but in essence it’s a number that captures the severity of the outage. A look at IODA’s AS-level breakdown confirms that Maroc Telecom was affected the most—the ISP’s overall anomaly score is more than twice that of Itissalat Al-Maghrib, the ISP that ranked second.
So, what happened? IODA reveals where Internet outages happen but it cannot tell us why. Understanding an outage’s root cause still requires a human in the loop; mostly to read news reports and social media postings that mention the outage. In our example, a search of the Arabic-speaking part of the Internet for “morocco internet” led us to Maroc Telecom’s Facebook page, which cited a power outage as the cause:
The time span quoted by Maroc Telecom roughly confirms what IODA saw but our data suggests that the outage began earlier—our active probers first saw a decline in connectivity at 11:30 pm—about half an hour before the alleged start of the outage.
We are supporting public access to IODA’s dashboard for exploration of this and other outages; please use it and send feedback to ioda-info AT caida DOT org.