A glimpse at open Internet performance data: M-Lab

With the heated net neutrality debate in Europe and the US, the question of objective indicators to measure the performance of broadband connections is as relevant as ever. The quality of service of broadband was in fact addressed in the French regulator’s study “State of the Internet in France” published last May, as the ARCEP announced it was stopping its historic measurement tools, likely in favour of crowdsourced means in the future.

An article by the British advice service Cable.co.uk –which helps consumers compare deals from ISPs– recently shed light on one such crowdsourced tool: Measurement Lab (aka M-Lab). The article itself shows how much the difference in the methodology of tests can impact the results, and why transparent, independent, tools are needed: « The UK’s average speed of 16.51Mbps is less than half that reported by Ofcom in its UK Home Broadband Performance earlier this year. The regulator, which used data provided by SamKnows, said the UK’s average download speed in 2016 was 36.2Mbps. »

The results provided by Cable.co.uk are worth exploring, and I will analyze them in another article, as I find it useful to explain how M-Lab works beforehand. The M-Lab initiative relies on a distributed network of servers located all around the world in datacenters, on which researchers can deploy tools to measure the quality of service of broadband connections and to identify potential blocking and throttling. M-Lab currently hosts a few tests (all published in Open Source), the Network Diagnostic Tool being the most commonly used. It provides measurements like download and upload speeds, latency, jitter, and other technical indicators by transferring as much data as possible in ten seconds through a single connection (more info on the NDT methodology here).

M-Lab’s goal is both to assist researchers, by providing a distributed platform and raw data, and to empower the public by giving them the means to test their Internet connection. All the data collected by M-Lab –dozens of billions of rows– is openly available, downloadable as is, through queries or accessible via the visualization tools. The raw data is mostly aimed at researchers, as some skill is required to understand how to use them. The visualization tool is however sufficient for some interesting comparisons, between different service providers, regions (or even cities), and times of the day:

MLAB’s visualization tool

M-Lab is a wonderful tool for anyone seeking broadband metrics: finding such quality data from a totally transparent source and for free is unhoped for. It is however important to keep in mind the limitations of such tools to avoid misusing and misinterpreting the data. Many of these limitations come from what makes M-Lab such a great tool: it is a crowdsourced initiative. Every single test made by M-Lab happens in a somewhat uncontrolled environment, even if NDT collects a number of parameters that allow for a better understanding of this environment. Important numbers make the data more reliable to use, i.e. more statistically correct. There are, however, some biases that need to be anticipated to get meaningful analysis from the data:

  • A significant frequency bias exists: some users perform more tests on the M-Lab platform than others. While this bias can lead to incorrect results, weighting the results per distinct IP and not per test easily solves it.
  • Another bias could come from the equipment (PC, WiFi…) of the users, which in can significantly impact the results from the tests of the broadband connection. It is not unlikely that users with better hardware or a better home network could be more commonly found among the customers of some tech-savvy friendly ISP, or in specific cities more appealing to such people. What’s more, some ISPs actually provide better tools for the home network (free PLC plugs, WiFi 802.11 ac…).
  • Some ISPs may also have a higher market share in rural / isolated areas, where broadband connections will inherently have worse characteristics than in urban areas in most countries.

With a good understanding of possible biases, M-Lab is a very interesting source of information for analysts and researchers, and it can certainly be useful for regulators willing to test the waters of crowdsourced measurements, such as the ARCEP in France. But it is hard to find the right way to display the results of such measurements, and the current visualization tool -although very interesting to play with- may fall short of the ambition to truly empower the end customer. As download and upload speeds alone may appear less and less relevant to describe the quality of a broadband connection, the question of having indicators closer to actual uses can seem appealing, by measuring access to different types of contents. But what kind of content should be considered? Is measuring the quality of connection to some content providers (Facebook, Netflix, Skype, Spotify, Youtube…) sufficient to evaluate the quality of a broadband connection?