The truth behind the coronavirus statistics

At university we were taught that if the statistics felt wrong, they probably were. You get rogue numbers just as you get rogue people, writes Hamish McRae

Tuesday 20 October 2020 17:01 BST

Number crunching: UK Covid stats make for grim reading (PA)

Are Covid cases in Manchester really going down, not up? Could deaths in Russia actually be three times the level that the official statistics say they are? What should we make of the prediction that another 170,000 people will die this winter in the United States as a result of the virus?

We are all deluged by statistics. If they aren’t about the economy, they are about political opinions, and now there is the new tsunami about public health. Yesterday was World Statistics Day, which as economist and author Tim Harford observes comes only every five years. It was started by the UN and serves as a reminder of just how much numbers seem to rule our lives. Yet we distrust them, and with reason.

A good UK example of economic statistics being wrong were the reports of a double-dip recession in 2012. That was embarrassing for chancellor George Osborne who took the blame, but subsequent revisions showed there was no second recession at all – merely a pause in growth.

We distrust the statistics about Covid partly because of the economic models that predicted far higher deaths than actually have occurred. In April a study by the University of Washington said that by early August 66,314 people would die of the virus in the UK. It was about 40,000, still terrible and worse than some UK academic predictions. However, even that lower figure may be an over-estimate because of the way the UK includes all people who had contracted coronavirus and died, even if they actually were killed by something else.

So what should we believe?

The first thing is to apply some basic common sense. When I was taught stats at Trinity College Dublin, I was told that we should look at how they were collected. Was it a sample survey and if so, how big was the sample? Or was it real data, hard numbers that you could rely on?

There are good examples of both right now. The UK counts everyone who dies, whatever the cause, and we know those numbers are solid. They come out every week, two weeks in arrears, and are published by the Office for National Statistics. So 9,954 people died in England and Wales in the week to 9 October, which is 1.5 per cent above the five-year average but well within the normal range. That is a really useful number: it tells us about the total impact of Covid, including deaths that have occurred from people who have not sought medical treatment.

As for samples, the most useful current study in Britain is the huge one being done by the ONS to try to discover how many people are being infected. The news is not good, as they calculate that there are currently more than 300,000 people in England who are infected and that the number is still rising. The sample size is massive (150,000) and there are plans to go up to 400,000 people being tested every week, and the tests are being done competently. It is not perfect because a lot of people drop out, people mislabel samples (I am on it and managed to do so), and – well – it is only a sample rather than hard data. But it is real unbiased research, unlike, I am afraid, some of the other stuff about.

Common sense will take us quite a long way. We can usually smell whether a survey is skewing the data and using them for political or commercial purposes. Even renowned academics can be biased – indeed that has been one of the depressing aspects in the response to this emergency. We also need to understand and accept uncertainty. We were all taught to ask whether the stats we were looking at felt intuitively right. If they felt wrong, they probably were. You get rogue numbers just as you get rogue people.

If all this sounds a bit unsatisfactory, there is hope. The big advance of the past few years has been the availability of much more real-time data and the ability of artificial intelligence to examine it. We can see the economic impact of the current recession by looking at Google searches for holidays, or rush-hour traffic. We can look at what is actually happening, rather than what people think in surveys. That great wash of numbers being spewed out would not be much use without AI to chomp through them and allowing us to figure out what might be useful and what is just background noise.

So, statistics are getting better. Andy Haldane, chief economist at the Bank of England, compared economic forecasting to weather forecasting, saying that failure to predict the financial crash of 2008 was akin to that of Michael Fish, the BBC weather presenter, failing to predict the great storm of 1987. His point was that weather forecasting has gotten much better since then, and the same could be expected of economic forecasting.

If we are also learning to improve public health statistics, then would be most welcome, and not before time.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

The truth behind the coronavirus statistics

At university we were taught that if the statistics felt wrong, they probably were. You get rogue numbers just as you get rogue people, writes Hamish McRae

Join our commenting forum

Thank you for registering