Data science in times of COVID19

Models, statistical models. Being a biology student that could understand them, and now and then make some of my own, I belonged to the unloved group that professors categorized as “can’t do laboratory work, does play with computer” Our revenge came when, a few years down the road, the same professors hired my kind to show that whatever they found in the lab was, also, statistically significant… without quite knowing what that might possibly mean. So my group roll with the wave and expend the last few decades walking that delicate equilibrium in which we tell our clients what they want to hear, if we can, but we also tell them how to understand statistical lingo, how to correct procedures, how to measure better, how to think sharper. And so we work on, between data and reality.

And then, somehow, we became data scientists.

The change took a long time, of course. The reasons we stopped being boring statisticians and became exciting data engineers or sophisticated data scientists are many, but most are driven by the prominent role that computers have taken in our lives. A crucial one? what we today call infographics. We all knew that an image tells more than thousand words, but when my people begin giving fancy graphics as results, we jumped from the dark corner inhabited by numerical tables, to the warmth of the lamps in the television programs. And we can influence people, oh yes!

By way of example, take a look at that golden article, the one on flattening the curve. In the last days it has been quoted to me from pretty much every other channel that connects me to reality. Newspapers, friends, whatsapp groups… Everybody wants to flatten the curve, doing their own social distancing. Great success of a data scientist! Everybody gets it!

But does everybody get it, actually?

The most important point of that article was that in order to prevent a collapse of the health system, we needed to create social isolation early on. With emphasize in early. After a certain moment of the epidemic, sudden onset of cases is impossible to prevent. I thought that this article was very clear in passing a simple message: get some distance before it is too late! But then, it wasn’t. In most countries social distancing has been adopted by the government, referencing the fight to flatten the curve, when it is obviously too late. And here is the thing. It is easy to lie with statistics, as we all know from that very old book… but it is also easy to confuse a complex reality with a couple of easy to understand graphics. And give us the feeling that we can do something big with our small actions, when, in reality, we just don’t know.

Take another example. A couple of weeks ago, I told a group that I expected to see a decrease in COVID19 cases about past weekend. The data of the last two days seems to agree with my prediction. But was I right? My prediction was based on the idea that the carnival celebrations were the main drive of the epidemic in The Netherlands. So I expected a bunch of people infected then, eventually coming to hospitals in the following two weeks, and then stopping to do so. If you look at the official data, that idea seems to be in good agreement with reality. But is my thinking of then correct? Hardly. In the meantime we have other hypothesis explaining the dynamic of COVID19. We think now that the social structure of each country is key to the spread. We realized that the period that schools were open (due to asymptomatic transmission) have to influence what we see. And studies talk of incubation times from 6 days up to 24. So, even if cases will keep on dropping tomorrow, it might very well be that my original logic in making that prediction is flawed. A data scientist can be right… by the wrong reasons.

Of course, I would like to round up an article in this blog of mine with some self propaganda, given that after all I do data science for a living. But perhaps the best self propaganda in times of crisis, or the propaganda that I am comfortable with, is to say: be aware! knowledge changes by the day! Doubt yourself and you will be better off… or at least, close to whatever reality actually is.

(This article was partly pushed to existence through discussions held with Steven and Alexandra, two of my iaido sempai’s. Thanks! )