data science

Science is, most definitively, a method. A more or less well defined set of instructions that bring us closer to the truth.

So what can possibly mean data science?

When we talk about biological sciences, or physical sciences, we know that there is such a thing as the truth, and we also know that we will never reach it, yet we are getting closer to it.

So… can’t a data scientist get to what the data of a client really is? to that truth?

My preferred answer would be that any data scientist must know that a set of data allows layers of interpretation. Shallow ones and deeper ones. It is up to the client, and to the capacity of the data scientist, to reach a particular depth.

Now and then a data scientists tells a client “this is it, this is the only depth that really matters” If you happen to be the client, be aware: he is lying to you.  There is always another iteration possible, another test to try, another interpretation to give.

Also, if a data scientist tells you, again and again, that more analysis is needed, be aware: he is incompetent. Because that is precisely what you need a data scientist for. For knowing when the layer that he is at <em>now</em> is the one yielding the answers that are relevant for you <em>now</em>. Sure, more questions are always possible, and new answers might be coming. But you, as a client, have questions. And there are answers.

So this is what data science for me is. A delicate and far from obvious balance act, a continuos dialogue in between the scientist and the problem owner, and subtle decision making. To inform Ulysses once again… between Scilla and Charybdis!