Data science: an argument for contigentism?

I recently discovered a passionating debate in philosophy of science about the contingentism/inevitabilism issue (see this book). Quoting Lena Soler, one the authors of the book and major expert in this domain, the debate is roughly about that: “Could science have been otherwise? could have it been dramatically different from science as we now it today? Is there something inevitable in a sound scientific enterprise? Could we have developed an alternative successful science based on different notions, conceptions, results?” Note that the aim here is not to discuss the importance of an exotic pseudo-science but to reason about the fact whether the scientific notions, concepts, techniques, we are using today are really necessary or if a different scientific path (i.e. still scientific in Popper’s terminology) would have been possible.

In my opinion, it is quite natural to conceive, that being science a human enterprise, it could have been evolving  in a very different manner. How many choices, decisions, conclusions (or Nobel prizes) in the scientific world have been dictated by contingencies, social aspects, historical contexts, politics, economics or nationalistic considerations? Was all of that inevitable? For instance, think simply to the obscure fate that could await today a revolutionary article, unfortunately written in very bad English…

This debate evoked  in me some considerations about the role of data science in all that. The success of data science is  the living proof that, starting from the same (or very similar) premises, modeling can have multiple, heterogenous outcomes. Think for instance to address a scientific prediction problem in a data driven manner and consider the overabundance of techniques, methods, algorithms that you could use to solve this problem. From a data scientist perspective contingentism is a pure evidence.  The same problem could be tackled in many different manners but with roughly the same accuracy from an external prediction perspective. So the question raises spontaneously: what would have happened if Keplero or Newton would have had the same attitude (or better computational power)? Would the gravitational laws have the same form, the same aspect? Notions like mass, gravity would be the same? Would the consequent course of science be the same?

Also, is not data science a formidable manner of playing again the history of science? Which kind of scientific product would have been returned by a today data scientist once put in front to the same experimental evidence of renowned scientists of the past centuries?

Bias/variance interpretation of conscience

Every philosopher (e.g. scientist) is slave of some formalism and tends to apply it as much as possible to any reality aspects. This is indeed a form of bias and my bias, recently, is that I tend to interpret everything in terms of bias/variance… So why not pushing this to the extreme and applying it to nothing less than the hardest issue of science and philosophy?  conscience, as simple as that…

In particular I will aim here to address issues like: does conscience exist really, what is its function, may robots have one, and so on…

Let’s go straight to the end of my reasoning: we could use the bias/variance formalism, useful to describe any learning procedure, to  support the idea that conscience is not only an epiphenomenon but rather a necessary component  of every rational cognitive process. In particular conscience is required for interacting with a complex multivariate, multi agent and uncertain reality where the criterion of effectiveness/success of such interaction is complex, multivariate, uncertain and dependent on others too.

In other terms what I claim is that surviving in our reality requires any intelligent agent to roughly decompose its intelligent activities in two parts: a part (made of several submodules if needed) which can be addressed as a problem of optimization (according to an univariate cost function) and implemented similarly to a fast regulator or automatic controller: a second part which has to deal with exploration, exception, multi criteria, interaction, uncertainty and adaption.

As far as the first part is concerned,  think for instance to visuomotor control regulations which allow us everyday to survive in an hostile environment thanks to their fast and unconscious control loops able to monitor, control or optimize some tasks. This part corresponds to any bias component of a cognitive effort: a stable, situated and limited module aiming to exploit some previous or acquired task in a specific application domain . This module is rapid and effective when the application domain is respected and the addressed cost function is of interest.


Let me quote Christof Koch from his book “Consciousness: Confessions of a Romantic Reductionist. ” :

« The mystery deepens with the realization that much of the ebb and flow of daily life does indeed take place beyond the pale of consciousness. This is patently true for most of the sensory-motor actions that compose our daily routine: tying shoelaces, typing on a computer keyboard, driving a car, returning a tennis serve, running on a rocky trail, dancing a waltz. These actions run on automatic pilot, with little or no conscious introspection. Indeed, the smooth execution of such tasks requires that you not concentrate too much on any one component.  »


Conscience boils down to all that cannot be dealt with in this manner, in other terms to all that escapes to the  domain of automatic, fast yet biased servomotor modules. No free lunch theorems have shown that there is no optimization working for settings or models optimal for all distributions. Any biased approach, though effective in his own application domain, is doomed to failure (or better to low performance) in a complex world which cannot be interpreted at the light of a single criterion, or a single cost function.

So the two facets of our cognitive process address different aspects: on one side bias, exploitation, unconsciousness, automation, regularity, single variate criterion, rapidity, optimized solution on the other variance, exploration, awareness, attention, exception, multi criteria, delay, assessment of alternatives.

So as Koch said, consciousness is useful because « life sometimes throws you a curveball!  »

According to this interpretation, consciousness is a necessary component of a high level cognitive functionality; in other terms I refute the possibility of having a zombie who could have the equivalent cognitive capabilities of a conscious being, ale I don’t believe that a too biased learning agent could be effective in the long run in a ever changing world. Not being conscious would reduce the functionalities to automatic, biased learning or control processes, making the resulting behavior constrained to limited objectives, settings and criteria. Though a zombie could emulate in a short time and specific contexts the activities of a conscious agent, I believe that it is relatively easy for a conscious being to recognize that the zombie is only simulating intelligence (or at least conscious intelligence) and unmask it.

Think for example how it is easy for a young child to expose the limitations of a highly expensive robot just after some minutes of free interaction.

Think now about the rising success of self driving cars and the fact that it often occurs to people driving their car a very long way and realizing that they were thinking of something else. The growing appearance of self-driving cars seems to confirm that conscience is not necessarily  needed for implementing driving functionality in conventional setting. Think now about the dramatic eventuality of a car deciding in a fraction of second between two potential victims during a car accident: no automatic algorithm would be considered adequate to deal  with such ethical issue and we would be uncomfortable in dictating to the robot some behavior rules to act in such context. We are indeed entering the domain of consciousness where the conventional mechanistic way of proceeding is no more relevant.

I consider all these examples as evidence about the impossibility of attaining high level of cognitive capability without conscience, like it is impossible to attain knowledge with a biased, constrained, precooked modeling approach.





Bias/variance gnosiology

We learn only when we create a regularity and all that remains from our learning efforts is some sort of confortable simplification. Now, reality escapes or diverges from our regular expectations every time we want to use or enforce  them to explain or predict the course of nature. In front of the inescapable gap between our regular eden and the  natural hell of observations, we can take two extreme attitudes: negate or discredit reality and reduce all divergences to some sort of noise (measurement error) or try to incorporate discording data and measures in our model. Of course there is a continuum of intermediate positions which are possible between these two extrema and it is conceivable that we change/adapt our strategy according to the context, the topic, our age or mood. However, this post supports the idea that a large part of our approach to the understanding of reality  can be simplified (again a regularity 🙂 by making explicit how we position ourselves in this range between ideological defense of our model and  acceptation of the confutation power of data. This trade off is well known in (frequentist) statistics where the process of estimating models from data is described in terms of the bias/variance trade-off. An estimator is a generic name for describing whatever function/algorithm bringing from data to an estimate: we could generalize here to any data/observation process returning a sort of model, regularization or belief.

A biased estimator is typically an estimator which is insensitive to data: his strength derives from the intrinsic robustness and coherence as well as his weaknesses might originate in the (in)sane attitude of disregarding data or incoming evidence. A variant estimator adapts rapidly and swiftly to data and observations but it can be easily criticized for its excessive instability.

So, nothing really new, but I feel sometimes delighted in  mapping attitudes, beliefs, ideologies to this trade-off (definitely another illusion of almighty regularity) or to characterize/explain differences in terms of this classification.

Bias/variance tradeoffs
On the biased side of the world On the variance side of the world
Right-wing Left-wing
Old Young
Parent Son
Idealism Empiricism
Self-confident Doubtful
Optimist Pessimist
Reformist Revolutionary
Woytila Bergoglio
German football team Italian football team
Classical art Modern art
Academia Université du peuple
Official press Social networks
European institutions Populism
Mainstream science Scientific breakthrough
Mathematics Statistics
Parametric statistics Nonparametric statistics
Expert driven Data driven
Faithful Playboy
Boring Charming
Bill Gates Steve Jobs
Long-term Short-term
Conventional Breakthrough
Official medicine Homeopathy
Apple Start-up
Book Webpage
Raiuno Raitre
Classic music Rock
Rock Rap
Risk-averse Risk-taker
Orthodox Unconventional
Dogma Unconventional
Aristotle Galileo
Formal informal
Descartes Popper
Manzoni Leopardi
Idealism Relativism
Truth Opinion
Linearity Nonlinearity
Simplicity Complexity
Certainty Doubt
Exploitation Exploration
Communist Populist
Automatic Conscious (?)

And now up to you…

PS. OK, but after all, is there a better side to stay? Hum, if you thing there is, welcome on the biased side ;-). If you think it depends, welcome on the variant side of the world.

From open data to open decisions

Open data is the next (or already current) big thing. Is it enough?

We should be doing data science, not (only) for the sake of having good models or nice predictions, but for providing quantified, data driven and assessed evidence to decision makers.

Is a good data science process enough? I would say no. Whatever is the evidence data scientists will be able to provide, such evidence will be affected (or better annotated) by uncertainty, risk, confidence intervals, variance. The role of the decision maker is not to take blindly the outcome of the data science process but to weight properly the risk and the costs.

Let us take an academic example: the doctor deciding whether to prescribe or not a treatment (e.g. a chemotherapy) to a patient. It is not only about the potential success (and risk) of the treatment. It is also about the cost of a false positive (prescribe a treatment and suffer only side effects) and false negative (avoid to prescribe it and deteriorate the patient state).

Eventually, it is the doctor who decides on the basis of

  1. a model (implicit in his knowledge or made explicit for instance in a statistical model)
  2. a measure of utility or cost (associated to false positives and false negatives)

whether it is more  beneficial for the patient to deliver or not a drug.

If the data that led to the statistical model are (or will presumably be) open and then available in the future for the sake of reproducibility and scientific validation, what about the final choice of the doctor (or more generally of the decision maker)?

Decision making is either irrational or rational. In the first case let us just cross our fingers. In the second case it would deserve a description, a documentation and (why not) an open sharing. I advocate that, like for open data, a comparable (or greater) effort should be deserved to provide tools, repositories and dashboards to edit, store and disseminate open decision models describing

  1. the decision making setting (date, author, target, expected impact)
  2. the evidence it relied on (informal knowledge, literature, statistical models)
  3. in case of statistical evidence, the (open) data  that were used for inferring it
  4. the utility (or cost) function used for the decision
  5. the decision making process, specifically how the material in points 2., 3. and 4. was used to deliver the final decision

And the confidentiality? The decision model (once formalized) could be kept confidential or have a restricted access if needed. The issue here is not really about the  disclosure of sensitive information but more about the degree of reproducibility of a decision. We can only learn from our (or other) errors. Think about political decision makers, democratically required to  document and safely store their decisions, and the possibility for a citizen of rerunning their decisions (once disclosed) in a near (or far) future.

The regularity gamble

All human knowledge relies on a gamble: “regularity exists“. Equivalently, only what is  regular (e.g. a pattern), or what seems to be regular, has the right of entering our knowledge and scientific heritage.

Note that regular does not mean necessarily something boring (a constant) or shallow or deterministic. We could find regularity in the behavior of a spring, as well as in the volatility of the stock market or in the way a complex dynamics evolves itself with time.

Nevertheless, humans start to consider that they know something only when they put that something within a pattern, a model, a map. All the rest is unknown (or not yet known) and deserves labels like noise, error, uncertainty.