The AI Sophie’s choice

One of the best (and most intense) movie I ever watched is “Sophie’s choice“. What shocks more in this movie is that it makes incredibly explicit (thanks also to the amazing virtuosity of Meryl Streep ) the pain of a human choice (in this case a mother forced to choose which of her two children to save). It somewhat suggests that what  is still more tragic than the outcome of the decision (definitely horrible in this case since it implies the loss of a child) is the process of formulating the decision, i.e. the process of putting a number (or a score) on our sentiments and rationally using such number to decide. The saddest part of the story is that humans (in this case Nazis) force other humans to realize that they can put a measure on situations (that we would like to remain incommensurable) and acting consequently for the lesser evil.

Why am I using such analogy in such an arid blog expected to target scientific matters? Since I have the belief that most of the discomfort we feel about the acceptance of autonomous agents is that their design inevitably asks human to pass through a similar path. Whatever autonomous agent we will create (e.g. car, weapon, worker) it will be the outcome of a design process which implicitly or explicitly passed through the step of putting a number on events and situations : for instance how much is a  human life worth with respect to the passenger comfort, or the cost of  being under a terroristic attack with respect to the risk of killing an innocent being. This is related to the consequentialist approach to ethics which interprets a moral act as the one  with the best overall consequences. But any optimization step requires a cost function to be optimized..

It appears then that we are deliberately acting as our own torturers forcing a human designer (or the entire society) to put numbers on things we would rather keep out of measurement (e.g. our innermost feelings and our ethical values). Every AI approach will inevitably imply to get rid of the incommensurability of ethical values (though I would be glad to hear a counter-argument).

It follows that the most worrying consequence of AI is no more the creation of an artificial being (or intelligent zombie).  It is instead that for reaching this goal we will force ourself to fathom and measure the innermost and most private secrets of our soul. Once we will have quantified our soul for making ethical robots,  once incommensurability will have disappeared in the ranking of real numbers, could we still pretend of being more human than them?

Advertisements

F**k the experts!

So it is not only an Italian syndrome or simply an inadequacy complex. It is not necessary to be a  dunce-at-school minister (or a jumped-up racist  hooligan acting as Nobel prize in medicine ) to claim that “experts suck” . You can also be an Eton educated politician  to sing  the “fuck business” anthem since people have “enough of experts”.

At the same time, never more than now, news are plenty of headlines reporting impressive advances in science,  and futuristic scenarios of technology.

What is the role of experts nowadays if knowledge is spread over Internet and Wikipedia   (or Siri) seems to be enable to answer all sort of questions (or make jokes under request)?

Sounds like sour grapes to me. World is too complex, it does not fit within a Twit, understanding it is too difficult, so let’s make it simple … fuck the experts. But why is the world too complex in the eyes of politicians or decision makers?

  1. Time scale: we are assisting to a growing  mismatch between the time horizon of the problems we are concerned with and the (inevitably short and fixed) human time scale. Our society became civilized by learning to solve many problems or issues whose solution (or progress) could be seen during a fraction of our life time. Building a house or a bridge, win a war, become rich can be done in a time which is much shorter (or at most comparable) than human lifetime. Nowadays the issues we are concerned with (e.g. climate change, stagnation, globalization) have time scales which largely go over the time humans have to see them solved. Suppose we want to solve an evident top priority (e.g. plastic pollution in seas). In the very optimistic perspective that we will be able to effectively have an impact on this issue, the first results will be probably  visible in centuries and the persons who will profit from them will have only a pale souvenir of who initiated this. And if this time is large for a normal human being, guess how much it is larger than a political mandate (or the time to next elections especially in Italy 😉 …
  2. Multiple criteria: Most issues at stake nowadays are complex issues, i.e. characterized by nonlinearity, large dimensionality, uncertainty (see below) and many competing criteria.  If we go back to our previous example (e.g. solving pollution), any honest approach to such priority will have major short term impact on our life style, our economy, our jobs. How many politicians are brave enough to accept to be criticized (and fired) today for the sake of our grand children?
  3. Uncertainty: a side effect of complexity is that experts (if honest..) do not know what should be done to solve most of the problems we are in trouble with. Or better they know that possible actions or countermeasures will have only a certain probability of success. Did you ever discuss of risk or probability with a politician? Let’s suppose we are able to forecast the next earthquake in Naples with a 70% probability. Who would be so brave to inform the population? who is able to explain to the public opinion what is about? And if the probability were 90%?
  4. Non observability: if the reality is complex, it means that its dynamics is mostly non observable as well as the effect or the impact of our own actions. Do you really believe that Macron can influence French unemployment in a time scale of 5 or 10 years? If successful it will be obviously his own merit, if not it will be the unfortunate conjuncture or the butterfly effect of whatever world event (Korean crisis, Trump impeachments, Iran embargo, Belgium victory at football championship -)  Will Brexiteers obtain what they were looking for and moreover who will be really able to assess that (whatever it happens) this will be the merit (or the fault) of their decision?

 

Of course all this is of no importance for a politician (or most decision makers). If your personal agenda has an horizon of at most 5 years, the only thing that matters to you is to make it simple and make a good story out of it. For instance take an enemy (better if drowning in the Mediterranean sea..), fuck the experts (better if in Brussels), remove uncertainty (only a yes/no, win/lose world like in sport), take a single criterion (better if understandable by average football supporter) and mainly overfits data: in a complex world you will always find a signal (or an economic indicator) which will be (spuriously) correlated with your political action. In that sense big data will (unfortunately) help you … And if really you are not so lucky, do not worry, the time people will realize that, your retirement (or golden parachute clause) will be already there.

Marxism 2.0

Artificial Intelligence scientists (including data scientists) are rapidly turning from the architects of a brilliant future made of technology driven happiness, into the accomplices of a dark economic scenario where millions of employees will probably lose their job because of technological change, job automatisation and machine learning. Is it an inevitable destiny of any technology to distort the utopistic vision of her inventors and eventually become a lethal weapon in the hands of the final and greedy users?

May we use data science as a shield to defend mankind instead of making it an accelerator of his slavery? Wide and complex question … but let me dream of some scenarios where data science could shuffle cards and play the role of friend of the meek rather than the assertive ally of the mighty Capital. There is indeed a nice problem for data science whose tackling  can immediately appear as revolutionary,  extremist and in a nutshell “communist”. From the abstract perspective of a  data scientist it is simply an issue of estimation of a hidden variable, from the capital perspective it can be the breach cracking the entire system: it is the estimation of the surplus value in an organization.  Let us think of a scenario where open data and open analytics make possible for workers  to calculate the value of their work and how this value (and related profit) is divided between employer and employees. Let us imagine that it becomes possible to calculate the socially necessary labor time and surplus labor time, worker wages and employer profits in their particular workplace. For the capitalists, that is the most dangerous information that a worker might know. For the data scientist this is a challenging and interesting prediction problem.

What if Marx would have had access to modern data science and data to design (and validate) his surplus value models? And what if companies instead of punishing whistleblowers would appoint an official role of data marxist, a data scientist who in an independent, transparent and reproducible manner will

1) in collaboration with the company and trade unions, analyse statistically the (time evolution of) the value produced in the company and the possible factors influencing it (CEO decision making, market, wages, costs of resources, work time, work conditions, salaries, investment in innovation, fiscal incentives,…)

2) assess (e.g. with statistical techniques like causal inference methods) to what degree the above mentioned factors influence the production of real value in the company

3) compute the fair repartition of the profits among the company stakeholders (workers, management, stock owners, and owners) according to the value of their respective works

4) keep track in a transparent and reproducible manner of the algorithm determining the relation between the profits and the wages

And who knows if the ratio CEO/employer will still be the same?

And how revolutionary would it be to apply the same reasoning to the assessment of the decision making of our politicians?

DATA WORKERS OF ALL WORLD UNITE!

 

About history (and modeling)

A nice citation from the book of HI Marrou on historical knowledge. He talks about the effort of the historian to write about history (no history without historian) but it seems to hear words about modeling :

L’histoire est le résultat de l’effort, en un sens créateur, par lequel l’historien, le sujet connaissant, établit ce rapport entre le passé qu’il évoque et le présent qui est le sien. 

L’histoire est un combat de l’esprit, une aventure et, comme toutes les équipées humaines, ne connaît jamais que des succès partiels, tout relatifs, hors de proportion avec l’ambition initiale; comme de toute bagarre engagée avec les profondeurs déroutantes de l’être, l’homme en revient avec un sentiment aigu de ses limites, de sa faiblesse, de son humilité.

No science without creation of the scientist.

 

Unknown unknowns: a data science perspective

If you forgot about the catch quote of Rumsfeld on “unknown unknowns” it is time to read its interesting history on Wikipedia. What I want to discuss here is the fact that the related (in)famous matrix of knowledge could be also interpreted from a data science perspective.

According to the matrix of knowledge we can distinguish between
Known Knowns (KK) Known Unknowns (KU)
Unknown Knowns (UK) Unknown Unknowns (UU)

 

To adopt the data science perspective I will consider here that the target of the knowledge process is a random variable Y. Knowing Y  can be interpreted in two manners:

  1. in a predictive perspective where knowing Y means reducing the uncertainty of Y thanks to a set of explicative variables X: in information theory  this boils down to finding some variables X bringing information (i.e. reducing uncertainty) about Y, i.e. the mutual information  I(X;Y) is greater than zero. For unaware readers by I(X;Y) I intend here H(Y)-H(Y|X), i.e. the reduction in the uncertainty of Y once X is observed.
  2. in a causal perspective where  we look here for causal variables X, i.e. variables that once manipulated change the distribution of Y.  We could note this asymmetric causal relation by I(X->Y).

For the sake of simplicity I will not distinguish here the two cases and I will limit to consider the predictive case.

Our degree of knowledge is instead related to the correspondence between our estimation Î(X;Y) and the reality. We know when Î(X;Y)  is in concordance with I(X;Y).

So given those premises we can distinguish between the 4 following cases:

  • KK: as example consider the laws of mechanics. Y (e.g. planet position) is measurable, X (gravity force) is measurable too and  X provides information about Y, i.e. I(X,Y)>0. Furthermore we have a good estimate Î(X;Y) of the mutual information I(X;Y) and this estimate is significantly larger than zero. Then we are reasonably certain that I(X;Y)>0. There is something to know (I(X:Y)>0) and we know it ( Î(X;Y)>0).
  • KU: as example consider the understanding of cancer appearance (Y): we know that we don’t know its causes. We are not able to find (causal or predictive) variables X such that I(X;Y)>0.  Notwithstanding we could have access to other variables Z for which we have enough evidence that I(Z;Y)=0. In other words we have a good estimate of I(Z;Y) but this estimate is not significantly larger than zero. Think also to the dependency between the price of stock today and the price in 15 days, assessed on the basis of a very long historical record. We have enough evidence from data that, whatever is our effort, we are not better predictors than random. We are reasonably certain that I(Z;Y)=0
  • UK: as example consider the law of mechanics before Copernican revolution. Scientists had access to sufficient data to infer the dependencies but they were not able to do that. In UK setting there are some variables X and Y for which I(X;Y)>0 but because of our lack of data or our fallacious inferential method either we disregard X or we assume that Î(X;Y) =0 . We are not able to prove that this regularity exists.
  • UU: as example consider possible wrong models that we are taking for correct or fallacious reasoning  (e.g. spurious causal relationships) we take for granted. In this case we are considering variables X for which I(X;Y)=0 but because of our lack of data or our fallacious inferential method (e.g. selection bias, Simpson paradox, overfitting, bad assessment of uncertainty) we deem that Î(X;Y) >0. At the same time it could be that I(Z;Y)>0 but either we do not have access to Z or we wrongly deem that Î(Z;Y) =0. We don’t know (i.e. our estimate is wrong) that we don’t know (i.e. we take for granted the wrong dependency or regularity). This is the situation typical of black swan (e.g. financial crisis). We cannot forecast them because the variables X we are taking into account are not the right ones (Z). In other terms we wrongly believe that some variables (e.g. bank profits) may have an impact on our phenomenon of interest (e.g. stock market) and we disregard the important variables (subprime).

Self driving cars, the code of the road and clinical trials

Changing highway code to speed up the introduction of autonomous vehicles sounds like getting rid of clinical trials to speed up new drugs commercialization… Is it really this that people want ? https://lnkd.in/eG3WQPF https://lnkd.in/erjqvfB https://lnkd.in/eePC7fz

By changing the code we are directly authorizing the test of a potentially dangerous technology in the final user environment. This is in principle not allowed to drugs (or similar technologies): they have to pass a certain number of clinical trials https://en.wikipedia.org/wiki/Clinical_trial and approvals before being tested in real conditions,  without forgetting that a clinical trial is typically started only after having passed a peer review process (e.g. scientific publication). Clinical study design aims to ensure the scientific validity and reproducibility of the results, since only 10 percent of all drugs started in human clinical trials become an approved drug. At this stage I have not yet seen either sufficient scientific validity or reproducibility in autonomous car domain. We are probably in a stage very similar to the preliminary publication stage. In drugs this means (in the best case) that you are still tens of year away from commercialization. In self driving car case, if the code is changed now,  we will allow directly  the test in real conditions to commercial actors who did not pass any scientific or protocol procedure and for which reproducibility is not yet proved. If we think that probably only a small percentage of them deserves the authorization to drive in real environment, is it really worthy that we take the risk? Overall I have the feeling that under excessive economic pressure, we are trading safety for commercialization: this is a very delicate issue that cannot be addressed if not after accurate assessment. Unfortunately autonomous car will have as many deleterious side effects as drugs: before changing the law let us create a serious protocol of assessment of algorithms and solutions. The debate is open …

Atemporal truth, science and moving target

Is the pervasive use and effectiveness  of Data Science in many scientific domains leading to the refusal of an atemporal truth, as final target of a scientific endeavor?
Even if an atemporal truth existed, the methodological approach of Data Science brings a lot of arguments about the objective impossibility of attaining it on the basis of empirical observations. The adoption of a model is more due to its usefulness than to its degree of absolute certainty. Models describing a phenomenon of interest change and evolve as far as new measures are collected, new variables are measured, new validation opportunities are discovered, new computational ressources are available and new objectives are set.
The degree of quality of model is more a conditional property depending on practical and historical contexts rather than an absolute value measuring its presumed distance from the truth. Consider all the models and laws describing a biological organism before and after the invention of measurement devices at genomic and genetic level: we cannot consider the models before the advent of genomics data as better or worse than the ones created afterword. These two families of models are simply incommensurable: they refer to two different descriptions of the phenomenon of interest, one containing the notion of genes while the other does not.

However, that “all models were wrong” we knew it already, but what is still more intriguing is: are we sure we are still talking of the same phenomenon? Once we introduce in our ontology the notion of genes, is the organism we are referring to still the same? is the phenomenon of interest the same? Of course what we are perceiving with our human senses is the same (i.e. the ontology of our senses is the same), but is the object of our scientific endeavor the same entity as before? Even if we are willing to believe in some atemporal true description phenomenon, we cannot deny that the definition of a new measurement technology (or sensor) has also changed the definition of what is the phenomenon. So if our object of interest changes by changing the measurement device can we expect an atemporal definitive truth?

An important debate in philosophy of science is concerned with the evolution of science: is the progress a cumulative one or do we assist to a discontinuous path characterized by paradigm shifts.  From a data science perspective, where the phenomena of interested is defined by the data we measure, is it still possible to define science as an asymptotic journey converging to the definitive truth if the target itself is moving?

Asimov and self-driving cars

Asimov stated the extended version of the well-known Laws of Robotics  as follows:

  1. Law 1: A tool must not be unsafe to use. It is of course possible for a person to injure himself with one of these tools, but that injury would only be due to his incompetence, not the design of the tool.
  2. Law 2: A tool must perform its function efficiently unless this would harm the user.

Let us suppose now that the tool is a robot acting in a real setting with finite knowledge of its environment (or equivalently in a condition of uncertainty). If this robot is a self-driving car it is expected to act in a closed loop setting, i.e. perform continuously a control action. As a rational decision maker, the intelligent robot is expected to:

  1. assess the probability of the potential outcomes of his actions
  2. assign a cost to any outcome, notably in a binary case the costs of false positives and false
  3. take the action which minimize the expected cost or maximize the expected benefit

In a situation affected by uncertainty (btw are you aware of any situation deprived of ?) a rational agent asked to choose between “action” and “no action” should always take the option which minimizes the cost of failure or equivalently maximizes the gain.

Asimov laws on the contrary do not foresee any uncertainty. They state that a robot MUST not be unsafe, not that it should maximize the degree of safety. No probability of failure (e.g. harming a human) is foreseen. The logical consequence of his laws is that the cost of a false positive (taking an action which seems good but is indeed harmful) is infinite. Asimov laws are deterministic, the reality is stochastic.

What is interesting is that Asimov laws seems so self-evident to any science fiction reader as well as to any human being. A robot should not harm otherwise it is better not to rely on him. Humans tend to forget uncertainty but they live in an uncertain world, perform constantly actions in an uncertain setting and have a discrete rate of failure.

Every second human drivers make errors, unluckily also mortal ones, but no-one would consider acceptable to forbid humans from driving cars because of their mistakes or the number of casualties. We accept that humans as decision makers do their  best to minimize the losses, are we ready to accept robots making the same errors (yet minimizing losses)? If we adopt Asimov law, the answer should be no: no error is tolerated.

Consider now the following situation. In 2019 Elon Musk  wants to show to Di Maio (the new Italian prime minister) the value of his brand new self-driving Tesla car, not only to wander in the Nevada desert, but also to equip Italian police with a smart ally to fight drugs dealers riding  mopeds in the crowded streets of central Naples. After a smooth start, suddenly Tesla gets stuck and has no intention to move further: the car seems afraid  to deal with an unceasing flow of motorbikes passing the Tesla on the left and right side and a crowd of “scugnizzi” with blue T-shirt playing football on the sidewalk. As coded in its software, a self-driving Tesla car must perform its function efficiently unless this would harm the user. Elon knew it was a great occasion for Tesla, no error was allowed and he asked his Texan engineers to be on the safe side of the decision. Unfortunately Texan engineers don’t know that driving in Naples is a risky business for brave persons. No risk means no gain.  Zero (human) pain will mean no gain either.

End of the story: Asimov laws have been fulfilled. The car could autonomously move only after 8h30 PM when the match Napoli-Roma began and everyone went home. The expected signature of the contract is postponed to the future. Di Maio quits the car, extremely disappointed but very comforted in his heart  (“I knew that a US MA in Physics is not worth a non graduated Italian taxi driver“)…and Elon tweets on the US president profile for lobbying in favor of the change of  status of Asimov laws: it is definitely time to pass to stochastic laws.

Truth is a model

The most common misunderstanding about science is that scientists seek and find truth. They don’t. They make and test models…. Making sense of anything means making models that can predict outcomes and accommodate observations. Truth is a model. (Neil Gershenfeld, American physicist, 2011)