Skoltech | Pavel Osinenko: “The dialogue between regulators and specialists is still poor”

How can the concept of trust be formalized? Can people trust artificial intelligence systems? On the eve of the Trustworthy AI conference at Skoltech, we talked with Assistant Professor Pavel Osinenko about how technical specialists view trust and whether we need standards in the field of artificial intelligence today.

Talk about the concept of Trustworthy AI, artificial intelligence worthy of our trust. Why are we talking about trust at all?

I believe that at this point, the term “artificial intelligence” is not particularly informative and often leads to confusion, whereas research publications always require precise formulations. For instance, we can formally define terms such as an algorithm, a system, and a mathematical model. With artificial intelligence, the situation is more complicated, and the line between philosophy, ethics and science gets blurry. I am not a specialist in either philosophy or ethics, so I do not want to make unsubstantiated claims on such aspects of artificial intelligence.

As for complex systems based on machine learning, when we talk about trustworthiness, we mean clearly formalized properties, such as stability, robustness, convergence, safety, adversarial robustness, and things like differential privacy.

Tell us more about these features.

Stability means, roughly speaking, that the system’s state be bounded in some suitable sense, i.e., there are no blow-ups. What could such a blow-up be? For instance, a reactor explosion, an autonomous car crash etc.

Safety is usually seen in the sense of constraint satisfaction. For example, an autonomous car must respect the traffic rules, not leave the road etc.

Robustness is a meta-property. It refers to the ability to retain some other basic property, say, stability under uncertainty. For example, the system can be stable, but a slight uncertainty, say, fluctuation of some system parameter immediately makes it unstable. On the contrary, if it is robustly stable, it will withstand parameter perturbations up to some limit.

Adversarial robustness is ability to withstand adversarial attacks. The most trivial example is when a slight distortion of visual information leads to severe misclassification: a car can mistake a pedestrian for a pillar or a fly on the windshield.

Differential privacy is a property of the system that refers to the user data security. Let us imagine the most straightforward example: there are two people in the database, and we know the age of one of them but not the other. If I make a query of the average age, I will be able to calculate the age of the second person. In a large database this becomes difficult, and, moreover, the queries are specially randomized. Differential privacy is formalized probabilistically, and we can write down the respective formulas.

There is also fault tolerance, or the ability to maintain normal operation, say, staying stable and within safety constraints, in case of sensor, actuator fault etc.

A question that often arises when people discuss the trustworthiness of these systems is bias in the results. What is the reason for this problem?

There are indeed several questions here. The first question is data. The user chooses the dataset; if you take an inadequate sample, you will have an inadequate result (in the sense of model). If a system optimizes a certain objective, then a “bias” might be observed in the respective optima. This can be, for instance, selection of potential candidates for a job based on some prediction of their performance and some formal criteria about their profiles. To address this bias, you need to adjust your objective, say, prioritize this or that profile criterion.

Do I understand correctly that machine learning by itself does not solve the problem if a distortion exists in the form of some kind of structural inequality in society? Human intervention is needed at some point.

Machine learning is a tool; its hyperparameters, including the parameters of the objective function, are set by the user. Systems based on machine learning algorithms are tools also. Moreover, they essentially do not differ from some kind of a simple control system, say, of a washing machine — they differ rather in complexity.

What do experts mean when they say that systems based on artificial intelligence technologies are a “black box”?

Of course, this is also an overused term, and one must look at the context. There is a formal concept: there are so-called white-box models, and there are black-box models. White-box models are based on the laws of physics, the laws of chemistry etc.

Black-box models refer to the situation when physical description of the object is not available. So we take some abstract structure, a neural network, for instance, and simply feed some input to the object and measure the output — and so we build a black-box model, an abstract model which has no direct connection to the laws of physics.

The same is true in the case of algorithms, for example, for control or decision support. Some algorithms do not use any model, only the input and output. The simplest controller just looks at the error between the actual value of the signal (could be, say, the spin speed of the washing machine drum) and the target one, multiplies it by a factor and thus (hopefully) achieves the said target.

Can people trust the decisions of these black-box systems?

Here, we return to the very first question. If we understand trustworthiness as a set of formal properties, we deal with questions of systems and control theory. This is in general a tricky technical issue.

For example, stability of linear systems is studied thoroughly. We can talk about stability of linear systems under linear controllers even if we do not know the system’s parameters at all and we only know that it is linear. So, can we say that we deal with a black box in this case? We do not know the parameters, but we know the system structure (linear).

If nothing is known about the object or process to be controlled, no guarantees can be given. There is no black magic.

At least some assumptions have to be met to guarantee something.

If we talk about technical challenges regarding trustworthiness, what are the most interesting questions today?

Adversarial robustness and differential privacy are relatively new concepts, which can be considered subtypes of robustness and stability, but they have specific application. As the complexity of systems grows, new trustworthiness properties may be formulated.

In my view, one of the most attractive directions of research into trustworthiness of systems employing machine learning is reinforcement learning. I suppose, reinforcement learning algorithms are the future of control systems of robots, self-driving vehicles etc. A lot of effort is currently being put into guaranteeing trustworthiness of reinforcement learning.

What do you think is the most important issue worth mentioning in this context?

It seems that there is some lack of communication between the lawmakers and technical specialists. Judging from numerous work group and committee meetings, the former have rather limited understanding of the technical side of things. Rosstandart is currently involved in standardization of artificial intelligence initiated by the ISO, the International Standardization Organization. From the documents that emerged from this activity, it seems that they are of little use to the specialists. At best they offer literature overviews or general definitions, many of which are rather debatable. If I work in, say, gas distribution networks, I want to take a standardization document and get information on what pipe to pick for gas of so or so pressure, say. What do I get from said artificial intelligence documents? Not much, unfortunately. Concluding this, I’d say that the technical, formalizable aspects of trustworthiness of machine learning and reinforcement learning should be elaborated first.

Contact information:
Skoltech Communications
+7 (495) 280 14 81