Motivation Selection Method

The second set of tools to control the undesired behaviour of Superintelligence is to try to motivate it to pursue the goals that are in our (human) interest, and that is why this approach is called the Motivation Selection Method. John Danaher provides a summary of these methods in his article “Bostrom on Superintelligence: Limiting an AI’s Capabilities”, parts of which I have used to convey below the essence of Motivation Selection in a less technical way.

It is in some way an extension of the ‘Incentive Method’ from the Capability Control set of tools. Bostrom is clear as with the Control Problem approach, that this set of methods would have to be implemented before an AI achieves Superintelligence. Otherwise, the Superintelligence could have a decisive strategic advantage over human beings, and it may be impossible to constrain or limit it in any way. That is why I have already stressed it before that we have really about one decade, till about 2030, to implement mechanisms of controlling Superintelligence.

The second suggested method of motivation selection is called “domesticity”. The analogy here might be with the domestication of wild animals. Dogs and cats have been successfully domesticated and tamed from wild animals over many generations. The suggestion is that something similar could be done with superintelligent agents. They could be domesticated. A classic example of a domesticated Superintelligence would be the so-called “oracle” device. This functions as a simple question-answering system. Its final goal is to produce correct answers to any questions it is asked. Even a simplistic micro AI gadget like “Alexa”, that I mentioned earlier, can already do that. Superintelligent agents would usually do just that from within a confined environment (a “box”). This would make it domesticated, in a sense, since it would be happy to work in a constrained way within a confined environment.

However, giving Superintelligence the seemingly benign goal of giving correct answers to questions could have startling implications. To answer the question, Superintelligence may require quite a lot of information, as anyone that has tried to talk with Google Home or Amazon Alexa appreciates. Once that information is stored in its memory, it will make the superintelligent agent more knowledgeable and more capable, increasing the risk of its misbehaviour, including a potential ‘runaway’, i.e. a total loss of control by humans.

Indirect Normativity

The third possible method of motivation selection is Indirect Normativity. The idea here is that instead of directly programming ethical or moral standards into Superintelligence, you give it some procedure for determining its own ethical and moral standards. If you get the procedure just right, Superintelligence might turn out to be benevolent and perhaps even supportive of human interests and needs. Superintelligence is to function much like an ideal, hyper-rational human being, which can “achieve that which we would have wished it to achieve if we had thought about the matter long and hard”.

One of the problems with this method of motivation selection is ensuring you’ve got the right norm-picking procedure. Getting it slightly wrong could have devastating implications, particularly if a superintelligent machine has a decisive strategic advantage over us.