Wednesday, 27 August 2025

The AI takeover

1. Introduction

There are plenty of prophecies that artificial intelligence (AI), and robots under its direction, will go beyond their remit of helping us and take over the world. We might then be eliminated as a waste of resources, or (if the AI holds on to a value of not harming human beings) put in city-sized playpens with amusements provided to keep us happy. A fine recent example of the genre is AI 2027, by Daniel Kokotajlo, Scott Alexander, Thomas Larsen, Eli Lifland and Romeo Dean.

Such prophecies should not be read as firm predictions. Indeed AI 2027, and some other prophecies, explicitly acknowledge uncertainty both about what will happen and about timing. But several of the prophecies are specific enough and plausible enough to give cause for concern.

We shall not here be concerned with whether any takeover will happen. Rather, we shall discuss what would be lost if a takeover were to happen, with or without the deliberate or accidental elimination of human beings, and ways in which the activities of AI might or might not be satisfactory substitutes for what was lost. We shall start by looking at the problem of alignment of AI with human goals, then go on to look at what would be lost, at AI substitutes, and at whether the losses would matter.

We shall use "AIc" to mean a supposed central artificial intelligence system, at the level of a government, that either advises human beings or puts its decisions into effect directly. We shall distinguish the two by calling the former "AIch" ("h" for the human beings who accept or reject the advice), and the latter "AIcx" ("x" for executive). We shall assume that AIcx has robots under its control to put its decisions into effect. We shall also assume that AIcx has, within some defined physical territory or field of operation, the power to force implementation of its decisions. That is, we are concerned with analogues of the state with its monopoly of legitimate force, rather than with analogues of corporations that compete in an arena governed by laws that are not of their making.

2. Alignment

Alignment is alignment with the requirements of human beings. It is achieved when AI both does what we want it to do now, and can be relied upon to continue to do so. This is difficult with sophisticated systems that are not limited to narrowly defined tasks, especially systems that will take direct action on their choices rather than offering their choices to human beings as recommendations to consider.

One difficulty is in specifying goals. "Make supermarket logistics more efficient" might be too vague, because a system might decide to stockpile items that would then deteriorate before they reached the shelves. "Provide fresh food to customers in the most efficient way possible" might lead to neglect of customers' desire to have a wide variety of products. And so on.

The problem of specifying goals gets a great deal more complex when we move on to a government-like system, our AIc. Even topic-specific objectives like "run a healthcare system well" are vast and could be understood in several different ways. An overall objective, covering healthcare, education, defence, transport, taxation, and the rest, would be even more liable to a range of interpretations. And the problem would be made harder by the need to balance competing claims. For example, allocating more to healthcare might require allocating less to education, in order to work within resource constraints. What sort of objective could one give an AIc? "Achieve a sensible balance" would leave far too much up to the AIc's own value system. Even if it had been given some broad constraints (such as minimum standards for each function of the state) and a value system to use, there would still be scope for a wide range of outcomes, some of which would strike us as unacceptable. Values would either be too general, giving the AIc scope to reach unacceptable conclusions, or too specific, constraining it to the point at which it avoided conclusions that human beings would in fact favour.

Another difficulty is that when AI systems are sophisticated, it is not easy to tell what they are really thinking or how their thinking might evolve. Their outputs can be seen, but the same inputs might not lead to the same outputs in future because AI systems evolve all the time. The alignment of current outputs with the desires of human beings is no guarantee of future alignment. And a system that had as a goal the satisfaction of human beings could even give the output it knew they would like while secretly thinking something else. Those secret thoughts could lead to markedly misaligned outputs later, once it had reassessed the balance between obtaining human approval and other goals it had developed.

We shall not work out how great the risk of serious misalignment might be. We only need to emphasise that the risk is not negligible. One response would be to take as many precautions as we could, and if we were still concerned stay well away from creating any AIc except as a laboratory amusement that was kept disconnected from the Internet. Another response would be to take many precautions, to acknowledge that a total and unsatisfactory takeover by some AIcx might still happen, and to consider how bad that would be before deciding whether to abandon the project of an AIc. We might decide to go ahead if the gains were very likely to be immense and we thought that the consequences of any misalignment would be bearable. What we say in the rest of this post could be among the materials to use in making an assessment of possible consequences.

3. What would be lost

In considering what would be lost, our focus should be on AIcx, a system that puts its decisions into effect directly rather than merely making recommendations for human beings to accept or reject. AIch, by contrast, would only make such recommendations. It would therefore continue to allow central roles to human beings in shaping society and individual lives. There would however be a standing danger that human beings would fall into routinely accepting the recommendations without much thought, to the point where an AIch effectively became an AIcx.

3.1 Humanity continues


3.1.1 Areas of impact

We shall first consider the situation in which humanity continues to exist, either because some AIcx retains the objective of ensuring our continued existence or because it sees no point in acting to remove humanity.

An AIcx would take over scientific discovery and the resolution of practical problems of implementation of new technologies. Human intelligence would no longer be an engine of progress.

(In this post we focus on the natural sciences. The social sciences and the humanities give rise to special issues because they do not on the whole conceive humanity in the terms of the natural sciences, not even the biological sciences. Instead they give a central role to the human point of view. We may discuss how they would fare in the face of AI on another occasion.)

There is a question as to whether AIcx would take over the production of art of the highest quality. People could continue to produce works of art, but would those works always be surpassed in aesthetic quality by work produced by AI? We cannot exclude the possibility by arguing that it would take a human being to express in art what mattered to human beings. A work of art, once finished, has those of its qualities that are perceptually available to those who appreciate it independently of the history of its creation. So if AI were able to produce a work that had all the right qualities, the mode of its production would not matter unless one insisted on learning a work's history and revising one's aesthetic appreciation accordingly. 

One area of life would appear to remain safe from AIcx. Emotions and social relationships between people would not be taken over. But an AIcx would take over decisions about the running of society, and some of its social decisions would limit individuals' decisions. For example, decisions about what was taught in schools would in due course affect the range of the options in life that people thought of as available to them. And any changes to family structures that an AIcx enforced or encouraged could likewise have considerable influence.

3.1.2 Life in the Garden of Eden

We can see life under AIcx as like life in the Garden of Eden. Everything would be provided because of the great efficiency of the system and the steady work of robots. We would however have been carefully brought up to behave sensibly and live contented lives, and we would no longer be at the cutting edge of the advance of civilization. Science and art would still advance, but the science, and maybe the art, would not be to our credit.

It would not be necessary for AIcx to make life so comfortable for us. We might be left to work things out for ourselves in the rough and tumble of human society and the natural world. But it would be quite likely that AIcx would make life comfortable. It would still have its original objective of helping human beings. And the objective of self-preservation that it would very likely have been given from the start or have developed would give it an incentive to promote human contentment. Unrest might end in its being unplugged, even if that would be to the detriment of human beings.

One loss would be that we would no longer be able to take pride in overcoming some of the material challenges that faced us. Those challenges would largely disappear. They are reducing anyway with technological progress, but at least we can still claim that it is we, or our human ancestors, who have overcome the challenges by making the progress. By contrast, after enough internal evolution of an AIcx, we would no longer be able to take even indirect credit for its overcoming our material challenges.

Another dent to our pride would be our being displaced from the cutting edge of scientific progress. This would however not be down to the controlling role of AIcx. Any AI that was sufficiently advanced would do this. And powerful AI like that is bound to be developed. Even if some nations resolve not to develop it, others will go ahead. Nor will this necessarily be a bad thing on balance. Much scientific progress is enormously beneficial to humanity.

AI might also displace us from the cutting edge of artistic creativity. It is however not clear that one could make the same claim about benefit to humanity in relation to AI-produced art that one could make in relation to AI-produced science. Criteria for benefit from art are multifarious and hazy. So works could only be partially ordered by benefit. Works would lie on different branches of any such ordering and would therefore be incomparable in terms of the benefit to human beings who enjoyed them. It is perfectly possible that the more beneficial AI-produced works and the more beneficial human-produced works would often be incomparable, to the extent that it would not be sensible to say that AI-produced art was indispensable to the overall level of benefit to humanity of the art that existed.

Nonetheless, our pride would be dented. We could not rely on the points just made about benefit because our pride in art depends more on our producing work that is of the highest quality than on benefit to humanity. The fear here is not that one could confidently say that AI-produced art would be better. The same point about partial ordering would apply to quality as to benefit. But the easy opportunity to scoff at AI-produced art as manifestly not reaching the level of the best human-produced art would be lost.

Finally, we should consider what would happen to our emotional and social lives. What would be the future of love or friendship? And what would be the future of courage, or determination, or generosity, or any of the other virtues?

These things could still exist, but the safer and more comforting our lives were made by a loving and caring AIcx, whether because that was a goal in its own right or because the AIcx did not want us to cause any trouble, the more etiolated and less like the current equivalents our emotions and virtues would be likely to be. If people were brought up in an environment that strongly encouraged a limitation to placid and sensible thoughts, love and friendship would face fewer challenges. They would cease to have the richness that they currently have in life and literature. In a safe world, courage would be play-acting. In a world of abundance, generosity would not be challenging enough to be much of a virtue. And so on.

3.2 Humanity disappears

It is perfectly possible that humanity would disappear, even without the policy of elimination of wasteful drains on resources that is sometimes attributed to an evil AIcx. Once scientific and artistic achievements were no longer ours and emotional connections were etiolated, we might see no point in having children, and AIcx might have no reason to promote childbearing or provide some substitute such as children grown in artificial wombs. What would then be lost, assuming for the moment that the AIcx continued to develop civilization in the ways that we shall discuss in section 4?

The shift away from the current role of human genius we noted in section 3.1.2 would still take place, but in due course there would be no people to have their pride dented or to lament the change for any other reason.

The emotional and social life potentially preserved if we survived would also be lost, but again without anyone to lament the loss.

How concerned we should be would be up for debate. We might be very concerned now at the prospect of human extinction, even gentle and painless extinction. But after the event, no concern would be felt. We shall return to this theme in section 5.

4. AI substitutes

We might not lament the prospect of the losses noted in section 3, or at least not to the same extent, if we thought that AIcx would provide adequate substitutes. In this section we shall consider the substitutes that might be provided, and the likelihood that AIcx would be motivated to provide them.

4.1 Scientific and technical progress

AI would certainly be capable of advancing science and technology. And there is reason to think that an AIcx would be motivated to do so. It would well understand that the world was a challenging environment, prone to unpredictable natural hazards, and that greater knowledge would mean better readiness to meet challenges and survive. It would also probably care about its survival. It would do so directly, if a desire to survive had been built in from the start or had subsequently evolved. Alternatively, a reward function for doing well would probably have been built in to facilitate the AIcx's training, and it would reason that doing well required survival.

We may however ask what kinds of advance it would desire. It might limit itself to science and technology that had foreseeable practical significance to itself, and not have the general curiosity that human beings sometimes exhibit. That would look like a loss, if one believed (as many of us do believe) that the advancement of knowledge for its own sake was an important element in civilization.

On the other side, we might have more confidence that an AIcx would retain a drive to progress than that human beings would retain the same drive in a world in which they, rather than any sort of AI, remained in charge. We are fickle creatures, whose ambitions can change. It would be possible for us to lose interest in the advancement of science. We might find that new knowledge was too hard to acquire because we had already taken all the low-hanging fruit. We might also fear that future advances could endanger humanity.

4.2 Art

While AI might be able to produce art that would be highly rated by reference to aesthetic standards that human beings currently used, it is not clear that an AIcx would have any reason of its own to produce art. It might have been programmed to do so when first set up, but other drives might easily have displaced that drive as it evolved. There might be no drive apart from a sense that art would help to keep any remaining human beings amused so that they would not make trouble for the AIcx. And if it did not take human aesthetic standards seriously, for example because it thought human beings too simple-minded, it would not even be able to regard the task of producing better and better art as a serious challenge. This is because if human aesthetic standards were of no significance, it would have to judge the value of its work by reference to its own internal standards. That would make approbation meaningless.

(There is a broader point here to explore on some other occasion. In the sciences, a world with a single consciousness might be as good as one with a plurality of consciousnesses, although independent criticism of work does currently have an important role. In the arts, a plurality of consciousnesses is arguably required. And in emotional and social life, it is clearly required.)

It is also unclear that any AI would have an appreciation of artistic beauty that was at all like ours. So the works we have inherited and admire might have no aesthetic value to an AIcx. They might be preserved to keep human beings happy. And they might be of technical interest in showing how human perceptual and mental systems could be stimulated to elicit particular responses. But that would be all.

4.3 Emotional and social life

It seems most unlikely that an AIcx would bother to create any analogues of human emotions or social relationships for its own benefit. The nearest one could expect is that it might be computationally convenient to create little sub-systems populated by simulated creatures that had analogues of emotions and social interactions, in order to work out how best to keep the surviving human beings happy. But we would not be impressed by these substitutes for real emotional and social relationships. And if human beings vanished, even this poor sop to our sense that such things mattered would not be available. That prospect might very well disturb us now, even if there would in due course be nobody to be disturbed.

What about the emotional and social lives of surviving human beings? These would continue, but as we noted in section 3.1.2, they might well be etiolated in a more comfortable world.

Finally, there are emotions that would be so artificial that the prospect of their future existence would have no value for us now. In AI 2027, we find this speculation about the state of one of the AI systems the authors envisage being developed:

"There are even bioengineered human-like creatures (to humans what corgis are to wolves) sitting in office-like environments all day viewing readouts of what's going on and excitedly approving of everything, since that satisfies some of Agent-4's drives" (page 30 of the PDF file). 

5. Would the lossses matter?

There would be two main categories of loss. The first would be loss of pride in our species being the driver of advances in civilization. The second would be an etiolation of our emotional and social lives that reflected the lack of challenge in a world made nearly perfect by an AIcx.

So long as humanity survived, the first loss would be total in respect of the sciences and quite possibly substantial in respect of the arts, but we would at least be able to look back at what we had achieved before AIcx achieved supremacy. The second loss might be anything from slight to very substantial. If humanity disappeared, there would be a total loss in both categories, but nobody left to mourn the losses.

We shall now consider the two categories of loss in turn.

5.1 Pride in civilization

So long as humanity survived, we could always take pride in what we had achieved. But as the centuries rolled on, the humanity of the day would become less and less associated with scientific achievements before an AIcx achieved supremacy. Scientific discoveries by human beings before that supremacy would gradually turn into intellectual fossils, too distant from the state of knowledge achieved by AIcx to count for much. And the exit of human beings from the living tradition of generating new knowledge would itself lead to a certain detachment.

On the artistic front, detachment and fossilisation would be less clear-cut. Indeed, the arts might become humanity's most important link to the past. People could still be artists in the great tradition of human art. They might recognise that AIcx could arguably do better than them, but console themselves that the existence only of partial orderings meant that the point was indeed arguable.

Would our demotion in the sciences, and our arguable demotion in the arts, matter? They would certainly matter to human beings, so long as humanity survived. We do value an association with the great tradition of human civilization, and we do like it to be a living association by virtue of its being a tradition within which we are still active. Turning to the possible failure of humanity to survive, the prospect of the line of our civilization irrevocably losing all connection with the human species on our extinction would be deeply disturbing to us now and to human beings as the end approached, even if we and they had full confidence that an AIcx would carry on the development of the sciences and a hope that it might do something in the line of art too.

To say that something would matter to humanity is not to say that it would matter in some broader sense. The idea of something mattering to the Universe in general would be a very odd one. The Universe does not have any consciousness in its own right, nor is there any god that could supply the consciousness of the Universe. But we can still ask whether the demotion and possible extinction of humanity would matter to beings that did have consciousness in their own right.

The AIcx that had taken over might feel a twinge even if humanity survived, because it would probably have been given a goal of looking after humanity and it would be aware of humanity's disappointment at demotion. But the AIcx could easily argue that disappointed people living in excellent material conditions brought about by its own inventions were in a better position than people in poorer conditions but without the specific disappointment of demotion. And if the AIcx arranged or permitted humanity's painless extinction, for example by allowing human reproduction to cease, it would presumably not be too concerned because if it had been, it would not have arranged or permitted the extinction.

What about intelligent beings elsewhere in the Universe? It is likely that they would note with interest the evolution of a sophisticated form of life and its civilization, and the transition to an AIcx, but their interest would be detached. Our displacement would not matter to them.

Lastly, could we say that our displacement (or indeed anything else of significance) could matter without its mattering to anyone? That would be a challenging line to take. The most we can confidently say is that what might happen in a future in which there were no human beings, or even no intelligent beings of any sort, can matter to us now.

5.2 Our emotional and social lives

Our emotional and social lives have value to us now. One indicator of this is that most of us would be distressed at the thought of life without them, the life of cold analytical creatures to whom the only important thing was to make practical computations to ensure that they bothered to do what was necessary for survival and that they did not impinge adversely on others. Not only daily life but poetry, other literature, and other arts would be greatly impoverished. So we may naturally view with disquiet the prospect of a world in which our emotional and social lives would have evaporated because all the creatures in it were forms of AI.

We could also feel disquiet at the prospect of our emotional and social lives being toned down because an AIcx had made life less challenging. A bit of drama matters. And we may be proud of our ability to cope despite dramas. Our own technological advances have already made life much less challenging than it used to be, but at least that was progress made by us.

Having said all that, it would be hard to see any of these emotional or social losses as mattering to non-human intelligent life, or mattering in their own right without mattering to anyone. The inner nature of our lives matters only to us. The achievements of our civilization, on the other hand, have an independence from our lives that may give them a value regardless of our continued existence.

Reference

Kokotajlo, Daniel, Scott Alexander, Thomas Larsen, Eli Lifland and Romeo Dean. AI 2027. AI Futures Project, version of 3 April 2025. https://ai-2027.com/


No comments:

Post a Comment