AI Consciousness and AI Behavior

Abstract:

Most agree that consciousness bears some sort of important connection to well-being. For example, consciousness is necessary for having experiences of pleasure and pain, and these experiences are often — if not always — good and bad for us, respectively. The subject of AI consciousness is therefore highly relevant to the subject of AI well-being.

I examine and defend I defend a behavioral criterion of consciousness: if two entities are similar with respect to their behavioral dispositions, then they are to that extent similar with respect to phenomenal consciousness. Thus, AI systems that are behaviorally similar to us will have a conscious life that is similar to our own. The behavioral criterion is neutral regarding the nature of the connection between behavior and consciousness. It is therefore with a dualist position on which the connection is a matter of psychophysical laws, or a physicalist position on which the connection is a matter of grounding or identity. I consider two arguments for the behavioral criterion, respond to an objection to the behavioral criterion, and briefly explore how the criterion bears on current and near-future AI.

The first argument for the behavioral criterion appeals to future attitudes towards AI. We can anticipate that, if it comes to pass that there are AIs sharing all or nearly all of the complex behavioral dispositions of human beings, then most people will find it highly plausible that these AIs are conscious. AIs with such complex behavioral dispositions would be capable of joining us in society as friends and co-workers. It is foreseeable that, if this arrangement goes on long enough, the common sense position will be that AIs are conscious in more or less the same way that human beings are. And if it is foreseeable that the behavioral criterion would be the common sense position among people who regularly interacted with humanlike AI, then the behavioral criterion should be regarded as the default position now. For if were were to reject the behavioral criterion now, then we would have to suppose that our own epistemic position is somehow superior to that of future people who regularly interact with humanlike AIs.

A second argument for the behavioral criterion appeals to what has been called the Copernican principle of consciousness.1 Just as we should not suppose that human beings occupy a privileged place in the universe — such as the exact center — we also should not suppose that humans occupy a privileged place with respect to consciousness. We should not suppose that, among all the possible sorts of entities that would share more or less the same set of complex behavioral dispositions as human beings, we are fortunate to be among the entities whose behavior is underwritten by conscious experience. This would require a kind of cosmic coincidence, and we should not believe in such cosmic coincidences unless we have very strong antecedent reason to do so. Some theists may take themselves to have such reasons — they might suppose that God created us in a special way that produced not only our sophisticated behavioral dispositions but also conscious states underlying those dispositions. But the default view should be that we are not special among entities that behave in humanlike ways.

One way to argue against the behavioral criterion would be to argue for a view in the metaphysics of mind that is inconsistent with it — for example, various versions of the identity theory, dualism, or psychofunctionalism. But any such argument will be be highly contentious. Partly in order to avoid this controversy, some philosophers have attempted to argue against the behavioral criterion in a more metaphysically neutral way. Eric Schwitzgebel and Jeremy Pober have argued that we should not infer phenomenal similarity from behavioral similarity when the entity in question is what they call a “consciousness mimic”: something that “…exhibits superficial features best explained by having been modeled on the superficial features of a model system, where in the model system those superficial features reliably indicate consciousness.2 The rough idea is that if an AI is designed to behave like us, then we ought not infer consciousness from its behavior, since the best explanation for the behavior is not that it is conscious but that it has been designed to mimic us.

The problem with this argument is that even if a mimic’s behavior is in some sense explained by the fact that it is mimicking us, this explanation need not compete with an explanation that appeals to consciousness. By way of analogy, suppose that I learn to play chess by mimicking Magnus Carlsen. I watch him play for hours and hours, and whenever it is my turn, I ask myself: “What would Magnus do here?” Iff I make a brilliant move on a particular occasion, then my making this move may well be explained by my mimicry of Magnus Carlsen. But it may also be explained by the fact that I am good at chess. After all, being a successful Magnus Carlsen mimic — that is, making the moves that he would make, across a wide range of possible board states — is a way of being good at chess.

Similarly, being a successful consciousness mimic may be a way of being conscious. That depends on the relationship between consciousness and behavior. So we are back where we started; the mimicry argument makes no headway against the behavioral criterion.

The behavioral criterion does not imply that current AIs such as LLMs are conscious. LLMs are to a certain extent similar to us with respect to linguistic behavior, but there are obviously many respects in which they do not behave like us. However, the behavioral criterion suggests that there is no barrier in principle to humanlike consciousness being realized by an LLM. There is therefore good reason for well-being researchers to be concerned about the well-being of LLMs, even if consciousness is taken to be necessary for well-being, and especially insofar as future LLMs more closely model human behavior.