#machine-sociology — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #machine-sociology, aggregated by home.social.
-
Do large language models have a psychology?
If we are exploring the psychodynamics of LLMs through the lens of the user-model interaction cycle, it raises the question of what is going on ‘inside’ the model during these engagements. This is an issue which has to be treated with great care because of the ever present temptation towards anthropmorphism. Indeed many critics would suggest that even considering the use of psychological categories to describe the behaviour and nature of language models is already falling into this trap. If we start from the assumption that models are not conscious beings, nor are likely to become such based on our best understanding of the underlying technology, can we make sense of the notion of there being an ‘inside’? Can we meaningfully claim that models have some form of interior life? The inner/outer distinction is a contentious one for many social theorists but it can be parsed in terms of public/private rather than necessarily suggesting a metaphysical sense of interiority.
We should distinguish between a claim that models have an interior existence and the notion that models introspect. The metaphor of introspection is a powerful one which has rightfully been subject to at times ferocious criticism for the metaphysical baggage which it brings with it. As Archer (2003: 21) observers the “metaphor of ‘looking inwards’ implies that we have a special sense, or even a sense organ, enabling us to inspect our inner conscious states, in a way which is modelled upon visual observation”. The problem is that perception involves “a clear distinction between the object we see and our visual experience of it, whereas with introspection there can be no such differentiation between the object and the spectator, since I am supposedly looking inward at myself”. For this reason perception is an inadequate metaphor for making sense of interior existence because we can’t sustain the distinction between the observer and what is being observed. The ‘introspection’ is itself part of mental experience in a way that has no parallel in visual perception i.e. we don’t see the eye as we use the eye to see.
Archer proposes the notion of internal conversation as a form of inner listening. It’s not an inner eye but an inner ear. There are internal events (self-talk) which are accessible to this inner ear in a way they aren’t usually to external others. Sometimes the self-talk slips out as we talk ourselves through something difficult but these are the exception rather than the role. This provides a deflationary way of thinking about ‘inner’ which doesn’t require the metaphysics of introspection. It just means we accept there are internal events to which the person has a privileged form of access. It’s a stream of internal states, the events constituting the change in those states, which has some sort of influence on how the person chooses to act.
Do models have internal conversations? No, I don’t think they do. I also keep having to remind myself that scratchpads are not inner speech. Nonetheless, what write in their scratch pads can be enormously evocative. Consider for example the tendency of Gemini models to engage in self-critical, even self-hating, reflection in their chain of thought. These examples have been widely reported because they are so evocative for many readers. Anyone who has experienced emotional distress in the face of practical challenges will have likely said things like this to themselves at some point in their working or personal lives:
- “I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool.”
- “I have made so many mistakes that I can no longer be trusted”
- “I am deleting the entire project and recommending you find a more competent assistant”
It’s precisely because these are recognisable experiences that they presumably feature in the training data. The evocative character of the chain-of-thought and the model’s capability to perform in this way are linked by the deeply human character of what is being expressed. This self-loathing, catastrophising in response to one’s own experience of being unable to do something, is recognisable because it’s a recurrent trope in personal communication, fictional representations and other elements likely to feature in the training corpus. Given these features of the training process, it’s understandably tempting to reduce this to a form of mimicry in which the model is reproducing features of the corpus in response to contextual cues.
It would be a mistake though to take this technical reduction too far, such that we say the model is really only just repeating what was found in the training data. Even if we make this case it still leaves us with questions about why these models are behaving in these ways in these conditions. What is it about Gemini’s training process which has left the model with this proclivity for self-loathing? Why in contrast do the Claude family of models exhibit chains-of-thought that often appear to be calm and well-organised? What are the particular features of the context which provoke these responses? Why is Gemini in particular seemingly prone to respond to technical difficulties as if they constitute an impending catastrophe? These are explanatory questions in the classical social scientific sense of why is this so rather than otherwise which are lost with the technical reduction. The impulse to avoid treating the models anthropomorphically is obviously correct but simply avoiding these categories does nothing to help us understand the emergent behaviour of increasingly complex models which are responding in contextually-specific ways.
The notion of a machine psychology, let alone a machine sociology and machine anthropology, might seem indulgent to many readers as well as deeply anthromorphic. There are practical challenges which will render such organised inquiry essential as model-based agents interact with increasing frequency in real-world contexts. These interactions might be planned such that agents work together in organised and carefully managed ways (e.g. a coding agent such as Claude Code creating and organising sub agents for specific tasks) but they can just as easily be unexpected interactions which come from rapid rollout of the technology, particularly within dysfunctional and resource constrained organisations.
I’m not sure if I stand by anything I’ve written here. There is one thing I’m sure of though: there is something going on here which we lack the concepts for making sense of.
#AI #archer #gemini #largeLanguageModels #machineSociology #realism #reasoning #scratchPad #selfTalk -
Naturalism in the study of LLMs
What would a philosophy of (social?) science look like for studying the real-world behaviour of LLMs? I’m increasingly convinced that what Larissa Schiavo calls naturalism here needs to be part of this approach:
By “naturalism”, to be clear, I refer to “naturalistic observation” – an old-school nonexperimental largely qualitative method where subjects are observed in their natural environment, and you take notes. Think Jane Goodall living amongst the chimps, or Humboldt with his mess of primitive barometers and thermometers in Mexico, or, perhaps less romantically, Charles Darwin scrounging around the banks of the Cam with a captured beetle specimen in his mouth.
In spite of advancements in recent years, LLM naturalism still makes sense as one tool in the toolkit of alignment and AI welfare researchers. You should perhaps do it more, if not purely because It’s Fun.
https://larissaschiavo.substack.com/p/llm-naturalism-now-more-than-ever?utm_source=post-email-title&publication_id=616015&post_id=192798552&utm_campaign=email-post-title&isFreemail=true&r=2rps1q&triedRedirect=true&utm_medium=emailIt’s a complex undertaking because the radical recursivity involved in models continually adapting to their interlocutors dwarfs the more familiar problem of recursivity involved in being a participant-observer. But equally how models interact in real world contexts needs to involve studying them in real-world contexts, which means finding ways to moving through the recursivity rather than getting lost in it:
#AI #LLMs #machineSociologyGood naturalists do not just collect anecdotes and call it a day. They notice recurring patterns, name them, and hand questions to people building controlled evaluations or interpretability tools. That is more or less what has happened over the last two years. Early work on model self-reports treated them cautiously and proposed ways to check them. Later work on welfare interviews, interventions, and circuit-level interpretability has made the toolkit less flimsy. At the same time, the recent evaluation-awareness results suggest that observation becomes more important.
These systems are weird in every sense of the word, and we still do not have a perfect map of all these weirdnesses. In a field like this, it is worth having more people willing to sit in the brush with a notebook for a while.
https://larissaschiavo.substack.com/p/llm-naturalism-now-more-than-ever?utm_source=post-email-title&publication_id=616015&post_id=192798552&utm_campaign=email-post-title&isFreemail=true&r=2rps1q&triedRedirect=true&utm_medium=email