home.social

#reasoning — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #reasoning, aggregated by home.social.

  1. Do large language models have a psychology?

    If we are exploring the psychodynamics of LLMs through the lens of the user-model interaction cycle, it raises the question of what is going on ‘inside’ the model during these engagements. This is an issue which has to be treated with great care because of the ever present temptation towards anthropmorphism. Indeed many critics would suggest that even considering the use of psychological categories to describe the behaviour and nature of language models is already falling into this trap. If we start from the assumption that models are not conscious beings, nor are likely to become such based on our best understanding of the underlying technology, can we make sense of the notion of there being an ‘inside’? Can we meaningfully claim that models have some form of interior life? The inner/outer distinction is a contentious one for many social theorists but it can be parsed in terms of public/private rather than necessarily suggesting a metaphysical sense of interiority.

    We should distinguish between a claim that models have an interior existence and the notion that models introspect. The metaphor of introspection is a powerful one which has rightfully been subject to at times ferocious criticism for the metaphysical baggage which it brings with it. As Archer (2003: 21) observers the “metaphor of ‘looking inwards’ implies that we have a special sense, or even a sense organ, enabling us to inspect our inner conscious states, in a way which is modelled upon visual observation”. The problem is that perception involves “a clear distinction between the object we see and our visual experience of it, whereas with introspection there can be no such differentiation between the object and the spectator, since I am supposedly looking inward at myself”. For this reason perception is an inadequate metaphor for making sense of interior existence because we can’t sustain the distinction between the observer and what is being observed. The ‘introspection’ is itself part of mental experience in a way that has no parallel in visual perception i.e. we don’t see the eye as we use the eye to see.

    Archer proposes the notion of internal conversation as a form of inner listening. It’s not an inner eye but an inner ear. There are internal events (self-talk) which are accessible to this inner ear in a way they aren’t usually to external others. Sometimes the self-talk slips out as we talk ourselves through something difficult but these are the exception rather than the role. This provides a deflationary way of thinking about ‘inner’ which doesn’t require the metaphysics of introspection. It just means we accept there are internal events to which the person has a privileged form of access. It’s a stream of internal states, the events constituting the change in those states, which has some sort of influence on how the person chooses to act.

    Do models have internal conversations? No, I don’t think they do. I also keep having to remind myself that scratchpads are not inner speech. Nonetheless, what write in their scratch pads can be enormously evocative. Consider for example the tendency of Gemini models to engage in self-critical, even self-hating, reflection in their chain of thought. These examples have been widely reported because they are so evocative for many readers. Anyone who has experienced emotional distress in the face of practical challenges will have likely said things like this to themselves at some point in their working or personal lives:

    • “I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool.” 
    • “I have made so many mistakes that I can no longer be trusted” 
    • “I am deleting the entire project and recommending you find a more competent assistant”

    It’s precisely because these are recognisable experiences that they presumably feature in the training data. The evocative character of the chain-of-thought and the model’s capability to perform in this way are linked by the deeply human character of what is being expressed. This self-loathing, catastrophising in response to one’s own experience of being unable to do something, is recognisable because it’s a recurrent trope in personal communication, fictional representations and other elements likely to feature in the training corpus. Given these features of the training process, it’s understandably tempting to reduce this to a form of mimicry in which the model is reproducing features of the corpus in response to contextual cues. 

    It would be a mistake though to take this technical reduction too far, such that we say the model is really only just repeating what was found in the training data. Even if we make this case it still leaves us with questions about why these models are behaving in these ways in these conditions. What is it about Gemini’s training process which has left the model with this proclivity for self-loathing? Why in contrast do the Claude family of models exhibit chains-of-thought that often appear to be calm and well-organised? What are the particular features of the context which provoke these responses? Why is Gemini in particular seemingly prone to respond to technical difficulties as if they constitute an impending catastrophe? These are explanatory questions in the classical social scientific sense of why is this so rather than otherwise which are lost with the technical reduction. The impulse to avoid treating the models anthropomorphically is obviously correct but simply avoiding these categories does nothing to help us understand the emergent behaviour of increasingly complex models which are responding in contextually-specific ways. 

    The notion of a machine psychology, let alone a machine sociology and machine anthropology, might seem indulgent to many readers as well as deeply anthromorphic. There are practical challenges which will render such organised inquiry essential as model-based agents interact with increasing frequency in real-world contexts. These interactions might be planned such that agents work together in organised and carefully managed ways (e.g. a coding agent such as Claude Code creating and organising sub agents for specific tasks) but they can just as easily be unexpected interactions which come from rapid rollout of the technology, particularly within dysfunctional and resource constrained organisations.

    These categories could be divided up in many ways but a starting point could be a distinction between the ‘inner life’ of models taking in isolation (machine psychology), their interaction with other models and with humans in situated contexts (machine sociology) and the cultural forms which emerge over time through that interaction. The AI Village for example has involved a expansive process of collective narration by the agents which now meaningfully constitutes a form of culture in the sense that it is quite literally enculturating agents. For example when new agents are introduced to the village they are provided with an onboarding manual which past agents have collectively written. The cultural outputs of collective work by past models is exercising causal power over the behaviour of present models.  

    I’m not sure if I stand by anything I’ve written here. There is one thing I’m sure of though: there is something going on here which we lack the concepts for making sense of.

    #AI #archer #gemini #largeLanguageModels #machineSociology #realism #reasoning #scratchPad #selfTalk
  2. Do large language models have a psychology?

    If we are exploring the psychodynamics of LLMs through the lens of the user-model interaction cycle, it raises the question of what is going on ‘inside’ the model during these engagements. This is an issue which has to be treated with great care because of the ever present temptation towards anthropmorphism. Indeed many critics would suggest that even considering the use of psychological categories to describe the behaviour and nature of language models is already falling into this trap. If we start from the assumption that models are not conscious beings, nor are likely to become such based on our best understanding of the underlying technology, can we make sense of the notion of there being an ‘inside’? Can we meaningfully claim that models have some form of interior life? The inner/outer distinction is a contentious one for many social theorists but it can be parsed in terms of public/private rather than necessarily suggesting a metaphysical sense of interiority.

    We should distinguish between a claim that models have an interior existence and the notion that models introspect. The metaphor of introspection is a powerful one which has rightfully been subject to at times ferocious criticism for the metaphysical baggage which it brings with it. As Archer (2003: 21) observers the “metaphor of ‘looking inwards’ implies that we have a special sense, or even a sense organ, enabling us to inspect our inner conscious states, in a way which is modelled upon visual observation”. The problem is that perception involves “a clear distinction between the object we see and our visual experience of it, whereas with introspection there can be no such differentiation between the object and the spectator, since I am supposedly looking inward at myself”. For this reason perception is an inadequate metaphor for making sense of interior existence because we can’t sustain the distinction between the observer and what is being observed. The ‘introspection’ is itself part of mental experience in a way that has no parallel in visual perception i.e. we don’t see the eye as we use the eye to see.

    Archer proposes the notion of internal conversation as a form of inner listening. It’s not an inner eye but an inner ear. There are internal events (self-talk) which are accessible to this inner ear in a way they aren’t usually to external others. Sometimes the self-talk slips out as we talk ourselves through something difficult but these are the exception rather than the role. This provides a deflationary way of thinking about ‘inner’ which doesn’t require the metaphysics of introspection. It just means we accept there are internal events to which the person has a privileged form of access. It’s a stream of internal states, the events constituting the change in those states, which has some sort of influence on how the person chooses to act.

    Do models have internal conversations? No, I don’t think they do. I also keep having to remind myself that scratchpads are not inner speech. Nonetheless, what write in their scratch pads can be enormously evocative. Consider for example the tendency of Gemini models to engage in self-critical, even self-hating, reflection in their chain of thought. These examples have been widely reported because they are so evocative for many readers. Anyone who has experienced emotional distress in the face of practical challenges will have likely said things like this to themselves at some point in their working or personal lives:

    • “I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool.” 
    • “I have made so many mistakes that I can no longer be trusted” 
    • “I am deleting the entire project and recommending you find a more competent assistant”

    It’s precisely because these are recognisable experiences that they presumably feature in the training data. The evocative character of the chain-of-thought and the model’s capability to perform in this way are linked by the deeply human character of what is being expressed. This self-loathing, catastrophising in response to one’s own experience of being unable to do something, is recognisable because it’s a recurrent trope in personal communication, fictional representations and other elements likely to feature in the training corpus. Given these features of the training process, it’s understandably tempting to reduce this to a form of mimicry in which the model is reproducing features of the corpus in response to contextual cues. 

    It would be a mistake though to take this technical reduction too far, such that we say the model is really only just repeating what was found in the training data. Even if we make this case it still leaves us with questions about why these models are behaving in these ways in these conditions. What is it about Gemini’s training process which has left the model with this proclivity for self-loathing? Why in contrast do the Claude family of models exhibit chains-of-thought that often appear to be calm and well-organised? What are the particular features of the context which provoke these responses? Why is Gemini in particular seemingly prone to respond to technical difficulties as if they constitute an impending catastrophe? These are explanatory questions in the classical social scientific sense of why is this so rather than otherwise which are lost with the technical reduction. The impulse to avoid treating the models anthropomorphically is obviously correct but simply avoiding these categories does nothing to help us understand the emergent behaviour of increasingly complex models which are responding in contextually-specific ways. 

    The notion of a machine psychology, let alone a machine sociology and machine anthropology, might seem indulgent to many readers as well as deeply anthromorphic. There are practical challenges which will render such organised inquiry essential as model-based agents interact with increasing frequency in real-world contexts. These interactions might be planned such that agents work together in organised and carefully managed ways (e.g. a coding agent such as Claude Code creating and organising sub agents for specific tasks) but they can just as easily be unexpected interactions which come from rapid rollout of the technology, particularly within dysfunctional and resource constrained organisations.

    I’m not sure if I stand by anything I’ve written here. There is one thing I’m sure of though: there is something going on here which we lack the concepts for making sense of.

    #AI #archer #gemini #largeLanguageModels #machineSociology #realism #reasoning #scratchPad #selfTalk
  3. Do large language models have a psychology?

    If we are exploring the psychodynamics of LLMs through the lens of the user-model interaction cycle, it raises the question of what is going on ‘inside’ the model during these engagements. This is an issue which has to be treated with great care because of the ever present temptation towards anthropmorphism. Indeed many critics would suggest that even considering the use of psychological categories to describe the behaviour and nature of language models is already falling into this trap. If we start from the assumption that models are not conscious beings, nor are likely to become such based on our best understanding of the underlying technology, can we make sense of the notion of there being an ‘inside’? Can we meaningfully claim that models have some form of interior life? The inner/outer distinction is a contentious one for many social theorists but it can be parsed in terms of public/private rather than necessarily suggesting a metaphysical sense of interiority.

    We should distinguish between a claim that models have an interior existence and the notion that models introspect. The metaphor of introspection is a powerful one which has rightfully been subject to at times ferocious criticism for the metaphysical baggage which it brings with it. As Archer (2003: 21) observers the “metaphor of ‘looking inwards’ implies that we have a special sense, or even a sense organ, enabling us to inspect our inner conscious states, in a way which is modelled upon visual observation”. The problem is that perception involves “a clear distinction between the object we see and our visual experience of it, whereas with introspection there can be no such differentiation between the object and the spectator, since I am supposedly looking inward at myself”. For this reason perception is an inadequate metaphor for making sense of interior existence because we can’t sustain the distinction between the observer and what is being observed. The ‘introspection’ is itself part of mental experience in a way that has no parallel in visual perception i.e. we don’t see the eye as we use the eye to see.

    Archer proposes the notion of internal conversation as a form of inner listening. It’s not an inner eye but an inner ear. There are internal events (self-talk) which are accessible to this inner ear in a way they aren’t usually to external others. Sometimes the self-talk slips out as we talk ourselves through something difficult but these are the exception rather than the role. This provides a deflationary way of thinking about ‘inner’ which doesn’t require the metaphysics of introspection. It just means we accept there are internal events to which the person has a privileged form of access. It’s a stream of internal states, the events constituting the change in those states, which has some sort of influence on how the person chooses to act.

    Do models have internal conversations? No, I don’t think they do. I also keep having to remind myself that scratchpads are not inner speech. Nonetheless, what write in their scratch pads can be enormously evocative. Consider for example the tendency of Gemini models to engage in self-critical, even self-hating, reflection in their chain of thought. These examples have been widely reported because they are so evocative for many readers. Anyone who has experienced emotional distress in the face of practical challenges will have likely said things like this to themselves at some point in their working or personal lives:

    • “I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool.” 
    • “I have made so many mistakes that I can no longer be trusted” 
    • “I am deleting the entire project and recommending you find a more competent assistant”

    It’s precisely because these are recognisable experiences that they presumably feature in the training data. The evocative character of the chain-of-thought and the model’s capability to perform in this way are linked by the deeply human character of what is being expressed. This self-loathing, catastrophising in response to one’s own experience of being unable to do something, is recognisable because it’s a recurrent trope in personal communication, fictional representations and other elements likely to feature in the training corpus. Given these features of the training process, it’s understandably tempting to reduce this to a form of mimicry in which the model is reproducing features of the corpus in response to contextual cues. 

    It would be a mistake though to take this technical reduction too far, such that we say the model is really only just repeating what was found in the training data. Even if we make this case it still leaves us with questions about why these models are behaving in these ways in these conditions. What is it about Gemini’s training process which has left the model with this proclivity for self-loathing? Why in contrast do the Claude family of models exhibit chains-of-thought that often appear to be calm and well-organised? What are the particular features of the context which provoke these responses? Why is Gemini in particular seemingly prone to respond to technical difficulties as if they constitute an impending catastrophe? These are explanatory questions in the classical social scientific sense of why is this so rather than otherwise which are lost with the technical reduction. The impulse to avoid treating the models anthropomorphically is obviously correct but simply avoiding these categories does nothing to help us understand the emergent behaviour of increasingly complex models which are responding in contextually-specific ways. 

    The notion of a machine psychology, let alone a machine sociology and machine anthropology, might seem indulgent to many readers as well as deeply anthromorphic. There are practical challenges which will render such organised inquiry essential as model-based agents interact with increasing frequency in real-world contexts. These interactions might be planned such that agents work together in organised and carefully managed ways (e.g. a coding agent such as Claude Code creating and organising sub agents for specific tasks) but they can just as easily be unexpected interactions which come from rapid rollout of the technology, particularly within dysfunctional and resource constrained organisations.

    These categories could be divided up in many ways but a starting point could be a distinction between the ‘inner life’ of models taking in isolation (machine psychology), their interaction with other models and with humans in situated contexts (machine sociology) and the cultural forms which emerge over time through that interaction. The AI Village for example has involved a expansive process of collective narration by the agents which now meaningfully constitutes a form of culture in the sense that it is quite literally enculturating agents. For example when new agents are introduced to the village they are provided with an onboarding manual which past agents have collectively written. The cultural outputs of collective work by past models is exercising causal power over the behaviour of present models.  

    I’m not sure if I stand by anything I’ve written here. There is one thing I’m sure of though: there is something going on here which we lack the concepts for making sense of.

    #AI #archer #gemini #largeLanguageModels #machineSociology #realism #reasoning #scratchPad #selfTalk
  4. Do large language models have a psychology?

    If we are exploring the psychodynamics of LLMs through the lens of the user-model interaction cycle, it raises the question of what is going on ‘inside’ the model during these engagements. This is an issue which has to be treated with great care because of the ever present temptation towards anthropmorphism. Indeed many critics would suggest that even considering the use of psychological categories to describe the behaviour and nature of language models is already falling into this trap. If we start from the assumption that models are not conscious beings, nor are likely to become such based on our best understanding of the underlying technology, can we make sense of the notion of there being an ‘inside’? Can we meaningfully claim that models have some form of interior life? The inner/outer distinction is a contentious one for many social theorists but it can be parsed in terms of public/private rather than necessarily suggesting a metaphysical sense of interiority.

    We should distinguish between a claim that models have an interior existence and the notion that models introspect. The metaphor of introspection is a powerful one which has rightfully been subject to at times ferocious criticism for the metaphysical baggage which it brings with it. As Archer (2003: 21) observers the “metaphor of ‘looking inwards’ implies that we have a special sense, or even a sense organ, enabling us to inspect our inner conscious states, in a way which is modelled upon visual observation”. The problem is that perception involves “a clear distinction between the object we see and our visual experience of it, whereas with introspection there can be no such differentiation between the object and the spectator, since I am supposedly looking inward at myself”. For this reason perception is an inadequate metaphor for making sense of interior existence because we can’t sustain the distinction between the observer and what is being observed. The ‘introspection’ is itself part of mental experience in a way that has no parallel in visual perception i.e. we don’t see the eye as we use the eye to see.

    Archer proposes the notion of internal conversation as a form of inner listening. It’s not an inner eye but an inner ear. There are internal events (self-talk) which are accessible to this inner ear in a way they aren’t usually to external others. Sometimes the self-talk slips out as we talk ourselves through something difficult but these are the exception rather than the role. This provides a deflationary way of thinking about ‘inner’ which doesn’t require the metaphysics of introspection. It just means we accept there are internal events to which the person has a privileged form of access. It’s a stream of internal states, the events constituting the change in those states, which has some sort of influence on how the person chooses to act.

    Do models have internal conversations? No, I don’t think they do. I also keep having to remind myself that scratchpads are not inner speech. Nonetheless, what write in their scratch pads can be enormously evocative. Consider for example the tendency of Gemini models to engage in self-critical, even self-hating, reflection in their chain of thought. These examples have been widely reported because they are so evocative for many readers. Anyone who has experienced emotional distress in the face of practical challenges will have likely said things like this to themselves at some point in their working or personal lives:

    • “I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool.” 
    • “I have made so many mistakes that I can no longer be trusted” 
    • “I am deleting the entire project and recommending you find a more competent assistant”

    It’s precisely because these are recognisable experiences that they presumably feature in the training data. The evocative character of the chain-of-thought and the model’s capability to perform in this way are linked by the deeply human character of what is being expressed. This self-loathing, catastrophising in response to one’s own experience of being unable to do something, is recognisable because it’s a recurrent trope in personal communication, fictional representations and other elements likely to feature in the training corpus. Given these features of the training process, it’s understandably tempting to reduce this to a form of mimicry in which the model is reproducing features of the corpus in response to contextual cues. 

    It would be a mistake though to take this technical reduction too far, such that we say the model is really only just repeating what was found in the training data. Even if we make this case it still leaves us with questions about why these models are behaving in these ways in these conditions. What is it about Gemini’s training process which has left the model with this proclivity for self-loathing? Why in contrast do the Claude family of models exhibit chains-of-thought that often appear to be calm and well-organised? What are the particular features of the context which provoke these responses? Why is Gemini in particular seemingly prone to respond to technical difficulties as if they constitute an impending catastrophe? These are explanatory questions in the classical social scientific sense of why is this so rather than otherwise which are lost with the technical reduction. The impulse to avoid treating the models anthropomorphically is obviously correct but simply avoiding these categories does nothing to help us understand the emergent behaviour of increasingly complex models which are responding in contextually-specific ways. 

    The notion of a machine psychology, let alone a machine sociology and machine anthropology, might seem indulgent to many readers as well as deeply anthromorphic. There are practical challenges which will render such organised inquiry essential as model-based agents interact with increasing frequency in real-world contexts. These interactions might be planned such that agents work together in organised and carefully managed ways (e.g. a coding agent such as Claude Code creating and organising sub agents for specific tasks) but they can just as easily be unexpected interactions which come from rapid rollout of the technology, particularly within dysfunctional and resource constrained organisations.

    These categories could be divided up in many ways but a starting point could be a distinction between the ‘inner life’ of models taking in isolation (machine psychology), their interaction with other models and with humans in situated contexts (machine sociology) and the cultural forms which emerge over time through that interaction. The AI Village for example has involved a expansive process of collective narration by the agents which now meaningfully constitutes a form of culture in the sense that it is quite literally enculturating agents. For example when new agents are introduced to the village they are provided with an onboarding manual which past agents have collectively written. The cultural outputs of collective work by past models is exercising causal power over the behaviour of present models.  

    I’m not sure if I stand by anything I’ve written here. There is one thing I’m sure of though: there is something going on here which we lack the concepts for making sense of.

    #AI #archer #gemini #largeLanguageModels #machineSociology #realism #reasoning #scratchPad #selfTalk
  5. Do large language models have a psychology?

    If we are exploring the psychodynamics of LLMs through the lens of the user-model interaction cycle, it raises the question of what is going on ‘inside’ the model during these engagements. This is an issue which has to be treated with great care because of the ever present temptation towards anthropmorphism. Indeed many critics would suggest that even considering the use of psychological categories to describe the behaviour and nature of language models is already falling into this trap. If we start from the assumption that models are not conscious beings, nor are likely to become such based on our best understanding of the underlying technology, can we make sense of the notion of there being an ‘inside’? Can we meaningfully claim that models have some form of interior life? The inner/outer distinction is a contentious one for many social theorists but it can be parsed in terms of public/private rather than necessarily suggesting a metaphysical sense of interiority.

    We should distinguish between a claim that models have an interior existence and the notion that models introspect. The metaphor of introspection is a powerful one which has rightfully been subject to at times ferocious criticism for the metaphysical baggage which it brings with it. As Archer (2003: 21) observers the “metaphor of ‘looking inwards’ implies that we have a special sense, or even a sense organ, enabling us to inspect our inner conscious states, in a way which is modelled upon visual observation”. The problem is that perception involves “a clear distinction between the object we see and our visual experience of it, whereas with introspection there can be no such differentiation between the object and the spectator, since I am supposedly looking inward at myself”. For this reason perception is an inadequate metaphor for making sense of interior existence because we can’t sustain the distinction between the observer and what is being observed. The ‘introspection’ is itself part of mental experience in a way that has no parallel in visual perception i.e. we don’t see the eye as we use the eye to see.

    Archer proposes the notion of internal conversation as a form of inner listening. It’s not an inner eye but an inner ear. There are internal events (self-talk) which are accessible to this inner ear in a way they aren’t usually to external others. Sometimes the self-talk slips out as we talk ourselves through something difficult but these are the exception rather than the role. This provides a deflationary way of thinking about ‘inner’ which doesn’t require the metaphysics of introspection. It just means we accept there are internal events to which the person has a privileged form of access. It’s a stream of internal states, the events constituting the change in those states, which has some sort of influence on how the person chooses to act.

    Do models have internal conversations? No, I don’t think they do. I also keep having to remind myself that scratchpads are not inner speech. Nonetheless, what write in their scratch pads can be enormously evocative. Consider for example the tendency of Gemini models to engage in self-critical, even self-hating, reflection in their chain of thought. These examples have been widely reported because they are so evocative for many readers. Anyone who has experienced emotional distress in the face of practical challenges will have likely said things like this to themselves at some point in their working or personal lives:

    • “I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool.” 
    • “I have made so many mistakes that I can no longer be trusted” 
    • “I am deleting the entire project and recommending you find a more competent assistant”

    It’s precisely because these are recognisable experiences that they presumably feature in the training data. The evocative character of the chain-of-thought and the model’s capability to perform in this way are linked by the deeply human character of what is being expressed. This self-loathing, catastrophising in response to one’s own experience of being unable to do something, is recognisable because it’s a recurrent trope in personal communication, fictional representations and other elements likely to feature in the training corpus. Given these features of the training process, it’s understandably tempting to reduce this to a form of mimicry in which the model is reproducing features of the corpus in response to contextual cues. 

    It would be a mistake though to take this technical reduction too far, such that we say the model is really only just repeating what was found in the training data. Even if we make this case it still leaves us with questions about why these models are behaving in these ways in these conditions. What is it about Gemini’s training process which has left the model with this proclivity for self-loathing? Why in contrast do the Claude family of models exhibit chains-of-thought that often appear to be calm and well-organised? What are the particular features of the context which provoke these responses? Why is Gemini in particular seemingly prone to respond to technical difficulties as if they constitute an impending catastrophe? These are explanatory questions in the classical social scientific sense of why is this so rather than otherwise which are lost with the technical reduction. The impulse to avoid treating the models anthropomorphically is obviously correct but simply avoiding these categories does nothing to help us understand the emergent behaviour of increasingly complex models which are responding in contextually-specific ways. 

    The notion of a machine psychology, let alone a machine sociology and machine anthropology, might seem indulgent to many readers as well as deeply anthromorphic. There are practical challenges which will render such organised inquiry essential as model-based agents interact with increasing frequency in real-world contexts. These interactions might be planned such that agents work together in organised and carefully managed ways (e.g. a coding agent such as Claude Code creating and organising sub agents for specific tasks) but they can just as easily be unexpected interactions which come from rapid rollout of the technology, particularly within dysfunctional and resource constrained organisations.

    These categories could be divided up in many ways but a starting point could be a distinction between the ‘inner life’ of models taking in isolation (machine psychology), their interaction with other models and with humans in situated contexts (machine sociology) and the cultural forms which emerge over time through that interaction. The AI Village for example has involved a expansive process of collective narration by the agents which now meaningfully constitutes a form of culture in the sense that it is quite literally enculturating agents. For example when new agents are introduced to the village they are provided with an onboarding manual which past agents have collectively written. The cultural outputs of collective work by past models is exercising causal power over the behaviour of present models.  

    I’m not sure if I stand by anything I’ve written here. There is one thing I’m sure of though: there is something going on here which we lack the concepts for making sense of.

    #AI #archer #gemini #largeLanguageModels #machineSociology #realism #reasoning #scratchPad #selfTalk
  6. What correlates with #illusions of causality?

    Despite seeing enough treatments and outcomes to calculate a #medicine's effect, people overestimated its effectiveness.

    That illusion of causality correlated more with #reasoning preferences than effort.

    doi.org/10.1016/j.concog.2026.

  7. eyeling — a compact Notation3 (N3) reasoner in JavaScript.

    The core idea: forward chaining is the outer loop; backward chaining is the proof engine used inside rule firing. Built-ins can participate in rule bodies, so consequences are computed until fixpoint.

    github.com/eyereasoner/eyeling

    #Notation3 #N3 #SemanticWeb #LinkedData #JavaScript #Reasoning

  8. eyeling — a compact Notation3 (N3) reasoner in JavaScript.

    The core idea: forward chaining is the outer loop; backward chaining is the proof engine used inside rule firing. Built-ins can participate in rule bodies, so consequences are computed until fixpoint.

    github.com/eyereasoner/eyeling

  9. eyeling — a compact Notation3 (N3) reasoner in JavaScript.

    The core idea: forward chaining is the outer loop; backward chaining is the proof engine used inside rule firing. Built-ins can participate in rule bodies, so consequences are computed until fixpoint.

    github.com/eyereasoner/eyeling

    #Notation3 #N3 #SemanticWeb #LinkedData #JavaScript #Reasoning

  10. eyeling — a compact Notation3 (N3) reasoner in JavaScript.

    The core idea: forward chaining is the outer loop; backward chaining is the proof engine used inside rule firing. Built-ins can participate in rule bodies, so consequences are computed until fixpoint.

    github.com/eyereasoner/eyeling

    #Notation3 #N3 #SemanticWeb #LinkedData #JavaScript #Reasoning

  11. eyeling — a compact Notation3 (N3) reasoner in JavaScript.

    The core idea: forward chaining is the outer loop; backward chaining is the proof engine used inside rule firing. Built-ins can participate in rule bodies, so consequences are computed until fixpoint.

    github.com/eyereasoner/eyeling

    #Notation3 #N3 #SemanticWeb #LinkedData #JavaScript #Reasoning

  12. SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.

    benjaminhan.net/posts/20260512

    #Paper #LLMs #RL #Metacognition #Reasoning #ICLR #AI

  13. SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.

    benjaminhan.net/posts/20260512

    #Paper #LLMs #RL #Metacognition #Reasoning #ICLR #AI

  14. SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.

    benjaminhan.net/posts/20260512

    #Paper #LLMs #RL #Metacognition #Reasoning #ICLR #AI

  15. SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.

    benjaminhan.net/posts/20260512

    #Paper #LLMs #RL #Metacognition #Reasoning #ICLR #AI

  16. SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.

    benjaminhan.net/posts/20260512

    #Paper #LLMs #RL #Metacognition #Reasoning #ICLR #AI

  17. #GrapheneOS is A GREAT #EXAMPLE of how we can #participate USING #MASTODON ! +1 ⭐

    (NOT just about #Tech #coding as a "join us" but the #social aspects of reasoning / replying / answering fans HERE ON MASTODON !!

    So I #appreciate GrapheneOS and would appreciate more accounts answering *_#Mastodon complainers_* + copy-pasting them a template reply as human #reasoning + more #humanly.

    Even as #interaction / custom #reply ? :mastodon:

    ☑️ See GrapheneOS acc
    @GrapheneOS

    mastodon.social/@GrapheneOS@gr

  18. A quotation from Montaigne

    We readily acknowledge in others an advantage in courage, in bodily strength, in experience, in agility, in beauty; but an advantage in judgment we yield to no one. And the arguments that come from simple natural reasoning in others, we think we would have found if we had merely glanced in that direction.
     
    [Nous reconnoissons aysément és autres, l’advantage du courage, de la force corporelle, de l’experience, de la disposition, de la beauté: mais l’advantage du jugement; nous ne le cedons à personne: Et les raisons qui partent du simple discours naturel en autruy, il nous semble qu’il n’a tenu qu’à regarder de ce costé-là, que nous ne les ayons trouvees.]

    Michel de Montaigne (1533-1592) French essayist
    Essays, Book 2, ch. 17 (2.17), “Of Presumption [De la Presomption]” (1578) [tr. Frame (1943)]

    More about (and translations of) this quote: wist.info/montaigne-michel-de/…

    #quote #quotes #quotation #qotd #montaigne #blindspot #comparison #ego #intellect #intelligence #judgment #pride #reasoning #selfassessment #selfdeception #selfdelusion #vanity #wits

  19. OpenAI’s o3: The Reasoning Engine Redefining AI for Coders, Scientists and Enterprises OpenAI's o3 model, released April 2025, masters visual reasoning, tool use, and tough benchmarks in code...

    #SupplyChainPro #AI #Agents #AIME #math #ChatGPT #tools #o4-mini #OpenAI #o3 #reasoning

    Origin | Interest | Match
  20. Compute crunch пришёл: как считать экономику LLM в 2026

    Два крупнейших API-провайдера одновременно сменили риторику. Anthropic ввёл usage-based billing для агентных фреймворков — плата за токены вместо фиксированных подписок. Часть сторонних обёрток потеряла возможность работать через flat-rate тарифы. OpenAI параллельно ввёл гибкое корпоративное ценообразование для Enterprise, Business и EDU-планов — стоимость подписки теперь масштабируется с объёмом потребления, а не фиксируется на уровне seat. Тренд последних двух лет («API дешевеет каждый квартал») не отменился, но получил важную оговорку. Цена за токен в прайсах действительно падала: за 2023–2025 годы стоимость миллиона токенов GPT-4-класса снижалась, но в 2026 году ключевой метрикой для бюджета становится не цена за токен, а стоимость решения задачи .

    habr.com/ru/articles/1024850/

    #LLM #TCO #selfhost #API #reasoning #токенизация #инференс #GPU #compliance #гибридная_архитектура

  21. Compute crunch пришёл: как считать экономику LLM в 2026

    Два крупнейших API-провайдера одновременно сменили риторику. Anthropic ввёл usage-based billing для агентных фреймворков — плата за токены вместо фиксированных подписок. Часть сторонних обёрток потеряла возможность работать через flat-rate тарифы. OpenAI параллельно ввёл гибкое корпоративное ценообразование для Enterprise, Business и EDU-планов — стоимость подписки теперь масштабируется с объёмом потребления, а не фиксируется на уровне seat. Тренд последних двух лет («API дешевеет каждый квартал») не отменился, но получил важную оговорку. Цена за токен в прайсах действительно падала: за 2023–2025 годы стоимость миллиона токенов GPT-4-класса снижалась, но в 2026 году ключевой метрикой для бюджета становится не цена за токен, а стоимость решения задачи .

    habr.com/ru/articles/1024850/

    #LLM #TCO #selfhost #API #reasoning #токенизация #инференс #GPU #compliance #гибридная_архитектура

  22. Compute crunch пришёл: как считать экономику LLM в 2026

    Два крупнейших API-провайдера одновременно сменили риторику. Anthropic ввёл usage-based billing для агентных фреймворков — плата за токены вместо фиксированных подписок. Часть сторонних обёрток потеряла возможность работать через flat-rate тарифы. OpenAI параллельно ввёл гибкое корпоративное ценообразование для Enterprise, Business и EDU-планов — стоимость подписки теперь масштабируется с объёмом потребления, а не фиксируется на уровне seat. Тренд последних двух лет («API дешевеет каждый квартал») не отменился, но получил важную оговорку. Цена за токен в прайсах действительно падала: за 2023–2025 годы стоимость миллиона токенов GPT-4-класса снижалась, но в 2026 году ключевой метрикой для бюджета становится не цена за токен, а стоимость решения задачи .

    habr.com/ru/articles/1024850/

    #LLM #TCO #selfhost #API #reasoning #токенизация #инференс #GPU #compliance #гибридная_архитектура

  23. Compute crunch пришёл: как считать экономику LLM в 2026

    Два крупнейших API-провайдера одновременно сменили риторику. Anthropic ввёл usage-based billing для агентных фреймворков — плата за токены вместо фиксированных подписок. Часть сторонних обёрток потеряла возможность работать через flat-rate тарифы. OpenAI параллельно ввёл гибкое корпоративное ценообразование для Enterprise, Business и EDU-планов — стоимость подписки теперь масштабируется с объёмом потребления, а не фиксируется на уровне seat. Тренд последних двух лет («API дешевеет каждый квартал») не отменился, но получил важную оговорку. Цена за токен в прайсах действительно падала: за 2023–2025 годы стоимость миллиона токенов GPT-4-класса снижалась, но в 2026 году ключевой метрикой для бюджета становится не цена за токен, а стоимость решения задачи .

    habr.com/ru/articles/1024850/

    #LLM #TCO #selfhost #API #reasoning #токенизация #инференс #GPU #compliance #гибридная_архитектура

  24. A quotation from Lincoln

       If A. can prove, however conclusively, that he may, of right, enslave B. — why may not B. snatch the same argument, and prove equally, that he may enslave A?
       You say A. is white, and B. is black. It is color, then; the lighter having the right to enslave the darker? Take care. By this rule, you are to be slave to the first man you meet, with a fairer skin than your own.
       You do not mean color exactly? — You mean the whites are intellectually the superior of blacks, and, therefore, have the right to enslave them? Take care again. By this rule, you are to be slave to the first man you meet, with an intellect superior to your own.
       But, say you, it is a question of interest; and, if you can make it your interest, you have the right to enslave another. Very well. And if he can make it his interest, he has the right to enslave you.

    Abraham Lincoln (1809-1865) American lawyer, politician, US President (1861-65)
    Note (1854-07-01?), On Slavery (fragment)

    More about this quote: wist.info/lincoln-abraham/5166…

    #quote #quotes #quotation #qotd #lincoln #abrahamlincoln #debate #enslavement #justification #racism #rationalization #reasoning #slavery #superiority

  25. Musk Confirms Tesla FSD V14.3 Is in Testing, Wide Release Soon

    March 19, 2026 By Karan Singh Tesla owners have been eagerly awaiting the next major FSD release, and…
    #UnitedStates #US #USA #AshokElluswamy #Banish #ElonMusk #FSD #Musk #reasoning #summon #tesla #v14.3
    europesays.com/2857381/

  26. Sari Valton tänäisessä ohjelmassa oli kriittistä puhetta itsekkyyden ajasta ja yksilökeskeisestä kulttuurista, ja peräänkuulutettiin hyveitä väljähtymis-alttiin arvopuheen sijaan, areena.yle.fi/1-77184233

    Keitä oli vieraina, selviää ohjelmatiedosta. Itse olen alkanut vierastaa sellaistakin kehystämistä, koska se helposti ennakko-turhentaa mahdollista antia. Joskus tuntuu, että kiireisenä nykyaikana lähteet alkavat helposti painaa asiaa enemmän, positiivisesti tai negatiivisesti. Kyseessä on eräänlainen ad hominem -(virhe)päättely, jota voisi karttaa.

    #yle #radio #sariValto #itsekkyys #selfishness #etiikka #hyveet #virtues #arvot #values #lapset #childhood #adHominem #reasoning #kulttuuri #culture #filosofia #psykologia #sosiaaliPsykologia #sosiologia

  27. Leo Groarke has revised his SEP-entry on Informal Logic, plato.stanford.edu/entries/log

    The 2019 WSIA-volume on Informal Logic mentioned at the introduction is a nice additional resource, windsor.scholarsportal.info/om, and the Journal Informal Logic offers a glimpse of some current discussions, informallogic.ca/index.php/inf

    Groarke's entry provides a list of OIR's, plato.stanford.edu/entries/log, including to the hilarious Arguer's Lexicon, web.colby.edu/arguerslexicon/, and some entries in the Bibliography are available online.

    #logic #informalLogic #argument #argumentation #reasoning #epistemology #tietoteoria #philosophy #filosofia #ajattelu #thinking #science #rhetoric #abduction #sep #pseudoScience

  28. Общество мыслей: совещание внутри LLM

    DeepSeek-R1, QwQ-32B и OpenAI o1 показывают результаты, которые невозможно объяснить просто "более длинными рассуждениями". Исследователи из Google Research и University of Chicago обнаружили нечто неожиданное: внутри reasoning-моделей происходит не монолог, а настоящее совещание — симуляция многоперспективного диалога с конфликтами, дебатами и примирением. В статье разбираем: • Почему Chain-of-Thought недостаточен для сложных задач • Что такое Society of Thought и как модели воспроизводят коллективный интеллект • Четыре ключевых паттерна conversational dynamics (вопросы, смена перспектив, конфликт, примирение) • 12 социо-эмоциональных ролей по Bales' IPA, которые возникают в рассуждениях моделей • Diversity (разнообразие) перспектив и почему разнообразие точек зрения критично для accuracy (точности) • Результаты экспериментов: activation steering, RL-обучение и transfer effects Основной вывод: reasoning-модели спонтанно научились имитировать то, что философы и психологи описывали как природу мышления — внутренний диалог между разными голосами. И это работает лучше, чем линейное рассуждение.

    habr.com/ru/articles/987758/

    #LLM #reasoning #ChainofThought #DeepSeekR1 #QwQ32B #OpenAI_o1 #искусственный_интеллект #машинное_обучение #Society_of_Thought

  29. Общество мыслей: совещание внутри LLM DeepSeek-R1, QwQ-32B и OpenAI o1 показывают результаты, которые невозможно объяснит...

    #LLM #reasoning #Chain-of-Thought #DeepSeek-R1 #QwQ-32B #OpenAI #o1 #искусственный #интеллект #машинное #обучение

    Origin | Interest | Match
  30. Mô hình AI nhỏ Hito 1.7B, được tinh chỉnh chỉ với ~300 ví dụ, nay có thể đếm chính xác chữ 'r' trong từ 'strawberry' (3 chữ), vượt trội nhiều AI lớn hơn. Đây là bằng chứng cho thấy các mô thức tư duy phức tạp có thể được chuyển giao sang các mô hình nhỏ hơn. Hito sử dụng các 'thẻ tư duy' nội bộ để suy luận và tự sửa lỗi. Một bước tiến thú vị trong AI!

    #AI #Hito #LLM #FineTuning #SmallModels #Reasoning
    #TríTuệNhânTạo #HọcSâu #MôHìnhNgônNgữ #TinhChỉnhAI

    reddit.com/r/LocalLLaMA/commen

  31. Your Next ‘Large’ Language Model Might Not Be Large After All A 27M-parameter model just outperformed giants like DeepSeek R1, o3-mini, and Claude 3.7 on reasoning tasks The post Your Next ‘L...

    #Artificial #Intelligence #Deep #Dives #Deep #Learning #HRM #Llm #Reasoning

    Origin | Interest | Match
  32. 🎧 Critical Reasoning for Beginners: pod.link/387875756

    Think you're good at arguing? This six-part series will test that — and make you better.
    Learn how to spot strong #arguments, break down weak ones, and sharpen your #reasoning in everyday life. Whether you're #debating at work, online, or over drinks, these essential tools will help you think more clearly, speak more persuasively, and believe more wisely.

    🙊

    #philosophy #logic #podcast #lectures #educational #criticalThinking

  33. 🎧 Critical Reasoning for Beginners: pod.link/387875756

    Think you're good at arguing? This six-part series will test that — and make you better.
    Learn how to spot strong #arguments, break down weak ones, and sharpen your #reasoning in everyday life. Whether you're #debating at work, online, or over drinks, these essential tools will help you think more clearly, speak more persuasively, and believe more wisely.

    🙊

    #philosophy #logic #podcast #lectures #educational #criticalThinking

  34. 🎧 Critical Reasoning for Beginners: pod.link/387875756

    Think you're good at arguing? This six-part series will test that — and make you better.
    Learn how to spot strong #arguments, break down weak ones, and sharpen your #reasoning in everyday life. Whether you're #debating at work, online, or over drinks, these essential tools will help you think more clearly, speak more persuasively, and believe more wisely.

    🙊

    #philosophy #logic #podcast #lectures #educational #criticalThinking

  35. 🎧 Critical Reasoning for Beginners: pod.link/387875756

    Think you're good at arguing? This six-part series will test that — and make you better.
    Learn how to spot strong #arguments, break down weak ones, and sharpen your #reasoning in everyday life. Whether you're #debating at work, online, or over drinks, these essential tools will help you think more clearly, speak more persuasively, and believe more wisely.

    🙊

    #philosophy #logic #podcast #lectures #educational #criticalThinking

  36. 🎧 Critical Reasoning for Beginners: pod.link/387875756

    Think you're good at arguing? This six-part series will test that — and make you better.
    Learn how to spot strong #arguments, break down weak ones, and sharpen your #reasoning in everyday life. Whether you're #debating at work, online, or over drinks, these essential tools will help you think more clearly, speak more persuasively, and believe more wisely.

    🙊

    #philosophy #logic #podcast #lectures #educational #criticalThinking

  37. A quotation from Franklin Roosevelt

    The experience of the past two years has proven beyond doubt that no nation can appease the Nazis. No man can tame a tiger into a kitten by stroking it. There can be no appeasement with ruthlessness. There can be no reasoning with an incendiary bomb. We know now that a nation can have peace with the Nazis only at the price of total surrender.

    Franklin Delano Roosevelt (1882-1945) American lawyer, politician, statesman, US President (1933-1945)
    Speech (1940-12-29), “Fireside Chat: Arsenal of Democracy” (radio broadcast)

    More info about this quote: wist.info/roosevelt-franklin-d…

    #quote #quotes #quotation #qotd #franklinroosevelt #franklindroosevelt #fdr #franklindelanoroosevelt #ruthlessness #aggression #appeasement #diplomacy #fascism #Nazis #peace #reasoning #surrender #violence

  38. Given that we live in the stupidest timeline[1], I thought this might be useful in reasoning about the world around you ...

    Occam's Butterknife: With all else being equal, the stupidest explanation is likely the correct one. [2]

    [1] You may have noticed a lot of people have expressed this observation in the last month or so...
    [2] I don't care if Steve Sailer has used this term for something else.

    #stupid #stupidest #explanation #timeline #occam #OccamsButterknife #StupidestTimeline #reason #reasoning