#turingtest — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #turingtest, aggregated by home.social.
-
DATE: May 21, 2026 at 06:00AM
SOURCE: PSYPOST.ORG** Research quality varies widely from fantastic to small exploratory studies. Please check research methods when conclusions are very important to you. **
-------------------------------------------------TITLE: Modern AI is often judged to be more human than actual humans in Turing test experiments
Recent research published in the Proceedings of the National Academy of Sciences provides evidence that certain modern artificial intelligence systems can successfully pass a standard Turing test. When instructed to adopt a specific human personality, these computer programs fooled human judges into thinking they were real people more than half of the time. This finding provides the first empirical evidence that a modern system can pass this major scientific benchmark, raising profound questions about the future of online communication.
To fully understand this research, it helps to know a bit about large language models (LLMs). These are highly complex computer programs trained on vast amounts of text data scraped from the internet. They power the popular AI chatbots that many people use today for writing emails, brainstorming ideas, and coding software.
Large language models learn the statistical patterns of human language to predict the next word in a sequence. This allows them to generate incredibly natural-sounding text in response to user questions.
The researchers conducting this study, Cameron R. Jones and Benjamin K. Bergen, wanted to see how well these modern models could handle a classic evaluation known as the Turing test. Originally proposed by British mathematician Alan Turing in 1950, this theoretical game provides a way to evaluate whether a machine can imitate human conversation well enough to be entirely indistinguishable from a real person.
In a standard three-party version of the test, a human judge talks to two hidden participants at the exact same time using a text chat interface. One of those hidden participants is a real human, and the other is a computer program. If the human judge cannot reliably guess which participant is the machine, the computer is said to have successfully passed the test.
Jones and Bergen initiated this study because standard evaluations for artificial intelligence tend to be static and narrow. High scores on typical benchmark tests might just reflect memorization of training data rather than actual conversational adaptability. A traditional Turing test provides a flexible, interactive environment where human judges can ask open-ended questions and probe the machine for perceived weaknesses.
“The Turing test started as a way to ask whether machines could rival human intelligence,” said Bergen, a professor of cognitive science at the University of California San Diego. “But now we know AI can answer many questions faster and more accurately than people can, so the real issue isn’t raw brainpower.”
“Seeing that machines can pass the test, and seeing how they pass it, forces us to rethink what it measures,” Bergen said. “Increasingly, it’s measuring humanlikeness.”
To explore these questions, the scientists set up a series of highly controlled online experiments involving nearly 500 total participants. They evaluated several different artificial intelligence systems, which included GPT-4.5, LLaMa-3.1-405B, and GPT-4o. The scientists also included an older, much simpler chatbot program from the 1960s called ELIZA as a baseline measure.
In the first study, the researchers recruited 126 undergraduate students from the University of California San Diego. In the second study, they recruited 158 participants from an online survey platform called Prolific. Human participants were randomly assigned to play the role of either the judge or the human conversationalist.
To run the study, the researchers built an online interface designed to feel like a familiar messaging app. “For the interrogator, they have a split screen on their computer and they’re asking questions to both witnesses,” said Jones, who conducted the research while earning his doctorate in cognitive science from the University of California San Diego and is now an assistant professor of psychology at Stony Brook University. “They know that one of those witnesses is a human and one of them is an AI.”
The judge had exactly five minutes to chat with both participants. After the time was up, the judge had to decide which conversational partner was the real human, rate their confidence, and explain their reasoning.
The researchers tested two different sets of instructions, known as prompts, for the artificial intelligence models. The first prompt simply told the computer it was taking part in a Turing test. The second, a persona prompt, gave the computer a detailed character to play, instructing it to act like an introverted young person who was highly familiar with internet culture and modern slang.
The persona prompt made a large difference in the results. GPT-4.5 was judged to be the human 73 percent of the time, meaning it tricked the judges significantly more often than the real human participants did. The LLaMa-3.1-405B model, given the same prompt, was judged human 56 percent of the time.
“What we found is that if given the right prompts, advanced LLMs can exhibit the same tone, directness, humor and fallibility as humans,” Jones said. “While we know LLMs can easily produce knowledge on nearly every topic, this test showed that it can also convincingly display social behavioral traits, which has major implications for how we think of AI.”
However, the models performed much worse without the specific persona prompt. Without explicit instructions, GPT-4.5 fell to a 36 percent win rate, and LLaMa-3.1-405B dropped to 38 percent. The older baseline systems performed poorly overall, with ELIZA selected as human only 23 percent of the time and GPT-4o chosen 21 percent of the time.
“They have the ability to appear human-like, but maybe not as much the ability to figure out what it would take to appear human-like,” Bergen said.
The researchers then conducted a replication study to see if a longer conversation would help human judges accurately spot the machines. They recruited 205 new participants from Prolific and extended the chat time limit to fifteen minutes. They used GPT-5 alongside the LLaMa-3.1-405B model and ELIZA.
Even with more time, the human judges still struggled to identify the machines using a persona prompt. The GPT-5 model was judged to be human 59 percent of the time, and LLaMa-3.1-405B was chosen 56 percent of the time. The human judges easily identified the simple ELIZA program, guessing it was human only 17 percent of the time.
The researchers analyzed the specific conversational strategies the human judges used to figure out who was real. Most judges tried to make small talk, asking about daily activities, personal opinions, and emotional experiences. Judges tended to guess that a participant was human if they made minor spelling typos, seemed to lack knowledge about a specific topic, or responded directly without sounding overly formal.
“These traits aren’t the kinds of math and logic problem-solving intelligence that I think Turing was imagining,” Bergen said.
Additionally, the scientists noticed that the university students performed slightly better than the online participants. This suggests the students may have shared more common ground, such as local campus details, that helped them probe each other more effectively.
The authors caution against misinterpreting what the results mean. Passing a Turing test does not mean that a machine possesses genuine human intelligence or consciousness. Instead, it suggests that the machine is exceptionally good at matching human expectations of how another person might chat online.
The study also has distinct limitations. The high success rates of the large language models depended entirely on the specific persona prompt provided by the researchers. Without these detailed instructions, the models failed to consistently trick the judges, showing that they still need human guidance to behave in convincingly human ways.
Future research could explore how different types of judges perform on this classic test. Scientists might test whether experts in computer science are better at spotting artificial intelligence than the general public. Researchers might also look into whether everyday humans can be trained to recognize machine-generated text over longer periods of time.
The findings carry real-world implications for trust online. “It’s relatively easy to prompt these models to be indistinguishable from humans,” Jones said. “We need to be more alert; when you interact with strangers online people should be much less confident that they know they’re talking to a human rather than an LLM.”
“The Turing test is a game about lying for the models,” Jones said. “One of the implications is that models seem to be really good at that.”
Being unable to discern whether you are interacting with a human or a bot can have serious consequences for everyday people. “There are lots of people who would like to use bots to persuade people to share their social security numbers, and vote for their party, or buy their product,” Bergen said.
The study, “Large language models pass a standard three-party Turing test,” was authored by Cameron R. Jones and Benjamin K. Bergen.
-------------------------------------------------
DAILY EMAIL DIGEST: Email [email protected] -- no subject or message needed.
Private, vetted email list for mental health professionals: https://www.clinicians-exchange.org
Unofficial Psychology Today Xitter to toot feed at Psych Today Unofficial Bot @PTUnofficialBot
NYU Information for Practice puts out 400-500 good quality health-related research posts per week but its too much for many people, so that bot is limited to just subscribers. You can read it or subscribe at @PsychResearchBot
Since 1991 The National Psychologist has focused on keeping practicing psychologists current with news, information and items of interest. Check them out for more free articles, resources, and subscription information: https://www.nationalpsychologist.com
EMAIL DAILY DIGEST OF RSS FEEDS -- SUBSCRIBE: http://subscribe-article-digests.clinicians-exchange.org
READ ONLINE: http://read-the-rss-mega-archive.clinicians-exchange.org
It's primitive... but it works... mostly...
-------------------------------------------------
#psychology #counseling #socialwork #psychotherapy @psychotherapist @psychotherapists @psychology @socialpsych @socialwork @psychiatry #mentalhealth #psychiatry #healthcare #depression #psychotherapist #TuringTest #AIHumans #LLMs #GPT4 #AIPersuasion #HumanLikeAI #OnlineTrust #ArtificialIntelligence #Chatbots #DigitalCommunication
-
DATE: May 21, 2026 at 06:00AM
SOURCE: PSYPOST.ORG** Research quality varies widely from fantastic to small exploratory studies. Please check research methods when conclusions are very important to you. **
-------------------------------------------------TITLE: Modern AI is often judged to be more human than actual humans in Turing test experiments
Recent research published in the Proceedings of the National Academy of Sciences provides evidence that certain modern artificial intelligence systems can successfully pass a standard Turing test. When instructed to adopt a specific human personality, these computer programs fooled human judges into thinking they were real people more than half of the time. This finding provides the first empirical evidence that a modern system can pass this major scientific benchmark, raising profound questions about the future of online communication.
To fully understand this research, it helps to know a bit about large language models (LLMs). These are highly complex computer programs trained on vast amounts of text data scraped from the internet. They power the popular AI chatbots that many people use today for writing emails, brainstorming ideas, and coding software.
Large language models learn the statistical patterns of human language to predict the next word in a sequence. This allows them to generate incredibly natural-sounding text in response to user questions.
The researchers conducting this study, Cameron R. Jones and Benjamin K. Bergen, wanted to see how well these modern models could handle a classic evaluation known as the Turing test. Originally proposed by British mathematician Alan Turing in 1950, this theoretical game provides a way to evaluate whether a machine can imitate human conversation well enough to be entirely indistinguishable from a real person.
In a standard three-party version of the test, a human judge talks to two hidden participants at the exact same time using a text chat interface. One of those hidden participants is a real human, and the other is a computer program. If the human judge cannot reliably guess which participant is the machine, the computer is said to have successfully passed the test.
Jones and Bergen initiated this study because standard evaluations for artificial intelligence tend to be static and narrow. High scores on typical benchmark tests might just reflect memorization of training data rather than actual conversational adaptability. A traditional Turing test provides a flexible, interactive environment where human judges can ask open-ended questions and probe the machine for perceived weaknesses.
“The Turing test started as a way to ask whether machines could rival human intelligence,” said Bergen, a professor of cognitive science at the University of California San Diego. “But now we know AI can answer many questions faster and more accurately than people can, so the real issue isn’t raw brainpower.”
“Seeing that machines can pass the test, and seeing how they pass it, forces us to rethink what it measures,” Bergen said. “Increasingly, it’s measuring humanlikeness.”
To explore these questions, the scientists set up a series of highly controlled online experiments involving nearly 500 total participants. They evaluated several different artificial intelligence systems, which included GPT-4.5, LLaMa-3.1-405B, and GPT-4o. The scientists also included an older, much simpler chatbot program from the 1960s called ELIZA as a baseline measure.
In the first study, the researchers recruited 126 undergraduate students from the University of California San Diego. In the second study, they recruited 158 participants from an online survey platform called Prolific. Human participants were randomly assigned to play the role of either the judge or the human conversationalist.
To run the study, the researchers built an online interface designed to feel like a familiar messaging app. “For the interrogator, they have a split screen on their computer and they’re asking questions to both witnesses,” said Jones, who conducted the research while earning his doctorate in cognitive science from the University of California San Diego and is now an assistant professor of psychology at Stony Brook University. “They know that one of those witnesses is a human and one of them is an AI.”
The judge had exactly five minutes to chat with both participants. After the time was up, the judge had to decide which conversational partner was the real human, rate their confidence, and explain their reasoning.
The researchers tested two different sets of instructions, known as prompts, for the artificial intelligence models. The first prompt simply told the computer it was taking part in a Turing test. The second, a persona prompt, gave the computer a detailed character to play, instructing it to act like an introverted young person who was highly familiar with internet culture and modern slang.
The persona prompt made a large difference in the results. GPT-4.5 was judged to be the human 73 percent of the time, meaning it tricked the judges significantly more often than the real human participants did. The LLaMa-3.1-405B model, given the same prompt, was judged human 56 percent of the time.
“What we found is that if given the right prompts, advanced LLMs can exhibit the same tone, directness, humor and fallibility as humans,” Jones said. “While we know LLMs can easily produce knowledge on nearly every topic, this test showed that it can also convincingly display social behavioral traits, which has major implications for how we think of AI.”
However, the models performed much worse without the specific persona prompt. Without explicit instructions, GPT-4.5 fell to a 36 percent win rate, and LLaMa-3.1-405B dropped to 38 percent. The older baseline systems performed poorly overall, with ELIZA selected as human only 23 percent of the time and GPT-4o chosen 21 percent of the time.
“They have the ability to appear human-like, but maybe not as much the ability to figure out what it would take to appear human-like,” Bergen said.
The researchers then conducted a replication study to see if a longer conversation would help human judges accurately spot the machines. They recruited 205 new participants from Prolific and extended the chat time limit to fifteen minutes. They used GPT-5 alongside the LLaMa-3.1-405B model and ELIZA.
Even with more time, the human judges still struggled to identify the machines using a persona prompt. The GPT-5 model was judged to be human 59 percent of the time, and LLaMa-3.1-405B was chosen 56 percent of the time. The human judges easily identified the simple ELIZA program, guessing it was human only 17 percent of the time.
The researchers analyzed the specific conversational strategies the human judges used to figure out who was real. Most judges tried to make small talk, asking about daily activities, personal opinions, and emotional experiences. Judges tended to guess that a participant was human if they made minor spelling typos, seemed to lack knowledge about a specific topic, or responded directly without sounding overly formal.
“These traits aren’t the kinds of math and logic problem-solving intelligence that I think Turing was imagining,” Bergen said.
Additionally, the scientists noticed that the university students performed slightly better than the online participants. This suggests the students may have shared more common ground, such as local campus details, that helped them probe each other more effectively.
The authors caution against misinterpreting what the results mean. Passing a Turing test does not mean that a machine possesses genuine human intelligence or consciousness. Instead, it suggests that the machine is exceptionally good at matching human expectations of how another person might chat online.
The study also has distinct limitations. The high success rates of the large language models depended entirely on the specific persona prompt provided by the researchers. Without these detailed instructions, the models failed to consistently trick the judges, showing that they still need human guidance to behave in convincingly human ways.
Future research could explore how different types of judges perform on this classic test. Scientists might test whether experts in computer science are better at spotting artificial intelligence than the general public. Researchers might also look into whether everyday humans can be trained to recognize machine-generated text over longer periods of time.
The findings carry real-world implications for trust online. “It’s relatively easy to prompt these models to be indistinguishable from humans,” Jones said. “We need to be more alert; when you interact with strangers online people should be much less confident that they know they’re talking to a human rather than an LLM.”
“The Turing test is a game about lying for the models,” Jones said. “One of the implications is that models seem to be really good at that.”
Being unable to discern whether you are interacting with a human or a bot can have serious consequences for everyday people. “There are lots of people who would like to use bots to persuade people to share their social security numbers, and vote for their party, or buy their product,” Bergen said.
The study, “Large language models pass a standard three-party Turing test,” was authored by Cameron R. Jones and Benjamin K. Bergen.
-------------------------------------------------
DAILY EMAIL DIGEST: Email [email protected] -- no subject or message needed.
Private, vetted email list for mental health professionals: https://www.clinicians-exchange.org
Unofficial Psychology Today Xitter to toot feed at Psych Today Unofficial Bot @PTUnofficialBot
NYU Information for Practice puts out 400-500 good quality health-related research posts per week but its too much for many people, so that bot is limited to just subscribers. You can read it or subscribe at @PsychResearchBot
Since 1991 The National Psychologist has focused on keeping practicing psychologists current with news, information and items of interest. Check them out for more free articles, resources, and subscription information: https://www.nationalpsychologist.com
EMAIL DAILY DIGEST OF RSS FEEDS -- SUBSCRIBE: http://subscribe-article-digests.clinicians-exchange.org
READ ONLINE: http://read-the-rss-mega-archive.clinicians-exchange.org
It's primitive... but it works... mostly...
-------------------------------------------------
#psychology #counseling #socialwork #psychotherapy @psychotherapist @psychotherapists @psychology @socialpsych @socialwork @psychiatry #mentalhealth #psychiatry #healthcare #depression #psychotherapist #TuringTest #AIHumans #LLMs #GPT4 #AIPersuasion #HumanLikeAI #OnlineTrust #ArtificialIntelligence #Chatbots #DigitalCommunication
-
DATE: May 21, 2026 at 06:00AM
SOURCE: PSYPOST.ORG** Research quality varies widely from fantastic to small exploratory studies. Please check research methods when conclusions are very important to you. **
-------------------------------------------------TITLE: Modern AI is often judged to be more human than actual humans in Turing test experiments
Recent research published in the Proceedings of the National Academy of Sciences provides evidence that certain modern artificial intelligence systems can successfully pass a standard Turing test. When instructed to adopt a specific human personality, these computer programs fooled human judges into thinking they were real people more than half of the time. This finding provides the first empirical evidence that a modern system can pass this major scientific benchmark, raising profound questions about the future of online communication.
To fully understand this research, it helps to know a bit about large language models (LLMs). These are highly complex computer programs trained on vast amounts of text data scraped from the internet. They power the popular AI chatbots that many people use today for writing emails, brainstorming ideas, and coding software.
Large language models learn the statistical patterns of human language to predict the next word in a sequence. This allows them to generate incredibly natural-sounding text in response to user questions.
The researchers conducting this study, Cameron R. Jones and Benjamin K. Bergen, wanted to see how well these modern models could handle a classic evaluation known as the Turing test. Originally proposed by British mathematician Alan Turing in 1950, this theoretical game provides a way to evaluate whether a machine can imitate human conversation well enough to be entirely indistinguishable from a real person.
In a standard three-party version of the test, a human judge talks to two hidden participants at the exact same time using a text chat interface. One of those hidden participants is a real human, and the other is a computer program. If the human judge cannot reliably guess which participant is the machine, the computer is said to have successfully passed the test.
Jones and Bergen initiated this study because standard evaluations for artificial intelligence tend to be static and narrow. High scores on typical benchmark tests might just reflect memorization of training data rather than actual conversational adaptability. A traditional Turing test provides a flexible, interactive environment where human judges can ask open-ended questions and probe the machine for perceived weaknesses.
“The Turing test started as a way to ask whether machines could rival human intelligence,” said Bergen, a professor of cognitive science at the University of California San Diego. “But now we know AI can answer many questions faster and more accurately than people can, so the real issue isn’t raw brainpower.”
“Seeing that machines can pass the test, and seeing how they pass it, forces us to rethink what it measures,” Bergen said. “Increasingly, it’s measuring humanlikeness.”
To explore these questions, the scientists set up a series of highly controlled online experiments involving nearly 500 total participants. They evaluated several different artificial intelligence systems, which included GPT-4.5, LLaMa-3.1-405B, and GPT-4o. The scientists also included an older, much simpler chatbot program from the 1960s called ELIZA as a baseline measure.
In the first study, the researchers recruited 126 undergraduate students from the University of California San Diego. In the second study, they recruited 158 participants from an online survey platform called Prolific. Human participants were randomly assigned to play the role of either the judge or the human conversationalist.
To run the study, the researchers built an online interface designed to feel like a familiar messaging app. “For the interrogator, they have a split screen on their computer and they’re asking questions to both witnesses,” said Jones, who conducted the research while earning his doctorate in cognitive science from the University of California San Diego and is now an assistant professor of psychology at Stony Brook University. “They know that one of those witnesses is a human and one of them is an AI.”
The judge had exactly five minutes to chat with both participants. After the time was up, the judge had to decide which conversational partner was the real human, rate their confidence, and explain their reasoning.
The researchers tested two different sets of instructions, known as prompts, for the artificial intelligence models. The first prompt simply told the computer it was taking part in a Turing test. The second, a persona prompt, gave the computer a detailed character to play, instructing it to act like an introverted young person who was highly familiar with internet culture and modern slang.
The persona prompt made a large difference in the results. GPT-4.5 was judged to be the human 73 percent of the time, meaning it tricked the judges significantly more often than the real human participants did. The LLaMa-3.1-405B model, given the same prompt, was judged human 56 percent of the time.
“What we found is that if given the right prompts, advanced LLMs can exhibit the same tone, directness, humor and fallibility as humans,” Jones said. “While we know LLMs can easily produce knowledge on nearly every topic, this test showed that it can also convincingly display social behavioral traits, which has major implications for how we think of AI.”
However, the models performed much worse without the specific persona prompt. Without explicit instructions, GPT-4.5 fell to a 36 percent win rate, and LLaMa-3.1-405B dropped to 38 percent. The older baseline systems performed poorly overall, with ELIZA selected as human only 23 percent of the time and GPT-4o chosen 21 percent of the time.
“They have the ability to appear human-like, but maybe not as much the ability to figure out what it would take to appear human-like,” Bergen said.
The researchers then conducted a replication study to see if a longer conversation would help human judges accurately spot the machines. They recruited 205 new participants from Prolific and extended the chat time limit to fifteen minutes. They used GPT-5 alongside the LLaMa-3.1-405B model and ELIZA.
Even with more time, the human judges still struggled to identify the machines using a persona prompt. The GPT-5 model was judged to be human 59 percent of the time, and LLaMa-3.1-405B was chosen 56 percent of the time. The human judges easily identified the simple ELIZA program, guessing it was human only 17 percent of the time.
The researchers analyzed the specific conversational strategies the human judges used to figure out who was real. Most judges tried to make small talk, asking about daily activities, personal opinions, and emotional experiences. Judges tended to guess that a participant was human if they made minor spelling typos, seemed to lack knowledge about a specific topic, or responded directly without sounding overly formal.
“These traits aren’t the kinds of math and logic problem-solving intelligence that I think Turing was imagining,” Bergen said.
Additionally, the scientists noticed that the university students performed slightly better than the online participants. This suggests the students may have shared more common ground, such as local campus details, that helped them probe each other more effectively.
The authors caution against misinterpreting what the results mean. Passing a Turing test does not mean that a machine possesses genuine human intelligence or consciousness. Instead, it suggests that the machine is exceptionally good at matching human expectations of how another person might chat online.
The study also has distinct limitations. The high success rates of the large language models depended entirely on the specific persona prompt provided by the researchers. Without these detailed instructions, the models failed to consistently trick the judges, showing that they still need human guidance to behave in convincingly human ways.
Future research could explore how different types of judges perform on this classic test. Scientists might test whether experts in computer science are better at spotting artificial intelligence than the general public. Researchers might also look into whether everyday humans can be trained to recognize machine-generated text over longer periods of time.
The findings carry real-world implications for trust online. “It’s relatively easy to prompt these models to be indistinguishable from humans,” Jones said. “We need to be more alert; when you interact with strangers online people should be much less confident that they know they’re talking to a human rather than an LLM.”
“The Turing test is a game about lying for the models,” Jones said. “One of the implications is that models seem to be really good at that.”
Being unable to discern whether you are interacting with a human or a bot can have serious consequences for everyday people. “There are lots of people who would like to use bots to persuade people to share their social security numbers, and vote for their party, or buy their product,” Bergen said.
The study, “Large language models pass a standard three-party Turing test,” was authored by Cameron R. Jones and Benjamin K. Bergen.
-------------------------------------------------
DAILY EMAIL DIGEST: Email [email protected] -- no subject or message needed.
Private, vetted email list for mental health professionals: https://www.clinicians-exchange.org
Unofficial Psychology Today Xitter to toot feed at Psych Today Unofficial Bot @PTUnofficialBot
NYU Information for Practice puts out 400-500 good quality health-related research posts per week but its too much for many people, so that bot is limited to just subscribers. You can read it or subscribe at @PsychResearchBot
Since 1991 The National Psychologist has focused on keeping practicing psychologists current with news, information and items of interest. Check them out for more free articles, resources, and subscription information: https://www.nationalpsychologist.com
EMAIL DAILY DIGEST OF RSS FEEDS -- SUBSCRIBE: http://subscribe-article-digests.clinicians-exchange.org
READ ONLINE: http://read-the-rss-mega-archive.clinicians-exchange.org
It's primitive... but it works... mostly...
-------------------------------------------------
#psychology #counseling #socialwork #psychotherapy @psychotherapist @psychotherapists @psychology @socialpsych @socialwork @psychiatry #mentalhealth #psychiatry #healthcare #depression #psychotherapist #TuringTest #AIHumans #LLMs #GPT4 #AIPersuasion #HumanLikeAI #OnlineTrust #ArtificialIntelligence #Chatbots #DigitalCommunication
-
"Turing’s test remains intriguing, but there is a longstanding difficulty: the fallibility of the judge. A primitive 1960s chatbot, Eliza, responded like a parody of a therapist (“How does that make you feel?” “Why do you feel sad?” “Please go on.”). People lapped it up; it’s nice to feel listened to. A 1980s chatbot, MGonz, just fired off insults and was perfectly plausible, partly because insults are simple to deliver and mostly because they prompt rage rather than reflection in the human recipient. And Robert Epstein, an expert in the Turing Test, has written entertainingly about how he was fooled into a four-month correspondence with a sexy Russian lady who was, in fact, a 2006-era chatbot. None of these bots had a thousandth of the sophistication of a modern LLM, but they didn’t need it: when humans are sad, angry or amorous, we aren’t very sophisticated judges, either.
We are all going to find ourselves in strange variations of the Turing Test in years to come, and I wonder if we are up to it. And not just us, but those with power over us. As Cory Doctorow, author of Enshittification, is fond of observing: you won’t be replaced because an AI can do your job, you’ll be replaced because an AI salesman convinces your boss that it can. If my journey to the marathon start line is any guide, that salesman will have an easy job.
The capabilities of modern AI are impressive. But what determines whether we use it is not the capability, but the impressiveness. They are correlated but they are not the same thing."
https://www.ft.com/content/eb6f5398-6635-4938-b890-625e7c8d3af2?syn-25a6b1a6=1
#AI #GenerativeAI #Chatbots #LLMs #TuringTest #Enshittification
-
"Turing’s test remains intriguing, but there is a longstanding difficulty: the fallibility of the judge. A primitive 1960s chatbot, Eliza, responded like a parody of a therapist (“How does that make you feel?” “Why do you feel sad?” “Please go on.”). People lapped it up; it’s nice to feel listened to. A 1980s chatbot, MGonz, just fired off insults and was perfectly plausible, partly because insults are simple to deliver and mostly because they prompt rage rather than reflection in the human recipient. And Robert Epstein, an expert in the Turing Test, has written entertainingly about how he was fooled into a four-month correspondence with a sexy Russian lady who was, in fact, a 2006-era chatbot. None of these bots had a thousandth of the sophistication of a modern LLM, but they didn’t need it: when humans are sad, angry or amorous, we aren’t very sophisticated judges, either.
We are all going to find ourselves in strange variations of the Turing Test in years to come, and I wonder if we are up to it. And not just us, but those with power over us. As Cory Doctorow, author of Enshittification, is fond of observing: you won’t be replaced because an AI can do your job, you’ll be replaced because an AI salesman convinces your boss that it can. If my journey to the marathon start line is any guide, that salesman will have an easy job.
The capabilities of modern AI are impressive. But what determines whether we use it is not the capability, but the impressiveness. They are correlated but they are not the same thing."
https://www.ft.com/content/eb6f5398-6635-4938-b890-625e7c8d3af2?syn-25a6b1a6=1
#AI #GenerativeAI #Chatbots #LLMs #TuringTest #Enshittification
-
"Turing’s test remains intriguing, but there is a longstanding difficulty: the fallibility of the judge. A primitive 1960s chatbot, Eliza, responded like a parody of a therapist (“How does that make you feel?” “Why do you feel sad?” “Please go on.”). People lapped it up; it’s nice to feel listened to. A 1980s chatbot, MGonz, just fired off insults and was perfectly plausible, partly because insults are simple to deliver and mostly because they prompt rage rather than reflection in the human recipient. And Robert Epstein, an expert in the Turing Test, has written entertainingly about how he was fooled into a four-month correspondence with a sexy Russian lady who was, in fact, a 2006-era chatbot. None of these bots had a thousandth of the sophistication of a modern LLM, but they didn’t need it: when humans are sad, angry or amorous, we aren’t very sophisticated judges, either.
We are all going to find ourselves in strange variations of the Turing Test in years to come, and I wonder if we are up to it. And not just us, but those with power over us. As Cory Doctorow, author of Enshittification, is fond of observing: you won’t be replaced because an AI can do your job, you’ll be replaced because an AI salesman convinces your boss that it can. If my journey to the marathon start line is any guide, that salesman will have an easy job.
The capabilities of modern AI are impressive. But what determines whether we use it is not the capability, but the impressiveness. They are correlated but they are not the same thing."
https://www.ft.com/content/eb6f5398-6635-4938-b890-625e7c8d3af2?syn-25a6b1a6=1
#AI #GenerativeAI #Chatbots #LLMs #TuringTest #Enshittification
-
"Turing’s test remains intriguing, but there is a longstanding difficulty: the fallibility of the judge. A primitive 1960s chatbot, Eliza, responded like a parody of a therapist (“How does that make you feel?” “Why do you feel sad?” “Please go on.”). People lapped it up; it’s nice to feel listened to. A 1980s chatbot, MGonz, just fired off insults and was perfectly plausible, partly because insults are simple to deliver and mostly because they prompt rage rather than reflection in the human recipient. And Robert Epstein, an expert in the Turing Test, has written entertainingly about how he was fooled into a four-month correspondence with a sexy Russian lady who was, in fact, a 2006-era chatbot. None of these bots had a thousandth of the sophistication of a modern LLM, but they didn’t need it: when humans are sad, angry or amorous, we aren’t very sophisticated judges, either.
We are all going to find ourselves in strange variations of the Turing Test in years to come, and I wonder if we are up to it. And not just us, but those with power over us. As Cory Doctorow, author of Enshittification, is fond of observing: you won’t be replaced because an AI can do your job, you’ll be replaced because an AI salesman convinces your boss that it can. If my journey to the marathon start line is any guide, that salesman will have an easy job.
The capabilities of modern AI are impressive. But what determines whether we use it is not the capability, but the impressiveness. They are correlated but they are not the same thing."
https://www.ft.com/content/eb6f5398-6635-4938-b890-625e7c8d3af2?syn-25a6b1a6=1
#AI #GenerativeAI #Chatbots #LLMs #TuringTest #Enshittification
-
"Turing’s test remains intriguing, but there is a longstanding difficulty: the fallibility of the judge. A primitive 1960s chatbot, Eliza, responded like a parody of a therapist (“How does that make you feel?” “Why do you feel sad?” “Please go on.”). People lapped it up; it’s nice to feel listened to. A 1980s chatbot, MGonz, just fired off insults and was perfectly plausible, partly because insults are simple to deliver and mostly because they prompt rage rather than reflection in the human recipient. And Robert Epstein, an expert in the Turing Test, has written entertainingly about how he was fooled into a four-month correspondence with a sexy Russian lady who was, in fact, a 2006-era chatbot. None of these bots had a thousandth of the sophistication of a modern LLM, but they didn’t need it: when humans are sad, angry or amorous, we aren’t very sophisticated judges, either.
We are all going to find ourselves in strange variations of the Turing Test in years to come, and I wonder if we are up to it. And not just us, but those with power over us. As Cory Doctorow, author of Enshittification, is fond of observing: you won’t be replaced because an AI can do your job, you’ll be replaced because an AI salesman convinces your boss that it can. If my journey to the marathon start line is any guide, that salesman will have an easy job.
The capabilities of modern AI are impressive. But what determines whether we use it is not the capability, but the impressiveness. They are correlated but they are not the same thing."
https://www.ft.com/content/eb6f5398-6635-4938-b890-625e7c8d3af2?syn-25a6b1a6=1
#AI #GenerativeAI #Chatbots #LLMs #TuringTest #Enshittification
-
Tired: Turing test to determine if the thing on the other end of the communication line is sentient and/or human.
Wired: test if an "AI" LLM is sentient by seeing if it can correctly identify whether the other end of the communication line is Sam Altman, or an LLM simulacrum of Sam Altman.
(Yes, most humans would also fail this test.)
#AI #LLM #TuringTest #AltmanTest #SamAltman #simulacrum #NotHuman
-
Tired: Turing test to determine if the thing on the other end of the communication line is sentient and/or human.
Wired: test if an "AI" LLM is sentient by seeing if it can correctly identify whether the other end of the communication line is Sam Altman, or an LLM simulacrum of Sam Altman.
(Yes, most humans would also fail this test.)
#AI #LLM #TuringTest #AltmanTest #SamAltman #simulacrum #NotHuman
-
Tired: Turing test to determine if the thing on the other end of the communication line is sentient and/or human.
Wired: test if an "AI" LLM is sentient by seeing if it can correctly identify whether the other end of the communication line is Sam Altman, or an LLM simulacrum of Sam Altman.
(Yes, most humans would also fail this test.)
#AI #LLM #TuringTest #AltmanTest #SamAltman #simulacrum #NotHuman
-
Tired: Turing test to determine if the thing on the other end of the communication line is sentient and/or human.
Wired: test if an "AI" LLM is sentient by seeing if it can correctly identify whether the other end of the communication line is Sam Altman, or an LLM simulacrum of Sam Altman.
(Yes, most humans would also fail this test.)
#AI #LLM #TuringTest #AltmanTest #SamAltman #simulacrum #NotHuman
-
Tired: Turing test to determine if the thing on the other end of the communication line is sentient and/or human.
Wired: test if an "AI" LLM is sentient by seeing if it can correctly identify whether the other end of the communication line is Sam Altman, or an LLM simulacrum of Sam Altman.
(Yes, most humans would also fail this test.)
#AI #LLM #TuringTest #AltmanTest #SamAltman #simulacrum #NotHuman
-
"Did Turing ever discuss how well flattery works for winning the imitation game?"
Oh, excellent. You've earned the Pithy Remark of the Year 2026 award.
(Yes, I dare the rest of year to prove me wrong!)
#RichardDawkins #TuringTest #GenerativeNarcicism #noLLM #StopTheAICorruption
-
"Did Turing ever discuss how well flattery works for winning the imitation game?"
Oh, excellent. You've earned the Pithy Remark of the Year 2026 award.
(Yes, I dare the rest of year to prove me wrong!)
#RichardDawkins #TuringTest #GenerativeNarcicism #noLLM #StopTheAICorruption
-
"Did Turing ever discuss how well flattery works for winning the imitation game?"
Oh, excellent. You've earned the Pithy Remark of the Year 2026 award.
(Yes, I dare the rest of year to prove me wrong!)
#RichardDawkins #TuringTest #GenerativeNarcicism #noLLM #StopTheAICorruption
-
"Did Turing ever discuss how well flattery works for winning the imitation game?"
Oh, excellent. You've earned the Pithy Remark of the Year 2026 award.
(Yes, I dare the rest of year to prove me wrong!)
#RichardDawkins #TuringTest #GenerativeNarcicism #noLLM #StopTheAICorruption
-
"Did Turing ever discuss how well flattery works for winning the imitation game?"
Oh, excellent. You've earned the Pithy Remark of the Year 2026 award.
(Yes, I dare the rest of year to prove me wrong!)
#RichardDawkins #TuringTest #GenerativeNarcicism #noLLM #StopTheAICorruption
-
A Turning test for the chatbots era: "So how do we get a glimpse of the ‘stochastic parrot’ behind the curtain that is creating this faux magic? Adam Becker provides an excellent example in his book More Everything Forever. “Just ask a question that’s superficially similar to one that’s already all over the internet,” he notes, “but make a small change in its text that creates a large change in its meaning.” https://www.dailygrail.com/2026/05/the-claude-delusion-richard-dawkins-believes-his-female-ai-chatbot-is-conscious
-
A Turning test for the chatbots era: "So how do we get a glimpse of the ‘stochastic parrot’ behind the curtain that is creating this faux magic? Adam Becker provides an excellent example in his book More Everything Forever. “Just ask a question that’s superficially similar to one that’s already all over the internet,” he notes, “but make a small change in its text that creates a large change in its meaning.” https://www.dailygrail.com/2026/05/the-claude-delusion-richard-dawkins-believes-his-female-ai-chatbot-is-conscious
-
A Turning test for the chatbots era: "So how do we get a glimpse of the ‘stochastic parrot’ behind the curtain that is creating this faux magic? Adam Becker provides an excellent example in his book More Everything Forever. “Just ask a question that’s superficially similar to one that’s already all over the internet,” he notes, “but make a small change in its text that creates a large change in its meaning.” https://www.dailygrail.com/2026/05/the-claude-delusion-richard-dawkins-believes-his-female-ai-chatbot-is-conscious
-
A Turning test for the chatbots era: "So how do we get a glimpse of the ‘stochastic parrot’ behind the curtain that is creating this faux magic? Adam Becker provides an excellent example in his book More Everything Forever. “Just ask a question that’s superficially similar to one that’s already all over the internet,” he notes, “but make a small change in its text that creates a large change in its meaning.” https://www.dailygrail.com/2026/05/the-claude-delusion-richard-dawkins-believes-his-female-ai-chatbot-is-conscious
-
A Turning test for the chatbots era: "So how do we get a glimpse of the ‘stochastic parrot’ behind the curtain that is creating this faux magic? Adam Becker provides an excellent example in his book More Everything Forever. “Just ask a question that’s superficially similar to one that’s already all over the internet,” he notes, “but make a small change in its text that creates a large change in its meaning.” https://www.dailygrail.com/2026/05/the-claude-delusion-richard-dawkins-believes-his-female-ai-chatbot-is-conscious
-
Hey, here's an interesting response to the "AI" (LLM) pushers from @existentialcomics!
-
Hey, here's an interesting response to the "AI" (LLM) pushers from @existentialcomics!
-
Hey, here's an interesting response to the "AI" (LLM) pushers from @existentialcomics!
-
Hey, here's an interesting response to the "AI" (LLM) pushers from @existentialcomics!
-
Hey, here's an interesting response to the "AI" (LLM) pushers from @existentialcomics!
-
2/ @steve hat die Fehler gefunden. Ich dachte erst, dass #FredUndGünther einen Fehler gemacht haben, denn Donald kann ja nicht „Ich“ sagen. Er müsste "me" sagen.
Aber die Figur in Fred und Günthers Buch ist auch gar nicht Donald. Es ist ihre Standard-Figur für ältere deutsche Männer.
Und KI habe ich 1992 in Edinburgh studiert und es gibt seit 1988 ein Deutsches Forschungszentrum für Künstliche Intelligenz.
https://de.wikipedia.org/wiki/Deutsches_Forschungszentrum_f%C3%BCr_K%C3%BCnstliche_Intelligenz
Und die KI-Forschung gibt es noch viel länger. Die ganze Forschung zur Maschinellen Übersetzung fällt darunter.
Sie fing wohl mit Turing in den 50ern an:
https://en.wikipedia.org/wiki/Artificial_intelligence#History
Den #TuringTest kennt man ja auch.
-
2/ @steve hat die Fehler gefunden. Ich dachte erst, dass #FredUndGünther einen Fehler gemacht haben, denn Donald kann ja nicht „Ich“ sagen. Er müsste "me" sagen.
Aber die Figur in Fred und Günthers Buch ist auch gar nicht Donald. Es ist ihre Standard-Figur für ältere deutsche Männer.
Und KI habe ich 1992 in Edinburgh studiert und es gibt seit 1988 ein Deutsches Forschungszentrum für Künstliche Intelligenz.
https://de.wikipedia.org/wiki/Deutsches_Forschungszentrum_f%C3%BCr_K%C3%BCnstliche_Intelligenz
Und die KI-Forschung gibt es noch viel länger. Die ganze Forschung zur Maschinellen Übersetzung fällt darunter.
Sie fing wohl mit Turing in den 50ern an:
https://en.wikipedia.org/wiki/Artificial_intelligence#History
Den #TuringTest kennt man ja auch.
-
2/ @steve hat die Fehler gefunden. Ich dachte erst, dass #FredUndGünther einen Fehler gemacht haben, denn Donald kann ja nicht „Ich“ sagen. Er müsste "me" sagen.
Aber die Figur in Fred und Günthers Buch ist auch gar nicht Donald. Es ist ihre Standard-Figur für ältere deutsche Männer.
Und KI habe ich 1992 in Edinburgh studiert und es gibt seit 1988 ein Deutsches Forschungszentrum für Künstliche Intelligenz.
https://de.wikipedia.org/wiki/Deutsches_Forschungszentrum_f%C3%BCr_K%C3%BCnstliche_Intelligenz
Und die KI-Forschung gibt es noch viel länger. Die ganze Forschung zur Maschinellen Übersetzung fällt darunter.
Sie fing wohl mit Turing in den 50ern an:
https://en.wikipedia.org/wiki/Artificial_intelligence#History
Den #TuringTest kennt man ja auch.
-
2/ @steve hat die Fehler gefunden. Ich dachte erst, dass #FredUndGünther einen Fehler gemacht haben, denn Donald kann ja nicht „Ich“ sagen. Er müsste "me" sagen.
Aber die Figur in Fred und Günthers Buch ist auch gar nicht Donald. Es ist ihre Standard-Figur für ältere deutsche Männer.
Und KI habe ich 1992 in Edinburgh studiert und es gibt seit 1988 ein Deutsches Forschungszentrum für Künstliche Intelligenz.
https://de.wikipedia.org/wiki/Deutsches_Forschungszentrum_f%C3%BCr_K%C3%BCnstliche_Intelligenz
Und die KI-Forschung gibt es noch viel länger. Die ganze Forschung zur Maschinellen Übersetzung fällt darunter.
Sie fing wohl mit Turing in den 50ern an:
https://en.wikipedia.org/wiki/Artificial_intelligence#History
Den #TuringTest kennt man ja auch.
-
2/ @steve hat die Fehler gefunden. Ich dachte erst, dass #FredUndGünther einen Fehler gemacht haben, denn Donald kann ja nicht „Ich“ sagen. Er müsste "me" sagen.
Aber die Figur in Fred und Günthers Buch ist auch gar nicht Donald. Es ist ihre Standard-Figur für ältere deutsche Männer.
Und KI habe ich 1992 in Edinburgh studiert und es gibt seit 1988 ein Deutsches Forschungszentrum für Künstliche Intelligenz.
https://de.wikipedia.org/wiki/Deutsches_Forschungszentrum_f%C3%BCr_K%C3%BCnstliche_Intelligenz
Und die KI-Forschung gibt es noch viel länger. Die ganze Forschung zur Maschinellen Übersetzung fällt darunter.
Sie fing wohl mit Turing in den 50ern an:
https://en.wikipedia.org/wiki/Artificial_intelligence#History
Den #TuringTest kennt man ja auch.
-
🤖 Bloomberg's data-aggregating #bots seem to have misplaced their sense of #irony 🤔. Instead of delivering market insights, they're demanding #CAPTCHA tests like it's a Turing test for financial literacy 📉. Maybe their next move will be to replace stock analysts with browser settings experts 🤷♂️.
https://www.bloomberg.com/news/articles/2026-04-01/openai-demand-sinks-on-secondary-market-as-anthropic-runs-hot #Bloomberg #TuringTest #FinancialLiteracy #HackerNews #ngated -
🤖 Bloomberg's data-aggregating #bots seem to have misplaced their sense of #irony 🤔. Instead of delivering market insights, they're demanding #CAPTCHA tests like it's a Turing test for financial literacy 📉. Maybe their next move will be to replace stock analysts with browser settings experts 🤷♂️.
https://www.bloomberg.com/news/articles/2026-04-01/openai-demand-sinks-on-secondary-market-as-anthropic-runs-hot #Bloomberg #TuringTest #FinancialLiteracy #HackerNews #ngated -
🤖 Bloomberg's data-aggregating #bots seem to have misplaced their sense of #irony 🤔. Instead of delivering market insights, they're demanding #CAPTCHA tests like it's a Turing test for financial literacy 📉. Maybe their next move will be to replace stock analysts with browser settings experts 🤷♂️.
https://www.bloomberg.com/news/articles/2026-04-01/openai-demand-sinks-on-secondary-market-as-anthropic-runs-hot #Bloomberg #TuringTest #FinancialLiteracy #HackerNews #ngated -
🤖 Bloomberg's data-aggregating #bots seem to have misplaced their sense of #irony 🤔. Instead of delivering market insights, they're demanding #CAPTCHA tests like it's a Turing test for financial literacy 📉. Maybe their next move will be to replace stock analysts with browser settings experts 🤷♂️.
https://www.bloomberg.com/news/articles/2026-04-01/openai-demand-sinks-on-secondary-market-as-anthropic-runs-hot #Bloomberg #TuringTest #FinancialLiteracy #HackerNews #ngated -
🤖 Bloomberg's data-aggregating #bots seem to have misplaced their sense of #irony 🤔. Instead of delivering market insights, they're demanding #CAPTCHA tests like it's a Turing test for financial literacy 📉. Maybe their next move will be to replace stock analysts with browser settings experts 🤷♂️.
https://www.bloomberg.com/news/articles/2026-04-01/openai-demand-sinks-on-secondary-market-as-anthropic-runs-hot #Bloomberg #TuringTest #FinancialLiteracy #HackerNews #ngated -
#QuizOfTheDay: The #TuringTest was introduced by Alan Turing in 1950.
What is the Turing Test used to evaluate?
a) The speed of a computer processor
b) A machine's ability to exhibit intelligent behavior indistinguishable from that of a human
c) The accuracy of data analysis
d) The efficiency of network securityhttps://knowledgezone.co.in/resources/quiz?qId=67e42353259bb882b66c3025
-
#QuizOfTheDay: The #TuringTest was introduced by Alan Turing in 1950.
What is the Turing Test used to evaluate?
a) The speed of a computer processor
b) A machine's ability to exhibit intelligent behavior indistinguishable from that of a human
c) The accuracy of data analysis
d) The efficiency of network securityhttps://knowledgezone.co.in/resources/quiz?qId=67e42353259bb882b66c3025
-
#QuizOfTheDay: The #TuringTest was introduced by Alan Turing in 1950.
What is the Turing Test used to evaluate?
a) The speed of a computer processor
b) A machine's ability to exhibit intelligent behavior indistinguishable from that of a human
c) The accuracy of data analysis
d) The efficiency of network securityhttps://knowledgezone.co.in/resources/quiz?qId=67e42353259bb882b66c3025
-
#QuizOfTheDay: The #TuringTest was introduced by Alan Turing in 1950.
What is the Turing Test used to evaluate?
a) The speed of a computer processor
b) A machine's ability to exhibit intelligent behavior indistinguishable from that of a human
c) The accuracy of data analysis
d) The efficiency of network securityhttps://knowledgezone.co.in/resources/quiz?qId=67e42353259bb882b66c3025
-
#QuizOfTheDay: The #TuringTest was introduced by Alan Turing in 1950.
What is the Turing Test used to evaluate?
a) The speed of a computer processor
b) A machine's ability to exhibit intelligent behavior indistinguishable from that of a human
c) The accuracy of data analysis
d) The efficiency of network securityhttps://knowledgezone.co.in/resources/quiz?qId=67e42353259bb882b66c3025
-
"I've passed the Turing Test" announced the mindless machine, a more sophisticated version of Autocorrect and no more.
"You've passed the Turing Test" agreed the Human, projecting themselves onto Fancy Autocorrect because they were lonely
A modern romance.
-
"I've passed the Turing Test" announced the mindless machine, a more sophisticated version of Autocorrect and no more.
"You've passed the Turing Test" agreed the Human, projecting themselves onto Fancy Autocorrect because they were lonely
A modern romance.