home.social

#llms — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #llms, aggregated by home.social.

  1. I’m really tired of all the anthropomorphic language used with .

    Current headline on @arstechnica:“LLMs believe false statements even after explicit warnings that they're false”.

    Does software “believe” the data you feed it? It processes it, operates on it, etc.

    How about “LLMs treat explicitly labeled false statements as true”. Seems a lot clearer to me what’s happening than muddying the story with “belief”. Treat this like software. It is.

  2. "Amazon has shut down an internal leaderboard that tracked employees’ use of AI tools after workers tried to boost their scores with unnecessary activity that increased the company’s computing costs.

    Employees at the $2.9tn group were told this week its “Kirorank” service — which scored users of Amazon’s Kiro developer platform based on their AI activity — had been taken offline, according to two people familiar with the matter.

    The decision came after the tool led some workers to assign AI agents — autonomous bots that can take actions on behalf of users — to carry out needless tasks in an apparent attempt to climb the rankings.

    Dave Treadwell, an Amazon senior vice-president, told staff earlier this week that the leaderboard had been built with “good intentions”, according to people familiar with his remarks.

    But he added that the result had been additional costs for Amazon due to employees “tokenmaxxing” or inflating their consumption of AI tokens — units of data processed by models.

    “Please don’t use AI just for the sake of using AI,” he told staff."

    ft.com/content/b1a62a7f-6df5-4

    #AI #GenerativeAI #LLMs #Amazon #BigTech

  3. "Amazon has shut down an internal leaderboard that tracked employees’ use of AI tools after workers tried to boost their scores with unnecessary activity that increased the company’s computing costs.

    Employees at the $2.9tn group were told this week its “Kirorank” service — which scored users of Amazon’s Kiro developer platform based on their AI activity — had been taken offline, according to two people familiar with the matter.

    The decision came after the tool led some workers to assign AI agents — autonomous bots that can take actions on behalf of users — to carry out needless tasks in an apparent attempt to climb the rankings.

    Dave Treadwell, an Amazon senior vice-president, told staff earlier this week that the leaderboard had been built with “good intentions”, according to people familiar with his remarks.

    But he added that the result had been additional costs for Amazon due to employees “tokenmaxxing” or inflating their consumption of AI tokens — units of data processed by models.

    “Please don’t use AI just for the sake of using AI,” he told staff."

    ft.com/content/b1a62a7f-6df5-4

    #AI #GenerativeAI #LLMs #Amazon #BigTech

  4. "Amazon has shut down an internal leaderboard that tracked employees’ use of AI tools after workers tried to boost their scores with unnecessary activity that increased the company’s computing costs.

    Employees at the $2.9tn group were told this week its “Kirorank” service — which scored users of Amazon’s Kiro developer platform based on their AI activity — had been taken offline, according to two people familiar with the matter.

    The decision came after the tool led some workers to assign AI agents — autonomous bots that can take actions on behalf of users — to carry out needless tasks in an apparent attempt to climb the rankings.

    Dave Treadwell, an Amazon senior vice-president, told staff earlier this week that the leaderboard had been built with “good intentions”, according to people familiar with his remarks.

    But he added that the result had been additional costs for Amazon due to employees “tokenmaxxing” or inflating their consumption of AI tokens — units of data processed by models.

    “Please don’t use AI just for the sake of using AI,” he told staff."

    ft.com/content/b1a62a7f-6df5-4

    #AI #GenerativeAI #LLMs #Amazon #BigTech

  5. "Amazon has shut down an internal leaderboard that tracked employees’ use of AI tools after workers tried to boost their scores with unnecessary activity that increased the company’s computing costs.

    Employees at the $2.9tn group were told this week its “Kirorank” service — which scored users of Amazon’s Kiro developer platform based on their AI activity — had been taken offline, according to two people familiar with the matter.

    The decision came after the tool led some workers to assign AI agents — autonomous bots that can take actions on behalf of users — to carry out needless tasks in an apparent attempt to climb the rankings.

    Dave Treadwell, an Amazon senior vice-president, told staff earlier this week that the leaderboard had been built with “good intentions”, according to people familiar with his remarks.

    But he added that the result had been additional costs for Amazon due to employees “tokenmaxxing” or inflating their consumption of AI tokens — units of data processed by models.

    “Please don’t use AI just for the sake of using AI,” he told staff."

    ft.com/content/b1a62a7f-6df5-4

    #AI #GenerativeAI #LLMs #Amazon #BigTech

  6. "Amazon has shut down an internal leaderboard that tracked employees’ use of AI tools after workers tried to boost their scores with unnecessary activity that increased the company’s computing costs.

    Employees at the $2.9tn group were told this week its “Kirorank” service — which scored users of Amazon’s Kiro developer platform based on their AI activity — had been taken offline, according to two people familiar with the matter.

    The decision came after the tool led some workers to assign AI agents — autonomous bots that can take actions on behalf of users — to carry out needless tasks in an apparent attempt to climb the rankings.

    Dave Treadwell, an Amazon senior vice-president, told staff earlier this week that the leaderboard had been built with “good intentions”, according to people familiar with his remarks.

    But he added that the result had been additional costs for Amazon due to employees “tokenmaxxing” or inflating their consumption of AI tokens — units of data processed by models.

    “Please don’t use AI just for the sake of using AI,” he told staff."

    ft.com/content/b1a62a7f-6df5-4

    #AI #GenerativeAI #LLMs #Amazon #BigTech

  7. "Steven Rosenbaum started writing his book The Future of Truth: How AI Reshapes Reality in 2022, around when ChatGPT launched. Initially he didn’t use it at all, “But as the writing moved forward into 2023, 2024, it got better and I got better at using it,” he said. “To be clear, it never wrote a page of the book,” he added. “But it became a research partner. I would ask it for quotes on certain things, and it would deliver them. They would occasionally be spectacular, often serviceable, and then, in very odd ways, just staggeringly wrong.”

    “I kept thinking, I’ll be really careful, and I’ll double-check everything,” he said.

    In May, the New York Times reported that Rosenbaum had included “more than a half-dozen misattributed or fake quotes” in the book seemingly generated by AI. Rosenbaum, a media entrepreneur, had previously acknowledged that he’d used AI tools during the research, writing, and editing process, but the Times investigation was nevertheless mortifying — for both Rosenbaum and his publisher, Simon & Schuster. The book-publishing industry had already been wrestling with the prospect of a flood of AI-authored texts in the fiction market, and now the Rosenbaum scandal was showing the way AI could blow a hole in the nonfiction sector, too.

    Nonfiction publishing is uniquely vulnerable to AI because the industry has long neglected to do anything to ensure the books it publishes are factually accurate. “People outside of the industry don’t understand that, contractually, publishers are not obligated to fact-check,” said Paul Bogaards, the longtime marketing and publicity executive at Knopf who now has his own PR firm. Fact-checking is not a service publishers will pay for, though they sometimes encourage authors to seek it out on their own dime. But fact-checking is expensive: Hiring an outside checker can cost between $7,000 to $10,000 per book, or even more...

    nymag.com/intelligencer/articl

    #AI #GenerativeAI #LLMs #Chatbots #Books #Publishing #NonFiction

  8. "Steven Rosenbaum started writing his book The Future of Truth: How AI Reshapes Reality in 2022, around when ChatGPT launched. Initially he didn’t use it at all, “But as the writing moved forward into 2023, 2024, it got better and I got better at using it,” he said. “To be clear, it never wrote a page of the book,” he added. “But it became a research partner. I would ask it for quotes on certain things, and it would deliver them. They would occasionally be spectacular, often serviceable, and then, in very odd ways, just staggeringly wrong.”

    “I kept thinking, I’ll be really careful, and I’ll double-check everything,” he said.

    In May, the New York Times reported that Rosenbaum had included “more than a half-dozen misattributed or fake quotes” in the book seemingly generated by AI. Rosenbaum, a media entrepreneur, had previously acknowledged that he’d used AI tools during the research, writing, and editing process, but the Times investigation was nevertheless mortifying — for both Rosenbaum and his publisher, Simon & Schuster. The book-publishing industry had already been wrestling with the prospect of a flood of AI-authored texts in the fiction market, and now the Rosenbaum scandal was showing the way AI could blow a hole in the nonfiction sector, too.

    Nonfiction publishing is uniquely vulnerable to AI because the industry has long neglected to do anything to ensure the books it publishes are factually accurate. “People outside of the industry don’t understand that, contractually, publishers are not obligated to fact-check,” said Paul Bogaards, the longtime marketing and publicity executive at Knopf who now has his own PR firm. Fact-checking is not a service publishers will pay for, though they sometimes encourage authors to seek it out on their own dime. But fact-checking is expensive: Hiring an outside checker can cost between $7,000 to $10,000 per book, or even more...

    nymag.com/intelligencer/articl

    #AI #GenerativeAI #LLMs #Chatbots #Books #Publishing #NonFiction

  9. "Steven Rosenbaum started writing his book The Future of Truth: How AI Reshapes Reality in 2022, around when ChatGPT launched. Initially he didn’t use it at all, “But as the writing moved forward into 2023, 2024, it got better and I got better at using it,” he said. “To be clear, it never wrote a page of the book,” he added. “But it became a research partner. I would ask it for quotes on certain things, and it would deliver them. They would occasionally be spectacular, often serviceable, and then, in very odd ways, just staggeringly wrong.”

    “I kept thinking, I’ll be really careful, and I’ll double-check everything,” he said.

    In May, the New York Times reported that Rosenbaum had included “more than a half-dozen misattributed or fake quotes” in the book seemingly generated by AI. Rosenbaum, a media entrepreneur, had previously acknowledged that he’d used AI tools during the research, writing, and editing process, but the Times investigation was nevertheless mortifying — for both Rosenbaum and his publisher, Simon & Schuster. The book-publishing industry had already been wrestling with the prospect of a flood of AI-authored texts in the fiction market, and now the Rosenbaum scandal was showing the way AI could blow a hole in the nonfiction sector, too.

    Nonfiction publishing is uniquely vulnerable to AI because the industry has long neglected to do anything to ensure the books it publishes are factually accurate. “People outside of the industry don’t understand that, contractually, publishers are not obligated to fact-check,” said Paul Bogaards, the longtime marketing and publicity executive at Knopf who now has his own PR firm. Fact-checking is not a service publishers will pay for, though they sometimes encourage authors to seek it out on their own dime. But fact-checking is expensive: Hiring an outside checker can cost between $7,000 to $10,000 per book, or even more...

    nymag.com/intelligencer/articl

    #AI #GenerativeAI #LLMs #Chatbots #Books #Publishing #NonFiction

  10. "Steven Rosenbaum started writing his book The Future of Truth: How AI Reshapes Reality in 2022, around when ChatGPT launched. Initially he didn’t use it at all, “But as the writing moved forward into 2023, 2024, it got better and I got better at using it,” he said. “To be clear, it never wrote a page of the book,” he added. “But it became a research partner. I would ask it for quotes on certain things, and it would deliver them. They would occasionally be spectacular, often serviceable, and then, in very odd ways, just staggeringly wrong.”

    “I kept thinking, I’ll be really careful, and I’ll double-check everything,” he said.

    In May, the New York Times reported that Rosenbaum had included “more than a half-dozen misattributed or fake quotes” in the book seemingly generated by AI. Rosenbaum, a media entrepreneur, had previously acknowledged that he’d used AI tools during the research, writing, and editing process, but the Times investigation was nevertheless mortifying — for both Rosenbaum and his publisher, Simon & Schuster. The book-publishing industry had already been wrestling with the prospect of a flood of AI-authored texts in the fiction market, and now the Rosenbaum scandal was showing the way AI could blow a hole in the nonfiction sector, too.

    Nonfiction publishing is uniquely vulnerable to AI because the industry has long neglected to do anything to ensure the books it publishes are factually accurate. “People outside of the industry don’t understand that, contractually, publishers are not obligated to fact-check,” said Paul Bogaards, the longtime marketing and publicity executive at Knopf who now has his own PR firm. Fact-checking is not a service publishers will pay for, though they sometimes encourage authors to seek it out on their own dime. But fact-checking is expensive: Hiring an outside checker can cost between $7,000 to $10,000 per book, or even more...

    nymag.com/intelligencer/articl

    #AI #GenerativeAI #LLMs #Chatbots #Books #Publishing #NonFiction

  11. "Steven Rosenbaum started writing his book The Future of Truth: How AI Reshapes Reality in 2022, around when ChatGPT launched. Initially he didn’t use it at all, “But as the writing moved forward into 2023, 2024, it got better and I got better at using it,” he said. “To be clear, it never wrote a page of the book,” he added. “But it became a research partner. I would ask it for quotes on certain things, and it would deliver them. They would occasionally be spectacular, often serviceable, and then, in very odd ways, just staggeringly wrong.”

    “I kept thinking, I’ll be really careful, and I’ll double-check everything,” he said.

    In May, the New York Times reported that Rosenbaum had included “more than a half-dozen misattributed or fake quotes” in the book seemingly generated by AI. Rosenbaum, a media entrepreneur, had previously acknowledged that he’d used AI tools during the research, writing, and editing process, but the Times investigation was nevertheless mortifying — for both Rosenbaum and his publisher, Simon & Schuster. The book-publishing industry had already been wrestling with the prospect of a flood of AI-authored texts in the fiction market, and now the Rosenbaum scandal was showing the way AI could blow a hole in the nonfiction sector, too.

    Nonfiction publishing is uniquely vulnerable to AI because the industry has long neglected to do anything to ensure the books it publishes are factually accurate. “People outside of the industry don’t understand that, contractually, publishers are not obligated to fact-check,” said Paul Bogaards, the longtime marketing and publicity executive at Knopf who now has his own PR firm. Fact-checking is not a service publishers will pay for, though they sometimes encourage authors to seek it out on their own dime. But fact-checking is expensive: Hiring an outside checker can cost between $7,000 to $10,000 per book, or even more...

    nymag.com/intelligencer/articl

    #AI #GenerativeAI #LLMs #Chatbots #Books #Publishing #NonFiction

  12. 🚨🚨 ALERT: Internet discovers that #LLMs generate #recycled sentences! Our brave blogger thought he'd stumbled upon linguistic gold, only to realize he was just mining #AI #clichés. 🔄🧠 Next up: the shocking revelation that the sky is blue. 🌤️
    shvbsle.in/various-llm-smells/ #Sentences #Blogging #Humor #Linguistics #HackerNews #ngated

  13. Como os drones transformam a guerra tradicional

    Irã e Ucrânia evitam derrotas certas e dissuadem potências, com enxames de drones de baixo custo. Mas mudança significaria virada dos “mais fracos”? EUA já aplicam engenharia reversa; e advento da IA acelera velocidade de resposta e cria perigosa assimetria

    outraspalavras.net/geopolitica

  14. AI jobs apocalypse canceled.

    OpenAI CEO Sam Altman backtracks on his prediction about the job market.

    time.com/article/2026/05/26/sa

  15. "Retrieval-Augmented Generation – Ein Erfahrungsbericht und Leitfaden zum Einsatz von wissensbasierten Chatbots an der HU Berlin"
    Teil 1: doi.org/10.1515/iwp-2026-3007
    Teil 2: doi.org/10.1515/iwp-2026-3008
    #Bibliothek #Chatbot #RAG #LLMs

  16. On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT raises abstention on instruct models. AbstentionBench (NeurIPS 2025) finds extending the reasoning budget lowers it on a trained reasoner. What gives?

    benjaminhan.net/posts/20260527

    #Metacognition #LLMs #Reasoning #Evaluation #AI

  17. On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT raises abstention on instruct models. AbstentionBench (NeurIPS 2025) finds extending the reasoning budget lowers it on a trained reasoner. What gives?

    benjaminhan.net/posts/20260527

    #Metacognition #LLMs #Reasoning #Evaluation #AI

  18. On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT raises abstention on instruct models. AbstentionBench (NeurIPS 2025) finds extending the reasoning budget lowers it on a trained reasoner. What gives?

    benjaminhan.net/posts/20260527

    #Metacognition #LLMs #Reasoning #Evaluation #AI

  19. On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT raises abstention on instruct models. AbstentionBench (NeurIPS 2025) finds extending the reasoning budget lowers it on a trained reasoner. What gives?

    benjaminhan.net/posts/20260527

    #Metacognition #LLMs #Reasoning #Evaluation #AI

  20. On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT raises abstention on instruct models. AbstentionBench (NeurIPS 2025) finds extending the reasoning budget lowers it on a trained reasoner. What gives?

    benjaminhan.net/posts/20260527

    #Metacognition #LLMs #Reasoning #Evaluation #AI

  21. Wow. Until now the New York Times has been little more than free advertising disguised as news for these new AI tech companies like OpenAI or Anthropic. If the New York Times is allowing op-eds accusing the Democratic Party of falling for the hype about LLMs (the hype about it soon becoming sentient and taking over the world) against all of the best expert opinion to the contrary, then The Times are clearly beginning to see a wave of revolt rising against this technology. They are probably trying to do a little damage control for all the propaganda they have been doing until now.

    #tech #AI #LLMs #TechPolicy #Policy #Law

    RE: https://sfba.social/@gypsyvegan/116649711978778077

  22. "How newsrooms should use AI — or if they should at all — has been a recurrent debate within the media industry over the last several years. Increasingly, these rules are being hammered out at the bargaining table between unions and publishers. Right now, employees at The New York Times are gearing up for a fight.

    Unionized staff with the Tech Guild say Times management has refused to provide the union with information related to how the company has used AI, its plans for AI use in the future, and how it will affect employees’ jobs and workflow. (The union filed an unfair labor practice charge earlier this month.) The Tech Guild, a NewsGuild of New York unit of around 700 software engineers, designers, product and project managers, and data analysts, also filed grievances saying Times management violated their collective bargaining agreement when it started using two internal AI tools that track and evaluate employee performance and activity.

    One of the AI tools, called DX, advertises itself as an engineering productivity tool that lets companies track employees’ output, generative AI use, and efficiency, among other metrics. DX was originally announced internally as a way to improve the developer experience, says Ben Harnett, a software engineer at the Times and chair of the unit’s generative AI committee. The goal, at least according to Times management, was to measure the company as a whole. Over the last few months, though, the DX data has become more personalized, with benchmarks being applied to individuals, Harnett says."

    theverge.com/ai-artificial-int

    #AI #GenerativeAI #LLMs #Media #News #Newsrooms #Journalism #Newspapers #HR #NYT

  23. "How newsrooms should use AI — or if they should at all — has been a recurrent debate within the media industry over the last several years. Increasingly, these rules are being hammered out at the bargaining table between unions and publishers. Right now, employees at The New York Times are gearing up for a fight.

    Unionized staff with the Tech Guild say Times management has refused to provide the union with information related to how the company has used AI, its plans for AI use in the future, and how it will affect employees’ jobs and workflow. (The union filed an unfair labor practice charge earlier this month.) The Tech Guild, a NewsGuild of New York unit of around 700 software engineers, designers, product and project managers, and data analysts, also filed grievances saying Times management violated their collective bargaining agreement when it started using two internal AI tools that track and evaluate employee performance and activity.

    One of the AI tools, called DX, advertises itself as an engineering productivity tool that lets companies track employees’ output, generative AI use, and efficiency, among other metrics. DX was originally announced internally as a way to improve the developer experience, says Ben Harnett, a software engineer at the Times and chair of the unit’s generative AI committee. The goal, at least according to Times management, was to measure the company as a whole. Over the last few months, though, the DX data has become more personalized, with benchmarks being applied to individuals, Harnett says."

    theverge.com/ai-artificial-int

    #AI #GenerativeAI #LLMs #Media #News #Newsrooms #Journalism #Newspapers #HR #NYT

  24. "How newsrooms should use AI — or if they should at all — has been a recurrent debate within the media industry over the last several years. Increasingly, these rules are being hammered out at the bargaining table between unions and publishers. Right now, employees at The New York Times are gearing up for a fight.

    Unionized staff with the Tech Guild say Times management has refused to provide the union with information related to how the company has used AI, its plans for AI use in the future, and how it will affect employees’ jobs and workflow. (The union filed an unfair labor practice charge earlier this month.) The Tech Guild, a NewsGuild of New York unit of around 700 software engineers, designers, product and project managers, and data analysts, also filed grievances saying Times management violated their collective bargaining agreement when it started using two internal AI tools that track and evaluate employee performance and activity.

    One of the AI tools, called DX, advertises itself as an engineering productivity tool that lets companies track employees’ output, generative AI use, and efficiency, among other metrics. DX was originally announced internally as a way to improve the developer experience, says Ben Harnett, a software engineer at the Times and chair of the unit’s generative AI committee. The goal, at least according to Times management, was to measure the company as a whole. Over the last few months, though, the DX data has become more personalized, with benchmarks being applied to individuals, Harnett says."

    theverge.com/ai-artificial-int

    #AI #GenerativeAI #LLMs #Media #News #Newsrooms #Journalism #Newspapers #HR #NYT

  25. "How newsrooms should use AI — or if they should at all — has been a recurrent debate within the media industry over the last several years. Increasingly, these rules are being hammered out at the bargaining table between unions and publishers. Right now, employees at The New York Times are gearing up for a fight.

    Unionized staff with the Tech Guild say Times management has refused to provide the union with information related to how the company has used AI, its plans for AI use in the future, and how it will affect employees’ jobs and workflow. (The union filed an unfair labor practice charge earlier this month.) The Tech Guild, a NewsGuild of New York unit of around 700 software engineers, designers, product and project managers, and data analysts, also filed grievances saying Times management violated their collective bargaining agreement when it started using two internal AI tools that track and evaluate employee performance and activity.

    One of the AI tools, called DX, advertises itself as an engineering productivity tool that lets companies track employees’ output, generative AI use, and efficiency, among other metrics. DX was originally announced internally as a way to improve the developer experience, says Ben Harnett, a software engineer at the Times and chair of the unit’s generative AI committee. The goal, at least according to Times management, was to measure the company as a whole. Over the last few months, though, the DX data has become more personalized, with benchmarks being applied to individuals, Harnett says."

    theverge.com/ai-artificial-int

    #AI #GenerativeAI #LLMs #Media #News #Newsrooms #Journalism #Newspapers #HR #NYT

  26. "How newsrooms should use AI — or if they should at all — has been a recurrent debate within the media industry over the last several years. Increasingly, these rules are being hammered out at the bargaining table between unions and publishers. Right now, employees at The New York Times are gearing up for a fight.

    Unionized staff with the Tech Guild say Times management has refused to provide the union with information related to how the company has used AI, its plans for AI use in the future, and how it will affect employees’ jobs and workflow. (The union filed an unfair labor practice charge earlier this month.) The Tech Guild, a NewsGuild of New York unit of around 700 software engineers, designers, product and project managers, and data analysts, also filed grievances saying Times management violated their collective bargaining agreement when it started using two internal AI tools that track and evaluate employee performance and activity.

    One of the AI tools, called DX, advertises itself as an engineering productivity tool that lets companies track employees’ output, generative AI use, and efficiency, among other metrics. DX was originally announced internally as a way to improve the developer experience, says Ben Harnett, a software engineer at the Times and chair of the unit’s generative AI committee. The goal, at least according to Times management, was to measure the company as a whole. Over the last few months, though, the DX data has become more personalized, with benchmarks being applied to individuals, Harnett says."

    theverge.com/ai-artificial-int

    #AI #GenerativeAI #LLMs #Media #News #Newsrooms #Journalism #Newspapers #HR #NYT

  27. "In the past few years, I’ve spoken to a number of academics and instructors at the college and high-school level who have said similar things. They talk about a sense of loss and of despair, because the one thing that brought them meaning has been erased, or blotted out, by the arrival of A.I. Most, like Peters, do not blame the students, nor do they believe all students welcome the changes wrought by the new technology. “I’ve seen students respond with this disdain for teachers who just let A.I. use happen,” Peters said. “There’s this indignance, like, ‘Why don’t you want more from us than this?’ So, even if they’re using it, they’re still wanting us to hold them to a higher standard.”

    “This is an exacerbation of a transactional model of education that has lasted for a long time,” Peters told me. Students are told that they’re in school to get a degree, one that comes with a high price tag and, for many, a debt burden. They are told that they will be assessed by the work they turn in. And, because A.I. allows them to turn in what Peters admitted was superficially “pretty good quality material,” they might not see why it’s such a big deal when they can’t explain what they have generated.

    “There are these waves of relief that wash over me when I see misspellings and poor grammatical structure in sentences,” Peters said. “When I can tell that they’re really working through it themselves.”

    The teachers and professors I’ve spoken to have varying perspectives on what A.I. is doing and what it may yet do. But common concerns emerge. What follows are testimonials from eleven faculty members at colleges across the country on how A.I. has changed their work."

    newyorker.com/news/fault-lines

    #AI #GenerativeAI #LLMs #Education #HigherEd #Universities #Academia

  28. "The world’s leading AI labs are hiring philosophers to think through ethical edge cases and grand questions of mind and morality. Are they another instrument of hype?

    “It’s probably the best time to be a philosopher since Aristotle was hired as tutor to Alexander the Great,” says Henry Ajder, a philosophy postgraduate who advises the UK government and a slew of startups on artificial intelligence. He’s only half joking.

    Philosophers have never seemed like the most employable bunch. But AI, the same technology that’s expected to drive many other people out of work, has given new weight to the kinds of questions they’re trained to ask (and sometimes maybe even answer): What is intelligence? What is a mind? “You have philosophers from hundreds of years ago who thought about some of the same problems,” Ajder says. “Now they are becoming material.”

    Two of the foremost AI labs have recruited teams of in-house philosophers. “There are significantly more philosophers now—that’s a sound intuition,” says ethicist Iason Gabriel, who leads Google DeepMind’s team of research scientists specializing in the societal impact of AI. At Anthropic, resident philosopher Amanda Askell has become one of the company’s most recognizable faces. Both labs declined to disclose the number of philosophers they employ, citing company policy. WIRED counts at least 10 at DeepMind and four at Anthropic."

    wired.com/story/to-land-a-job-

    #AI #GenerativeAI #LLMs #Philosophy #BigTech #DeepMind #Anthropic

  29. There is no solution to the AI and assessment problem

    This is the core message of a surprisingly upbeat paper. There is no solution to the AI and assessment problem because it’s a classic example of a wicked problem. This means that, as they put it on pg 2:

    Wicked problems, as opposed to ‘tame’ problems, do not have ‘correct’ or ‘incorrect’ solutions (Rittel and Webber 1973). This does not mean there are no ways forward, nor does it mean that all ways forward are equally valuable. However, it does mean that responses must look very different. For one, they require a shift from seeking definitive answers to engaging in ongoing, adaptive work shaped by competing priorities and evolving conditions.

    There are a number of reasons they claim it is a wicked problem:

    • It cannot be clearly or conclusively defined
    • There is no clear criteria for knowing when ‘the solution’ has been reached
    • There are only better or worse options involving trade offs
    • There is a lack of clear metrics to adjudicate between these better or worse options
    • They cannot be studied through trial and error because every trial has real world consequences which means decision makers are on the line for them
    • The range of putative solutions and potential approaches is pretty much limitless
    • They exist because of deeper structural issues and reflect these issues
    • The framing determines which approaches show up for us as relevant

    This means academics are “put in the position of needing to make continuous professional judgments in conditions of permanent uncertainty” (pg 12). This is not a good position to be in and it’s not going away. Rather than a council of despair, recognising the character of wicked problems is necessary for helping us cope with being placed in that position:

    • “First, it lifts the impossible burden on teachers and institutions to immediately get things right once and for all. When problems are unsolvable and ever-changing, missteps and course corrections are not failures. They are part of doing the work well.” (pg 12)
    • “Alternatively, a wicked problem frame suggests that trade-off are necessary and there is no optimal balance nor solution. The teacher who wondered ‘Have I struck the right balance? I don’t know’ (T6) was describing the uncertainty inherent in weighing pedagogical goals against workload, security against authenticity, current needs against future preparation.” (pg 13)
    • “Permission to diverge recognizes that in wicked problems, context determines every- thing. What transforms learning in a 20-student philosophy seminar becomes logis- tically impossible with 250 business students. What prepares future lawyers for AI-integrated practice might undermine the clinical skills nurses need.” (pg 13)

    It means we can accept there is no fix but rather iterative and evaluative design work which is necessary because the environment has shifted in a fundamental sense. What matters is that we are moving in the ‘right’ direction while ensuring that we build up a variegated (and always provisional) sense of what ‘right’ is that reflects the range of different practices and imperatives within a multidisciplinary university.

    #AI #assessment #higherEducation #LLMs
  30. At the Google I/O conference last week where the only topic was LLMs, Google distributed baseball caps adorned with the LLM prompt that would theoretically bring the cap about.

    Not only the three lines small font prompt makes the cap absolutely ugly on its own, but the prompt also omits to mention the prompt decoration, which means the prompt would have led to the creation of a different cap.

    Among the problems LLMs are purported to solve, short-sightedness still isn’t one of them.

    #AI #LLMs #ArtificialIntelligence #LargeLanguageModels

  31. Why Google’s new AI-saturated search page will be a disaster

    Google didn’t invent full-text search of the Internet – that honour belongs to early pioneers such as WebCrawler, Lycos and AltaVista. But for the last 25 years or so, Google has been synonymous with online searching, providing the quickest and most effective way to find things online (although its results may be getting worse.) More recently, it has been adding to its search engine more […]

    #agentic #agents #ai #altavista #blackBox #chatbot #creators #dependency #google #interface #links #llms #lycos #magazines #newspapers #publishing #search #training #webcrawler #worldWideWeb walledculture.org/why-googles-
  32. Why Google’s new AI-saturated search page will be a disaster

    Google didn’t invent full-text search of the Internet – that honour belongs to early pioneers such as WebCrawler, Lycos and AltaVista. But for the last 25 years or so, Google has been synonymous with online searching, providing the quickest and most effective way to find things online (although its results may be getting worse.) More recently, it has been adding to its search engine more […]

    #agentic #agents #ai #altavista #blackBox #chatbot #creators #dependency #google #interface #links #llms #lycos #magazines #newspapers #publishing #search #training #webcrawler #worldWideWeb walledculture.org/why-googles-
  33. Why Google’s new AI-saturated search page will be a disaster

    Google didn’t invent full-text search of the Internet – that honour belongs to early pioneers such as WebCrawler, Lycos and AltaVista. But for the last 25 years or so, Google has been synonymous with online searching, providing the quickest and most effective way to find things online (although its results may be getting worse.) More recently, it has been adding to its search engine more […]

    #agentic #agents #ai #altavista #blackBox #chatbot #creators #dependency #google #interface #links #llms #lycos #magazines #newspapers #publishing #search #training #webcrawler #worldWideWeb walledculture.org/why-googles-
  34. Why Google’s new AI-saturated search page will be a disaster

    Google didn’t invent full-text search of the Internet – that honour belongs to early pioneers such as WebCrawler, Lycos and AltaVista. But for the last 25 years or so, Google has been synonymous with online searching, providing the quickest and most effective way to find things online (although its results may be getting worse.) More recently, it has been adding to its search engine more […]

    #agentic #agents #ai #altavista #blackBox #chatbot #creators #dependency #google #interface #links #llms #lycos #magazines #newspapers #publishing #search #training #webcrawler #worldWideWeb walledculture.org/why-googles-
  35. Why Google’s new AI-saturated search page will be a disaster

    Google didn’t invent full-text search of the Internet – that honour belongs to early pioneers such as WebCrawler, Lycos and AltaVista. But for the last 25 years or so, Google has been synonymous with online searching, providing the quickest and most effective way to find things online (although its results may be getting worse.) More recently, it has been adding to its search engine more […]

    #agentic #agents #ai #altavista #blackBox #chatbot #creators #dependency #google #interface #links #llms #lycos #magazines #newspapers #publishing #search #training #webcrawler #worldWideWeb walledculture.org/why-googles-
  36. Can language models monitor and steer their own internal activations? A neuroscience-inspired neurofeedback paradigm finds yes, but only within a low-dimensional metacognitive space: semantically interpretable directions are accessible, raw-variance directions aren't. The prerequisite for spoofing activation-based oversight already partially exists.

    benjaminhan.net/posts/20260526

    #Paper #Metacognition #LLMs #AISafety #Neuroscience #NeurIPS #AI

  37. Can frontier coding agents rebuild a program from scratch given only its executable and docs? No: a new 200-task benchmark finds that across nine models none fully resolves any task. The best passes 95% of tests on just 3% of them. Same models score well on bug-fix benchmarks but zero here, so headline progress numbers don't extrapolate.

    benjaminhan.net/posts/20260526

    #Paper #LLMs #AgenticSystems #SoftwareEngineering #AI