#model-collapse — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #model-collapse, aggregated by home.social.
-
Yay! My essay on the impacts of Large Language Model (#LLM) #AI in #archaeology was just published:
https://doi.org/10.11141/ia.71.15
It looks at #bots and mass scraping on the infrastructure supporting #opendata and #openaccess. It also looks at the incentives that encourage the mass-production of #bullshit that may lead to #modelcollapse but more likely less dramatic and more dreary outcomes.
-
@ApostateEnglishman This spelling appears in some dictionaries because it was used in 1646. As far as I can tell it was used only _once_. This misspelling of "loyalty" was probably a typo or mistranslation by the original author. Or, it may be an error introduced in more recent times during scanning/OCR. I haven't seen a photo of the original page so I can't confirm, but I have seen this sort of glitch happen.
Nevertheless, it's bizarre to include such a rare and archaic word in spell-check dictionaries!
How did this happen? I think it may be a consequence of LLMs scraping content from online sources, using what it finds without the intelligence to discern between quality and slop, and negligent humans failing to review machine-generated content before declaring "LGTM, ship it!" Next, that LLM gets scraped by other LLMs, which indiscriminately incorporate the errors into their own AI model training corpus in an ever-worsening "Habsburg AI" feedback loop.
Thus, it seems one person's typo nearly 400 years ago has resurfaced and is contributing to AI Model Collapse.
#AI #LLM #LLMs #AISlop #HabsburgAI #AIModelCollapse #ModelCollapse #AutoCarrot
-
@ApostateEnglishman This spelling appears in some dictionaries because it was used in 1646. As far as I can tell it was used only _once_. This misspelling of "loyalty" was probably a typo or mistranslation by the original author. Or, it may be an error introduced in more recent times during scanning/OCR. I haven't seen a photo of the original page so I can't confirm, but I have seen this sort of glitch happen.
Nevertheless, it's bizarre to include such a rare and archaic word in spell-check dictionaries!
How did this happen? I think it may be a consequence of LLMs scraping content from online sources, using what it finds without the intelligence to discern between quality and slop, and negligent humans failing to review machine-generated content before declaring "LGTM, ship it!" Next, that LLM gets scraped by other LLMs, which indiscriminately incorporate the errors into their own AI model training corpus in an ever-worsening "Habsburg AI" feedback loop.
Thus, it seems one person's typo nearly 400 years ago has resurfaced and is contributing to AI Model Collapse.
#AI #LLM #LLMs #AISlop #HabsburgAI #AIModelCollapse #ModelCollapse #AutoCarrot
-
À force d’utiliser l’#IA, les #journalistes risquent-ils d’appauvrir la langue ?
https://theconversation.com/a-force-dutiliser-lia-les-journalistes-risquent-ils-dappauvrir-la-langue-283938
Quand les systèmes commencent à être entraînés à partir de textes produits par d’autres IA arrive le #modelcollapse ou #effondrement du modèle un processus de #dégénérescence où les données générées par un modèle finissent par contaminer l’entraînement des générations suivantes.
+ Il y a de textes artificiels - les modèles sont exposés à la diversité réelle des usages humains de la langue -
À force d’utiliser l’#IA, les #journalistes risquent-ils d’appauvrir la langue ?
https://theconversation.com/a-force-dutiliser-lia-les-journalistes-risquent-ils-dappauvrir-la-langue-283938
Quand les systèmes commencent à être entraînés à partir de textes produits par d’autres IA arrive le #modelcollapse ou #effondrement du modèle un processus de #dégénérescence où les données générées par un modèle finissent par contaminer l’entraînement des générations suivantes.
+ Il y a de textes artificiels - les modèles sont exposés à la diversité réelle des usages humains de la langue -
🔴 LIVE NOW ON VORTEX
📻 Vortex Night ⛓️ (Industrial metal)
──────────────
🎵 MODEL COLLAPSE - SILENT PATH▶️ Écouter / Listen : VorteX [Radio]
https://lesonduvortex.net💬 Join us on Discord:
https://discord.gg/d82hJZBeDE -
🔴 LIVE NOW ON VORTEX
📻 Vortex Night ⛓️ (Industrial metal)
──────────────
🎵 MODEL COLLAPSE - SILENT PATH▶️ Écouter / Listen : VorteX [Radio]
https://lesonduvortex.net💬 Join us on Discord:
https://discord.gg/d82hJZBeDE -
🇺🇦 #NowPlaying on #KEXP's #MechanicalBreakdown MODEL COLLAPSE: 🎵 SILENT PATH #MODELCOLLAPSE ▶️ 🪄 Automagic 🔊 show 📻 playlist on Spotify ▶️ Song on #Spotify:
SILENT PATH -
🇺🇦 #NowPlaying on #KEXP's #MechanicalBreakdown MODEL COLLAPSE: 🎵 SILENT PATH #MODELCOLLAPSE ▶️ 🪄 Automagic 🔊 show 📻 playlist on Spotify ▶️ Song on #Spotify:
SILENT PATH -
AI makes mistakes – I still notice them because I have prior knowledge.
But what about young people who use AI as their primary source of information?
And: what happens when this generation trains the next AI – with the knowledge they got from AI?
Does ignorance compound itself?
-
🟩 𝗘𝗫𝗛𝗜𝗕𝗜𝗧𝗜𝗢𝗡: 𝐿𝑎𝑡𝑒𝑛𝑡 𝑆𝑝𝑎𝑐𝑒
1–30 April | Aksioma Project Space
❕ 𝗢𝗽𝗲𝗻𝗶𝗻𝗴: 1 April at 8 PMIn her installation, artist #FelicityHammond offers a speculative glimpse into a not-too-distant future where this new approach to space-based computation has become the dominant position in the AI industry. However, the system continues to battle with the effects of #modelcollapse...
> https://aksioma.org/becomingimage/exhibitions/latent-space/
-
🟩 𝗘𝗫𝗛𝗜𝗕𝗜𝗧𝗜𝗢𝗡: 𝐿𝑎𝑡𝑒𝑛𝑡 𝑆𝑝𝑎𝑐𝑒
1–30 April | Aksioma Project Space
❕ 𝗢𝗽𝗲𝗻𝗶𝗻𝗴: 1 April at 8 PMIn her installation, artist #FelicityHammond offers a speculative glimpse into a not-too-distant future where this new approach to space-based computation has become the dominant position in the AI industry. However, the system continues to battle with the effects of #modelcollapse...
> https://aksioma.org/becomingimage/exhibitions/latent-space/
-
“The half-life of cultural relevance has collapsed below the minimum viable generation cycle for coherent slop.”
WTFH?!…
https://open.substack.com/pub/ediblspaceships/p/is-the-mediaplex-happening-faster
-
“The half-life of cultural relevance has collapsed below the minimum viable generation cycle for coherent slop.”
WTFH?!…
https://open.substack.com/pub/ediblspaceships/p/is-the-mediaplex-happening-faster
-
RE: https://wandering.shop/@cstross/115961174452820573
Model collapse: “The owners of the right-wing press read their own media and it rotted their brains.” Ha ha, yes! The same thing happened to the (so-called) Liberal Party in Australia & they lost the last two elections, badly. ⤵️
#AUSPol #ModelCollapse #LiberalParty #AustralianElections #reconnectingConsequencesToCauses
-
RE: https://wandering.shop/@cstross/115961174452820573
Model collapse: “The owners of the right-wing press read their own media and it rotted their brains.” Ha ha, yes! The same thing happened to the (so-called) Liberal Party in Australia & they lost the last two elections, badly. ⤵️
#AUSPol #ModelCollapse #LiberalParty #AustralianElections #reconnectingConsequencesToCauses
-
Wąż zjada własny ogon. „Profesjonalny” GPT-5.2 przyłapany na cytowaniu kontrowersyjnej Grokipedii
Według zapewnień OpenAI miał być szczytem techniki, narzędziem dedykowanym dla prawników, bankierów i naukowców. Tymczasem flagowy model GPT-5.2 został przyłapany na ściąganiu na egzaminie. I to od kogo? Od swojego mniej rozgarniętego kuzyna z xAI.
Recykling cyfrowych treści
Śledztwo przeprowadzone przez The Guardian ujawniło mechanizm, którego inżynierowie z San Francisco woleliby nie nagłaśniać. GPT-5.2 – w zamyśle twórców model klasy „enterprise” – w swoich odpowiedziach powołuje się na Grokipedię jako wiarygodne źródło.
Tu potrzebne jest wyjaśnienie: Grokipedia (część projektu xAI Elona Muska) nie jest tradycyjną encyklopedią redagowaną przez ludzi. To dynamiczny agregator, który generuje podsumowania w czasie rzeczywistym, często zasysając treści bezpośrednio z serwisu X (dawniej Twitter). Efekt? Obok faktów trafiają tam teorie spiskowe i treści z forów ekstremistycznych, które algorytm traktuje na równi z newsami.
Iran, Holokaust i halucynacje
Problem nie dotyczy błahostek. Dziennikarze wykazali, że GPT-5.2 posiłkował się treściami wygenerowanymi przez Groka w tematach wagi ciężkiej:
- Powiązań rządu Iranu z firmą telekomunikacyjną MTN-Irancell.
- Kwestii brytyjskiego historyka Richarda Evansa, biegłego w procesie negacjonisty Holokaustu Davida Irvinga.
W obu przypadkach „poważny” ChatGPT, przeszukując sieć w poszukiwaniu odpowiedzi, uznał syntetyczny wytwór algorytmu Elona Muska za rzetelne źródło informacji. To tak, jakby profesor uniwersytetu w pracy naukowej zacytował przypadkowy, niezweryfikowany wpis z mediów społecznościowych.
OpenAI: „Filtrujemy, ale…”
Odpowiedź OpenAI jest standardowa: firma tłumaczy, że model przeszukuje szeroki zakres publicznie dostępnych stron i stosuje filtry bezpieczeństwa, by odsiać szkodliwe treści.
Wpadka z Grokipedią pokazuje jednak, że filtry te są dziurawe. Skoro system nie odróżnia rzetelnego dziennikarstwa od automatycznego agregatu opinii z X, to obietnica „profesjonalizmu” staje pod znakiem zapytania.
Era „Sztucznej Wiedzy”
To zdarzenie to dowód na to, że internet w 2026 roku staje się zamkniętym obiegiem. Modele AI mają coraz większy problem z dotarciem do „czystej”, ludzkiej wiedzy, więc zaczynają przetwarzać output innych maszyn (zjawisko tzw. Model Collapse).
Dla firm, które planowały oprzeć swój biznes na bezkrytycznym zaufaniu do GPT-5.2, to sygnał ostrzegawczy. Weryfikacja źródeł przez człowieka wciąż jest niezbędna – zwłaszcza gdy źródłem dla sztucznej inteligencji staje się inna sztuczna inteligencja.
#Grokipedia #halucynacjeAI #ModelCollapse #news #OpenAIGPT52 #TheGuardian #weryfikacjaźródeł #xAIElonMuskGiganci rozwijający AI mają problem, nie chodzi tylko o Apple
-
Wąż zjada własny ogon. „Profesjonalny” GPT-5.2 przyłapany na cytowaniu kontrowersyjnej Grokipedii
Według zapewnień OpenAI miał być szczytem techniki, narzędziem dedykowanym dla prawników, bankierów i naukowców. Tymczasem flagowy model GPT-5.2 został przyłapany na ściąganiu na egzaminie. I to od kogo? Od swojego mniej rozgarniętego kuzyna z xAI.
Recykling cyfrowych treści
Śledztwo przeprowadzone przez The Guardian ujawniło mechanizm, którego inżynierowie z San Francisco woleliby nie nagłaśniać. GPT-5.2 – w zamyśle twórców model klasy „enterprise” – w swoich odpowiedziach powołuje się na Grokipedię jako wiarygodne źródło.
Tu potrzebne jest wyjaśnienie: Grokipedia (część projektu xAI Elona Muska) nie jest tradycyjną encyklopedią redagowaną przez ludzi. To dynamiczny agregator, który generuje podsumowania w czasie rzeczywistym, często zasysając treści bezpośrednio z serwisu X (dawniej Twitter). Efekt? Obok faktów trafiają tam teorie spiskowe i treści z forów ekstremistycznych, które algorytm traktuje na równi z newsami.
Iran, Holokaust i halucynacje
Problem nie dotyczy błahostek. Dziennikarze wykazali, że GPT-5.2 posiłkował się treściami wygenerowanymi przez Groka w tematach wagi ciężkiej:
- Powiązań rządu Iranu z firmą telekomunikacyjną MTN-Irancell.
- Kwestii brytyjskiego historyka Richarda Evansa, biegłego w procesie negacjonisty Holokaustu Davida Irvinga.
W obu przypadkach „poważny” ChatGPT, przeszukując sieć w poszukiwaniu odpowiedzi, uznał syntetyczny wytwór algorytmu Elona Muska za rzetelne źródło informacji. To tak, jakby profesor uniwersytetu w pracy naukowej zacytował przypadkowy, niezweryfikowany wpis z mediów społecznościowych.
OpenAI: „Filtrujemy, ale…”
Odpowiedź OpenAI jest standardowa: firma tłumaczy, że model przeszukuje szeroki zakres publicznie dostępnych stron i stosuje filtry bezpieczeństwa, by odsiać szkodliwe treści.
Wpadka z Grokipedią pokazuje jednak, że filtry te są dziurawe. Skoro system nie odróżnia rzetelnego dziennikarstwa od automatycznego agregatu opinii z X, to obietnica „profesjonalizmu” staje pod znakiem zapytania.
Era „Sztucznej Wiedzy”
To zdarzenie to dowód na to, że internet w 2026 roku staje się zamkniętym obiegiem. Modele AI mają coraz większy problem z dotarciem do „czystej”, ludzkiej wiedzy, więc zaczynają przetwarzać output innych maszyn (zjawisko tzw. Model Collapse).
Dla firm, które planowały oprzeć swój biznes na bezkrytycznym zaufaniu do GPT-5.2, to sygnał ostrzegawczy. Weryfikacja źródeł przez człowieka wciąż jest niezbędna – zwłaszcza gdy źródłem dla sztucznej inteligencji staje się inna sztuczna inteligencja.
#Grokipedia #halucynacjeAI #ModelCollapse #news #OpenAIGPT52 #TheGuardian #weryfikacjaźródeł #xAIElonMuskGiganci rozwijający AI mają problem, nie chodzi tylko o Apple
-
dos o tres años de uso masivo y ya hemos vuelto tonta otra inteligencia
#AIslop #ModelCollapse #brainrot #infoxication
https://www.techbuzz.ai/articles/ai-models-get-brain-rot-from-social-media-training-data -
"The co-degeneration thesis is not a prediction about distant futures. It describes dynamics already in motion, already documented in peer-reviewed research, already observable in the declining quality of online discourse and the increasing unreliability of AI systems that should, by simple scaling laws, only be improving.
The feedback loops are active. Engagement-optimized content degrades training data. Degraded models produce degraded outputs. Humans consuming and delegating to these systems experience cognitive effects that reduce their capacity to recognize and correct the degradation. The cycle continues.
But this is not a counsel of despair. The research also suggests intervention points. Model collapse can be prevented through data accumulation strategies that preserve genuine human content. Cognitive debt can be mitigated through usage protocols that maintain human engagement. Platform incentives can be restructured through regulation, competition, or user demand.
The question is whether institutional actors—corporations, governments, investors, educators—recognize the dynamics in time to intervene effectively, or whether they continue optimizing for metrics that accelerate the degradation."
https://substack.com/inbox/post/180851372?r=6p7b5o&utm_medium=ios&triedRedirect=true
-
"The co-degeneration thesis is not a prediction about distant futures. It describes dynamics already in motion, already documented in peer-reviewed research, already observable in the declining quality of online discourse and the increasing unreliability of AI systems that should, by simple scaling laws, only be improving.
The feedback loops are active. Engagement-optimized content degrades training data. Degraded models produce degraded outputs. Humans consuming and delegating to these systems experience cognitive effects that reduce their capacity to recognize and correct the degradation. The cycle continues.
But this is not a counsel of despair. The research also suggests intervention points. Model collapse can be prevented through data accumulation strategies that preserve genuine human content. Cognitive debt can be mitigated through usage protocols that maintain human engagement. Platform incentives can be restructured through regulation, competition, or user demand.
The question is whether institutional actors—corporations, governments, investors, educators—recognize the dynamics in time to intervene effectively, or whether they continue optimizing for metrics that accelerate the degradation."
https://substack.com/inbox/post/180851372?r=6p7b5o&utm_medium=ios&triedRedirect=true
-
. @glitter mentioned a few days ago that AI-generated images are becoming more and more yellow as the LLMs are trained on the output of other LLM runs. #ModelCollapse #AI #LLMs -
. @glitter mentioned a few days ago that AI-generated images are becoming more and more yellow as the LLMs are trained on the output of other LLM runs. #ModelCollapse #AI #LLMs -
#HoloWrites 1200-odd words today! I'm finding it super difficult to fake writing LLM output in a way that's engaging, funny, and obvious to the reader, but I think I'm getting there with the last chapter of #ModelCollapse. Shouldn't keep my audience of three waiting too long :D
-
I've read that LLMs and other generative models will eventually collapse if they are trained on their own output. I did a search and found this paper for example https://www.nature.com/articles/s41586-024-07566-y . Shouldn't this problem affect humans as well? Humans "generate" books which other humans use to "train" themselves. Then these trained humans generate new books and the cycle continues. What prevents the quality and diversity of the human output from collapsing in the same way that LLM output collapses?
My guess is that sometimes there are problems where the quality of human thought decreases over time. Group think comes to mind. In science, experimental work helps to keep the theory to be grounded. Also humans live in the real world so they suffer if their internal world model differs from the real world.
-
I've read that LLMs and other generative models will eventually collapse if they are trained on their own output. I did a search and found this paper for example https://www.nature.com/articles/s41586-024-07566-y . Shouldn't this problem affect humans as well? Humans "generate" books which other humans use to "train" themselves. Then these trained humans generate new books and the cycle continues. What prevents the quality and diversity of the human output from collapsing in the same way that LLM output collapses?
My guess is that sometimes there are problems where the quality of human thought decreases over time. Group think comes to mind. In science, experimental work helps to keep the theory to be grounded. Also humans live in the real world so they suffer if their internal world model differs from the real world.
-
In big news overnight, #Anthropic have made a major change to their user data retention and training policy - giving customers until September 28th to opt out, or have their chats, code sessions and other artefacts used for training for up to five years.
This is a major departure from their previous privacy-first stance.
But what's really behind this change? As Connie Loizos points out in this @Techcrunch article, it's all about the #data.
As I've spoken about recently, we've passed #PeakToken - the point in history where we have the maximum amount of authentic, human-generated data available. Now, the internet is polluted with synthetically-generated #AIslop. If you're an #AI company scraping the web for new data to train on, that's bad news, because you also scoop up the AI slop. If models are trained on AI slop, they're likely to encounter #ModelCollapse - like a bad photocopy.
Anthropic's play here is all about the #TokenCrisis - the voracious appetite for new, authentic, human-generated data to train on - part of a broader phenomenon I've termed the #TokenWars.
As new data becomes scarcer and more valuable, it will be more sought after and contested. We're still in the early days of the #TokenWars, and we should expect to see more moves like this to secure more data for AI training.
-
In big news overnight, #Anthropic have made a major change to their user data retention and training policy - giving customers until September 28th to opt out, or have their chats, code sessions and other artefacts used for training for up to five years.
This is a major departure from their previous privacy-first stance.
But what's really behind this change? As Connie Loizos points out in this @Techcrunch article, it's all about the #data.
As I've spoken about recently, we've passed #PeakToken - the point in history where we have the maximum amount of authentic, human-generated data available. Now, the internet is polluted with synthetically-generated #AIslop. If you're an #AI company scraping the web for new data to train on, that's bad news, because you also scoop up the AI slop. If models are trained on AI slop, they're likely to encounter #ModelCollapse - like a bad photocopy.
Anthropic's play here is all about the #TokenCrisis - the voracious appetite for new, authentic, human-generated data to train on - part of a broader phenomenon I've termed the #TokenWars.
As new data becomes scarcer and more valuable, it will be more sought after and contested. We're still in the early days of the #TokenWars, and we should expect to see more moves like this to secure more data for AI training.
-
#ModelCollapse is not inevitable, but together we can make it happen :why2025: :aMarxParty: :tetrapod:
-
#ModelCollapse is not inevitable, but together we can make it happen :why2025: :aMarxParty: :tetrapod:
-
New #review today: "Or you could just listen to #AncientPsychicTripleHyperOctopus and find yourself in a sound-world of weird electronics, percussion, and trumpet that floats along without rhyme or reason, but manifests as a fascinating journey. The perpetrators of this experiment are #AlexBonney (trumpet, bass recorder, Strohviol), #WillGlaser (drums, percussion), and #IsambardKhroustaliov (aka #SamBritton, electronics)." #ExposeOnline #ExperimentalMusic #ModelCollapse http://expose.org/index.php/articles/display/ancient-psychic-triple-hyper-octopus-put-emojis-on-my-grave-2.html
-
New #review today: "Or you could just listen to #AncientPsychicTripleHyperOctopus and find yourself in a sound-world of weird electronics, percussion, and trumpet that floats along without rhyme or reason, but manifests as a fascinating journey. The perpetrators of this experiment are #AlexBonney (trumpet, bass recorder, Strohviol), #WillGlaser (drums, percussion), and #IsambardKhroustaliov (aka #SamBritton, electronics)." #ExposeOnline #ExperimentalMusic #ModelCollapse http://expose.org/index.php/articles/display/ancient-psychic-triple-hyper-octopus-put-emojis-on-my-grave-2.html
-
Whole new meaning to the impact of #ModelCollapse
https://tomkahe.com/@GiftArticles/114857402911829126 -
Whole new meaning to the impact of #ModelCollapse
https://tomkahe.com/@GiftArticles/114857402911829126 -
"We are happy to tell you that we accept your proposal: The Well Is Poisoned — Now What Shall We Drink?" :blobcatchristmasglowsticks:
Looking forward to talking about genAI pollution of the infosphere and what we can do about it at #WHY2025 :why2025:
Read my proposal here: https://martinh.net/hacks/poisoned-well/
#SearchClub #ModelCollapse #AISlop #SmallWeb #LowBackgroundInformation
-
"We are happy to tell you that we accept your proposal: The Well Is Poisoned — Now What Shall We Drink?" :blobcatchristmasglowsticks:
Looking forward to talking about genAI pollution of the infosphere and what we can do about it at #WHY2025 :why2025:
Read my proposal here: https://martinh.net/hacks/poisoned-well/
#SearchClub #ModelCollapse #AISlop #SmallWeb #LowBackgroundInformation
-
هل نواجه "تلوّثًا رقميًا" يُهدد مستقبل #الذكاء_الاصطناعي؟
منذ إطلاق #ChatGPT في 2022، يشبّه خبراء الذكاء الاصطناعي ما حدث بانفجار أول قنبلة ذرية!لماذا ؟
👇👇👇
#AI #ModelCollapse #DataQuality #ChatGPT #ArtificialIntelligence #Ethics #TechPolicy -
هل نواجه "تلوّثًا رقميًا" يُهدد مستقبل #الذكاء_الاصطناعي؟
منذ إطلاق #ChatGPT في 2022، يشبّه خبراء الذكاء الاصطناعي ما حدث بانفجار أول قنبلة ذرية!لماذا ؟
👇👇👇
#AI #ModelCollapse #DataQuality #ChatGPT #ArtificialIntelligence #Ethics #TechPolicy -
I can definitely see this for #EdTech - things have to work.
#ModelCollapse is for example a risk that keeps coming up when talking about #AI in #academia
-
I can definitely see this for #EdTech - things have to work.
#ModelCollapse is for example a risk that keeps coming up when talking about #AI in #academia
-
I love how even the car people are noticing and aware of the AI/LLM/GenAI slop machine.
https://www.youtube.com/watch?v=gK_vt3xa6xI -
I love how even the car people are noticing and aware of the AI/LLM/GenAI slop machine.
https://www.youtube.com/watch?v=gK_vt3xa6xI -
e516 with Michael and Michael - #AI #prompts, #browsers, #ModelCollapse & #automation along with a teeny tiny #PicoMacNano and so much more.
-
e516 with Michael and Michael - #AI #prompts, #browsers, #ModelCollapse & #automation along with a teeny tiny #PicoMacNano and so much more.
-
e516 with Michael and Michael - #AI #prompts, #browsers, #ModelCollapse & #automation along with a teeny tiny #PicoMacNano and so much more. https://gamesatwork.biz/2025/06/02/e516-model-behavior/