home.social

#tokenwars — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #tokenwars, aggregated by home.social.

  1. In big news overnight, #Anthropic have made a major change to their user data retention and training policy - giving customers until September 28th to opt out, or have their chats, code sessions and other artefacts used for training for up to five years.

    This is a major departure from their previous privacy-first stance.

    But what's really behind this change? As Connie Loizos points out in this @Techcrunch article, it's all about the #data.

    As I've spoken about recently, we've passed #PeakToken - the point in history where we have the maximum amount of authentic, human-generated data available. Now, the internet is polluted with synthetically-generated #AIslop. If you're an #AI company scraping the web for new data to train on, that's bad news, because you also scoop up the AI slop. If models are trained on AI slop, they're likely to encounter #ModelCollapse - like a bad photocopy.

    Anthropic's play here is all about the #TokenCrisis - the voracious appetite for new, authentic, human-generated data to train on - part of a broader phenomenon I've termed the #TokenWars.

    As new data becomes scarcer and more valuable, it will be more sought after and contested. We're still in the early days of the #TokenWars, and we should expect to see more moves like this to secure more data for AI training.

    techcrunch.com/2025/08/28/anth

  2. In just a few days I will travel to Tarntanya/Adelaide - fulfilling a desire to take the Overland across north-Western Victoria and into South Australia - to present my talk on the #TokenWars at @everythingopen #EO2025 #EverythingOpen.

    The future of many open source, volunteer-run conferences is precarious.

    Rising costs of hosting, dwindling sponsorship, and reluctance to fund employees to attend, as well as the increasing burn-out of the dedicated folks who pitch in thousands of hours a year to make them happen - on top of the erosion caused by the pandemic - means that this may be the last year in many I get to catch up with the community I've come to call "my people" over the last 15 years.

    So, let's make it a blast.

    Three stellar #keynotes lead the proceedings - maker, technologist and Skill Seeker, @sjpiper145, critical technologist and FOI expert, @daedalus, alongside passionate advocate for the power of libraries, @Trishh.

    On top of that, I'm also anticipating great talks from Andy Gelme, @Unixbigot, @saera, @nnye, @dtbell91, @emmadavidson, @kattekrab @[email protected] Aleisha Amohia and Sara King, just to name a few - people I have admired and respected for a long time.

    See you there, perhaps for the last time in a long while?

  3. In just a few days I will travel to Tarntanya/Adelaide - fulfilling a desire to take the Overland across north-Western Victoria and into South Australia - to present my talk on the #TokenWars at @everythingopen #EO2025 #EverythingOpen.

    The future of many open source, volunteer-run conferences is precarious.

    Rising costs of hosting, dwindling sponsorship, and reluctance to fund employees to attend, as well as the increasing burn-out of the dedicated folks who pitch in thousands of hours a year to make them happen - on top of the erosion caused by the pandemic - means that this may be the last year in many I get to catch up with the community I've come to call "my people" over the last 15 years.

    So, let's make it a blast.

    Three stellar #keynotes lead the proceedings - maker, technologist and Skill Seeker, @sjpiper145, critical technologist and FOI expert, @daedalus, alongside passionate advocate for the power of libraries, @Trishh.

    On top of that, I'm also anticipating great talks from Andy Gelme, @Unixbigot, @saera, @nnye, @dtbell91, @emmadavidson, @kattekrab @[email protected] Aleisha Amohia and Sara King, just to name a few - people I have admired and respected for a long time.

    See you there, perhaps for the last time in a long while?

  4. In just a few days I will travel to Tarntanya/Adelaide - fulfilling a desire to take the Overland across north-Western Victoria and into South Australia - to present my talk on the #TokenWars at @everythingopen #EO2025 #EverythingOpen.

    The future of many open source, volunteer-run conferences is precarious.

    Rising costs of hosting, dwindling sponsorship, and reluctance to fund employees to attend, as well as the increasing burn-out of the dedicated folks who pitch in thousands of hours a year to make them happen - on top of the erosion caused by the pandemic - means that this may be the last year in many I get to catch up with the community I've come to call "my people" over the last 15 years.

    So, let's make it a blast.

    Three stellar #keynotes lead the proceedings - maker, technologist and Skill Seeker, @sjpiper145, critical technologist and FOI expert, @daedalus, alongside passionate advocate for the power of libraries, @Trishh.

    On top of that, I'm also anticipating great talks from Andy Gelme, @Unixbigot, @saera, @nnye, @dtbell91, @emmadavidson, @kattekrab @[email protected] Aleisha Amohia and Sara King, just to name a few - people I have admired and respected for a long time.

    See you there, perhaps for the last time in a long while?

  5. In just a few days I will travel to Tarntanya/Adelaide - fulfilling a desire to take the Overland across north-Western Victoria and into South Australia - to present my talk on the #TokenWars at @everythingopen #EO2025 #EverythingOpen.

    The future of many open source, volunteer-run conferences is precarious.

    Rising costs of hosting, dwindling sponsorship, and reluctance to fund employees to attend, as well as the increasing burn-out of the dedicated folks who pitch in thousands of hours a year to make them happen - on top of the erosion caused by the pandemic - means that this may be the last year in many I get to catch up with the community I've come to call "my people" over the last 15 years.

    So, let's make it a blast.

    Three stellar #keynotes lead the proceedings - maker, technologist and Skill Seeker, @sjpiper145, critical technologist and FOI expert, @daedalus, alongside passionate advocate for the power of libraries, @Trishh.

    On top of that, I'm also anticipating great talks from Andy Gelme, @Unixbigot, @saera, @nnye, @dtbell91, @emmadavidson, @kattekrab @[email protected] Aleisha Amohia and Sara King, just to name a few - people I have admired and respected for a long time.

    See you there, perhaps for the last time in a long while?

  6. In just a few days I will travel to Tarntanya/Adelaide - fulfilling a desire to take the Overland across north-Western Victoria and into South Australia - to present my talk on the #TokenWars at @everythingopen #EO2025 #EverythingOpen.

    The future of many open source, volunteer-run conferences is precarious.

    Rising costs of hosting, dwindling sponsorship, and reluctance to fund employees to attend, as well as the increasing burn-out of the dedicated folks who pitch in thousands of hours a year to make them happen - on top of the erosion caused by the pandemic - means that this may be the last year in many I get to catch up with the community I've come to call "my people" over the last 15 years.

    So, let's make it a blast.

    Three stellar #keynotes lead the proceedings - maker, technologist and Skill Seeker, @sjpiper145, critical technologist and FOI expert, @daedalus, alongside passionate advocate for the power of libraries, @Trishh.

    On top of that, I'm also anticipating great talks from Andy Gelme, @Unixbigot, @saera, @nnye, @dtbell91, @emmadavidson, @kattekrab @[email protected] Aleisha Amohia and Sara King, just to name a few - people I have admired and respected for a long time.

    See you there, perhaps for the last time in a long while?

  7. You might be familiar with what I'm terming the "Token Wars" - in which #LLM and #GenAI companies seek to ingest text, image, audio and video content to create their #ML models. Tokens are the basic unit of data input into these models - meaning that #scraping of web content is widespread.

    In retaliation, many sites - such as Reddit, Inc. and Stack Overflow - are entering into content sharing deals with companies like OpenAI, or making their sites subscription only.

    Another solution that has emerged recently is content blocking based on user agent. In web programming, the client requesting a web page identifies themself - usually as a browser or a bot.

    User agents can be blocked by a website's robots.txt file - but only if the user agent respects the robots.txt protocol. Many web scrapers do not. Taking this a step further, network providers like Cloudflare are now offering solutions which block known token scraper bots at a a network level.

    I've been playing with one of these solutions called #DarkVisitors for a couple weeks after learning it about it on The Sizzle and was **amazed** at how much traffic to my websites were bots, crawlers and content scrapers.

    darkvisitors.com

    (No backhanders here, it's just a very insightful tool)

    #TokenWars #tokenization #scraping #bots #scrapy #WebScraping