#pleias — Public Fediverse posts on home.social

Techdirt [Unofficial] @[email protected] · 2026-03-24 · 22:37 UTC

An Open Training Set For AI Goes Global

https://fed.brid.gy/r/https://www.techdirt.com/2026/03/24/an-open-training-set-for-ai-goes-global/

#commoncorpus #pleias #ai #aitraining #copyright #openlicenseing

Techdirt [Unofficial] @[email protected] · 2026-03-24 · 22:37 UTC

An Open Training Set For AI Goes Global

https://web.brid.gy/r/https://www.techdirt.com/2026/03/24/an-open-training-set-for-ai-goes-global/

#commoncorpus #pleias #ai #aitraining #copyright #openlicenseing

Techdirt [Unofficial] @[email protected] · 2026-03-24 · 22:37 UTC

An Open Training Set For AI Goes Global

https://fed.brid.gy/r/https://www.techdirt.com/2026/03/24/an-open-training-set-for-ai-goes-global/

#commoncorpus #pleias #ai #aitraining #copyright #openlicenseing

Techdirt [Unofficial] @[email protected] · 2026-03-24 · 22:37 UTC

An Open Training Set For AI Goes Global

https://fed.brid.gy/r/https://www.techdirt.com/2026/03/24/an-open-training-set-for-ai-goes-global/

#trainingdata #publicdomain #openlicensing #openlicenseing #copyright #aitraining

Techdirt [Unofficial] @[email protected] · 2026-03-24 · 22:37 UTC

An Open Training Set For AI Goes Global

https://web.brid.gy/r/https://www.techdirt.com/2026/03/24/an-open-training-set-for-ai-goes-global/

#commoncorpus #pleias #ai #aitraining #copyright #openlicenseing

Pierre-Yves Beaudouin @[email protected] · 2026-03-12 · 20:45 UTC

Quelques informations supplémentaires sur la journée du 21 mars à Jussieu, dont la composition de la table-ronde dans ce communiqué de presse. https://www.sorbonne-universite.fr/actualites/seulement-02-de-donnees-francophones-dans-lia-wikimedia-france-et-sorbonne-universite

#Wikipedia #Pleias #TeamESR

#wikipedia #pleias #teamesr

Pierre-Yves Beaudouin @[email protected] · 2026-03-12 · 20:45 UTC

Quelques informations supplémentaires sur la journée du 21 mars à Jussieu, dont la composition de la table-ronde dans ce communiqué de presse. https://www.sorbonne-universite.fr/actualites/seulement-02-de-donnees-francophones-dans-lia-wikimedia-france-et-sorbonne-universite

#Wikipedia #Pleias #TeamESR

#wikipedia #pleias #teamesr

Pierre-Yves Beaudouin @[email protected] · 2026-03-12 · 20:45 UTC

Quelques informations supplémentaires sur la journée du 21 mars à Jussieu, dont la composition de la table-ronde dans ce communiqué de presse. https://www.sorbonne-universite.fr/actualites/seulement-02-de-donnees-francophones-dans-lia-wikimedia-france-et-sorbonne-universite

#Wikipedia #Pleias #TeamESR

#wikipedia #pleias #teamesr

Pierre-Yves Beaudouin @[email protected] · 2026-03-12 · 20:45 UTC

Quelques informations supplémentaires sur la journée du 21 mars à Jussieu, dont la composition de la table-ronde dans ce communiqué de presse. https://www.sorbonne-universite.fr/actualites/seulement-02-de-donnees-francophones-dans-lia-wikimedia-france-et-sorbonne-universite

#Wikipedia #Pleias #TeamESR

#teamesr #pleias #wikipedia

Walled Culture @[email protected] · 2026-02-25 · 13:00 UTC

Common Corpus, an open training set for AI, goes global – and so should support for it

As many of the AI stories on Walled Culture attest, one of the most contentious areas in the latest stage of AI development concerns the sourcing of training data. To create high-quality large language models (LLMs) massive quantities of training data are required. In the current genAI stampede, many companies are simply scraping everything they can off the Internet. Quite how that will work […]

#aiAlliance #commonCorpus #curation #euAiAct #financeCommons #france #gdpr #github #legalCommons #llms #multilingual #openCulture #openGovernment #openScience #openSource #openWeb #pdf #permissiveLicensing #pleias #publicDomain #scraping #tokens #toxicity #wikimedia #youtube https://walledculture.org/common-corpus-an-open-training-set-for-ai-goes-global-and-so-should-support-for-it/

#aialliance #commoncorpus #curation #euaiact #financecommons #france

🌈 Lascapi ⁂ @[email protected] · 2026-01-24 · 18:08 UTC

> Today, we are announcing #Amazon, #Meta, #Microsoft, #mistralai , and #Perplexity for the first time as they join our roster of partners, which includes #Google, #Ecosia, #Nomic, #Pleias, #ProRata, and #ReefMedia. All these organizations utilize #WikimediaEnterprise to integrate human-governed knowledge into their platforms at scale. By doing so, they help ensure that the work of our global volunteer community reaches billions of people with the accuracy and transparency that Wikipedia represents.

And that a good new for me.

#wikimedia #wikipedia #ai

https://enterprise.wikimedia.com/blog/wikipedia-25-enterprise-partners/

#amazon #meta #microsoft #mistralai #perplexity #google

Carlos Solís @[email protected] · 2025-06-09 · 18:40 UTC

Really happy to see a new #copyleft -based #LLM , and this one seems to be more general-purpose than former attempts such as #PleIAs. The #Comma model is trained with #CommonPile, a new training pile with 8 TB of public domain and copyleft data. huggingface.co/papers/2506.052…

Paper page - The Common Pile v...

#comma #commonpile #llm #pleias #copyleft

Carlos Solís @[email protected] · 2025-06-09 · 17:06 UTC

Really happy to see a new #copyleft -based #LLM , and this one seems to be more general-purpose than former attempts such as #PleIAs. The #Comma model is trained with #CommonPile, a new training pile with 8 TB of public domain and copyleft data. huggingface.co/papers/2506.052…

#copyleft #llm #pleias #comma #commonpile