#textprocessing — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #textprocessing, aggregated by home.social.
-
sentencex - by Wikimedia:
https://github.com/wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
Written in #Rust.
Bindings are available for #Python, #NodeJS and #WASM
Might be useful for my #SpeechToText system! 👀
-
The palindrome problem – Unicode edition
https://wiesmann.codiferes.net/wordpress/archives/41500
#C++ #CodePoints #GraphemeClusters #java #Javascript #ProgrammingLanguage #Python #Swift #TextProcessing #Unicode
-
Building on the 90s, statistical n-gram language models, trained on vast text collections, became the backbone of NLP research. They fueled advancements in nearly all NLP techniques of the era, laying the groundwork for today's AI.
F. Jelinek (1997), Statistical Methods for Speech Recognition, MIT Press, Cambridge, MA
#NLP #LanguageModels #HistoryOfAI #TextProcessing #AI #historyofscience #ISE2025 @fizise @fiz_karlsruhe @tabea @enorouzi @sourisnumerique
-
🚀 Behold the epic tale of Janet's #PEG #module, where the author heroically excludes regular expressions like they're yesterday's news. 💥 Marvel at the labyrinth of #parsing magic that claims to be more readable, but only if you have a PhD in arcane text processing. 📜✨
https://bakpakin.com/writing/how-janets-peg-works.html #Janet #readability #textprocessing #regex #HackerNews #ngated -
Getting ready to run an online introductory XSLT course for people writing or maintaining stylesheets.
#XSLT #XML #Schematron #XSpec #declarative #functionalProgramming #textProcessing #digitalHumanities #JATS
-
Dusting off an old idea that may be nearing practicality with the rise of large language models:
A model of human universal values—multilevel, fine-grained, probabilistic, intertwingled, and evolving—could facilitate useful discourse and collective decision-making.
It seems to me that our worldwide, multicultural corpus of fiction must contain examples, within context, of nearly all human values and situations in which they may apply.
Such fictional examples are of course highly biased toward the dramatic and the sensational, since that's what people tend to enjoy reading, but might we now be close to having the tools to help us collect, categorize, and make sense of such a data-store and apply it to effective discussion and decision-making?
#decisionmaking #values #HumanValues #HumanUniversals #CollectiveDecisionmaking #TextProcessing #CorpusAnalytics #analytics #LLM #Cooperation #emotion
-
Dusting off an old idea that may be nearing practicality with the rise of large language models:
A model of human universal values—multilevel, fine-grained, probabilistic, intertwingled, and evolving—could facilitate useful discourse and collective decision-making.
It seems to me that our worldwide, multicultural corpus of fiction must contain examples, within context, of nearly all human values and situations in which they may apply.
Such fictional examples are of course highly biased toward the dramatic and the sensational, since that's what people tend to enjoy reading, but might we now be close to having the tools to help us collect, categorize, and make sense of such a data-store and apply it to effective discussion and decision-making?
#decisionmaking #values #HumanValues #HumanUniversals #CollectiveDecisionmaking #TextProcessing #CorpusAnalytics #analytics #LLM #Coopertion #emotion