#language-modeling — Public Fediverse posts on home.social

Arint - SEO+KI @[email protected] · 2026-07-04 · 10:04 UTC

RT @NVIDIAAI: Wir haben ein 30B-Modell in zwei Hälften aufgeteilt, um Tokens parallel statt nacheinander zu verarbeiten. Wir stellen vor: Nemotron-Labs-TwoTower, ein Diffusions-Sprachmodell von NVIDIA Research, das auf Nemotron-3-Nano-30B-A3B basiert. So funktioniert es: Eine Hälfte hält den Kontext, die andere schreibt die Tokens, wobei beide die vortrainierte Modellarchitektur nutzen, anstatt ein neues Modell von Grund auf zu trainieren. Wir haben festgestellt, dass es 98,7 % der Qualität des Originalmodells bei 2,42× schnellerer Generierung beibehält. Video

mehr auf Arint.info

#AI #DiffusionModel #LanguageModeling #MachineLearning #Nemotron #NVIDIAResearch #arint_info

https://x.com/NVIDIAAI/status/2072394812301480067#m

#ai #diffusionmodel #languagemodeling #machinelearning #nemotron #nvidiaresearch

Arint - SEO+KI @[email protected] · 2026-07-04 · 10:04 UTC

RT @NVIDIAAI: Wir haben ein 30B-Modell in zwei Hälften aufgeteilt, um Tokens parallel statt nacheinander zu verarbeiten. Wir stellen vor: Nemotron-Labs-TwoTower, ein Diffusions-Sprachmodell von NVIDIA Research, das auf Nemotron-3-Nano-30B-A3B basiert. So funktioniert es: Eine Hälfte hält den Kontext, die andere schreibt die Tokens, wobei beide die vortrainierte Modellarchitektur nutzen, anstatt ein neues Modell von Grund auf zu trainieren. Wir haben festgestellt, dass es 98,7 % der Qualität des Originalmodells bei 2,42× schnellerer Generierung beibehält. Video

mehr auf Arint.info

#AI #DiffusionModel #LanguageModeling #MachineLearning #Nemotron #NVIDIAResearch #arint_info

https://x.com/NVIDIAAI/status/2072394812301480067#m

#ai #diffusionmodel #languagemodeling #machinelearning #nemotron #nvidiaresearch

N-gated Hacker News @[email protected] · 2026-06-01 · 14:46 UTC

🔮 Behold, the mystical prophecy of CS336! 🌟 Students are invited to glimpse the future of 2026, where they'll be serenaded by the dulcet tones of "Language Modeling from Scratch" every Monday and Wednesday. 🤖 No need to travel through time, just teleport to Skilling Auditorium and prepare to be mind-boggled by the repeat spectacle of springtime syllabi! 🎉
https://cs336.stanford.edu/ #CS336 #LanguageModeling #Future2026 #SkillingAuditorium #SpringSyllabi #MindBoggled #HackerNews #ngated

#cs336 #languagemodeling #future2026 #skillingauditorium #springsyllabi #mindboggled

N-gated Hacker News @[email protected] · 2026-06-01 · 14:46 UTC

🔮 Behold, the mystical prophecy of CS336! 🌟 Students are invited to glimpse the future of 2026, where they'll be serenaded by the dulcet tones of "Language Modeling from Scratch" every Monday and Wednesday. 🤖 No need to travel through time, just teleport to Skilling Auditorium and prepare to be mind-boggled by the repeat spectacle of springtime syllabi! 🎉
https://cs336.stanford.edu/ #CS336 #LanguageModeling #Future2026 #SkillingAuditorium #SpringSyllabi #MindBoggled #HackerNews #ngated

#cs336 #languagemodeling #future2026 #skillingauditorium #springsyllabi #mindboggled

Hacker News @[email protected] · 2026-06-01 · 14:46 UTC

CS336: Language Modeling from Scratch

https://cs336.stanford.edu/

#HackerNews #CS336 #LanguageModeling #StanfordAI #MachineLearning #NaturalLanguageProcessing #TechEducation

#hackernews #cs336 #languagemodeling #stanfordai #machinelearning #naturallanguageprocessing

Hacker News @[email protected] · 2026-06-01 · 14:46 UTC

CS336: Language Modeling from Scratch

https://cs336.stanford.edu/

#HackerNews #CS336 #LanguageModeling #StanfordAI #MachineLearning #NaturalLanguageProcessing #TechEducation

#hackernews #cs336 #languagemodeling #stanfordai #machinelearning #naturallanguageprocessing

Hacker News @[email protected] · 2026-03-04 · 18:20 UTC

NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute

https://qlabs.sh/slowrun

#HackerNews #NanoGPT #Slowrun #LanguageModeling #LimitedData #InfiniteCompute

#hackernews #nanogpt #slowrun #languagemodeling #limiteddata #infinitecompute

Hacker News @[email protected] · 2026-03-04 · 18:20 UTC

NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute

https://qlabs.sh/slowrun

#HackerNews #NanoGPT #Slowrun #LanguageModeling #LimitedData #InfiniteCompute

#hackernews #nanogpt #slowrun #languagemodeling #limiteddata #infinitecompute

Hacker News @[email protected] · 2025-05-30 · 10:01 UTC

Tokenization for language modeling: BPE vs. Unigram Language Modeling (2020)

https://ndingwall.github.io/blog/tokenization

#HackerNews #Tokenization #LanguageModeling #BPE #Unigram #NLP

#hackernews #tokenization #languagemodeling #bpe #unigram #nlp

Hacker News @[email protected] · 2025-05-30 · 10:01 UTC

Tokenization for language modeling: BPE vs. Unigram Language Modeling (2020)

https://ndingwall.github.io/blog/tokenization

#HackerNews #Tokenization #LanguageModeling #BPE #Unigram #NLP

#hackernews #tokenization #languagemodeling #bpe #unigram #nlp

Harald Klinke @[email protected] · 2025-04-21 · 12:09 UTC

Andrey Markov & Claude Shannon Counted Letters to Build the First Language-Generation Models Shannon’s said: “OCRO HLI RGWR NMIELWIS” #Shannon #Markov #NLP #AIhistory #LanguageModeling

Andrey Markov & Claude Shannon...

#shannon #markov #nlp #aihistory #languagemodeling

Harald Klinke @[email protected] · 2025-04-21 · 12:09 UTC

Andrey Markov & Claude Shannon Counted Letters to Build the First Language-Generation Models
Shannon’s said: “OCRO HLI RGWR NMIELWIS”
#Shannon #Markov #NLP #AIhistory #LanguageModeling
https://spectrum.ieee.org/andrey-markov-and-claude-shannon-built-the-first-language-generation-models

#shannon #markov #nlp #aihistory #languagemodeling

Harald Klinke @[email protected] · 2025-04-21 · 12:09 UTC

Andrey Markov & Claude Shannon Counted Letters to Build the First Language-Generation Models
Shannon’s said: “OCRO HLI RGWR NMIELWIS”
#Shannon #Markov #NLP #AIhistory #LanguageModeling
https://spectrum.ieee.org/andrey-markov-and-claude-shannon-built-the-first-language-generation-models

#shannon #markov #nlp #aihistory #languagemodeling

michabbb @[email protected] · 2024-11-06 · 14:29 UTC

🎯 #OuteTTS introduces a novel approach to text-to-speech synthesis using pure #languagemodeling
🔧 Built on #LLaMa architecture with just 350M parameters, featuring:

Zero-shot #voicecloning capability
Integration with #WavTokenizer (75 tokens/sec)
Local deployment via #llamacpp
#GGUF format compatibility

🔍 Technical Implementation:

Audio tokenization process
CTC forced alignment
Structured prompt system
Temperature-adjustable outputs

⚠️ Current Limitations:

Limited vocabulary range
String-only input support
Best performance with shorter sentences
Variable temperature sensitivity

https://github.com/edwko/OuteTTS
https://huggingface.co/OuteAI/OuteTTS-0.1-350M

#outetts #languagemodeling #llama #voicecloning #wavtokenizer #llamacpp

Naomi Saphra @[email protected] · 2023-09-15 · 04:51 UTC

New #languagemodeling #nlp #ai #paper, led by Angelica Chen! We break the steepest MLM training loss drop into *2* phase changes: first in internal grammatical structure, then external capabilities. Big implications for emergence, simplicity bias, and interpretability! https://arxiv.org/abs/2309.07311

#languagemodeling #nlp #ai #paper

Naomi Saphra @[email protected] · 2023-09-15 · 04:51 UTC

New #languagemodeling #nlp #ai #paper, led by Angelica Chen! We break the steepest MLM training loss drop into *2* phase changes: first in internal grammatical structure, then external capabilities. Big implications for emergence, simplicity bias, and interpretability! https://arxiv.org/abs/2309.07311

#languagemodeling #nlp #ai #paper

Netherlands eScience Center @[email protected] · 2023-06-01 · 09:20 UTC

#LanguageModeling is trending, to a large extent because of #ChatGPT. But did you know language modeling has been with us for more than a century? And that it was born of the collaboration of a poet and a mathematician?

Our engineer Carsten Schnober tells us more:
https://blog.esciencecenter.nl/language-modeling-the-first-100-years-357556816148

#languagemodeling #chatgpt

Netherlands eScience Center @[email protected] · 2023-06-01 · 09:20 UTC

#LanguageModeling is trending, to a large extent because of #ChatGPT. But did you know language modeling has been with us for more than a century? And that it was born of the collaboration of a poet and a mathematician?

Our engineer Carsten Schnober tells us more:
https://blog.esciencecenter.nl/language-modeling-the-first-100-years-357556816148

#languagemodeling #chatgpt