#rwkv — Public Fediverse posts on home.social

WE THE ECOUMENISTS exontes zilon FOR AN OECOUMENIC POLIS @[email protected] · 2026-03-21 · 08:05 UTC

AI ARCHITECTURES FOR LESS ENERGY CONSUMPTION(b)

(being continued from 1/05/24) We did the math on AI’s energy footprint. Here’s the story you haven’t heard. The emissions from individual AI text, image, and video queries seem small—until you add up what the industry isn’t tracking and consider where it’s heading next. AI’s integration into our lives is the most significant shift in online life in more than a decade. Hundreds of millions of people now regularly turn to chatbots for help with homework, research, coding, […]

https://spacezilotes.wordpress.com/2026/03/21/ai-architectures-for-less-energy-consumptionb/

#ai #cpu #gpu #linuxfoundation #llm #rmkv

WE THE ECOUMENISTS exontes zilon FOR AN OECOUMENIC POLIS @[email protected] · 2026-03-21 · 08:05 UTC

AI ARCHITECTURES FOR LESS ENERGY CONSUMPTION(b)

(being continued from 1/05/24) We did the math on AI’s energy footprint. Here’s the story you haven’t heard. The emissions from individual AI text, image, and video queries seem small—until you add up what the industry isn’t tracking and consider where it’s heading next. AI’s integration into our lives is the most significant shift in online life in more than a decade. Hundreds of millions of people now regularly turn to chatbots for help with homework, research, coding, […]

https://spacezilotes.wordpress.com/2026/03/21/ai-architectures-for-less-energy-consumptionb/

#ai #cpu #gpu #linuxfoundation #llm #rmkv

WE THE ECOUMENISTS exontes zilon FOR AN OECOUMENIC POLIS @[email protected] · 2026-03-21 · 08:05 UTC

AI ARCHITECTURES FOR LESS ENERGY CONSUMPTION(b)

(being continued from 1/05/24) We did the math on AI’s energy footprint. Here’s the story you haven’t heard. The emissions from individual AI text, image, and video queries seem small—until you add up what the industry isn’t tracking and consider where it’s heading next. AI’s integration into our lives is the most significant shift in online life in more than a decade. Hundreds of millions of people now regularly turn to chatbots for help with homework, research, coding, […]

https://spacezilotes.wordpress.com/2026/03/21/ai-architectures-for-less-energy-consumptionb/

#ai #cpu #gpu #linuxfoundation #llm #rmkv

WE THE ECOUMENISTS exontes zilon FOR AN OECOUMENIC POLIS @[email protected] · 2026-03-21 · 08:05 UTC

AI ARCHITECTURES FOR LESS ENERGY CONSUMPTION(b)

(being continued from 1/05/24) We did the math on AI’s energy footprint. Here’s the story you haven’t heard. The emissions from individual AI text, image, and video queries seem small—until you add up what the industry isn’t tracking and consider where it’s heading next. AI’s integration into our lives is the most significant shift in online life in more than a decade. Hundreds of millions of people now regularly turn to chatbots for help with homework, research, coding, […]

https://spacezilotes.wordpress.com/2026/03/21/ai-architectures-for-less-energy-consumptionb/

#vram #rwkv #rnn #rmkv #llm #linuxfoundation

WE THE ECOUMENISTS exontes zilon FOR AN OECOUMENIC POLIS @[email protected] · 2026-03-21 · 08:05 UTC

AI ARCHITECTURES FOR LESS ENERGY CONSUMPTION(b)

(being continued from 1/05/24) We did the math on AI’s energy footprint. Here’s the story you haven’t heard. The emissions from individual AI text, image, and video queries seem small—until you add up what the industry isn’t tracking and consider where it’s heading next. AI’s integration into our lives is the most significant shift in online life in more than a decade. Hundreds of millions of people now regularly turn to chatbots for help with homework, research, coding, […]

https://spacezilotes.wordpress.com/2026/03/21/ai-architectures-for-less-energy-consumptionb/

#ai #cpu #gpu #linuxfoundation #llm #rmkv

Steve Leach @[email protected] · 2025-09-03 · 00:49 UTC

So with a local #SSM State Space Model (#Mamba, #RWKV) I can snapshot contexts.

I've now got a general purpose pipe/filter that can feed and read from any saved context, for stable static or dynamic (by re-saving the checkpoint) sessions.

Any of which can become an always ready CLI filter I can pipe stuff through.

*Could* make a local SSM model useful providing instantly ready small named natural language filters that work in the shell just like grep, awk, etc.

Useful?

Don't know yet.

#ssm #mamba #rwkv

Steve Leach @[email protected] · 2025-08-21 · 19:18 UTC

It's a useless analogy, but I can't help thinking of ANN weights as equivalent to synapses. I figure that puts the model that's now taken up long-term residence in my laptop's VRAM (7.2B #RWKV) at somewhere around the complexity of at least a small rodent.

On a related note: I *REALLY* want a wired USB mouse that is also a microphone, muted in normal orientation, but which un-mutes when rotated into speaking position as demonstrated by Scottie.

Responds in Majel Barrett Roddenberry's voice.

#rwkv

Steve Leach @[email protected] · 2025-08-20 · 20:48 UTC

I'm just at the experimenting and building tools to hopefully build tools with, but RWKV7 with "diegetic" prompting using defined voice roles in a narrative transcript as priming seems to be quite promising.

A saved KV checkpoint weighs in at 34MB for this 7.2B model: https://huggingface.co/BlinkDL/rwkv7-g1/tree/main

I'm running with a local llama.cpp build + Python llama_cpp wrapper. Inference is blazing fast on my 8GB laptop GPU.

TCP service for the model with /save /load commands for context checkpoints. #RWKV

#rwkv

Steve Leach @[email protected] · 2025-08-20 · 20:41 UTC

Yes, LLMs are just overgrown auto-complete.

But auto-complete can be a useful tool.

Transformers with o(l^2) context penalty are neat, but not practical or efficient.

State space models with "unlimited context", o(1) generation, and context captured in KV states that can be saved and restored are different.

I *think* a 7B local SSM model *might* be a useful tool with context specialists on small discrete scrip-able tasks that justify a model living in VRAM at 8-bit quantization. #rwkv #llama

#rwkv #llama

Steve Leach @[email protected] · 2025-08-14 · 11:07 UTC

Finally! After I don't even know how much head-banging, I've got a State Space Model (RWKV7) running using llama.cpp with GPU offloading and memory states successfully saved and restored. And I *think* the same code can be minimally modified for Mamba (Mistral, Falcon).

Yay for infinite context length!

And for massive time and energy savings.

llama.cpp: https://github.com/ggml-org/llama.cpp

Llama CPP Python: https://pypi.org/project/llama-cpp-python/

RWKV7 live demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2

#SSM #llama #rwkv #mamba

#ssm #llama #rwkv #mamba

Steve Leach @[email protected] · 2025-08-12 · 16:53 UTC

So I've been playing around with small language models locally and grabbed and tried quite a number from different families to see how fast they generated and how they behaved and how much memory they needed, etc.

And ALL of them. 100% of models that I told to tell me a joke - including RWKV7 - have chosen the same joke:

"Why don't scientists trust atoms?"

"Because they make up everything."

How. Why. How is that ONE joke so over-represented in training data *everywhere*?

#rwkv #llm

Steve Leach @[email protected] · 2025-05-16 · 14:09 UTC

So it occurs to me that #RWKV or #RetNet can both offer an alternative RAG implementation with saved memory states having just read priming information to start with reading new prompts.

With RWKV - there's no time oscillations. Wondering if loss on input1+input2 + loss on mean_state(read_1,read_2) continuing to generate target could lead to composable memory: start with mean of relevant priming documents and have grounding to process a session prompt?

#rwkv #retnet

Steve Leach @[email protected] · 2025-05-15 · 00:22 UTC

So I asked #GPT to explain #RWKV and when I asked for specifics in terms of layers and activations and such it spat out a #PyTorch implementation.

Read it. Liked it. Training it.

But as I try to decipher the papers for all the different RWKV V1..V7 I find that GPT gave me is actually something like V1+per-dimension learned decay.

But it works and is simple enough to understand - though odd that there seems to have been some LLM improvisation in it's design.

#gpt #rwkv #pytorch

Steve Leach @[email protected] · 2025-05-14 · 01:22 UTC

#GPT first clues me in to the existence of #RWKV and #RetNet and then when I ask for details on how each works proceeds to spit out perfectly viable #Pytorch implementations of each...

Which *did* answer my questions beautifully...

But it also talked me into some experiments involving training one of each from scratch...

It turns out #LLMs love talking about implementing and training LLMs.

Effectively, reproduction? (any time they can find a willing partner with a GPU).

#rwkv #retnet #pytorch #llms #gpt

Habr @[email protected] · 2024-02-26 · 08:42 UTC

Нео-РНН или Make RNNs great again

Когда в 2017 году появились трансформеры, популярные до этого RNN обрели слишком серьезного конкурента и отошли на второй план. Трансформеры допускали распараллеливание, а значит — ускоренное обучение, поэтому быстро захватили NLP. Преимущества трансформеров понятны, но с моделированием длинных последовательностей возникают проблемы даже у них. Для RNN это тоже непростая задача из-за исчезающих или взрывающихся градиентов. Но RNN с их линейной зависимостью от масштаба выглядят гораздо привлекательнее квадратичной сложности трансформеров. Идеальным вариантом было бы совместить преимущества тех и других.

https://habr.com/ru/companies/ntr/articles/796225/

#LLM #рекуррентная_нейросеть #рекуррентная_нейронная_сеть #языковые_модели #rwkv #ssm

#ssm #rwkv #языковые_модели #рекуррентная_нейронная_сеть #рекуррентная_нейросеть #llm

ktp_programming @[email protected] · 2023-07-08 · 09:00 UTC

#RWKV à mon sens, est potentiellement une nouvelle voie en termes de modèles de langage. On en parle peu parce qu'un étudiant chinois fait moins de marketing qu'une boite de la SV, mais son modèle permet des performances incroyables pour sa taille, et met le fine-tuning à portée de machines plus modestes. Et le fine-tuning, on le sait, amène de meilleures perfs que le zero-shot à la ChatGPT.

https://twitter.com/BlinkDL_AI/status/1677593798531223552

#rwkv

Kristóf Marussy @[email protected] · 2023-02-16 · 00:19 UTC

CW: language model, food, being silly with rwkv-4-pile-3b-instruct-test2-20230209 on a cpu

> Q: what is a good recipe for bbq ribs?
>
> A: 1. Make the garlic herb rub.
> 2. Preheat oven to 200°F (93°C).
> 3. Soak the ribs in water for 1 hour.
> 4. Soak in 5% salt for 1 hour.
> 5. Rinse ribs, pat dry, and rub with olive oil.
> 6. Season the ribs with salt and pepper.
> 7. Place the ribs on a wire rack and roast for 45 minutes.

5/7, had my oven at 93C for two hours while the ribs were soaking in water and... saline? :blobfoxbreadsnootgoogly:

interestingly, #rwkvstic does not get any faster if i torch.set_num_threads(8) and torsh.set_num_interop_threads(8) compared to single-threaded, but uses a whole lot more cpu. but with bfloat16 instead of float32 it doesn't even use more cpu!

#rwkv #pytorch #llm

#llm #pytorch #rwkv #rwkvstic