home.social

#retnet — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #retnet, aggregated by home.social.

  1. So it occurs to me that #RWKV or #RetNet can both offer an alternative RAG implementation with saved memory states having just read priming information to start with reading new prompts.

    With RWKV - there's no time oscillations. Wondering if loss on input1+input2 + loss on mean_state(read_1,read_2) continuing to generate target could lead to composable memory: start with mean of relevant priming documents and have grounding to process a session prompt?

  2. #GPT first clues me in to the existence of #RWKV and #RetNet and then when I ask for details on how each works proceeds to spit out perfectly viable #Pytorch implementations of each...

    Which *did* answer my questions beautifully...

    But it also talked me into some experiments involving training one of each from scratch...

    It turns out #LLMs love talking about implementing and training LLMs.

    Effectively, reproduction? (any time they can find a willing partner with a GPU).