#policygradient — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #policygradient, aggregated by home.social.
-
Implementing DeepSeek R1's GRPO algorithm from scratch
https://github.com/policy-gradient/GRPO-Zero
#HackerNews #Implementing #DeepSeek #GRPO #algorithm #from #scratch #deepseek #GRPO #algorithm #machinelearning #AIresearch #policygradient