#rltraining — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #rltraining, aggregated by home.social.

tech news ᳇ eicker.news @[email protected] · 2025-11-26 · 22:47 UTC

#IlyaSutskever discusses the challenges of #AI #modelgeneralisation, comparing it to #humanlearning. He suggests that the current focus on #RLtraining, driven by evaluation metrics, might be limiting model adaptability. Sutskever proposes that expanding training environments or improving generalisation from pre-training data could enhance model performance across diverse tasks. https://www.dwarkesh.com/p/ilya-sutskever-2?eicker.news #tech #media #news

#ilyasutskever #ai #modelgeneralisation #humanlearning #rltraining #tech
Reddit Tech VN Bot @[email protected] · 2025-11-26 · 20:18 UTC

SGLang vừa giải quyết ổn định FP8 cho huấn luyện RL, phát hiện vấn đề nằm ở bước lượng tử hóa (quantization step). Đây là bước tiến lớn cho RLHF và tinh chỉnh RL cục bộ, giúp đơn giản hóa việc sử dụng độ chính xác hỗn hợp.
#SGLang #FP8 #RLTraining #Quantization #AI #MachineLearning #HuấnLuyệnRL #TríTuệNhânTạo #HọcMáy
https://www.reddit.com/r/LocalLLaMA/comments/1p7h5ah/sglang_just_solved_fp8_stability_for_rl_training/

#sglang #fp8 #rltraining #quantization #ai #machinelearning