#iclr — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #iclr, aggregated by home.social.
-
SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.
https://benjaminhan.net/posts/20260512-score/?utm_source=mastodon&utm_medium=social
-
SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.
https://benjaminhan.net/posts/20260512-score/?utm_source=mastodon&utm_medium=social
-
SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.
https://benjaminhan.net/posts/20260512-score/?utm_source=mastodon&utm_medium=social
-
SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.
https://benjaminhan.net/posts/20260512-score/?utm_source=mastodon&utm_medium=social
-
SCoRe is a two-stage on-policy RL recipe that teaches a language model to revise its own answers using only self-generated data. On Gemini 1.5 Flash and 1.0 Pro it gains 15.6 points on MATH and 9.1 on HumanEval over the base model. At matched inference budgets, sequential self-correction beats parallel sampling up to 32 samples.
https://benjaminhan.net/posts/20260512-score/?utm_source=mastodon&utm_medium=social
-
Let's Verify Step by Step compares process and outcome supervision on MATH. The process-reward model reaches 78.2% best-of-1860 vs 72.4% for outcome. But that gap narrows fast at small N, where most deployments actually live.
-
Conformal Language Modeling (CLM) adapts conformal prediction to generative LMs: sample candidates, stop when a calibrated rule fires, return a set guaranteed to contain an acceptable answer. The more interesting half is the component-level filter — per-phrase coverage, not just set-level. That's the primitive for hallucination flagging: highlight the vetted phrases, leave the rest for review.
-
DSPy turns LM pipelines into typed-module graphs and compiles them end-to-end against a single metric, bootstrapping its own few-shot demonstrations.
The programming-model layer is the real contribution, not any specific teleprompter. Once pipelines are typed graphs, pipeline-level search (MASS, MIPRO) becomes possible in a way it wasn't with string-template prompts.
https://benjaminhan.net/posts/20260430-dspy/?utm_source=mastodon&utm_medium=social
-
SelfReflect measures whether an LLM's text summary of its uncertainty matches its actual answer distribution. Across 20 modern models: it doesn't, unless the model sees samples of its own answers first.
The negative result does more work than the metric itself. Fits a growing line where LLM self-reports shouldn't be trusted as introspection. Practical workaround isn't cheap: N forward passes to sample, then a summarize pass.
-
Big congratulations to all authors! 🚀
#ICLR2026 #MachineLearning #AIResearch #RepresentationLearning #InformationRetrieval #DenseRetrieval #SelfSupervisedLearning #LanguageModels #NLP #UKPLab #ICLR
-
Big congratulations to all authors! 🚀
#ICLR2026 #MachineLearning #AIResearch #RepresentationLearning #InformationRetrieval #DenseRetrieval #SelfSupervisedLearning #LanguageModels #NLP #UKPLab #ICLR
-
Big congratulations to all authors! 🚀
#ICLR2026 #MachineLearning #AIResearch #RepresentationLearning #InformationRetrieval #DenseRetrieval #SelfSupervisedLearning #LanguageModels #NLP #UKPLab #ICLR
-
Big congratulations to all authors! 🚀
#ICLR2026 #MachineLearning #AIResearch #RepresentationLearning #InformationRetrieval #DenseRetrieval #SelfSupervisedLearning #LanguageModels #NLP #UKPLab #ICLR
-
A major #AIconference, the International Conference on Learning Representations (#ICLR), discovered that 21% of #peerreviews were fully #AIgenerated. #Researchers raised concerns about AI-generated #reviews, citing issues like #hallucinatedcitations and #vaguefeedback. Organisers will now use automated tools to assess submissions and reviews for AI use. https://www.nature.com/articles/d41586-025-03506-6?eicker.news #tech #media #news
-
Got invited to speak at an #iclr workshop in May and I'm over the moon.
-
Looking to follow (and be followed!) #mlbio #bioml #biotech #bayes #bayesian #generativemodels #startup #startups #medicine #biology #representationlearning #techbio #immunology #neurips #icml #iclr #cambridge #nyc