home.social

Search

1000 results for “Benja”

  1. europesays.com/people/76583/ Netanyahu announces projects for Jerusalem, including Western Wall – Israel & Jewish News #BenjaminNetanyahu

  2. europesays.com/people/76515/ Netanyahu says Israeli army nearing completion of Gaza mission, signals readiness for all Iran scenarios – Middle East Monitor #BenjaminNetanyahu

  3. europesays.com/people/76364/ As Netanyahu spotlights Israel’s ties to the UAE, its rulers prefer to be discreet #BenjaminNetanyahu

  4. europesays.com/people/76056/ Israeli court postpones Netanyahu corruption hearing for ‘security’ reasons – Middle East Monitor #BenjaminNetanyahu

  5. @BenjaminHCCarr I can't imagine that this could be legal outside of the USA.
    And for science, it should be a question of ethics even there.

    #ethics #Science #academicChatter #education #school

  6. This weekend in addition to my #99 #Parkrun (sigmoid.social/@BenjaminHan/11), I also made a 2nd run afternoon on the trails: a solid 10K run with a bit of hail raining down on me at some point!

    #Running #Trailrunning #Photo #PNW

  7. "The Night Has a Thousand Eyes" is a song written by #BenjaminWeisman, Dorothy Wayne, and Marilyn Garrett. It became a popular hit in 1962 for #BobbyVee and has had several cover versions over the years.
    youtube.com/watch?v=LJfpDaFOkA0

  8. "The Night Has a Thousand Eyes" is a song written by #BenjaminWeisman, Dorothy Wayne, and Marilyn Garrett. It became a popular hit in 1962 for #BobbyVee and has had several cover versions over the years.
    youtube.com/watch?v=LJfpDaFOkA0

  9. "The Night Has a Thousand Eyes" is a song written by #BenjaminWeisman, Dorothy Wayne, and Marilyn Garrett. It became a popular hit in 1962 for #BobbyVee and has had several cover versions over the years.
    youtube.com/watch?v=LJfpDaFOkA0

  10. "The Night Has a Thousand Eyes" is a song written by #BenjaminWeisman, Dorothy Wayne, and Marilyn Garrett. It became a popular hit in 1962 for #BobbyVee and has had several cover versions over the years.
    youtube.com/watch?v=LJfpDaFOkA0

  11. @pbloem That's a good question! I wrote up a longer answer to your question at benjaminhan.net/posts/20260517

    The short version: yes, the recent reasoning-model training *internalizes* what used to be an inference-time external signals. Question is can we do it universally.

    #LLMs #Reasoning #Metacognition

  12. Reflexion splits self-correction in two: an Evaluator that detects success/failure, and a Self-Reflection model that diagnoses what went wrong. The Evaluator's external signal — heuristic, exact-match, or test execution — gates whether diagnosis fires. When that signal misfires, as on MBPP Python's high false-negative rate, Self-Reflection rewrites correct code wrong, exactly the failure mode Cannot-Self-Correct documented.

    benjaminhan.net/posts/20260516

    #LLMs #AI #Reasoning #Agents #Metacognition

  13. Reflexion splits self-correction in two: an Evaluator that detects success/failure, and a Self-Reflection model that diagnoses what went wrong. The Evaluator's external signal — heuristic, exact-match, or test execution — gates whether diagnosis fires. When that signal misfires, as on MBPP Python's high false-negative rate, Self-Reflection rewrites correct code wrong, exactly the failure mode Cannot-Self-Correct documented.

    benjaminhan.net/posts/20260516

    #LLMs #AI #Reasoning #Agents #Metacognition

  14. Reflexion splits self-correction in two: an Evaluator that detects success/failure, and a Self-Reflection model that diagnoses what went wrong. The Evaluator's external signal — heuristic, exact-match, or test execution — gates whether diagnosis fires. When that signal misfires, as on MBPP Python's high false-negative rate, Self-Reflection rewrites correct code wrong, exactly the failure mode Cannot-Self-Correct documented.

    benjaminhan.net/posts/20260516

    #LLMs #AI #Reasoning #Agents #Metacognition

  15. Reflexion splits self-correction in two: an Evaluator that detects success/failure, and a Self-Reflection model that diagnoses what went wrong. The Evaluator's external signal — heuristic, exact-match, or test execution — gates whether diagnosis fires. When that signal misfires, as on MBPP Python's high false-negative rate, Self-Reflection rewrites correct code wrong, exactly the failure mode Cannot-Self-Correct documented.

    benjaminhan.net/posts/20260516

    #LLMs #AI #Reasoning #Agents #Metacognition

  16. Cannot-Self-Correct tests the strong claim that LLMs can revise their own reasoning answers without any external signal about correctness. Across three benchmarks (GSM8K, CommonSenseQA, HotPotQA), the answer is no: the model's confidence carries over from the initial answer into the revision, and the self-correction loop tends to degrade rather than improve performance. The result refutes the class of approach Self-Refine belongs to.

    benjaminhan.net/posts/20260516

    #LLMs #AI #Reasoning #Metacognition

  17. In Self-Refine, a single frozen LLM acts as generator, critic, and rewriter in a prompt-only loop, and the paper reports about 20 points of average lift across seven tasks without any training, RL, or external signal. The gains vary widely by task: small on math reasoning, but large on dialogue and constrained generation, where what counts as "good" is hardest to define from a one-line critique.

    benjaminhan.net/posts/20260516

    #SelfRefine #LLMs #AI #Reasoning #Metacognition

  18. Anthropic launched The Anthropic Institute — a four-pillar research agenda introducing a third governance document type at frontier labs alongside declared values and deployment gates, set up to produce empirical findings the other layers can be checked against. OpenAI's recent "Adaptability" principle commits to updating positions as evidence comes in; TAI is built for that.

    benjaminhan.net/posts/20260516

    #AI #Anthropic #Policy #Society #Economics #Ethics