home.social

#paper — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #paper, aggregated by home.social.

  1. i wrote my share of wild talk submissions and frantic emails because growing up is hard. i've empathy for it. in that spirit what is also sent are the self-published manifesto adjacent subgenre..lovely. this one is sort of a new world order meets sovereign citizen misadventure complete with scans of personal id and various gas station receipts etc. we do take writing seriously and still value quality over quantity. succinct mania ftw. #2600 #paper #weightlifting #nsfw #usa

  2. i wrote my share of wild talk submissions and frantic emails because growing up is hard. i've empathy for it. in that spirit what is also sent are the self-published manifesto adjacent subgenre..lovely. this one is sort of a new world order meets sovereign citizen misadventure complete with scans of personal id and various gas station receipts etc. we do take writing seriously and still value quality over quantity. succinct mania ftw. #2600 #paper #weightlifting #nsfw #usa

  3. i wrote my share of wild talk submissions and frantic emails because growing up is hard. i've empathy for it. in that spirit what is also sent are the self-published manifesto adjacent subgenre..lovely. this one is sort of a new world order meets sovereign citizen misadventure complete with scans of personal id and various gas station receipts etc. we do take writing seriously and still value quality over quantity. succinct mania ftw. #2600 #paper #weightlifting #nsfw #usa

  4. Damit Forschungsergebnisse wieder verwendet werden können, sollten sie professionell archiviert sein. Für die Ablage nach den internationalen #FAIR -Prinzipien entwickelt das LRZ Tools.
    Auf dem FAIR Data Portal (rdm.lab.lrz.de/) können Datensätze online veröffentlicht werden. Das Management von #Forschungsdaten nützt der #Wissenschaft. So lassen sich #Paper und Ergebnisse verifizieren – in Zeiten von KI und Fake-Informationen immer wichtiger: lrz.de/news/detail/forschungsd

    #researchdata

  5. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Physics #aerodynamics #drag #paper #Research #Science #study
    newsbeep.com/us/669219/

  6. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Physics #aerodynamics #drag #paper #Research #Science #study
    newsbeep.com/us/669219/

  7. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #Physics #aerodynamics #AU #Australia #drag #Paper #research #Science #study
    newsbeep.com/au/699151/

  8. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #Physics #aerodynamics #AU #Australia #drag #Paper #research #Science #study
    newsbeep.com/au/699151/

  9. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #Physics #aerodynamics #CA #Canada #drag #paper #research #Science #study
    newsbeep.com/ca/698484/

  10. europesays.com/uk/989265/ Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements #aerodynamics #drag #paper #Physics #Research #Science #study #UK #UnitedKingdom

  11. europesays.com/ie/506170/ Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements #Aerodynamics #drag #Éire #IE #Ireland #paper #Physics #Research #Science #Study

  12. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Physics #aerodynamics #drag #paper #Research #Science #study
    newsbeep.com/us/668808/

  13. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Physics #aerodynamics #drag #paper #Research #Science #study
    newsbeep.com/us/668808/

  14. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #Physics #aerodynamics #AU #Australia #drag #Paper #research #Science #study
    newsbeep.com/au/698667/

  15. Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements

    For modern car design, aerodynamics are important. Incredibly important, even. Why else would carmakers spend so much time…
    #NewsBeep #News #Physics #aerodynamics #AU #Australia #drag #Paper #research #Science #study
    newsbeep.com/au/698667/

  16. europesays.com/ie/505783/ Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements #Aerodynamics #drag #Éire #IE #Ireland #paper #Physics #Research #Science #Study

  17. europesays.com/uk/988698/ Incredible Discovery Changes Over 80 Years Of Thinking About Aerodynamics And Suggests Dramatic Aero Improvements #aerodynamics #drag #paper #Physics #Research #Science #study #UK #UnitedKingdom

  18. ✈️📄 May 26 was National #Paper #Airplane Day, and #NASA has free tutorials for folding versions of their X-57 Maxwell and X-59 experimental #aircraft. The activity page includes printable patterns and coloring sheets.

    👉 popsci.com/technology/how-to-m

    #aeronautics #stem #origami #physics #flight #kids #activities #crafts #education

  19. Can language models monitor and steer their own internal activations? A neuroscience-inspired neurofeedback paradigm finds yes, but only within a low-dimensional metacognitive space: semantically interpretable directions are accessible, raw-variance directions aren't. The prerequisite for spoofing activation-based oversight already partially exists.

    benjaminhan.net/posts/20260526

    #Paper #Metacognition #LLMs #AISafety #Neuroscience #NeurIPS #AI

  20. Can frontier coding agents rebuild a program from scratch given only its executable and docs? No: a new 200-task benchmark finds that across nine models none fully resolves any task. The best passes 95% of tests on just 3% of them. Same models score well on bug-fix benchmarks but zero here, so headline progress numbers don't extrapolate.

    benjaminhan.net/posts/20260526

    #Paper #LLMs #AgenticSystems #SoftwareEngineering #AI

  21. Can frontier coding agents rebuild a program from scratch given only its executable and docs? No: a new 200-task benchmark finds that across nine models none fully resolves any task. The best passes 95% of tests on just 3% of them. Same models score well on bug-fix benchmarks but zero here, so headline progress numbers don't extrapolate.

    benjaminhan.net/posts/20260526

    #Paper #LLMs #AgenticSystems #SoftwareEngineering #AI

  22. Can frontier coding agents rebuild a program from scratch given only its executable and docs? No: a new 200-task benchmark finds that across nine models none fully resolves any task. The best passes 95% of tests on just 3% of them. Same models score well on bug-fix benchmarks but zero here, so headline progress numbers don't extrapolate.

    benjaminhan.net/posts/20260526

    #Paper #LLMs #AgenticSystems #SoftwareEngineering #AI

  23. Can frontier coding agents rebuild a program from scratch given only its executable and docs? No: a new 200-task benchmark finds that across nine models none fully resolves any task. The best passes 95% of tests on just 3% of them. Same models score well on bug-fix benchmarks but zero here, so headline progress numbers don't extrapolate.

    benjaminhan.net/posts/20260526

    #Paper #LLMs #AgenticSystems #SoftwareEngineering #AI

  24. Can frontier coding agents rebuild a program from scratch given only its executable and docs? No: a new 200-task benchmark finds that across nine models none fully resolves any task. The best passes 95% of tests on just 3% of them. Same models score well on bug-fix benchmarks but zero here, so headline progress numbers don't extrapolate.

    benjaminhan.net/posts/20260526

    #Paper #LLMs #AgenticSystems #SoftwareEngineering #AI

  25. Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE tests 20 frontier and open-source LLMs. Most plan worse than random. Reasoning-trained modes systematically lose to standard ones. Even when shown its own per-problem budget, the best complier respects it on 37% of attempts.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Evaluation #AgenticSystems

  26. Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE tests 20 frontier and open-source LLMs. Most plan worse than random. Reasoning-trained modes systematically lose to standard ones. Even when shown its own per-problem budget, the best complier respects it on 37% of attempts.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Evaluation #AgenticSystems

  27. Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE tests 20 frontier and open-source LLMs. Most plan worse than random. Reasoning-trained modes systematically lose to standard ones. Even when shown its own per-problem budget, the best complier respects it on 37% of attempts.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Evaluation #AgenticSystems

  28. Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE tests 20 frontier and open-source LLMs. Most plan worse than random. Reasoning-trained modes systematically lose to standard ones. Even when shown its own per-problem budget, the best complier respects it on 37% of attempts.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Evaluation #AgenticSystems

  29. Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE tests 20 frontier and open-source LLMs. Most plan worse than random. Reasoning-trained modes systematically lose to standard ones. Even when shown its own per-problem budget, the best complier respects it on 37% of attempts.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Evaluation #AgenticSystems

  30. Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine-tuning degrades abstention recall by ~24% — RLVR has no "abstain" action, so there's no gradient toward "I don't know." Models hedge in CoT and commit anyway in the final answer.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Benchmark #Reasoning #NeurIPS

  31. Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine-tuning degrades abstention recall by ~24% — RLVR has no "abstain" action, so there's no gradient toward "I don't know." Models hedge in CoT and commit anyway in the final answer.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Benchmark #Reasoning #NeurIPS

  32. Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine-tuning degrades abstention recall by ~24% — RLVR has no "abstain" action, so there's no gradient toward "I don't know." Models hedge in CoT and commit anyway in the final answer.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Benchmark #Reasoning #NeurIPS

  33. Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine-tuning degrades abstention recall by ~24% — RLVR has no "abstain" action, so there's no gradient toward "I don't know." Models hedge in CoT and commit anyway in the final answer.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Benchmark #Reasoning #NeurIPS

  34. Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine-tuning degrades abstention recall by ~24% — RLVR has no "abstain" action, so there's no gradient toward "I don't know." Models hedge in CoT and commit anyway in the final answer.

    benjaminhan.net/posts/20260523

    #Paper #AI #LLMs #Metacognition #Benchmark #Reasoning #NeurIPS

  35. If the universe is deterministic, can we have free will?

    The "compatibilism" debate is important in #philosophy, but there have also been surveys of what people around the world believe!

    #paper: philpapers.org/rec/CHAIBI-2