home.social

#humanityslastexam — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #humanityslastexam, aggregated by home.social.

  1. Google's new Gemini 3 Flash promises faster AI with a leaner footprint, challenging larger frontier models. Early benchmark tests show it hitting 2.5‑model performance levels while staying more accessible. Could this be the answer to the ‘Humanity’s Last Exam’ of scaling? Dive into the details and see how parameter counts stack up. #Gemini3Flash #FrontierModels #2point5Models #HumanitysLastExam

    🔗 aidailypost.com/news/gemini-3-

  2. Gemini Deep Research agent just topped the Humanity’s Last Exam (HLE) and DeepSearchQA benchmarks, and now leads BrowseComp—outperforming Google Search and NotebookLM. The results showcase a new AI model’s capabilities and set a fresh standard for open‑source research tools. Curious how it did it? Read the full breakdown. #GeminiDeepResearch #HumanitysLastExam #DeepSearchQA #BrowseComp

    🔗 aidailypost.com/news/gemini-de

  3. Die Grenzen von KI austesten

    Reuters & die New York Times berichten über einen neuen Test: Humanity's Last Exam. Mit 3.000 Fragen aus über 100 Themengebieten werden hier die Grenzen moderner KI-Systeme ausgetestet. Thorben Jansen vom IPN war an der Entwicklung beteiligt.

    🔗 Mehr: lastexam.ai

    New York Times: reuters.com/technology/artific

    Reuters: reuters.com/technology/artific

    #AI #AIBenchmark #KI #HumanitysLastExam