home.social

#livecodebenchpro — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #livecodebenchpro, aggregated by home.social.

  1. Google just rolled out Gemini 3.1 Pro, smashing the GPQA Diamond benchmark at 94.3% and climbing to an Elo 2 on LiveCodeBench Pro. It also tops SWE‑Bench, showing leaps in AI reasoning, scientific knowledge, and vibe‑coding. Curious how it reshapes open‑source AI research? Read the full breakdown. #Gemini3_1Pro #GPQADiamond #LiveCodeBenchPro #SWEBench

    🔗 aidailypost.com/news/google-un