home.social

#reasoningmodels — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #reasoningmodels, aggregated by home.social.

  1. OpenAI claims it solved an 80-year-old math problem — for real this time

    OpenAI claims its new reasoning model has produced an original mathematical proof disproving a famous unsolved conjecture in…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Artificialintelligence #AI #ArtificialIntelligence #ChatGPT #erdosproblems #OpenAI #reasoningmodels #Technology
    newsbeep.com/us/655490/

  2. OpenAI claims it solved an 80-year-old math problem — for real this time

    OpenAI claims its new reasoning model has produced an original mathematical proof disproving a famous unsolved conjecture in…
    #NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Artificialintelligence #AI #ArtificialIntelligence #ChatGPT #erdosproblems #OpenAI #reasoningmodels #Technology
    newsbeep.com/us/655490/

  3. Arcee AI released Trinity-Large-Thinking, a 400B parameter open-source reasoning model that scores within 2 points of Claude Opus on PinchBench while costing 96% less at $0.90 per million tokens. Uses sparse architecture activating only 13B parameters per token. Trained for $20M by 30-person team. #OpenSource #AI #ReasoningModels

    implicator.ai/arcee-ai-release

  4. xAI’s co‑founder exits keep coming, while Lambda outlines a 2025 shift toward bigger context windows, multimodal reasoning models and open‑source inference for AI production. What could this mean for the future of machine learning? Read on for the full story. #AIProduction #ReasoningModels #MultimodalAI #OpenSourceInference

    🔗 aidailypost.com/news/xai-co-fo

  5. AI that thinks instead of guessing?

    Reasoning models use techniques like chain of thought and tree of thought to decompose problems, explore alternatives, and choose better answers, often at the cost of more compute and latency.

    A practical explainer:
    🔗 techglimmer.io/what-is-ai-thin

    #AI #ReasoningModels #ChainOfThought #TreeOfThought #GenAI #FediTech #MachineLearning

  6. 2025 saw significant advancements in #LLMs, particularly in the areas of #reasoning and #agent based systems. #Reasoningmodels, capable of breaking down #complextasks and utilising tools, revolutionised #coding and #search. The year witnessed the rise of #codingagents, exemplified by #ClaudeCode, which can autonomously write, execute, and refine code. simonwillison.net/2025/Dec/31/ #tech #media #news

  7. 2025 saw significant advancements in #LLMs, particularly in the areas of #reasoning and #agent based systems. #Reasoningmodels, capable of breaking down #complextasks and utilising tools, revolutionised #coding and #search. The year witnessed the rise of #codingagents, exemplified by #ClaudeCode, which can autonomously write, execute, and refine code. simonwillison.net/2025/Dec/31/ #tech #media #news

  8. 2025 saw significant advancements in #LLMs, particularly in the areas of #reasoning and #agent based systems. #Reasoningmodels, capable of breaking down #complextasks and utilising tools, revolutionised #coding and #search. The year witnessed the rise of #codingagents, exemplified by #ClaudeCode, which can autonomously write, execute, and refine code. simonwillison.net/2025/Dec/31/ #tech #media #news

  9. 2025 saw significant advancements in #LLMs, particularly in the areas of #reasoning and #agent based systems. #Reasoningmodels, capable of breaking down #complextasks and utilising tools, revolutionised #coding and #search. The year witnessed the rise of #codingagents, exemplified by #ClaudeCode, which can autonomously write, execute, and refine code. simonwillison.net/2025/Dec/31/ #tech #media #news

  10. 2025 saw significant advancements in #LLMs, particularly in the areas of #reasoning and #agent based systems. #Reasoningmodels, capable of breaking down #complextasks and utilising tools, revolutionised #coding and #search. The year witnessed the rise of #codingagents, exemplified by #ClaudeCode, which can autonomously write, execute, and refine code. simonwillison.net/2025/Dec/31/ #tech #media #news

  11. "The point is that with each advance in AI, new hurdles become apparent; when one missing aspect of “intelligence” is filled in, we find ourselves bumping up against another gap. When I speculated about GPT-5 last year, it didn’t occur to me to question whether it would know how to set priorities, because the models of the time weren’t even capable enough for that to be a limiting factor. In a post from November, AI is Racing Forward – on a Very Long Road, I wrote:

    …the real challenges may be things that we can’t easily anticipate right now, weaknesses that we will only start to put our finger on when we observe [future models] performing astonishing feats and yet somehow still not being able to write that tightly-plotted novel.

    In April 2024, it seemed like agentic AI was going to be the next big thing. The ensuing 16 months have brought enormous progress on many fronts, but very little progress on real-world agency. With projects like AI Village shining a light on the profound weakness of current AI agents, I think robust real-world capability is still years away."

    secondthoughts.ai/p/gpt-5-the-

    #AI #GenerativeAI #LLMs #Chatbots #AIAgents #AgenticAI #ReasoningModels

  12. 🧠 What if you could tell AI how much to think before answering?
    Seed-OSS 36B gives builders a thinking budget knob + 512K context window—control depth vs speed like never before. ⚡

    👉 See how it changes product SLAs, costs, and user experience:
    medium.com/@rogt.x1997/seed-os

    #AI #ReasoningModels #LongContext
    medium.com/@rogt.x1997/seed-os

  13. Seven so-called "replies" to Apple's paper on reasoning models, or as I like to call them, seven exercises in missing the point entirely. 📚🤦‍♂️ It's almost like a bad magic trick: look over here at these rebuttals while we pretend the original issue just vanishes! 🎩✨
    garymarcus.substack.com/p/seve #AppleReplies #ReasoningModels #MissingThePoint #BadMagicTrick #TechCritique #HackerNews #ngated

  14. No more guessing games! 🕵️‍♂️ 's new 'think' feature cleanly separates the model's internal thinking from the content. Easy to enable - just 'think': true in your API request. youtu.be/yBD598s5g8c

  15. 🤖 AI
    🔴 OpenAI Unveils o3 & o4-mini Reasoning Models

    🔸 o3 outperforms all models in math, coding & visual tasks; o4-mini balances price & power.
    🔸 First OpenAI models to "think with images" — can analyze blurry PDFs or sketches.
    🔸 Both run Python, browse the web, and will be accessible via APIs & ChatGPT.

    #OpenAI #AI #o3 #o4mini #GPT5 #ReasoningModels

  16. Oh, the irony! An article on "reasoning models" that can't reason its way past a #JavaScript prompt. 🤖🧠✨ Maybe it should model how to enable #cookies first before philosophizing! 🍪🙄
    anthropic.com/research/reasoni #irony #reasoningmodels #techhumor #codingstruggles #HackerNews #ngated

  17. Apparently AI reasoning models like Deepseek-R1 and OpenAI o1 suffer from "underthinking", where they abandon promising solutions too quickly, leading to inefficient resource use. To address this, a "thought switching penalty" (TIP) was developed, which improved accuracy across math and science problems.

    the-decoder.com/reasoning-mode

  18. The Chinese firm said training the model cost just $5.6 million. Alibaba Cloud followed with a new generative AI model, while Microsoft alleges DeepSeek ‘distilled’ OpenAI’s work.#artificialintelligence #chatgpt #deepseek #deepseekr1 #deepseek-v3 #generativeai #Microsoft #nvidia #openai #reasoningmodels
    DeepSeek Chatbot Beats OpenAI on App Store Leaderboard
  19. OpenAI's o1 marks a major shift in the AI industry, moving away from prediction-based LLMs to reasoning models that aim to overcome their limitations. 🔍🤖 #OpenAI #AI #MachineLearning #ReasoningModels #ArtificialIntelligence #TechInnovation #AIShift