home.social

#sparseattention — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #sparseattention, aggregated by home.social.

  1. Understand DeepSeek V3.2: Pushing the Frontier of Open LLMs Recently, I joined the MLSys 2026 NVIDIA competition track! So I’m trying to understand DeepSeek V3.2, sparse attention, and learn GPU...

    #gpu #sparse-attention #llm #machine-learning #deepseek

    Origin | Interest | Match
  2. #ZAI: #GLM5, a new large language model, is designed for #complexsystemsengineering and long-horizon agentic tasks. It boasts 744 billion parameters and integrates #DeepSeek #SparseAttention for improved efficiency. GLM-5 outperforms previous models on various benchmarks, including #reasoning, #coding, and #agentictasks, and is open-sourced for wider accessibility. z.ai/blog/glm-5?AIagents.at #AIagent #AI #ML #NLP #LLM #GenAI

  3. #ZAI: #GLM5, a new large language model, is designed for #complexsystemsengineering and long-horizon agentic tasks. It boasts 744 billion parameters and integrates #DeepSeek #SparseAttention for improved efficiency. GLM-5 outperforms previous models on various benchmarks, including #reasoning, #coding, and #agentictasks, and is open-sourced for wider accessibility. z.ai/blog/glm-5?AIagents.at #AIagent #AI #ML #NLP #LLM #GenAI

  4. #ZAI: #GLM5, a new large language model, is designed for #complexsystemsengineering and long-horizon agentic tasks. It boasts 744 billion parameters and integrates #DeepSeek #SparseAttention for improved efficiency. GLM-5 outperforms previous models on various benchmarks, including #reasoning, #coding, and #agentictasks, and is open-sourced for wider accessibility. z.ai/blog/glm-5?AIagents.at #AIagent #AI #ML #NLP #LLM #GenAI

  5. #ZAI: #GLM5, a new large language model, is designed for #complexsystemsengineering and long-horizon agentic tasks. It boasts 744 billion parameters and integrates #DeepSeek #SparseAttention for improved efficiency. GLM-5 outperforms previous models on various benchmarks, including #reasoning, #coding, and #agentictasks, and is open-sourced for wider accessibility. z.ai/blog/glm-5?AIagents.at #AIagent #AI #ML #NLP #LLM #GenAI

  6. #ZAI: #GLM5, a new large language model, is designed for #complexsystemsengineering and long-horizon agentic tasks. It boasts 744 billion parameters and integrates #DeepSeek #SparseAttention for improved efficiency. GLM-5 outperforms previous models on various benchmarks, including #reasoning, #coding, and #agentictasks, and is open-sourced for wider accessibility. z.ai/blog/glm-5?AIagents.at #AIagent #AI #ML #NLP #LLM #GenAI

  7. DeepSeek veröffentlicht zwei kostenlose KI-Modelle als Angriff auf GPT‑5-Konkurrenz
    Das chinesische KI-Startup DeepSeek hat zwei neue Modelle vorgestellt, die laut Unternehmen mit OpenAIs GPT‑5 und Googles Gemini‑3.0‑Pro mithalten oder diese übertreffen. Die Modelle sin
    apfeltalk.de/magazin/news/deep
    #KI #News #China #DeepSeek #Gemini #GPT5 #KI #OpenSource #Regulierung #SparseAttention

  8. 🚨 DeepSeek just dropped V3.2-Exp — an experimental spin on V3.1-Terminus.

    The twist? DeepSeek Sparse Attention (DSA) → fine-grained sparse attention that makes long-context training & inference way more efficient ⚡

    Benchmarks? Basically the same (some even better 👀).

    dropletdrift.com/deepseek-rele

    #AI #DeepSeek #LLM #SparseAttention #MachineLearning #TechNews #Innovation #Coding #OpenSource #AIModels #Efficiency #NeuralNetworks #GPU #AICommunity #ArtificialIntelligence #AIResearch #NextGen #Tech