home.social

#deepseek-v3 — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #deepseek-v3, aggregated by home.social.

fetched live
  1. DeepSeek-V3 from Scratch: Mixture of Experts (MoE) Table of Contents DeepSeek-V3 from Scratch: Mixture of Experts (MoE) The Scaling Challenge in Neural Networks Mixture of Experts (MoE): Mathematic...

    #Deep #Learning #DeepSeek #Machine #Learning #Neural #Networks #Tutorial #deepseek-v3 #expert #routing

    Origin | Interest | Match
  2. DeepSeek-V3 from Scratch: Mixture of Experts (MoE) Table of Contents DeepSeek-V3 from Scratch: Mixture of Experts (MoE) The Scaling Challenge in Neural Networks Mixture of Experts (MoE): Mathematic...

    #Deep #Learning #DeepSeek #Machine #Learning #Neural #Networks #Tutorial #deepseek-v3 #expert #routing

    Origin | Interest | Match
  3. Build DeepSeek-V3: Multi-Head Latent Attention (MLA) Architecture Table of Contents Build DeepSeek-V3: Multi-Head Latent Attention (MLA) Architecture The KV Cache Memory Problem in DeepSeek-V3 Mult...

    #Deep #Learning #Large #Language #Models #PyTorch #Transformers #Tutorial #attention #mechanisms #deepseek-v3

    Origin | Interest | Match
  4. DeepSeek-V3 Model: Theory, Config, and Rotary Positional Embeddings Table of Contents DeepSeek-V3 Model: Theory, Config, and Rotary Positional Embeddings Introduction to the DeepSeek-V3 Model The F...

    #DeepSeek-V3 #KV #Cache #MultiHead #Latent #Attention #RoPE #Tutorial #deepseekv3 #kv #cache

    Origin | Interest | Match
  5. Beating GPT-5: DeepSeekMath-V2 Self-Corrects Logic Errors Presentational View Introduction Mathematics with the aid of artificial intelligence, is advancing rapidly. Innovations such as informal th...

    #ai-in-mathematics #deepseekmath-v2 #deepseek-v3 #open-source-ai-model #theorem-proving

    Origin | Interest | Match
  6. 🚀 Welcome GLM-4.6 the Latest flagship #opensource #AI #llm with advanced agentic, reasoning & coding capabilities

    ⚡ Performance improvements over #GLM45 with competitive advantages against #DeepSeekV3 and #ClaudeSonnet4 across 8 public benchmarks covering agents, reasoning & coding

    🧵 👇

  7. Насколько зацензурен и опасен DeepSeek?

    Насколько предвзят искусственный интеллект? Принято ругать нейросети за трансляцию стереотипов человеческого мышления, которые были подсмотрены в датасетах предобучения. На деле ИИ куда более аккуратен, чем можно ожидать. Хороший пример — генерация фотографий бабочек. Как правило, дизайнеры-люди очень любят изображать бабочек в мёртвом виде. Дело в том, что энтомологи руководствуются строгими визуальными стандартами: вид сверху, расправленные на 180° крылья, чистый фон, симметрия.

    habr.com/ru/articles/949540/

    #DeepSeek #DeepSeekR1 #DeepSeekV3 #КНР #Китай #большие_языковые_модели #БЯМ #искусственный_интеллект #предвзятость #цензура

  8. 🧩 #Llama4Maverick nutzt 128 Experten für deutlich mehr Rechenleistung und schlägt sogar #GPT4o und #Gemini20 in Benchmarks – bei nur der Hälfte der aktiven Parameter von #DeepSeekv3.

    🎓 Beide #KIModelle wurden mithilfe des riesigen Lehrmodells #Llama4 Behemoth trainiert, das mit 288 Milliarden aktiven Parametern zu den leistungsstärksten weltweit zählt.

    👉 eicker.TV #Technik #Medien #Politik #Wirtschaft (2/2)

  9. Studie: #KI #Chatbots sind beim Zitieren von #News unbrauchbar
    derstandard.at/story/300000026

    "Untersucht wurden #ChatGPT Search (#OpenAI), #Perplexity, Perplexity Pro (Perplexity AI), #Gemini 2.0 Flash (#Google), #DeepseekV3 Search (#Deepseek), #Grok-2 Search, Grok-3 Search Beta (#xAI) sowie #Copilot (#Microsoft und OpenAI)."

    "#Grok3 [...] lieferte gleich in 96 Prozent aller Fälle falsche Antworten." 🤣

    #Nachrichten #Algorithmen #Automatisierung

  10. »Chinese #AIlab #DeepSeek just released the latest version of their enormous #DeepSeekv3 model: The license is #MIT (that's new - previous DeepSeek v3 had a custom license).« simonwillison.net/2025/Mar/24/ #tech #media

  11. 🚀 DeepSeek V3 vs ChatGPT-4o: Which One Reigns Supreme?🤖

    AI is evolving fast! 🏎️ DeepSeek V3 and ChatGPT-4o are two of the most powerful LLMs in 2025. But which one is better?

    🔍 We compare:
    ✅ Accuracy & performance
    ✅ Multimodal capabilities
    ✅ Speed & efficiency
    ✅ Real-world applications

    📖 Read the full breakdown here:

    radargit.com/2025/02/03/deepse

    Which AI model do you prefer? Comment below! 👇

    #AI #DeepSeekV3 #ChatGPT4o #ArtificialIntelligence #Tech #MachineLearning #AICompari

  12. "A key component of the success is that it is #opensource. #DeepSeek-V3 is on GitHub with detailed docs on how it can be replicated. This has fueled a rush of people to try to make their own models." https://baixacultura.org/2025/01/29/a-corrida-da-ia-ganha-um-novo-capitulo-chines-e-open-source/

    A corrida da IA ganha um novo ...

  13. The Chinese firm said training the model cost just $5.6 million. Alibaba Cloud followed with a new generative AI model, while Microsoft alleges DeepSeek ‘distilled’ OpenAI’s work.#artificialintelligence #chatgpt #deepseek #deepseekr1 #deepseek-v3 #generativeai #Microsoft #nvidia #openai #reasoningmodels
    DeepSeek Chatbot Beats OpenAI on App Store Leaderboard
  14. Anyone out there who tried to run #deepseek V3 locally on a #linux machine? I'm curious if it can run with a consumer #nvidia or #amd card?

    #deepseekv3