home.social

#localllm — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #localllm, aggregated by home.social.

  1. Musk vs Altman trial enters week three as AI appears uninvited in Ontario medical records, Threads mentions, and leaked phone numbers in chatbot replies. Also a Game Boy Color now runs a transformer model.

    ai0.news/posts/2026-05-14-dail

    #AI #OpenAI #LocalLLM #OpenSource

  2. Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

    Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

    8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

    Detaylı yazı + VRAM önerileri:
    webbrain.one/blog

    GitHub'da ⭐ atarsanız çok seviniriz 🙏
    github.com/esokullu/webbrain

    #LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

  3. Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

    Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

    8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

    Detaylı yazı + VRAM önerileri:
    webbrain.one/blog

    GitHub'da ⭐ atarsanız çok seviniriz 🙏
    github.com/esokullu/webbrain

    #LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

  4. Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

    Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

    8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

    Detaylı yazı + VRAM önerileri:
    webbrain.one/blog

    GitHub'da ⭐ atarsanız çok seviniriz 🙏
    github.com/esokullu/webbrain

    #LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

  5. Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

    Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

    8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

    Detaylı yazı + VRAM önerileri:
    webbrain.one/blog

    GitHub'da ⭐ atarsanız çok seviniriz 🙏
    github.com/esokullu/webbrain

    #LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

  6. Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

    Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

    8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

    Detaylı yazı + VRAM önerileri:
    webbrain.one/blog

    GitHub'da ⭐ atarsanız çok seviniriz 🙏
    github.com/esokullu/webbrain

    #LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

  7. So today is vLLM setup day as I want to run a few experiments with parallel inferencing.

    Funnily LLM inference does not need 2 times the time and energy of you batch 2 request at the same time. So what I am trying to do is to have the same model come up with 2 or 3 different solutions for a functions or test so I then can choose the one that needs less editing.

    Nothing that is not a year old already but regardless i imagine it super useful in a local only setup.

    #ai #llm #localLlm #vllm

  8. Have pushed 0.9.5-dev branch to codeberg of foxing ( codeberg.org/aenertia/foxing/s ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

  9. Have pushed 0.9.5-dev branch to codeberg of foxing ( codeberg.org/aenertia/foxing/s ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

  10. Have pushed 0.9.5-dev branch to codeberg of foxing ( codeberg.org/aenertia/foxing/s ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

  11. Have pushed 0.9.5-dev branch to codeberg of foxing ( codeberg.org/aenertia/foxing/s ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

  12. Have pushed 0.9.5-dev branch to codeberg of foxing ( codeberg.org/aenertia/foxing/s ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

  13. v1 Of DoomSummarizer is out.
    It's a crazy deep research / auto knowledgebase system. Point it at a directory of word docs, pdf and markdown it'll index it all then answer questions about the contents. Point it at a url it'll parse the content, index it and tell you what it's about.
    Crawl your company's knowledgebase? It'll automatically become a support AI.

    Want to know what your biggest invoice was, when you sent that angry letter etc...all local, all private, all open source (unlicense) . Quick two as unlike most RAG systems it MINIMIZES token use.

    #llm #ai #rag #search #localllm #ollama #onnx github.com/scottgal/lucidrag/r

  14. Wax: Động cơ bộ nhớ đơn file, thuần Swift cho AI trên thiết bị – không server, không DB. Tích hợp dữ liệu, embedding, index và WAL trong 1 file xác định. Tìm kiếm lai (lexical + vector + temporal), an toàn khi sập nguồn, hỗ trợ GPU trên Apple Silicon. Mở nguồn, phù hợp trợ lý AI, ứng dụng offline/riêng tư. #Wax #OnDeviceAI #Swift #RAG #AI #LocalLLM #TríTuệNhânTạo #AItrênThiếtBị #SwiftUI #MachineLearning

    reddit.com/r/LocalLLaMA/commen

  15. 📢 Tìm kiếm mô hình LLM tối ưu cho lập trình Unity "agentic" (hạn chế 12GB VRAM)! Cần gợi ý các mô hình hỗ trợ chỉnh sửa code chi tiết (SEARCH/REPLACE) thay vì rewrite toàn bộ file. Cấu hình: RTX 3060 12GB, Ryzen 5 600x, LM Studio + Zed/Aider. Đã thử: qwen3-53b, glm-4.7, mistral-nemo, v.v. nhưng hiệu quả chưa cao. Nên dùng mô hình nào phù hợp Unity/C#? 🔍

    #LocalLLM #UnityDev #AIProgramming #ViTinHoc #LLM #Mastodon #TechVietNam

    reddit.com/r/LocalLLaMA/commen

  16. Đã tính toán chi phí chạy Moltbot trên máy tính cá nhân: 128 GB DDR5 + RTX 4090 + bo mạch ≈ $3.2k, điện ~30 $/tháng → $3.56k năm đầu. Cloud hosting chỉ $25 $/tháng ≈ $300/năm, tiết kiệm ~90%. Nếu bot chỉ điều phối API, dùng cloud hợp lý hơn; chỉ khi cần inference nội bộ mới cân nhắc đầu tư phần cứng. #Moltbot #LLM #AI #CloudComputing #ChiPhí #MáyTính #LocalLLM

    reddit.com/r/LocalLLaMA/commen

  17. Bàn về hiệu năng hệ thống AI workstation kép RTX PRO 6000 với 1.15TB RAM: So sánh xử lý GPU-only (INT4) vs CPU+GPU (fp8) trên mô hình MiniMax-M2.1. Kết quả: GPU-only nhanh hơn 2–4x ở prefill nhưng chỉ xử lý tối đa ~3 request đồng thời do giới hạn KV-cache..fp8 tuy chậm hơn nhưng mở rộng tốt hơn cho 10+ người dùng, đặc biệt với context dài. Queue time là điểm nghẽn quan trọng. Phù hợp cho agent coding nội bộ. #AIWorkstation #LLMBenchmark #MultiUserAI #GPUvsCPU #LocalLLM #HPC #MachineLearning #Tín

  18. For my personal old Macbook Air M1 16Gb I found an LLM model that fits and performs reasonably well for my simple use cases. It's Mistral's `ministral-3:8b` — it gives an average 10 tokens per second. I use it in conjunction with Raycast AI chat app #llms #ai #genai #localai #localllm #raycast #mistral

  19. is there any way to obtain a useful and fast local llm for agentic coding on 8GB VRAM (RTX 3060 TI)?

    I tried #gemma3 4b, #deepseekr1 7b, #phi4mini and #qwen3 4b using #Ollama with #Cline but got poor results

    #localllm #agenticai

  20. AGI is just around the corner!

    I'm learning to use DSPy with GEPA (Genetic-Pareto) prompt optimization. In GEPA a larger "teacher" LLM adjusts the prompt for a smaller "student" LM to perform a specific task as well as possible. The teacher will try many different prompts and evaluate the outcome, in my case the quality of a metadata extraction task.

    The larger model (GPT-OSS 120B) just added this to the prompt for the smaller model (Gemma 3 4B):

    > Good luck! 🎯

    😅

    #LLM #LocalLLM #DSPy #GEPA

  21. "With 15GB VRAM, Unsloth allows you to transform any model up to 15B parameters like Llama 3.1 (8B), Phi-4 (14B), Mistral (7B) or Qwen2.5 (7B) into a reasoning model"

    Train your own R1 reasoning model with Unsloth

    unsloth.ai/blog/r1-reasoning

    #LocalLLM #LLM #reasoning #unsloth #GRPO