#localllm — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #localllm, aggregated by home.social.
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
Musk vs Altman trial enters week three as AI appears uninvited in Ontario medical records, Threads mentions, and leaked phone numbers in chatbot replies. Also a Game Boy Color now runs a transformer model.
-
Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.
Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.
8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.
Detaylı yazı + VRAM önerileri:
https://webbrain.one/blogGitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain -
Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.
Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.
8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.
Detaylı yazı + VRAM önerileri:
https://webbrain.one/blogGitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain -
Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.
Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.
8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.
Detaylı yazı + VRAM önerileri:
https://webbrain.one/blogGitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain -
Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.
Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.
8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.
Detaylı yazı + VRAM önerileri:
https://webbrain.one/blogGitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain -
Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.
Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.
8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.
Detaylı yazı + VRAM önerileri:
https://webbrain.one/blogGitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain -
So today is vLLM setup day as I want to run a few experiments with parallel inferencing.
Funnily LLM inference does not need 2 times the time and energy of you batch 2 request at the same time. So what I am trying to do is to have the same model come up with 2 or 3 different solutions for a functions or test so I then can choose the one that needs less editing.
Nothing that is not a year old already but regardless i imagine it super useful in a local only setup.
-
Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm
-
Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm
-
Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm
-
Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm
-
Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm
-
v1 Of DoomSummarizer is out.
It's a crazy deep research / auto knowledgebase system. Point it at a directory of word docs, pdf and markdown it'll index it all then answer questions about the contents. Point it at a url it'll parse the content, index it and tell you what it's about.
Crawl your company's knowledgebase? It'll automatically become a support AI.Want to know what your biggest invoice was, when you sent that angry letter etc...all local, all private, all open source (unlicense) . Quick two as unlike most RAG systems it MINIMIZES token use.
#llm #ai #rag #search #localllm #ollama #onnx https://github.com/scottgal/lucidrag/releases/
-
Wax: Động cơ bộ nhớ đơn file, thuần Swift cho AI trên thiết bị – không server, không DB. Tích hợp dữ liệu, embedding, index và WAL trong 1 file xác định. Tìm kiếm lai (lexical + vector + temporal), an toàn khi sập nguồn, hỗ trợ GPU trên Apple Silicon. Mở nguồn, phù hợp trợ lý AI, ứng dụng offline/riêng tư. #Wax #OnDeviceAI #Swift #RAG #AI #LocalLLM #TríTuệNhânTạo #AItrênThiếtBị #SwiftUI #MachineLearning
https://www.reddit.com/r/LocalLLaMA/comments/1qtdejw/i_built_a_swiftnative_singlefile_memory
-
📢 Tìm kiếm mô hình LLM tối ưu cho lập trình Unity "agentic" (hạn chế 12GB VRAM)! Cần gợi ý các mô hình hỗ trợ chỉnh sửa code chi tiết (SEARCH/REPLACE) thay vì rewrite toàn bộ file. Cấu hình: RTX 3060 12GB, Ryzen 5 600x, LM Studio + Zed/Aider. Đã thử: qwen3-53b, glm-4.7, mistral-nemo, v.v. nhưng hiệu quả chưa cao. Nên dùng mô hình nào phù hợp Unity/C#? 🔍
#LocalLLM #UnityDev #AIProgramming #ViTinHoc #LLM #Mastodon #TechVietNam
https://www.reddit.com/r/LocalLLaMA/comments/1qqibnr/seeking_best_
-
Đã tính toán chi phí chạy Moltbot trên máy tính cá nhân: 128 GB DDR5 + RTX 4090 + bo mạch ≈ $3.2k, điện ~30 $/tháng → $3.56k năm đầu. Cloud hosting chỉ $25 $/tháng ≈ $300/năm, tiết kiệm ~90%. Nếu bot chỉ điều phối API, dùng cloud hợp lý hơn; chỉ khi cần inference nội bộ mới cân nhắc đầu tư phần cứng. #Moltbot #LLM #AI #CloudComputing #ChiPhí #MáyTính #LocalLLM
-
Bàn về hiệu năng hệ thống AI workstation kép RTX PRO 6000 với 1.15TB RAM: So sánh xử lý GPU-only (INT4) vs CPU+GPU (fp8) trên mô hình MiniMax-M2.1. Kết quả: GPU-only nhanh hơn 2–4x ở prefill nhưng chỉ xử lý tối đa ~3 request đồng thời do giới hạn KV-cache..fp8 tuy chậm hơn nhưng mở rộng tốt hơn cho 10+ người dùng, đặc biệt với context dài. Queue time là điểm nghẽn quan trọng. Phù hợp cho agent coding nội bộ. #AIWorkstation #LLMBenchmark #MultiUserAI #GPUvsCPU #LocalLLM #HPC #MachineLearning #Tín
-
ollama pull mistral-nemo / learn how to use the tools #Ollama #MistralNemo #WSL2 #OpenAIWhisper #LocalLLM #Linux #PowerShell #PromptEngineering #TechLearning #SelfHosted #AI
-
For my personal old Macbook Air M1 16Gb I found an LLM model that fits and performs reasonably well for my simple use cases. It's Mistral's `ministral-3:8b` — it gives an average 10 tokens per second. I use it in conjunction with Raycast AI chat app #llms #ai #genai #localai #localllm #raycast #mistral
-
AGI is just around the corner!
I'm learning to use DSPy with GEPA (Genetic-Pareto) prompt optimization. In GEPA a larger "teacher" LLM adjusts the prompt for a smaller "student" LM to perform a specific task as well as possible. The teacher will try many different prompts and evaluate the outcome, in my case the quality of a metadata extraction task.
The larger model (GPT-OSS 120B) just added this to the prompt for the smaller model (Gemma 3 4B):
> Good luck! 🎯
😅
-
"With 15GB VRAM, Unsloth allows you to transform any model up to 15B parameters like Llama 3.1 (8B), Phi-4 (14B), Mistral (7B) or Qwen2.5 (7B) into a reasoning model"
Train your own R1 reasoning model with Unsloth