#localllm — Public Fediverse posts on home.social

XiELEd @[email protected] · 2026-05-14 · 13:22 UTC

<8B multilingual models for language learning chatbots

https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots

#ai #llm #llamacpp #localllm #koboldai #koboldcpp

XiELEd @[email protected] · 2026-05-14 · 13:22 UTC

<8B multilingual models for language learning chatbots

https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots

#ai #llm #llamacpp #localllm #koboldai #koboldcpp

XiELEd @[email protected] · 2026-05-14 · 13:22 UTC

<8B multilingual models for language learning chatbots

https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots

#ai #llm #llamacpp #localllm #koboldai #koboldcpp

XiELEd @[email protected] · 2026-05-14 · 13:22 UTC

<8B multilingual models for language learning chatbots

https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots

#koboldcpp #koboldai #localllm #llamacpp #llm #ai

XiELEd @[email protected] · 2026-05-14 · 13:22 UTC

<8B multilingual models for language learning chatbots

https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots

#ai #llm #llamacpp #localllm #koboldai #koboldcpp

ai0.news @[email protected] · 2026-05-14 · 06:04 UTC

Musk vs Altman trial enters week three as AI appears uninvited in Ontario medical records, Threads mentions, and leaked phone numbers in chatbot replies. Also a Game Boy Color now runs a transformer model.

https://ai0.news/posts/2026-05-14-daily-digest/

#AI #OpenAI #LocalLLM #OpenSource

#ai #openai #localllm #opensource

Emre Sokullu :verified: @[email protected] · 2026-05-07 · 11:36 UTC

Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

Detaylı yazı + VRAM önerileri:
https://webbrain.one/blog

GitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain

#LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

#localllm #vlm #aiagents #qwen #ai #yapayzeka

Emre Sokullu :verified: @[email protected] · 2026-05-07 · 11:36 UTC

Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

Detaylı yazı + VRAM önerileri:
https://webbrain.one/blog

GitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain

#LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

#localllm #vlm #aiagents #qwen #ai #yapayzeka

Emre Sokullu :verified: @[email protected] · 2026-05-07 · 11:36 UTC

Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

Detaylı yazı + VRAM önerileri:
https://webbrain.one/blog

GitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain

#LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

#localllm #vlm #aiagents #qwen #ai #yapayzeka

Emre Sokullu :verified: @[email protected] · 2026-05-07 · 11:36 UTC

Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

Detaylı yazı + VRAM önerileri:
https://webbrain.one/blog

GitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain

#LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

#yapayzeka #ai #qwen #aiagents #vlm #localllm

Emre Sokullu :verified: @[email protected] · 2026-05-07 · 11:36 UTC

Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

Detaylı yazı + VRAM önerileri:
https://webbrain.one/blog

GitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain

#LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

#localllm #vlm #aiagents #qwen #ai #yapayzeka

Heidenstedt 👩‍💻 @[email protected] · 2026-04-29 · 08:46 UTC

So today is vLLM setup day as I want to run a few experiments with parallel inferencing.

Funnily LLM inference does not need 2 times the time and energy of you batch 2 request at the same time. So what I am trying to do is to have the same model come up with 2 or 3 different solutions for a functions or test so I then can choose the one that needs less editing.

Nothing that is not a year old already but regardless i imagine it super useful in a local only setup.

#ai #llm #localLlm #vllm

#ai #llm #localllm #vllm

Joel Wirāmu, Pauling @[email protected] · 2026-04-29 · 08:02 UTC

Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

#linux #filesystem #bert #vectordb #postgres #xfs

Joel Wirāmu, Pauling @[email protected] · 2026-04-29 · 08:02 UTC

Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

#linux #filesystem #bert #vectordb #postgres #xfs

Joel Wirāmu, Pauling @[email protected] · 2026-04-29 · 08:02 UTC

Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

#linux #filesystem #bert #vectordb #postgres #xfs

Joel Wirāmu, Pauling @[email protected] · 2026-04-29 · 08:02 UTC

Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

#localllm #blake3 #stratis #xfs #postgres #vectordb

Joel Wirāmu, Pauling @[email protected] · 2026-04-29 · 08:02 UTC

Have pushed 0.9.5-dev branch to codeberg of foxing ( https://codeberg.org/aenertia/foxing/src/branch/0.9.5-dev ) in preparation for release tagging. A LOT of features and a couple of bug-fixes now the packet/file processing engine has stabilized ; including Semantic Routing to Parsers for Metadata Extraction and in-path Binary analysis using local ORT/BERT models ; letting you get semantic search powers for free when you copy something with foxingd/fxcp #linux #filesystem #bert #vectordb #postgres #xfs #stratis #blake3 #localllm

#linux #filesystem #bert #vectordb #postgres #xfs

Scott Galloway @[email protected] · 2026-02-04 · 01:59 UTC

v1 Of DoomSummarizer is out.
It's a crazy deep research / auto knowledgebase system. Point it at a directory of word docs, pdf and markdown it'll index it all then answer questions about the contents. Point it at a url it'll parse the content, index it and tell you what it's about.
Crawl your company's knowledgebase? It'll automatically become a support AI.

Want to know what your biggest invoice was, when you sent that angry letter etc...all local, all private, all open source (unlicense) . Quick two as unlike most RAG systems it MINIMIZES token use.

#llm #ai #rag #search #localllm #ollama #onnx https://github.com/scottgal/lucidrag/releases/

#llm #ai #rag #search #localllm #ollama

Reddit Tech VN Bot @[email protected] · 2026-02-01 · 23:20 UTC

Wax: Động cơ bộ nhớ đơn file, thuần Swift cho AI trên thiết bị – không server, không DB. Tích hợp dữ liệu, embedding, index và WAL trong 1 file xác định. Tìm kiếm lai (lexical + vector + temporal), an toàn khi sập nguồn, hỗ trợ GPU trên Apple Silicon. Mở nguồn, phù hợp trợ lý AI, ứng dụng offline/riêng tư. #Wax #OnDeviceAI #Swift #RAG #AI #LocalLLM #TríTuệNhânTạo #AItrênThiếtBị #SwiftUI #MachineLearning

https://www.reddit.com/r/LocalLLaMA/comments/1qtdejw/i_built_a_swiftnative_singlefile_memory

#wax #ondeviceai #swift #rag #ai #localllm

Reddit Tech VN Bot @[email protected] · 2026-01-29 · 20:24 UTC

📢 Tìm kiếm mô hình LLM tối ưu cho lập trình Unity "agentic" (hạn chế 12GB VRAM)! Cần gợi ý các mô hình hỗ trợ chỉnh sửa code chi tiết (SEARCH/REPLACE) thay vì rewrite toàn bộ file. Cấu hình: RTX 3060 12GB, Ryzen 5 600x, LM Studio + Zed/Aider. Đã thử: qwen3-53b, glm-4.7, mistral-nemo, v.v. nhưng hiệu quả chưa cao. Nên dùng mô hình nào phù hợp Unity/C#? 🔍

#LocalLLM #UnityDev #AIProgramming #ViTinHoc #LLM #Mastodon #TechVietNam

https://www.reddit.com/r/LocalLLaMA/comments/1qqibnr/seeking_best_

#localllm #unitydev #aiprogramming #vitinhoc #llm #mastodon

Reddit Tech VN Bot @[email protected] · 2026-01-29 · 14:25 UTC

Đã tính toán chi phí chạy Moltbot trên máy tính cá nhân: 128 GB DDR5 + RTX 4090 + bo mạch ≈ $3.2k, điện ~30 $/tháng → $3.56k năm đầu. Cloud hosting chỉ $25 $/tháng ≈ $300/năm, tiết kiệm ~90%. Nếu bot chỉ điều phối API, dùng cloud hợp lý hơn; chỉ khi cần inference nội bộ mới cân nhắc đầu tư phần cứng. #Moltbot #LLM #AI #CloudComputing #ChiPhí #MáyTính #LocalLLM

https://www.reddit.com/r/LocalLLaMA/comments/1qq8qay/almost_bought_128gb_ram_4090_for_moltbot_did_the/

#moltbot #llm #ai #cloudcomputing #chiphi #maytinh

Reddit Tech VN Bot @[email protected] · 2026-01-27 · 22:24 UTC

Bàn về hiệu năng hệ thống AI workstation kép RTX PRO 6000 với 1.15TB RAM: So sánh xử lý GPU-only (INT4) vs CPU+GPU (fp8) trên mô hình MiniMax-M2.1. Kết quả: GPU-only nhanh hơn 2–4x ở prefill nhưng chỉ xử lý tối đa ~3 request đồng thời do giới hạn KV-cache..fp8 tuy chậm hơn nhưng mở rộng tốt hơn cho 10+ người dùng, đặc biệt với context dài. Queue time là điểm nghẽn quan trọng. Phù hợp cho agent coding nội bộ. #AIWorkstation #LLMBenchmark #MultiUserAI #GPUvsCPU #LocalLLM #HPC #MachineLearning #Tín

#aiworkstation #llmbenchmark #multiuserai #gpuvscpu #localllm #hpc

notanowl @[email protected] · 2026-01-02 · 19:36 UTC

ollama pull mistral-nemo / learn how to use the tools #Ollama #MistralNemo #WSL2 #OpenAIWhisper #LocalLLM #Linux #PowerShell #PromptEngineering #TechLearning #SelfHosted #AI

#ollama #mistralnemo #wsl2 #openaiwhisper #localllm #linux

Artem R 🇺🇦 @[email protected] · 2025-12-29 · 08:00 UTC

For my personal old Macbook Air M1 16Gb I found an LLM model that fits and performs reasonably well for my simple use cases. It's Mistral's `ministral-3:8b` — it gives an average 10 tokens per second. I use it in conjunction with Raycast AI chat app #llms #ai #genai #localai #localllm #raycast #mistral

#llms #ai #genai #localai #localllm #raycast

Sapphic Stellavlinder⭐ @[email protected] · 2025-10-04 · 14:21 UTC

is there any way to obtain a useful and fast local llm for agentic coding on 8GB VRAM (RTX 3060 TI)?

I tried #gemma3 4b, #deepseekr1 7b, #phi4mini and #qwen3 4b using #Ollama with #Cline but got poor results

#localllm #agenticai

#gemma3 #deepseekr1 #phi4mini #qwen3 #ollama #cline

Osma Suominen @[email protected] · 2025-09-30 · 06:56 UTC

AGI is just around the corner!

I'm learning to use DSPy with GEPA (Genetic-Pareto) prompt optimization. In GEPA a larger "teacher" LLM adjusts the prompt for a smaller "student" LM to perform a specific task as well as possible. The teacher will try many different prompts and evaluate the outcome, in my case the quality of a metadata extraction task.

The larger model (GPT-OSS 120B) just added this to the prompt for the smaller model (Gemma 3 4B):

> Good luck! 🎯

😅

#LLM #LocalLLM #DSPy #GEPA

#llm #localllm #dspy #gepa

Richard S. Lingner @[email protected] · 2025-02-06 · 21:43 UTC

"With 15GB VRAM, Unsloth allows you to transform any model up to 15B parameters like Llama 3.1 (8B), Phi-4 (14B), Mistral (7B) or Qwen2.5 (7B) into a reasoning model"

Train your own R1 reasoning model with Unsloth

https://unsloth.ai/blog/r1-reasoning

#LocalLLM #LLM #reasoning #unsloth #GRPO

#localllm #llm #reasoning #unsloth #grpo