“thb” — Fediverse search results on home.social

Thomas @[email protected] · 2026-05-26 · 14:25 UTC

New week, more slides: Run LLMs Locally

Now including wllama to run GGUF models inside your browser!

wllama uses llama.cpp, WebAssembly and WebGPU, bringing a completely new experience of LLMs into the web.
It has no 4 GB limitation and is faster than Transformers.js.

I also added translations using the HY-MT model from Tencent.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #wllama #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp #webassembly

#ai #llm #llamacpp #wllama #stablediffusion #qwen3

Thomas @[email protected] · 2026-05-26 · 14:25 UTC

New week, more slides: Run LLMs Locally

Now including wllama to run GGUF models inside your browser!

wllama uses llama.cpp, WebAssembly and WebGPU, bringing a completely new experience of LLMs into the web.
It has no 4 GB limitation and is faster than Transformers.js.

I also added translations using the HY-MT model from Tencent.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #wllama #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp #webassembly

#ai #llm #llamacpp #wllama #stablediffusion #qwen3

Thomas @[email protected] · 2026-05-26 · 14:25 UTC

New week, more slides: Run LLMs Locally

Now including wllama to run GGUF models inside your browser!

wllama uses llama.cpp, WebAssembly and WebGPU, bringing a completely new experience of LLMs into the web.
It has no 4 GB limitation and is faster than Transformers.js.

I also added translations using the HY-MT model from Tencent.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #wllama #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp #webassembly

#ai #llm #llamacpp #wllama #stablediffusion #qwen3

Thomas @[email protected] · 2026-05-26 · 14:25 UTC

New week, more slides: Run LLMs Locally

Now including wllama to run GGUF models inside your browser!

wllama uses llama.cpp, WebAssembly and WebGPU, bringing a completely new experience of LLMs into the web.
It has no 4 GB limitation and is faster than Transformers.js.

I also added translations using the HY-MT model from Tencent.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #wllama #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp #webassembly

#webassembly #mtp #opencode #webgpu #gemma4 #localai

Thomas @[email protected] · 2026-05-26 · 14:25 UTC

New week, more slides: Run LLMs Locally

Now including wllama to run GGUF models inside your browser!

wllama uses llama.cpp, WebAssembly and WebGPU, bringing a completely new experience of LLMs into the web.
It has no 4 GB limitation and is faster than Transformers.js.

I also added translations using the HY-MT model from Tencent.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #wllama #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp #webassembly

#ai #llm #llamacpp #wllama #stablediffusion #qwen3

Thomas @[email protected] · 2026-05-18 · 22:03 UTC

New week, new slides: Run LLMs Locally

Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-ASR is now working directly with Llama.cpp and included in the slides.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-18 · 22:03 UTC

New week, new slides: Run LLMs Locally

Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-ASR is now working directly with Llama.cpp and included in the slides.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-18 · 22:03 UTC

New week, new slides: Run LLMs Locally

Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-ASR is now working directly with Llama.cpp and included in the slides.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-18 · 22:03 UTC

New week, new slides: Run LLMs Locally

Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-ASR is now working directly with Llama.cpp and included in the slides.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp

#mtp #opencode #webgpu #gemma4 #localai #glm

Thomas @[email protected] · 2026-05-18 · 22:03 UTC

New week, new slides: Run LLMs Locally

Now including multi-token prediction using Qwen3.6 35B-A3B with Nextn quantization. Also speech recognition using Qwen-3-ASR is now working directly with Llama.cpp and included in the slides.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode #mtp

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-11 · 09:15 UTC

New week, small update: Run LLMs Locally

Now with a new setup for OpenCode with Qwen 3.6 and Gemma 4, including permissions and thinking variants.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-11 · 09:15 UTC

New week, small update: Run LLMs Locally

Now with a new setup for OpenCode with Qwen 3.6 and Gemma 4, including permissions and thinking variants.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-11 · 09:15 UTC

New week, small update: Run LLMs Locally

Now with a new setup for OpenCode with Qwen 3.6 and Gemma 4, including permissions and thinking variants.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Thomas @[email protected] · 2026-05-11 · 09:15 UTC

New week, small update: Run LLMs Locally

Now with a new setup for OpenCode with Qwen 3.6 and Gemma 4, including permissions and thinking variants.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode

#opencode #webgpu #gemma4 #localai #glm #qwen3

Thomas @[email protected] · 2026-05-11 · 09:15 UTC

New week, small update: Run LLMs Locally

Now with a new setup for OpenCode with Qwen 3.6 and Gemma 4, including permissions and thinking variants.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #qwen3 #glm #localai #gemma4 #webgpu #opencode

#ai #llm #llamacpp #stablediffusion #qwen3 #glm

Holger Schurig DH3HS @[email protected] · 2026-05-09 · 09:44 UTC

@thbley Ah, ich frage eine LLM quasi nie Wissensfragen. Da nutze ich eher Suchmaschinen dafür.

Bei Suchmaschinen sehe ich oft schon an der URL, wie verlässlich Antworten sind. Bei einem LLM sehe ich keine Quelle. Da habe ich keine Ahnung, ob die Herbeihalluziniert wurde.

Habe hab' gerade nur gemma-4-26B-A4B-it-GGUF:UD-Q4_K_XL heruntergeladen, also das MoE-Model. Das hat bei mir dann 4 Sekunden für die Antwort gebraucht. Erste Frage nach dem Start.

Eine Sekunde für den Prompt, 3 für die Antwort. Das Reasoning war in Englisch, Antwort dann auch in Deutsch. Raesoning wurde noch mit George Sand und Krankheitsinfos garniert.

Ich nutze LLMs fast nur für Coding-Sachen. Und erst Gemma4 war für #Bazel #Starlark einigermaßen brauchbar. Frühere Modelle sind da total Schrott.

#bazel #starlark

Thomas @[email protected] · 2026-05-05 · 00:02 UTC

New week, more slides: Run LLMs Locally

Now with LFM 2 and new slides for using Transformers.js with WebGPU for Privacy Filter, Function Calling and Embeddings, running completely in your browser.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-05-05 · 00:02 UTC

New week, more slides: Run LLMs Locally

Now with LFM 2 and new slides for using Transformers.js with WebGPU for Privacy Filter, Function Calling and Embeddings, running completely in your browser.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-05-05 · 00:02 UTC

New week, more slides: Run LLMs Locally

Now with LFM 2 and new slides for using Transformers.js with WebGPU for Privacy Filter, Function Calling and Embeddings, running completely in your browser.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-05-05 · 00:02 UTC

New week, more slides: Run LLMs Locally

Now with LFM 2 and new slides for using Transformers.js with WebGPU for Privacy Filter, Function Calling and Embeddings, running completely in your browser.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#webgpu #nemotron #gemma4 #localai #glm #qwen3

Thomas @[email protected] · 2026-05-05 · 00:02 UTC

New week, more slides: Run LLMs Locally

Now with LFM 2 and new slides for using Transformers.js with WebGPU for Privacy Filter, Function Calling and Embeddings, running completely in your browser.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-04-29 · 05:05 UTC

New week, new slides: Run LLMs Locally

Now including Nemotron 3 Nano Omni from Nvidia, Llama.cpp built-in tools and new slides about using Transformers.js with WebGPU for Image Recognition and OCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-04-29 · 05:05 UTC

New week, new slides: Run LLMs Locally

Now including Nemotron 3 Nano Omni from Nvidia, Llama.cpp built-in tools and new slides about using Transformers.js with WebGPU for Image Recognition and OCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-04-29 · 05:05 UTC

New week, new slides: Run LLMs Locally

Now including Nemotron 3 Nano Omni from Nvidia, Llama.cpp built-in tools and new slides about using Transformers.js with WebGPU for Image Recognition and OCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Thomas @[email protected] · 2026-04-29 · 05:05 UTC

New week, new slides: Run LLMs Locally

Now including Nemotron 3 Nano Omni from Nvidia, Llama.cpp built-in tools and new slides about using Transformers.js with WebGPU for Image Recognition and OCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#webgpu #nemotron #gemma4 #localai #glm #qwen3

Thomas @[email protected] · 2026-04-29 · 05:05 UTC

New week, new slides: Run LLMs Locally

Now including Nemotron 3 Nano Omni from Nvidia, Llama.cpp built-in tools and new slides about using Transformers.js with WebGPU for Image Recognition and OCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3

Máy đo khoảng cách THB Việt Na @[email protected] · 2026-04-23 · 20:56 UTC

THB Việt Nam tổng hợp top 12 máy đo khoảng cách laser ngoài trời bán chạy nhất. Các thiết bị này sở hữu công nghệ laser xanh hiện đại, camera kỹ thuật số giúp đo đạc chính xác cao cả trong nhà và ngoài trời, hỗ trợ tối ưu cho mọi công trình. Liên hệ ngay! Hotline: 0916.610.499. Cửa hàng: 579 Phạm Văn Đồng, P. Xuân Đỉnh, Hà Nội. #MayDoKhoangCach #ThietBiXayDung #DoLaserChinhXac https://thbvietnam.com/tin-tuc/top-3-may-do-khoang-cach-laser-200m-dang-mua-nhat-hien-nay-2530.html

#maydokhoangcach #thietbixaydung #dolaserchinhxac

Thomas @[email protected] · 2026-04-14 · 09:46 UTC

New week, new update for the slides of my talk "Run LLMs Locally":

Now including Gemma4 and Qwen3-Omni with Vision and Audio support and new slides describing Llama.cpp server parameters.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4