#koboldcpp — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #koboldcpp, aggregated by home.social.
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
<8B multilingual models for language learning chatbots
https://piefed.social/c/localllama/p/2061114/8b-multilingual-models-for-language-learning-chatbots
-
Were finally able to add music generation to #KoboldCpp ! And since it reuses all the large backend code from ggml it won't bloat us down adding it. Don't like music and just want your LLM backend to stay unbloated? No problem, don't load a music model and it won't impact you.
But if you like music gen what we are integrating can be very fast and produce really fun results. Been having a blast with ace-step 1.5 since it released and soon you can enjoy it inside KoboldCpp to.
And of course for those who don't yet know its an #ai program that lets you run #llm 's locally alongside #imagegen and more! But with a portable binary that is small for what it can do. -
紀錄一下用 #koboldcpp 進行的 #LLM 性能測試
Llama 3 8B模型,IQ4_XS 量化
Flags: NoAVX2=False Threads=7 HighPriority=False Cublas_Args=None Tensor_Split=None BlasThreads=7 BlasBatchSize=512 FlashAttention=True KvCache=2
Timestamp: 2025-05-26 07:36:07.196685+00:00
Backend: koboldcpp_vulkan.so
Layers: 49
Model: L3-8B-Stheno-v3.2-NEO-V1-D_AU-IQ4_XS-imat13
MaxCtx: 8192
GenAmount: 100
-----
ProcessingTime: 55.082s
ProcessingSpeed: 146.91T/s
GenerationTime: 8.004s
GenerationSpeed: 12.49T/s
TotalTime: 63.086s
然後是 Mistral Nemo Small 12B,也是 IQ4_XS 量化
Flags: NoAVX2=False Threads=7 HighPriority=False Cublas_Args=None Tensor_Split=None BlasThreads=7 BlasBatchSize=512 FlashAttention=True KvCache=2
Timestamp: 2025-05-26 07:43:55.416620+00:00
Backend: koboldcpp_vulkan.so
Layers: 49
Model: MN-GRAND-Gutenburg-Lyra4-Lyra-12B-DARKNESS-D_AU-IQ4_XS
MaxCtx: 8192
GenAmount: 100
-----
ProcessingTime: 80.623s
ProcessingSpeed: 100.37T/s
GenerationTime: 11.601s
GenerationSpeed: 8.62T/s
TotalTime: 92.224s
硬體規格
GMKtec K8 Plus
Ryzen 7 7840HS /w Radeon 780M 8G Vram
64GB DDR5-5600 Dual channel
OS: Proxmox VE 8.2 -
I want to do a good write up in my README for PixelPolygot as one of my last touches but I need the damn #rocm fork of #KoboldCpp to update so I can do some more testing with Qwen2.5-VL locally. Like it works with vulkan on the main branch but way slower than rocm. :notlikeblob: -
Tried a roleplay with an LLM which more or less turned into a story. I actually quite like it. Could have more action, though
Not sure if chat mode was the right choice.
Also, still no idea how to do multi character chats. -
Yeah, this is what a mean by "the model gets confused".
Apparently KoboldAI/cpp can't distinguish the stop words anymore.The initial prompt was
User: This is a roleplay between two animal characters, which I will define later.
User: Please generate and describe an animal character, which you will play. Choose a mammal. Please prefix your messages by a newline and your characters name, like so: '
Bob: waves "Hi, how are you?"'. -
And apparently, you can't have two characters as "you"...
-
Hm, I'm unsure whether I should instruct the model to reply with their characters name prefixed.
I have no idea how much "automagic" the chat mode has inbuilt.
I actually wanna get away from only seeing "KoboldAI" and "user".Also, changing "my name" in the chat mode settings seems to confuse the model?
-
#koboldcpp has a nice interface.
I kind wish they had a better explanation what's the difference between "chat" and "KoboldGPT chat".Also, I'm simply overwhelmed by the amount of models. Each of them needs to be treated / prompted differently, it seems.
And then I still have problems with the model generating my character's actions / speech occasionally, and I don't know whether that's due to my insufficient prompting, or due to this "end token" (e.g. <|im_end|>) not being configured correctly, which seems to differ for some models again.
-
把 #koboldcpp 以前景模式執行,打開 AI Horde 的任務分派程式,很明顯可以感覺到早上的使用者明顯比下午多。然後 RP 跟 ERP 的比例大約是 1:4。
#LLM -
Как запустить Mixtral на своём компьютере
Всякий раз, когда выходит новая хорошая ИИ модель, Хабр наполняется вопросами "Как нам её попробовать" и неправильными ответами, будто нужно платить за какие-то сервисы или иметь железа на сто лямов. Поэтому я вновь напишу инструкцию, как запустить новейший mixtral-8x7 на обычных средних компьютерах.
https://habr.com/ru/articles/781702/
#LLM #Mixtral #KoboldCPP #GGUF #18+