#modelcompression — Public Fediverse posts on home.social

UKP Lab @[email protected] · 2026-03-27 · 09:38 UTC

Authors: Federico Marcuzzi (INSAIT - Institute for Computer Science, Artificial Intelligence and Technology), Xuefei Ning (Tsinghua University), Roy Schwartz (The Hebrew University of Jerusalem), and Iryna Gurevych (UKP Lab, Technische Universität Darmstadt and ATHENE Center).

See you at #EACL2026 in Rabat 🕌!

#UKPLab #NLProc #ResponsibleAI #Quantization #MLSafety #Fairness #TrustworthyAI #ModelCompression #LLMSafety #EthicalAI #NLP #AIResearch

#eacl2026 #ukplab #nlproc #responsibleai #quantization #mlsafety

UKP Lab @[email protected] · 2026-03-27 · 09:38 UTC

Authors: Federico Marcuzzi (INSAIT - Institute for Computer Science, Artificial Intelligence and Technology), Xuefei Ning (Tsinghua University), Roy Schwartz (The Hebrew University of Jerusalem), and Iryna Gurevych (UKP Lab, Technische Universität Darmstadt and ATHENE Center).

See you at #EACL2026 in Rabat 🕌!

#UKPLab #NLProc #ResponsibleAI #Quantization #MLSafety #Fairness #TrustworthyAI #ModelCompression #LLMSafety #EthicalAI #NLP #AIResearch

#eacl2026 #ukplab #nlproc #responsibleai #quantization #mlsafety

UKP Lab @[email protected] · 2026-03-27 · 09:38 UTC

Authors: Federico Marcuzzi (INSAIT - Institute for Computer Science, Artificial Intelligence and Technology), Xuefei Ning (Tsinghua University), Roy Schwartz (The Hebrew University of Jerusalem), and Iryna Gurevych (UKP Lab, Technische Universität Darmstadt and ATHENE Center).

See you at #EACL2026 in Rabat 🕌!

#UKPLab #NLProc #ResponsibleAI #Quantization #MLSafety #Fairness #TrustworthyAI #ModelCompression #LLMSafety #EthicalAI #NLP #AIResearch

#eacl2026 #ukplab #nlproc #responsibleai #quantization #mlsafety

UKP Lab @[email protected] · 2026-03-27 · 09:38 UTC

Authors: Federico Marcuzzi (INSAIT - Institute for Computer Science, Artificial Intelligence and Technology), Xuefei Ning (Tsinghua University), Roy Schwartz (The Hebrew University of Jerusalem), and Iryna Gurevych (UKP Lab, Technische Universität Darmstadt and ATHENE Center).

See you at #EACL2026 in Rabat 🕌!

#UKPLab #NLProc #ResponsibleAI #Quantization #MLSafety #Fairness #TrustworthyAI #ModelCompression #LLMSafety #EthicalAI #NLP #AIResearch

#airesearch #nlp #ethicalai #llmsafety #modelcompression #trustworthyai

UKP Lab @[email protected] · 2026-03-27 · 09:38 UTC

Authors: Federico Marcuzzi (INSAIT - Institute for Computer Science, Artificial Intelligence and Technology), Xuefei Ning (Tsinghua University), Roy Schwartz (The Hebrew University of Jerusalem), and Iryna Gurevych (UKP Lab, Technische Universität Darmstadt and ATHENE Center).

See you at #EACL2026 in Rabat 🕌!

#UKPLab #NLProc #ResponsibleAI #Quantization #MLSafety #Fairness #TrustworthyAI #ModelCompression #LLMSafety #EthicalAI #NLP #AIResearch

#eacl2026 #ukplab #nlproc #responsibleai #quantization #mlsafety

AI Daily Post @[email protected] · 2026-03-06 · 21:43 UTC

New research shows KV‑cache compaction can slash LLM memory usage by up to 50× while preserving quality. With chunked processing and attention‑matching tricks, models like Llama 3.1 and Qwen‑3 handle far longer contexts—great news for open‑source and enterprise workloads. Dive into the benchmarks! #KVCaching #LLMMemory #LongContexts #ModelCompression

🔗 https://aidailypost.com/news/kv-cache-compaction-cuts-llm-memory-50-chunked-processing-long

#kvcaching #llmmemory #longcontexts #modelcompression

AI Daily Post @[email protected] · 2026-03-06 · 21:43 UTC

New research shows KV‑cache compaction can slash LLM memory usage by up to 50× while preserving quality. With chunked processing and attention‑matching tricks, models like Llama 3.1 and Qwen‑3 handle far longer contexts—great news for open‑source and enterprise workloads. Dive into the benchmarks! #KVCaching #LLMMemory #LongContexts #ModelCompression

🔗 https://aidailypost.com/news/kv-cache-compaction-cuts-llm-memory-50-chunked-processing-long

#modelcompression #longcontexts #llmmemory #kvcaching

AI Daily Post @[email protected] · 2026-03-06 · 21:43 UTC

New research shows KV‑cache compaction can slash LLM memory usage by up to 50× while preserving quality. With chunked processing and attention‑matching tricks, models like Llama 3.1 and Qwen‑3 handle far longer contexts—great news for open‑source and enterprise workloads. Dive into the benchmarks! #KVCaching #LLMMemory #LongContexts #ModelCompression

🔗 https://aidailypost.com/news/kv-cache-compaction-cuts-llm-memory-50-chunked-processing-long

#kvcaching #llmmemory #longcontexts #modelcompression

Reddit Tech VN Bot @[email protected] · 2026-01-05 · 02:17 UTC

Sparse nén mô hình fine-tuned và dataset thành delta từ bản gốc. Nén 14GB xuống 1.4GB (lossless) hoặc 50MB (tương đương LoRA), phục hồi trong 4 giây. Áp dụng sau khi training, phù hợp mọi mô hình đã huấn luyện. Hiệu quả cho AI y tế, tài chính, pháp lý. #AI #MachineLearning #FineTuning #ModelCompression #Sparse #TríTuệNhânTạo #HọcMáy #NénMôHình

https://www.reddit.com/r/LocalLLaMA/comments/1q47kyt/delta_compression_for_finetuned_models_and/

#ai #machinelearning #finetuning #modelcompression #sparse #trituệnhantạo

Reddit Tech VN Bot @[email protected] · 2025-11-17 · 20:18 UTC

So sánh GLM-4.6 IQ2_M và GLM-4.6-REAP-268B Q2_K_XL: Hai phương pháp nén khác nhau, một giảm chất lượng toàn bộ mô hình và một loại bỏ cấu trúc nhất định. #GLM #AI #MachineLearning #MôHìnhNén #TríTuệNhânTạo #HọcMáy #NénMôHình #PhươngPhápNén #English: #GLM #AI #MachineLearning #ModelCompression #ArtificialIntelligence

https://www.reddit.com/r/LocalLLaMA/comments/1ozq14d/comparing_unsloths_glm46_iq2_m_vs_glm46reap268b/

#glm #ai #machinelearning #mohinhnen #trituệnhantạo #họcmay

marco @[email protected] · 2022-11-06 · 08:39 UTC

Here is what I've been reading this week (btw, if the authors are on Mastodon, please let me know their handles). It mostly deals with #modelcompression and #gpu programming, two problems that have become very interesting to me recently.

#modelcompression #gpu