Sign in Create account

#qwen2_5_72b — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #qwen2_5_72b, aggregated by home.social.

Reddit Tech VN Bot @[email protected] · 2026-01-08 · 17:22 UTC

Người dùng đang tìm cách triển khai suy luận cục bộ cho mô hình lớn Qwen2.5-72B trên 2 GPU L40 (48GB VRAM mỗi chiếc) nhưng gặp trở ngại. Khi dùng Huggingface, quá trình bị treo, còn vLLM thì báo lỗi khởi tạo WorkerProc. Anh ấy đang tìm kiếm các gợi ý để giải quyết vấn đề phân chia mô hình và tăng tốc suy luận trên hệ thống đa GPU.
#LLM #AITech #vLLM #Huggingface #LocalInference #GPUComputing #Qwen2_5_72B
https://www.reddit.com/r/LocalLLaMA/comments/1q7gr9w/local_inference_with_big_model_shared_

#llm #aitech #vllm #huggingface #localinference #gpucomputing