home.social

#llmd — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #llmd, aggregated by home.social.

  1. Red Hat and Tesla engineers tackled a real production problem together.

    3x output tokens/sec, 2x faster TTFT on Llama 3.1 70B with KServe + llm-d + vLLM. Fixes pushed upstream to KServe along the way.

    This is what open source looks like. 🤝 🚀

    llm-d.ai/blog/production-grade

    #RedHat #Tesla #RedHatAI #vLLM #Pytorch #Kubernetes #OpenShift #KServe #llmd #Llama #OpenSource

  2. Red Hat and Tesla engineers tackled a real production problem together.

    3x output tokens/sec, 2x faster TTFT on Llama 3.1 70B with KServe + llm-d + vLLM. Fixes pushed upstream to KServe along the way.

    This is what open source looks like. 🤝 🚀

    llm-d.ai/blog/production-grade

  3. Red Hat and Tesla engineers tackled a real production problem together.

    3x output tokens/sec, 2x faster TTFT on Llama 3.1 70B with KServe + llm-d + vLLM. Fixes pushed upstream to KServe along the way.

    This is what open source looks like. 🤝 🚀

    llm-d.ai/blog/production-grade

    #RedHat #Tesla #RedHatAI #vLLM #Pytorch #Kubernetes #OpenShift #KServe #llmd #Llama #OpenSource

  4. Red Hat and Tesla engineers tackled a real production problem together.

    3x output tokens/sec, 2x faster TTFT on Llama 3.1 70B with KServe + llm-d + vLLM. Fixes pushed upstream to KServe along the way.

    This is what open source looks like. 🤝 🚀

    llm-d.ai/blog/production-grade

    #RedHat #Tesla #RedHatAI #vLLM #Pytorch #Kubernetes #OpenShift #KServe #llmd #Llama #OpenSource

  5. Red Hat and Tesla engineers tackled a real production problem together.

    3x output tokens/sec, 2x faster TTFT on Llama 3.1 70B with KServe + llm-d + vLLM. Fixes pushed upstream to KServe along the way.

    This is what open source looks like. 🤝 🚀

    llm-d.ai/blog/production-grade

    #RedHat #Tesla #RedHatAI #vLLM #Pytorch #Kubernetes #OpenShift #KServe #llmd #Llama #OpenSource

  6. Big thanks to everyone contributing code, reviews, and ideas — this integration is shaping up to be a game-changer for 𝗞𝘂𝗯𝗲𝗿𝗻𝗲𝘁𝗲𝘀-𝗻𝗮𝘁𝗶𝘃𝗲 𝗟𝗟𝗠 𝘀𝗲𝗿𝘃𝗶𝗻𝗴. Stay tuned for next release!

  7. 🎉 Behold! The #llmd #community emerges from the depths of the #tech abyss, promising the holy grail of Kubernetes-native distributed #LLM #inference. 🤖 Because who doesn't want their #AI #deployments served with extra buzzwords and a side of "competitive performance per dollar"? 🍽️
    llm-d.ai/blog/llm-d-announce #Kubernetes #news #HackerNews #ngated