home.social

Search

4 results for “AMDmi3”

  1. LLM Inference Takes Aim at Production Realities

    New disaggregated LLM serving is faster and cheaper than old aggregated methods for businesses using AI. Tests show better performance.

    #LLMServing, #AIefficiency, #OracleCloud, #AMDMI300X, #TechNews

    newsletter.tf/disaggregated-ll

  2. New tests show a disaggregated LLM serving method is 2x faster than older methods using fewer resources. This means AI services will work better.

    #LLMServing, #AIefficiency, #OracleCloud, #AMDMI300X, #TechNews
    newsletter.tf/disaggregated-ll

  3. It looks like is finally going to get proper route matching with prefixes and suffixes support (e.g. `/images/{foo}.jpg`, while before only `/images/{foo}` was supported).
    github.com/tokio-rs/axum/pull/