home.social

#jina — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #jina, aggregated by home.social.

  1. 2 new #jina models have entered the embedding arena — v5 multilingual:
    * small: 1024 dim, 32K context
    * nano: 768 dim, 8K context
    both support matryoshka dimension truncation (32+) and are at the top of current benchmarks — especially for their parameter size
    publicly accessible (non-commercial)

    full announcement blog post: jina.ai/news/jina-embeddings-v
    hugging face: huggingface.co/collections/jin
    paper for even more details: arxiv.org/abs/2602.15547

  2. #VoyageAI introduces voyage-context-3, a contextualized chunk #embedding #llm that captures both chunk details and full document context 🔍 #ai

    🔄 Outperforms #OpenAI-v3-large by 14.24% on chunk-level and 12.56% on document-level retrieval tasks

    📊 Beats #Cohere-v4 by 7.89% and 5.64% respectively, and #Jina-v3 late chunking by 23.66% and 6.76%

    🛠️ Drop-in replacement for standard embeddings without requiring downstream workflow changes

    🧵👇

  3. However, the distances were always similar without really getting any significant differences or ranking. So, I improved the embeddings with context (author notes from the account), tags, etc. I also tried different #jina models (base, small, large), with different max-tokens configuration. But the embeddings seems to be ok, the clusters were mostly random, but after some try and error I got some configuration I like it. So, what could be bad?

  4. @raccoonbits Because my work I have been involved in other "AI" #LLM #RAG projects, and once I found #jina I thought it would be fun to integrate it using embeddings.

  5. CW: iran pol

    Funeral of #MohammadMehdiKarmi, young Kurdish-Iranian executed by the regime for protesting.

    Even at his funeral, intelligence officers didn’t let his family breathe.

    This #IranRevolution started with #Jina. It won’t end here. We will be their voice. • Source: nitter.notraxx.ch/AlinejadMasi