Dominik Weckmüller (@DomeGIS@fosstodon.org) — Public Fediverse posts on home.social

Dominik Weckmüller @DomeGIS · 2024-03-05 · 17:23 UTC

#SemanticFinder meets
@ollama !
You can now add superpowers to your favorite in-browser semantic search tool🥳
Chat with your document or an entire book and fuel #LLAMA with your relevant paragraphs!
App: https://do-me.github.io/SemanticFinder/
Announcement: https://reddit.com/r/ollama/comments/1b79c23/inbrowser_rag_feeding_ollama/

#semanticfinder #llama

Dominik Weckmüller @DomeGIS · 2024-02-12 · 17:12 UTC

#SemanticFinder now offers built-in dimensionality reduction with wasm-powered BH-tSNE. Try it now: https://do-me.github.io/SemanticFinder/

#semanticfinder

Dominik Weckmüller @DomeGIS · 2024-01-16 · 19:00 UTC

Private semantic & hybrid search for large documents like books in 2 seconds in your browser with #SemanticFinder!🚀Examples:
- The Bible (en) https://do-me.github.io/SemanticFinder/?hf=King_James_Bible_24f6dc4c
- Les Misérables (fr) https://do-me.github.io/SemanticFinder/?hf=Les_Mis%C3%A9rables_2239df51
- Das Kapital (de) https://do-me.github.io/SemanticFinder/?hf=Das_Kapital_c1a84fba
Full catalogue: https://huggingface.co/datasets/do-me/semanticfinder

You can save your own index and keep it private or share it publicly. Proposals for books or docs of public interest always welcome - just share a source URL and I'll add it to the collection :)

#semanticfinder

Dominik Weckmüller @DomeGIS · 2023-12-21 · 19:30 UTC

Just added the current #MTEB leader #WhereIsAI/UAE-Large-V1 to #SemanticFinder and it's performing great! Would love a base or small version to have it slightly faster for in-browser semantic search.
Test here: https://do-me.github.io/SemanticFinder/

#mteb #whereisai #semanticfinder

Dominik Weckmüller @DomeGIS · 2023-11-05 · 17:28 UTC

✨ Open source RAG (Retrieval Augmented Generation) right in your browser! ✨

#SemanticFinder now offers an 𝐚𝐝𝐯𝐚𝐧𝐜𝐞𝐝 𝐜𝐡𝐚𝐭 & 𝐬𝐮𝐦𝐦𝐚𝐫𝐲 𝐟𝐞𝐚𝐭𝐮𝐫𝐞 for your search results - all in your browser.

💡There are very few capable small LLMs that offer high-quality results. Quantized LaMini-Flan-T5-783M offers good performance with 3-4s load time and >6 tokens/s after model download on an old i7.

https://do-me.github.io/SemanticFinder/

#transformers #RAG #AI #LLM #embeddings #semanticsearch #text2text #Flan #T5

#semanticfinder #transformers #rag #ai #llm #embeddings

Dominik Weckmüller @DomeGIS · 2023-10-27 · 20:23 UTC

Just indexed the whole bible in my browser with Jina AI's new 8k embeddings and #SemanticFinder (https://do-me.github.io/SemanticFinder/). 742 pages, 80.500 lines or 4.641.000 chars and my browser doesn't even crash. 30-40 mins indexing time but less than 60s for any consecutive search! 🚀

#semanticfinder

Dominik Weckmüller @DomeGIS · 2023-10-26 · 11:06 UTC

𝐉𝐢𝐧𝐚 𝐀𝐈 𝟖𝐤 𝐭𝐞𝐱𝐭 𝐞𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 📄
I just quantized both available versions for 𝟒𝐱 𝐫𝐞𝐝𝐮𝐜𝐞𝐝 𝐟𝐢𝐥𝐞 𝐬𝐢𝐳𝐞 and usage in #transformers.js and #SemanticFinder (https://do-me.github.io/SemanticFinder/) when dealing with a large corpus:

- 𝟐𝟖.𝟓 𝐌𝐁 jina-embeddings-v2-small-en (https://huggingface.co/do-me/jina-embeddings-v2-small-en)
- 𝟏𝟎𝟗 𝐌𝐁 jina-embeddings-v2-base-en (https://huggingface.co/do-me/jina-embeddings-v2-base-en)

⚠️ I noted however, that the base version seems to perform a little poor on smaller text chunks. Test in SemanticFinder.

Jina AI announcement: https://jina.ai/news/jina-ai-launches-worlds-first-open-source-8k-text-embedding-rivaling-openai/

#transformers #semanticfinder

Dominik Weckmüller @DomeGIS · 2023-10-23 · 15:18 UTC

𝐆𝐮𝐞𝐫𝐢𝐥𝐥𝐚 𝐒𝐞𝐦𝐚𝐧𝐭𝐢𝐜 𝐒𝐞𝐚𝐫𝐜𝐡 𝐓𝐮𝐭𝐨𝐫𝐢𝐚𝐥 🔎

Create an open source semantic search web app for the #Copernicus Services Catalogue:

📦 Mine data
📝 Chunk and index the documents
🖋️ Write a static web app
🚀 Host for free on GitHub pages

⭐ Lots of practical tricks. Built with #pandas, #pytorch, #haystack, #transformers.js, #pako.js.

𝐓𝐮𝐭𝐨𝐫𝐢𝐚𝐥: https://geo.rocks/post/semantic-search-tutorial/
𝐆𝐢𝐭𝐇𝐮𝐛 repo with Jupyter Notebook: https://github.com/do-me/copernicus-services-semantic-search
𝐖𝐞𝐛 𝐚𝐩𝐩: https://do-me.github.io/copernicus-services-semantic-search/

#semanticsearch #eo #earthobservation

#copernicus #pandas #pytorch #haystack #transformers #pako

Dominik Weckmüller @DomeGIS · 2023-09-27 · 12:04 UTC

🔍 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰𝗙𝗶𝗻𝗱𝗲𝗿 𝗕𝗿𝗼𝘄𝘀𝗲𝗿 𝗘𝘅𝘁𝗲𝗻𝘀𝗶𝗼𝗻 🔍
Varun and me are super excited to present you the open source Chrome extension for #SemanticFinder - in-browser, privacy-preserving #SemanticSearch! 🥳
GitHub: http://github.com/do-me/SemanticFinder
Web: https://do-me.github.io/SemanticFinder
Contributors wanted!

#semanticfinder #semanticsearch

Dominik Weckmüller @DomeGIS · 2023-06-26 · 20:09 UTC

#SemanticFinder just received a major feature contribution from @IamVarunSri! Interactively explore all the results for your semantic search query right in your browser for any text up to hundreds of pages: github.com/do-me/SemanticFinder

Made with github.com/xenova/transformers.js

#semanticfinder

Dominik Weckmüller @DomeGIS · 2023-04-17 · 08:32 UTC

You can now pre-index documents with #SemanticFinder and allow for blazingly fast semantic search for very large documents of e.g. 100 pages right in your browser!
See the example:
𝗜𝗣𝗖𝗖 𝗥𝗲𝗽𝗼𝗿𝘁: https://geo.rocks/semanticfinder/ipcc
𝗚𝗶𝘁𝗛𝘂𝗯: https://github.com/do-me/SemanticFinder

#semanticfinder

Dominik Weckmüller @DomeGIS · 2023-04-15 · 14:09 UTC

Just updated the UI of #SemanticFinder

https://geo.rocks/semanticfinder/

You can scroll through the results now with one click, enabling you to quickly find what you're looking for! Also working on mobile!

#semanticfinder

Dominik Weckmüller @DomeGIS · 2023-04-12 · 17:12 UTC

𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰𝗙𝗶𝗻𝗱𝗲𝗿 - A browser-based semantic search engine you can use to query your own texts!

Demo: https://geo.rocks/semanticfinder/
Blog Post: https://geo.rocks/post/semanticfinder-semantic-search-frontend-only/
GitHub: https://github.com/do-me/SemanticFinder/

Built with amazing open-source software: #SentenceTransformers (all-MiniLM-L6-v2), #transformers.js, #CodeMirror and #Bootstrap. #SemanticFinder

#sentencetransformers #transformers #codemirror #bootstrap #semanticfinder

Dominik Weckmüller @DomeGIS · 2023-04-06 · 06:04 UTC

Create a semantic search engine with only a vector database and a light-weight frontend - keep the inference server client-side!

Tutorial with demo: https://geo.rocks/post/qdrant-transformers-js-semantic-search/

Powered by amazing open-source software from #Qdrant, #transformers.js and #SentenceTransformers!

#qdrant #transformers #sentencetransformers