home.social

#multimodal — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #multimodal, aggregated by home.social.

  1. @sinabhfuil Definitely, I think so - although a mere Jackeen, I think bikeshare.ie/ should be in each city in the country, as well as all major towns, and all rail line interchanges.

    In this case yesterday, I was in Athlone, which is home to a TUS campus (fka Athlone IT).

    There, I had to use a private bike share scheme, and a company I'm not fond of but which appears to have the monopoly on this location.

    #Rothar #BikeWeek #BikeTooter #Cycling #Trains #MultiModal

  2. @sinabhfuil Definitely, I think so - although a mere Jackeen, I think bikeshare.ie/ should be in each city in the country, as well as all major towns, and all rail line interchanges.

    In this case yesterday, I was in Athlone, which is home to a TUS campus (fka Athlone IT).

    There, I had to use a private bike share scheme, and a company I'm not fond of but which appears to have the monopoly on this location.

    #Rothar #BikeWeek #BikeTooter #Cycling #Trains #MultiModal

  3. @sinabhfuil Definitely, I think so - although a mere Jackeen, I think bikeshare.ie/ should be in each city in the country, as well as all major towns, and all rail line interchanges.

    In this case yesterday, I was in Athlone, which is home to a TUS campus (fka Athlone IT).

    There, I had to use a private bike share scheme, and a company I'm not fond of but which appears to have the monopoly on this location.

    #Rothar #BikeWeek #BikeTooter #Cycling #Trains #MultiModal

  4. @sinabhfuil Definitely, I think so - although a mere Jackeen, I think bikeshare.ie/ should be in each city in the country, as well as all major towns, and all rail line interchanges.

    In this case yesterday, I was in Athlone, which is home to a TUS campus (fka Athlone IT).

    There, I had to use a private bike share scheme, and a company I'm not fond of but which appears to have the monopoly on this location.

    #Rothar #BikeWeek #BikeTooter #Cycling #Trains #MultiModal

  5. @sinabhfuil Definitely, I think so - although a mere Jackeen, I think bikeshare.ie/ should be in each city in the country, as well as all major towns, and all rail line interchanges.

    In this case yesterday, I was in Athlone, which is home to a TUS campus (fka Athlone IT).

    There, I had to use a private bike share scheme, and a company I'm not fond of but which appears to have the monopoly on this location.

    #Rothar #BikeWeek #BikeTooter #Cycling #Trains #MultiModal

  6. IA 2026: Áreas de impacto clave 🌍🚀

    El impacto hoy se define en tres frentes:
    Multimodalidad: La IA ya entiende video, audio y planos técnicos en tiempo real, facilitando diagnósticos y auditorías.

    Agentes Autónomos: Sistemas como OpenClaw que ejecutan tareas complejas (citas, pagos, código) sin supervisión.

    Razonamiento Lógico: Modelos que verifican sus propios pasos, vitales en ciencia y finanzas.

    ¡La IA ya no solo sugiere, ahora actúa! 🛠️✨

    #AI #Multimodal #FinTech #HealthTech #Automation

  7. The assumption around multimodal AI has mostly been the same. if you want serious capability, you need serious hardware.

    MiniCPM-V 4.6 is trying to challenge that idea. It’s a 1.3B parameter multimodal model built to run on phones across iOS, Android, and HarmonyOS, while still handling image understanding, video analysis, OCR, and multi-image reasoning workloads that normally push users toward much larger systems.
    firethering.com/minicpm-v-4-6-
    #ai #technews #news #multimodal #opensource #trending

  8. Thinking Machines Lab announced research preview of "interaction models", which was trained from-scratch for real-time multimodal collaboration, 200ms micro-turns, audio+video+text+tools concurrent. Their bet: today's chat UX fits "answering inference", not collaboration, so capable AI defaults to autonomous use and looks like labor substitution. Could we change the debate by changing the UI/UX?

    benjaminhan.net/posts/20260512

    #AI #HumanInTheLoop #Multimodal #HCI #FutureOfWork

  9. Thinking Machines Lab announced research preview of "interaction models", which was trained from-scratch for real-time multimodal collaboration, 200ms micro-turns, audio+video+text+tools concurrent. Their bet: today's chat UX fits "answering inference", not collaboration, so capable AI defaults to autonomous use and looks like labor substitution. Could we change the debate by changing the UI/UX?

    benjaminhan.net/posts/20260512

    #AI #HumanInTheLoop #Multimodal #HCI #FutureOfWork

  10. Thinking Machines Lab announced research preview of "interaction models", which was trained from-scratch for real-time multimodal collaboration, 200ms micro-turns, audio+video+text+tools concurrent. Their bet: today's chat UX fits "answering inference", not collaboration, so capable AI defaults to autonomous use and looks like labor substitution. Could we change the debate by changing the UI/UX?

    benjaminhan.net/posts/20260512

    #AI #HumanInTheLoop #Multimodal #HCI #FutureOfWork

  11. Thinking Machines Lab announced research preview of "interaction models", which was trained from-scratch for real-time multimodal collaboration, 200ms micro-turns, audio+video+text+tools concurrent. Their bet: today's chat UX fits "answering inference", not collaboration, so capable AI defaults to autonomous use and looks like labor substitution. Could we change the debate by changing the UI/UX?

    benjaminhan.net/posts/20260512

    #AI #HumanInTheLoop #Multimodal #HCI #FutureOfWork

  12. Thinking Machines Lab announced research preview of "interaction models", which was trained from-scratch for real-time multimodal collaboration, 200ms micro-turns, audio+video+text+tools concurrent. Their bet: today's chat UX fits "answering inference", not collaboration, so capable AI defaults to autonomous use and looks like labor substitution. Could we change the debate by changing the UI/UX?

    benjaminhan.net/posts/20260512

    #AI #HumanInTheLoop #Multimodal #HCI #FutureOfWork

  13. 🌟✨ OMG, hold the press! The Gemini API File Search is now "multimodal"—whatever that means in techie-speak 🤯. Clearly, the #innovation world is SHOOK by the #groundbreaking ability to search files in more than one way. 🚀 Maybe soon they'll invent a search that can actually find something useful! 😂
    blog.google/innovation-and-ai/ #GeminiAPI #FileSearch #multimodal #technews #HackerNews #ngated

  14. 🌟✨ OMG, hold the press! The Gemini API File Search is now "multimodal"—whatever that means in techie-speak 🤯. Clearly, the #innovation world is SHOOK by the #groundbreaking ability to search files in more than one way. 🚀 Maybe soon they'll invent a search that can actually find something useful! 😂
    blog.google/innovation-and-ai/ #GeminiAPI #FileSearch #multimodal #technews #HackerNews #ngated

  15. 🌟✨ OMG, hold the press! The Gemini API File Search is now "multimodal"—whatever that means in techie-speak 🤯. Clearly, the #innovation world is SHOOK by the #groundbreaking ability to search files in more than one way. 🚀 Maybe soon they'll invent a search that can actually find something useful! 😂
    blog.google/innovation-and-ai/ #GeminiAPI #FileSearch #multimodal #technews #HackerNews #ngated

  16. 🌟✨ OMG, hold the press! The Gemini API File Search is now "multimodal"—whatever that means in techie-speak 🤯. Clearly, the #innovation world is SHOOK by the #groundbreaking ability to search files in more than one way. 🚀 Maybe soon they'll invent a search that can actually find something useful! 😂
    blog.google/innovation-and-ai/ #GeminiAPI #FileSearch #multimodal #technews #HackerNews #ngated

  17. 🌟✨ OMG, hold the press! The Gemini API File Search is now "multimodal"—whatever that means in techie-speak 🤯. Clearly, the #innovation world is SHOOK by the #groundbreaking ability to search files in more than one way. 🚀 Maybe soon they'll invent a search that can actually find something useful! 😂
    blog.google/innovation-and-ai/ #GeminiAPI #FileSearch #multimodal #technews #HackerNews #ngated

  18. 🤖 Another day, another incomprehensible jargon-filled soup about how machines are getting better at understanding pictures and words. Who knew that blending buzzwords with indecipherable acronyms could make #AI sound like it just discovered fire? 🔥 Let's all pretend we're not terrified by the impending takeover of our #multimodal #overlords. 🚀
    arxiv.org/abs/2604.26752 #Revolution #Jargon #Overload #TechTrends #HackerNews #ngated

  19. 🤖 Another day, another incomprehensible jargon-filled soup about how machines are getting better at understanding pictures and words. Who knew that blending buzzwords with indecipherable acronyms could make #AI sound like it just discovered fire? 🔥 Let's all pretend we're not terrified by the impending takeover of our #multimodal #overlords. 🚀
    arxiv.org/abs/2604.26752 #Revolution #Jargon #Overload #TechTrends #HackerNews #ngated

  20. 🤖 Another day, another incomprehensible jargon-filled soup about how machines are getting better at understanding pictures and words. Who knew that blending buzzwords with indecipherable acronyms could make #AI sound like it just discovered fire? 🔥 Let's all pretend we're not terrified by the impending takeover of our #multimodal #overlords. 🚀
    arxiv.org/abs/2604.26752 #Revolution #Jargon #Overload #TechTrends #HackerNews #ngated

  21. 🤖 Another day, another incomprehensible jargon-filled soup about how machines are getting better at understanding pictures and words. Who knew that blending buzzwords with indecipherable acronyms could make #AI sound like it just discovered fire? 🔥 Let's all pretend we're not terrified by the impending takeover of our #multimodal #overlords. 🚀
    arxiv.org/abs/2604.26752 #Revolution #Jargon #Overload #TechTrends #HackerNews #ngated

  22. 🤖 Another day, another incomprehensible jargon-filled soup about how machines are getting better at understanding pictures and words. Who knew that blending buzzwords with indecipherable acronyms could make #AI sound like it just discovered fire? 🔥 Let's all pretend we're not terrified by the impending takeover of our #multimodal #overlords. 🚀
    arxiv.org/abs/2604.26752 #Revolution #Jargon #Overload #TechTrends #HackerNews #ngated

  23. 10 актуальных RAG-подходов: какие реально полезны и когда их применять?

    Всем привет, на фоне обновлений в LLM-стеке за последний год, решил собрать практический список RAG-подходов, которые реально используются в продакшене на основе моего опыта и того что я изучал в других кейсах.

    habr.com/ru/articles/1029616/

    #aiразработка #rag_ai #rag_pipeline #retrieval_augmented_generation #llm #llmмодели #vector_search #hybrid_search #graphrag #multimodal

  24. 10 актуальных RAG-подходов: какие реально полезны и когда их применять?

    Всем привет, на фоне обновлений в LLM-стеке за последний год, решил собрать практический список RAG-подходов, которые реально используются в продакшене на основе моего опыта и того что я изучал в других кейсах.

    habr.com/ru/articles/1029616/

    #aiразработка #rag_ai #rag_pipeline #retrieval_augmented_generation #llm #llmмодели #vector_search #hybrid_search #graphrag #multimodal

  25. 10 актуальных RAG-подходов: какие реально полезны и когда их применять?

    Всем привет, на фоне обновлений в LLM-стеке за последний год, решил собрать практический список RAG-подходов, которые реально используются в продакшене на основе моего опыта и того что я изучал в других кейсах.

    habr.com/ru/articles/1029616/

    #aiразработка #rag_ai #rag_pipeline #retrieval_augmented_generation #llm #llmмодели #vector_search #hybrid_search #graphrag #multimodal

  26. 10 актуальных RAG-подходов: какие реально полезны и когда их применять?

    Всем привет, на фоне обновлений в LLM-стеке за последний год, решил собрать практический список RAG-подходов, которые реально используются в продакшене на основе моего опыта и того что я изучал в других кейсах.

    habr.com/ru/articles/1029616/

    #aiразработка #rag_ai #rag_pipeline #retrieval_augmented_generation #llm #llmмодели #vector_search #hybrid_search #graphrag #multimodal

  27. Heute war ich bei einer Familienfeier in einem Restaurant in Gutmadingen. An einem Sonntag. Zum Mittagessen und Kaffee trinken. Mit dem ÖPNV nicht zu erreichen.

    Fahrrad leider auch keine Option weil ca. 150 km und eine feine Familienfeier mittendrin passt nicht. Wettervorhersage war regnerisch.

    Der Kostenrechner von @nes rechnet einen Preis von 54 € aus. Kleinstes #eAuto weil ich alleine unterwegs bin. Nicht schlecht.

    Allerdings habe ich ein #Deutschlandticket
    🤔💭
    Bin schlussendlich mit der :diebahn: von #Freiburg nach #Donaueschingen gefahren und habe mir da ein #eCarsharing genommen für die letzten km.

    Ergebnis: 15,53 € Carsharing Kosten + Zeit zum Lesen im 🚈
    Einsparung: 38,47 € 💶🤑💰
    #multimodal #intermodlitat #Carsharing

  28. #Qwen36 35B-A3B, a new #opensource MoE model with 35 billion parameters, showcases exceptional #agentic #coding performance and strong #multimodal perception and #reasoning abilities. It outperforms its predecessor and rivals larger models, making it a versatile choice for various tasks. qwen.ai/blog?id=qwen3.6-35b-a3b #tech #media #news

  29. #Qwen36 35B-A3B, a new #opensource MoE model with 35 billion parameters, showcases exceptional #agentic #coding performance and strong #multimodal perception and #reasoning abilities. It outperforms its predecessor and rivals larger models, making it a versatile choice for various tasks. qwen.ai/blog?id=qwen3.6-35b-a3b #tech #media #news

  30. #Qwen36 35B-A3B, a new #opensource MoE model with 35 billion parameters, showcases exceptional #agentic #coding performance and strong #multimodal perception and #reasoning abilities. It outperforms its predecessor and rivals larger models, making it a versatile choice for various tasks. qwen.ai/blog?id=qwen3.6-35b-a3b #tech #media #news

  31. #Qwen36 35B-A3B, a new #opensource MoE model with 35 billion parameters, showcases exceptional #agentic #coding performance and strong #multimodal perception and #reasoning abilities. It outperforms its predecessor and rivals larger models, making it a versatile choice for various tasks. qwen.ai/blog?id=qwen3.6-35b-a3b #tech #media #news

  32. #Qwen36 35B-A3B, a new #opensource MoE model with 35 billion parameters, showcases exceptional #agentic #coding performance and strong #multimodal perception and #reasoning abilities. It outperforms its predecessor and rivals larger models, making it a versatile choice for various tasks. qwen.ai/blog?id=qwen3.6-35b-a3b #tech #media #news

  33. Как я сделал Claude мультимодальным, подключив к нему Qwen Omni

    Claude слепой. К сожалению ни одна модель Антропиков не работает напрямую с видео. Да, можно нарезать хоть на каждый кадр и скормить ему, но это не то. Контекст движения теряется, а без него это просто разбор кучи кадров на составляющие и попытка собрать контекст воедино. Для меня как для визуального артиста это большая боль, потому что часто хочется отправить видео-рефы и попросить разобрать движение камеры, персонажа, дизайн в конце концов. И вот конкретная задача - 29 сгенерированных видео-референсов анимации персонажа лежат в папке проекта, надо их разобрать по категориям и описать каждое движение. Вручную мне заниматься этим, конечно же, лень. Час-полтора времени на нудную задачу. Тогда я вспомнил про Qwen Omni, которым уже пользуюсь для создания Цифрового риалтайм персонажа-ассистента. И подумал, а почему бы не подружить их.

    habr.com/ru/articles/1023852/

    #claudecode #multimodal #qwen #opensource #cowork #plugin

  34. Как я выбираю моменты для Shorts: почему LLM + транскрипт почти всегда дают мусор

    Это третья статья про мой "аниме завод" — систему, которая автоматически превращает длинные эпизоды в Shorts. Если хотите полный контекст, вот предыдущие части:

    habr.com/ru/articles/1021552/

    #llm #shorts #python #cv #computer_vision #signal_processing #multimodal #transcript #youtube_shorts #ai