home.social

#mllms — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #mllms, aggregated by home.social.

  1. If you would like to learn more how it works: Guiding Instruction-based Image Editing via Multimodal Large Language Models. Check out the code repository for the ICLR'24 Spotlight paper by Tsu-Jui Fu, Wenze Hu, Xianzhi Du, William Yang Wang, Yinfei Yang, and Zhe Gan.
    github.com/apple/ml-mgie
    #ICLR24 #ImageEditing #MLLMs #AIResearch

  2. Apple has released Ferret, a new type of multimodal large language model (MLLM) that excels in both image understanding and language processing, particularly demonstrating significant advantages in understanding spatial references.

    Paper: arxiv.org/abs/2310.07704
    Github: github.com/apple/ml-ferret?tab

    Source: threads.net/@luokai/post/C1OE1

    #ai #LLMs #mllms