home.social

#iqdb — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #iqdb, aggregated by home.social.

  1. Hmm I probably have the most ridiculous #robotstxt for a #Misskey instance right now lol. I just want to let #Mojeek and #Marginalia crawl #Makai and make sure to keep out #Google and the AI scrapers... ​:satrithink:​

    If there are other user-agents of independent
    #searchengines I should allow in https://makai.chaotic.ninja/robots.txt, please let me know! I'm actually searching #SauceNAO, #TinEye, and #IQDB's #useragent so I can let them fetch our media for their reverse image search.

    User-Agent: MojeekBot
    User-Agent: FeedFetcher-Mojeek
    User-Agent: search.marginalia.nu
    Allow: /
    Allow: /notes
    Disallow: /admin
    Disallow: /settings
    Disallow: /my/
    
    User-Agent: *
    User-Agent: Googlebot
    User-Agent: Google-Extended
    User-Agent: GoogleOther
    User-Agent: AdsBot-Google
    User-Agent: AdsBot-Google-Mobile
    User-Agent: Mediapartners-Google
    User-Agent: CCBot
    User-Agent: ChatGPT-User
    User-Agent: GPTBot
    User-Agent: Omgilibot
    User-Agent: omgili
    User-Agent: FacebookBot
    User-agent: Twitterbot
    User-Agent: cohere-ai
    User-Agent: anthropic-ai
    User-Agent: Bytespider
    User-Agent: Amazonbot
    User-Agent: Applebot
    User-Agent: PerplexityBot
    User-Agent: YouBot
    User-Agent: AwarioRssBot
    User-Agent: AwarioSmartBot
    User-Agent: ClaudeBot
    User-Agent: Claude-Web
    User-Agent: DataForSeoBot
    User-Agent: FriendlyCrawler
    User-Agent: ImagesiftBot
    User-Agent: magpie-crawler
    User-Agent: Meltwater
    User-Agent: peer39_crawler
    User-Agent: PiplBot
    User-Agent: Seekr
    Disallow: /
    
    # todo: sitemap

    #sysadmin #fediadmin