home.social

#solr — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #solr, aggregated by home.social.

  1. @drmorrisj the monetization aspect is based on their licensing which is excellent plus the value prop for concrete search engine results which are internal and custom is great ip that can add to business workflows and continuity, that is my concrete, non abstract take on yacy and why people may want to use it, also makes for good rag pipeline and structured data, json, they have a nice api #open source value leader #solr dump #nutch

  2. Apache Solr next major release 10.0.0 is out.
    Congrats to the maintainers and all involved 👏
    solr.apache.org/news.html#apac

    @TheASF

  3. there is a ton of info here but i spider and it indexes so you can find what you are looking for quicker - this is about 7gb #lib archive #index #nutch #solr #common crawl

  4. @Earl @gary_alderson

    That's the incompatibility of internal #solr
    Usually, with the major version number increase, the solr is upgraded. And solr is able to upgrade only one version up, not two. So you can upgrade #YaCy, for example, from 1.93 to 1.94, not 1.96.
    Exporting and reimporting the whole index is a way how to cross this limitation, but can take time and disk space, depending on the size of index.
    See: eldar.cz/yacydoc/dev/solr.html
    and
    eldar.cz/yacydoc/operation/ind

  5. @Earl @gary_alderson

    That's the incompatibility of internal #solr
    Usually, with the major version number increase, the solr is upgraded. And solr is able to upgrade only one version up, not two. So you can upgrade #YaCy, for example, from 1.93 to 1.94, not 1.96.
    Exporting and reimporting the whole index is a way how to cross this limitation, but can take time and disk space, depending on the size of index.
    See: eldar.cz/yacydoc/dev/solr.html
    and
    eldar.cz/yacydoc/operation/ind

  6. @Earl @gary_alderson

    That's the incompatibility of internal #solr
    Usually, with the major version number increase, the solr is upgraded. And solr is able to upgrade only one version up, not two. So you can upgrade #YaCy, for example, from 1.93 to 1.94, not 1.96.
    Exporting and reimporting the whole index is a way how to cross this limitation, but can take time and disk space, depending on the size of index.
    See: eldar.cz/yacydoc/dev/solr.html
    and
    eldar.cz/yacydoc/operation/ind

  7. @Earl @gary_alderson

    That's the incompatibility of internal #solr
    Usually, with the major version number increase, the solr is upgraded. And solr is able to upgrade only one version up, not two. So you can upgrade #YaCy, for example, from 1.93 to 1.94, not 1.96.
    Exporting and reimporting the whole index is a way how to cross this limitation, but can take time and disk space, depending on the size of index.
    See: eldar.cz/yacydoc/dev/solr.html
    and
    eldar.cz/yacydoc/operation/ind

  8. @Earl @gary_alderson

    That's the incompatibility of internal #solr
    Usually, with the major version number increase, the solr is upgraded. And solr is able to upgrade only one version up, not two. So you can upgrade #YaCy, for example, from 1.93 to 1.94, not 1.96.
    Exporting and reimporting the whole index is a way how to cross this limitation, but can take time and disk space, depending on the size of index.
    See: eldar.cz/yacydoc/dev/solr.html
    and
    eldar.cz/yacydoc/operation/ind

  9. 📚 New Blog Post: Building an 'X-Ray Machine' for Drupal 11
    I just published Part 3 of my "Automated Librarian" series. Today is all about moving from "thin metadata" to full deep-text indexing using Solr and Tika.
    If you've ever struggled with making thousands of PDFs truly searchable in #Drupal, this one is for you.
    Read the full breakdown: drupalodyssey.com/go/rOxKwcy
    #Drupal11 #OpenSource #Solr #SearchAPI #WebDevelopment #PHP

  10. 📚 New Blog Post: Building an 'X-Ray Machine' for Drupal 11
    I just published Part 3 of my "Automated Librarian" series. Today is all about moving from "thin metadata" to full deep-text indexing using Solr and Tika.
    If you've ever struggled with making thousands of PDFs truly searchable in #Drupal, this one is for you.
    Read the full breakdown: drupalodyssey.com/go/rOxKwcy
    #Drupal11 #OpenSource #Solr #SearchAPI #WebDevelopment #PHP

  11. I've been a digital hoarder of eBooks for two decades. It was time to stop collecting and start discovering.
    Part 1 of my new series on building an intelligent library with #Drupal and #Solr is live.

    📖 Read the breakdown: drupalodyssey.com/go/WRJQM

    #OpenSource #Drupal11 #DevLog #PHP #HomeLab

  12. I've been a digital hoarder of eBooks for two decades. It was time to stop collecting and start discovering.
    Part 1 of my new series on building an intelligent library with #Drupal and #Solr is live.

    📖 Read the breakdown: drupalodyssey.com/go/WRJQM

    #OpenSource #Drupal11 #DevLog #PHP #HomeLab

  13. I've been a digital hoarder of eBooks for two decades. It was time to stop collecting and start discovering.
    Part 1 of my new series on building an intelligent library with #Drupal and #Solr is live.

    📖 Read the breakdown: drupalodyssey.com/go/WRJQM

    #OpenSource #Drupal11 #DevLog #PHP #HomeLab

  14. I've been a digital hoarder of eBooks for two decades. It was time to stop collecting and start discovering.
    Part 1 of my new series on building an intelligent library with #Drupal and #Solr is live.

    📖 Read the breakdown: drupalodyssey.com/go/WRJQM

    #OpenSource #Drupal11 #DevLog #PHP #HomeLab

  15. I've been a digital hoarder of eBooks for two decades. It was time to stop collecting and start discovering.
    Part 1 of my new series on building an intelligent library with #Drupal and #Solr is live.

    📖 Read the breakdown: drupalodyssey.com/go/WRJQM

    #OpenSource #Drupal11 #DevLog #PHP #HomeLab

  16. @benjamin_e @orbiterlab

    Trying for news search engine as well, using #YaCy and eldar.cz/news/ aggregator. Relevancy while search is not great. The pseudo-pagerank ("citation rank") doesn't work that much and is so heavy for computation that I switched that off:
    community.searchlab.eu/t/how-t

    Vector search would certainly be a big help. #solr already have that, but not implemented in YaCy so far.

    For distinguishing news sites, I just use "collections" feature. see community.searchlab.eu/t/what-

  17. @benjamin_e @orbiterlab

    Trying for news search engine as well, using #YaCy and eldar.cz/news/ aggregator. Relevancy while search is not great. The pseudo-pagerank ("citation rank") doesn't work that much and is so heavy for computation that I switched that off:
    community.searchlab.eu/t/how-t

    Vector search would certainly be a big help. #solr already have that, but not implemented in YaCy so far.

    For distinguishing news sites, I just use "collections" feature. see community.searchlab.eu/t/what-

  18. @benjamin_e @orbiterlab

    Trying for news search engine as well, using #YaCy and eldar.cz/news/ aggregator. Relevancy while search is not great. The pseudo-pagerank ("citation rank") doesn't work that much and is so heavy for computation that I switched that off:
    community.searchlab.eu/t/how-t

    Vector search would certainly be a big help. #solr already have that, but not implemented in YaCy so far.

    For distinguishing news sites, I just use "collections" feature. see community.searchlab.eu/t/what-

  19. there is a ton of info here but i spider and it indexes so you can find what you are looking for quicker - this is about 7gb #lib archive #index #nutch #solr #common crawl

  20. there is a ton of info here but i spider and it indexes so you can find what you are looking for quicker - this is about 7gb #lib archive #index #nutch #solr #common crawl

  21. there is a ton of info here but i spider and it indexes so you can find what you are looking for quicker - this is about 7gb #lib archive #index #nutch #solr #common crawl

  22. @kajer you need pipeline but also real time sentiment ranking, checksums for all files seen, cve, ioc - all of a sudden you went from stodgy siem to real world noc

    i like the topic even though i do look at it sarcastically

    i think you want combinatorials of top 10 db and real time #mentions I think you are going to get people in federated enclaves to join together and work on problems but also make a typical image - everybody can optimize and get vbetter by sharing there will be a representative manifest of sw that will vary by sector #thomas register ocr scan and convert into semantic/vectordb/graphs #rss #solr #keywords #page rank...it reminds me of the site that correlates ip to domains and more - great osint info if found to be true - how do you advertise and make the site have rev streams - you cab advertise to people trolling the sector - you need a template bots and spiders, get all the info you can and cache sites and then get it into a db - the real time part is basically a tagcloud #i ching #book of changes #backlinks

  23. @kajer you need pipeline but also real time sentiment ranking, checksums for all files seen, cve, ioc - all of a sudden you went from stodgy siem to real world noc

    i like the topic even though i do look at it sarcastically

    i think you want combinatorials of top 10 db and real time #mentions I think you are going to get people in federated enclaves to join together and work on problems but also make a typical image - everybody can optimize and get vbetter by sharing there will be a representative manifest of sw that will vary by sector #thomas register ocr scan and convert into semantic/vectordb/graphs #rss #solr #keywords #page rank...it reminds me of the site that correlates ip to domains and more - great osint info if found to be true - how do you advertise and make the site have rev streams - you cab advertise to people trolling the sector - you need a template bots and spiders, get all the info you can and cache sites and then get it into a db - the real time part is basically a tagcloud #i ching #book of changes #backlinks

  24. @kajer you need pipeline but also real time sentiment ranking, checksums for all files seen, cve, ioc - all of a sudden you went from stodgy siem to real world noc

    i like the topic even though i do look at it sarcastically

    i think you want combinatorials of top 10 db and real time #mentions I think you are going to get people in federated enclaves to join together and work on problems but also make a typical image - everybody can optimize and get vbetter by sharing there will be a representative manifest of sw that will vary by sector #thomas register ocr scan and convert into semantic/vectordb/graphs #rss #solr #keywords #page rank...it reminds me of the site that correlates ip to domains and more - great osint info if found to be true - how do you advertise and make the site have rev streams - you cab advertise to people trolling the sector - you need a template bots and spiders, get all the info you can and cache sites and then get it into a db - the real time part is basically a tagcloud #i ching #book of changes #backlinks

  25. @kajer you need pipeline but also real time sentiment ranking, checksums for all files seen, cve, ioc - all of a sudden you went from stodgy siem to real world noc

    i like the topic even though i do look at it sarcastically

    i think you want combinatorials of top 10 db and real time #mentions I think you are going to get people in federated enclaves to join together and work on problems but also make a typical image - everybody can optimize and get vbetter by sharing there will be a representative manifest of sw that will vary by sector #thomas register ocr scan and convert into semantic/vectordb/graphs #rss #solr #keywords #page rank...it reminds me of the site that correlates ip to domains and more - great osint info if found to be true - how do you advertise and make the site have rev streams - you cab advertise to people trolling the sector - you need a template bots and spiders, get all the info you can and cache sites and then get it into a db - the real time part is basically a tagcloud #i ching #book of changes #backlinks

  26. Мои книги по Search & Recsys

    Друзья, я наконец опубликовал третью книгу по теме поиска (плюс еще одна по близкой теме рекомендательных систем). Они очень нишевые, рассчитаны на специалистов, и я подумал, что Habr просто идеальное место сообщить об этом. Во всех четырех книгах ноль воды, и очень плотно изложен материал, с ссылками на научные статьи и иллюстрациями, где они реально необходимы. Anatomy of Ecommerce Search testmysearch.com/books/anatomy Начнем с той, что вышла сегодня - Anatomy of Ecommerce Search.

    habr.com/ru/articles/974598/

    #search #recsys #ecommerce #solr

  27. Мои книги по Search & Recsys

    Друзья, я наконец опубликовал третью книгу по теме поиска (плюс еще одна по близкой теме рекомендательных систем). Они очень нишевые, рассчитаны на специалистов, и я подумал, что Habr просто идеальное место сообщить об этом. Во всех четырех книгах ноль воды, и очень плотно изложен материал, с ссылками на научные статьи и иллюстрациями, где они реально необходимы. Anatomy of Ecommerce Search testmysearch.com/books/anatomy Начнем с той, что вышла сегодня - Anatomy of Ecommerce Search.

    habr.com/ru/articles/974598/

    #search #recsys #ecommerce #solr

  28. Мои книги по Search & Recsys

    Друзья, я наконец опубликовал третью книгу по теме поиска (плюс еще одна по близкой теме рекомендательных систем). Они очень нишевые, рассчитаны на специалистов, и я подумал, что Habr просто идеальное место сообщить об этом. Во всех четырех книгах ноль воды, и очень плотно изложен материал, с ссылками на научные статьи и иллюстрациями, где они реально необходимы. Anatomy of Ecommerce Search testmysearch.com/books/anatomy Начнем с той, что вышла сегодня - Anatomy of Ecommerce Search.

    habr.com/ru/articles/974598/

    #search #recsys #ecommerce #solr

  29. Мои книги по Search & Recsys

    Друзья, я наконец опубликовал третью книгу по теме поиска (плюс еще одна по близкой теме рекомендательных систем). Они очень нишевые, рассчитаны на специалистов, и я подумал, что Habr просто идеальное место сообщить об этом. Во всех четырех книгах ноль воды, и очень плотно изложен материал, с ссылками на научные статьи и иллюстрациями, где они реально необходимы. Anatomy of Ecommerce Search testmysearch.com/books/anatomy Начнем с той, что вышла сегодня - Anatomy of Ecommerce Search.

    habr.com/ru/articles/974598/

    #search #recsys #ecommerce #solr