home.social

#document_processing — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #document_processing, aggregated by home.social.

  1. Hey, Fedi, what's the best way under Linux to OCR a scanned PDF and put the resulting text into the PDF? I haven't found any particularly convincing recipes yet. (I mean, Tesseract for the OCR part, I know that much - but what's the best way to get the text into the PDF for searchability and text selection? Ideally without disturbing any annotations I've already made.)

    #pdf #linux #ocr #tesseract #document_processing