home.social

#pdfconversion — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #pdfconversion, aggregated by home.social.

  1. 🎉 Oh, joy! Another tool promising to revolutionize your life by converting PDFs into a smorgasbord of formats you never knew you needed. 🤖✨ Complete with buzzword bingo: #PaddleOCR, #LLM, TypeScript SDK, and WebSocket updates! Don't forget to remind your grandma to self-host it! 😂🙄
    github.com/majcheradam/ocrbase #PDFconversion #TypeScriptSDK #selfhosting #HackerNews #ngated

  2. Oh joy, another thrilling guide on turning #PDFs into something more useful than doorstops, using #tools with names straight out of a sci-fi B-movie 🍿🤖. Just what we needed—more reasons to believe #AI will soon replace all our human jobs with endless lines of code and nonsensical buzzwords. 🌪️📄
    github.com/feast-dev/feast/tre #PDFConversion #JobAutomation #TechHumor #HackerNews #ngated

  3. @thornAvery My own approaches are:

    • Find LITERALLY ANY FORMAT OTHER THAN PDF. HTML, text, ePub, etc., if possible.

    • Try pdftotext, part of Poppler utils: poppler.freedesktop.org/ This is available for most Linux distros, MacOS under Homebrew, or check out via Git.

    If I can get something vaguely reasonable, that's usually sufficient.

    • OCR is an option. I've never had good luck with that, and there's such a tremendous amount of tendous correcting that retyping is frequently preferable. That said, I operate at fairly low scale.

    • Retype by hand. Since I'm usually reading the work, this actually turns out to be a pretty good reading method for content-retention.

    PDF itself is a container around a bunch of other formats. Asking how to convert a PDF is a bit like asking how to cook a bag full of groceries. It really depends on what's in it, and what you're hoping to get.

    #PDF #PDFConversion #kfc #docfs #webfs

  4. @thornAvery I'm trying to find what I thought I remembered as an excellent HN comment discussing how to do this at scale.

    It turns out to be really complicated.

    That said, maybe tell us what it is you're trying to do, specifically:

    • How many documents.
    • How large.
    • What languages / charactersets.
    • What budget (if any).
    • What end-use.

    #webfs #docfs #kfc #PDFConversion #pdf

  5. @thornAvery There's no such creature that will cover all cases. You may get lucky in many instances with easier options.

    Your best bet is to find another form of the document that's closer to text. For many published documents there are good odds of this.

    If the PDF is actually rendered from a text source, pdftotext is pretty good at extracting the actual text.

    If it's not ... you're left with a much more challenging job. I find with rather startling frequency that simply re-typing the document from scratch is often the best option.

    #pdf #PDFConversion #kfc #docfs #webfs

    1/

  6. The new Altova MapForce PDF Extractor makes it easy to define rules for extracting data from PDF files in a structured format to make it available for mapping to other popular formats like Excel, XML, JSON, databases, and more. Check out our new blog post and video tutorial:
    altova.com/blog/extract-pdf-da
    #pdf #pdfconversion #pdftoexcel #data #database #extraction #xml #json

  7. manual.calibre-ebook.com/conve

    Reading PDFs on ebook readers and smart phones is a pain.
    Hope content creators get to use a better format. (Epub, Single file Html? )

    PDF format is perfect for many use cases, but not for everything.

  8. 🚀 Simplify your PDF conversion process! In this episode, we guide you through using Shortcuts to automate webpage-to-PDF conversion. Perfect for streamlining reading and research, whether you're into machine learning or just want an easier way to read articles!

    Visit RoutineHub Academy: routinehub.link/get-it-3.7

    Key takeaways:

    Create a Siri shortcut for seamless web-to-PDF conversion, enhancing reading and document management.

    #RoutineHub #PDFConversion #Shortcuts #ProductivityHacks