#document_processing — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #document_processing, aggregated by home.social.
-
Hey, Fedi, what's the best way under Linux to OCR a scanned PDF and put the resulting text into the PDF? I haven't found any particularly convincing recipes yet. (I mean, Tesseract for the OCR part, I know that much - but what's the best way to get the text into the PDF for searchability and text selection? Ideally without disturbing any annotations I've already made.)
-
Benchmarking the Most Reliable Document Parsing API
https://www.tensorlake.ai/blog/benchmarks
#ycombinator #context_engineering #document_processing #machine_learning #LLM #RAG #vector_database #knowledge_graphs #document_parsing #structured_extraction #AI_workflows #Document_Parsing #OCR #Benchmarks #TEDS #Enterprise_AI -
Benchmarking the Most Reliable Document Parsing API
https://www.tensorlake.ai/blog/benchmarks
#ycombinator #context_engineering #document_processing #machine_learning #LLM #RAG #vector_database #knowledge_graphs #document_parsing #structured_extraction #AI_workflows #Document_Parsing #OCR #Benchmarks #TEDS #Enterprise_AI -
Benchmarking the Most Reliable Document Parsing API
https://www.tensorlake.ai/blog/benchmarks
#ycombinator #context_engineering #document_processing #machine_learning #LLM #RAG #vector_database #knowledge_graphs #document_parsing #structured_extraction #AI_workflows #Document_Parsing #OCR #Benchmarks #TEDS #Enterprise_AI -
Benchmarking the Most Reliable Document Parsing API
https://www.tensorlake.ai/blog/benchmarks
#ycombinator #context_engineering #document_processing #machine_learning #LLM #RAG #vector_database #knowledge_graphs #document_parsing #structured_extraction #AI_workflows #Document_Parsing #OCR #Benchmarks #TEDS #Enterprise_AI