#ocr4all — Public Fediverse posts on home.social

Monika Barget @[email protected] · 2025-09-04 · 09:36 UTC

@daelba @KathyReid I also recommend #OCR4all. While e-Scriptorium runs on the kraken engine, OCR4all, uses Calamari. Once you have training data in e-Scriptorium, you can also potentially use them to train models in OCR4all. Depending on your discipline, the existing models for e-Scriptorium are 'better' than those for OCR4all or vice-versa, but both tools are highly recommended.

#ocr4all

Monika Barget @[email protected] · 2025-04-13 · 15:13 UTC

Every now & then, I give #ChatGPT a scan of my handwriting to test its skills in working with #handwrittentexts. Initially, it responded that it could not process the scans or gave me entirely fictional output, but today it got almost everything right. These results are better than those I achieved with #HWR models in #Tesseract & #OCR4all without additional training. I also asked ChatGPT what it "thought" about my writing & it called it "consistently shaped & large with stylistic strokes."

#chatgpt #handwrittentexts #hwr #tesseract #ocr4all

Daniela Schneider @[email protected] · 2025-02-07 · 08:39 UTC

Hi #histodons,
I need your expertise. We want to integrate an #opensource #ocr tool into our #useGalaxy Platform so you can better analyse your texts, etc.
I worked with #tesseract some years ago, and I heard about #ocr4all.
Do you have experience with any of these - or other recommendations?
We are also integrating #tranksribus via API but want another ocr-specific option.
Looking forward to your experiences!

@galaxyfreiburg
@NFDI4Memory

#histodons #opensource #ocr #usegalaxy #tesseract #ocr4all

Frederik Elwert @[email protected] · 2024-09-26 · 08:51 UTC

Re OCR/ATR, interestingly the #OCR4all paper also offers a very good overview of the different steps and workflows. It has a different purpose, but I think it can still be used in a class context.

Reul, Christian et al. 2019. “OCR4all—An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings.” Applied Sciences 9 (22): 4853. https://doi.org/10.3390/app9224853.

#ocr4all

Benjamin Rosemann @[email protected] · 2024-07-06 · 06:49 UTC

@tkinias as far as I understand you want to implement a PDF -> Text -> PDF workflow. Using plaintext as intermediate is problematic, as you (may) lose a lot of layout information.

For high quality fulltext you may need a more sophisticated intermediate format like #PageXML or #AltoXML. But they also require a more sophisticated tool for editing like #OCR4All.

#pagexml #altoxml #ocr4all

Frederik Elwert @[email protected] · 2024-06-13 · 15:01 UTC

A colleague just asked me about a good, free OCR software for a historical book they are scanning. I was checking out #OCR4all to see if I could recommend it. First thing on the "Getting started" page: A Linux terminal command to start docker … 😵‍💫 I’m not criticizing the project, which I think does important work, but it’s a rather peculiar definition of "all" …

#ocr4all

🌈 Lascapi ⁂ @[email protected] · 2024-04-16 · 13:47 UTC

Salut ici :)
Je suis en train de tester #ocr4all pour faire reconnaître de l’écriture manuscrite. ( #ocr #hwr #htr )
Mais j’arrive à rien.
C’est peut-être à cause des modèles ?! Je n’ai que ceux de base qui sont optimisé pour le vieux français … ça aide pas … 😅

Est-ce que quelqu’un a déjà essayé et réussi ??

#question #RT apprécié 😌

#rt #question #htr #hwr #ocr #ocr4all

Monika Barget @[email protected] · 2024-04-10 · 21:05 UTC

@jomla @stabihh Mittlerweile haben wir auf unserem DSRI (Data Science Research Environment) #ocr4all aufgesetzt und der Workflow insgesamt erscheint uns sehr transparent. Allerdings sind wir bei der #Layouterkennung gleich am ersten Dokument gescheitert. Also... "read the docs"!

#ocr4all #layouterkennung

Monika Barget @[email protected] · 2023-11-26 · 10:14 UTC

@jomla @stabihh Workshop habe ich leider verpasst. Bin aber interessiert daran, Menschen mit #OCR4all Expertise als Referent*innen nach #Maastricht einzuladen. Hat jemand aus der Community Interesse? Dann gerne PM.

#ocr4all #maastricht

Monika Barget @[email protected] · 2023-11-26 · 10:10 UTC

@jomla @stabihh Ich sehe mal wieder keinerlei Antworten auf den Post und hoffe ich frage nicht doppelt: wie waren die Erfahrungen? Ich denke gerade darüber nach, welche #OCR Infrastruktur für mich und meine Fakultät langfristig die beste wäre. Mit #Transkribus arbeite ich gerne, aber #OCR4all hat natürlich #OpenScience Pluspunkte. Allerdings weiß ich noch zu wenig über Anwendungserfahrungen für die #FrüheNeuzeit und freue mich über Austausch.

#ocr #transkribus #ocr4all #openscience #fruheneuzeit

Annika Rockenberger (she/they) @[email protected] · 2023-07-11 · 08:53 UTC

#Day2 of #DH2023 pre-conference workshops. Today I am learning how to use #OCR4All. Hopefully, I can teach and tutor folks at the #UniversityOfOslo later. It could be especially useful for #MedievalManuscripts since we have a couple of projects that require good #OCR #HTR processing!

#day2 #dh2023 #ocr4all #universityofoslo #medievalmanuscripts #ocr

Clemens Radl @[email protected] · 2023-04-12 · 11:22 UTC

On his blog, Jonathan Green recommends #OCR4all for early printed books: http://researchfragments.blogspot.com/2023/04/ocr4all-is-good.html #ocr #digitalhumanities #dh (via https://archivalia.hypotheses.org/171036)

#dh #digitalhumanities #ocr #ocr4all