PDF Editor for repairing book scan OCR?

zabadoh@lemmy.ml · edit-2 2 years ago

PDF Editor for repairing book scan OCR?

ChickenBoo@lemmy.jnks.xyz · edit-2 2 years ago

If you want to host it locally, Stirling PDF can be run in docker, and uses a library that uses Tesseract. Has a bunch of other handy PDF operations, too. I keep it around for the two times a year I need to merge, split, or decrypt PDFs.

https://github.com/Frooodle/Stirling-PDF/blob/main/HowToUseOCR.md

It can do it straight from PDF and do multiple files at a time.

sibloure@beehaw.org · 2 years ago

This is amazing. Did not realize it existed. Thank you for sharing