PDF OCR

UUID: pdf-ocr@schorschii
Last edited:
3 weeks ago 2024-11-26, 05:49
Last commit: [8fef138e] Update Czech translations (#570)

Quickly recognize text in images of PDF files via OCR

README

Close

PDF OCR

Does Optical Character Recognition (OCR) on the selected PDF files.

DESCRIPTION

After scanning, a PDF file only contains an image of your scanned text. With OCR, you can add text information so that you can copy it from your PDF. The resulting PDF will be saved with the suffix ".ocr.pdf".

DEPENDENCIES

The following programs must be installed and available:

  • ocrmypdf for PDF processing
  • zenity for progress dialog

CHANGELOG

Open

Log In To Comment!

3 Comments

uditarenos
uditarenos-9 months ago
Thank you for your work! I've used it before. There might be a problem with space characters in file names. The dialogue gives me ("Date Example.pdf"): ERROR - InputFileError: File not found - Date ERROR - InputFileError: File not found - Example.pdf
Nassos Kranidiotis
Nassos Kranidiotis-10 months ago
Thank you for the nice action. Please consider including support for Greek language.
Julian Groß
Julian Groß -5 months ago
Should work now if you update. https://github.com/linuxmint/cinnamon-spices-actions/pull/413 You just need the relevant tesseract package installed. E.g. tesseract-ocr-ell.