Process Plan: Extract text using OCR

This Process Plan creates ORC text on all pages and extracts the text into a .txt file next to the input file

Erliest version with full support for “Export text using OCR.kfpx” is pdfToolbox 15

  1. To make sure that the OCR text is good, the existing OCR text will be deleted.
  2. The page will be convert into an image.
  3. Create a new OCR text for all pages.
  4. Extracts the text into a .txt file that will be saved next to the input file.
  5. After the engine has extracted the OCR text into a text file, the original file will be picked up.
Why does pdfToolbox not show a "save as" dialog when executing the Process Plan?

Since the last step in the Process Plan is a "File pick up", this means that the original file is restored. It is therefore not necessary to save the result. In these cases, the "Do not save modified input file" checkbox can be activated in the Process Plan parameters to suppress the "Save as" dialog.