Quick Check – Introduction

Quick Check functionality in pdfToolbox makes it possible to retrieve certain information from a PDF very quickly.

Internal tests at callas software have shown that it typically takes 5 seconds to retrieve and return complete page size information for all pages in a 40.000 page PDF document.

For deep and extensive analysis of a PDF file, using a preflight profile remains the way to go. Where some more basic ; and even not so basic – information is of interest, Quick Check provides an extremely fast and light weight mechanism to retrieve such information. Quick Check is highly configurable which makes it possible to only retrieve and return information that is actually needed.

Typical examples for Quick Check based information retrieval:

  • number of pages
  • page sizes
  • color spaces used (e.g. whether RGB is used)
  • spot colors used
  • fonts used
  • layers used
  • whether the PDF claims conformance with PDF/X, PDF/A, PDF/UA, PDF/VT or PDF/E
  • output intents
  • information about bookmarks
  • information about embedded files
  • use of transparency (opacity, blend mode, soft masks)
  • image resolution (or any other information pertaining to images)
  • XMP metadata
  • information about markup annotations or other annotations

If you feel a need for any aspects currently not available in Quick Check, please use the commenting feature at the end of this article for requesting their implementation in a future version.

How to use Quick Check

Quick Check is available in two ways:

  • As a step in a Process Plan:
    This makes it possible to first quickly retrieve some core data for a PDF about to be processed, and then use that data to make decisions – via a JavaScript step or JavaScript inside other Process Plan steps – how to process the PDF; e.g. if there is no non-CMYK color data present, there is no need to go through a "Convert to CMYK" Fixup
  • As a call to on the command line (requires the Server or command line version of pdfToolbox):
    In a special Quick Check execution mode, only a small part of the pdfToolbox command line executable is launched, cutting down on launch time and use of resources; core data retrieved can be used by the system that drives pdfToolbox, for example to deliver immediate information on incoming PDF files or to decide about further processing.

Configuring Quick Check

Quick Check is configured by a list of filter expressions that white lists or black lists different sets and subsets of available data. For a Quick Check step in a Process Plan, the filter expressions have to be provided as an array through a JavaScript variable. On the command line, a simple text based configuration file with one filter expression per line is used.
A more detailed article about Quick Check configuration is found here.

Quick Check output

For a Quick Check step in a Process Plan, the result from running a Quick Check is added to the app.vars data object (which is data structure in the form of a JavaScript variable which is maintained and enriched throughout the execution of a Process Plan). Subsequent steps in a Process Plan can retrieve data from the app.vars data object and use it for processing.

For Quick Check execution in the command line, the output is created in the form of a JSON file.

The content of this JSON file is completely equivalent to the data substructure added to the app.vars object when using Quick Check in a Process Plan.