Log comparison based information

The feature "Compare documents" in pdfToolbox comes with a resolution setting, among other parameters, to compare two PDFs.

This article explains the two available options for logging comparison-related information:

  1. Compare.log
  2. Compare JSON report (available starting with pdfToolbox 17)

Compare.log

For comparison related information to be logged, a 'Compare.log' file has to be present in the user preferences. Simply executing the compare functionality would log all comparison parameters in the log file.

What's inside 'Compare.log'

Compare.log is a tab-delimited file containing the following information (a sample is attached at the bottom of this article)

timestamp	doc1	page1	resolution1	rect1 [left,bottom,right,top]	doc2	page	resolution1	rect2 [left,bottom,right,top]	algorithm	anchor	anchorBox	threshold	areaThreshold	diff.min	diff.max	area
Click to copy

Please note that the headers should be tab-separated.

Where to add Compare.log

The log file has to be stored in the User Preferences like shown below:

Server/CLI

For logging compare on CLI, place the 'Compare.log' here.

Desktop

For logging compare on Desktop, place the 'Compare.log' here.

Example

A simple compare command like below

pdfToolbox --compare <PDF file 1> <PDF file 2>
Click to copy

will log the information to 'Compare.log', if 'Compare.log' is present under User Preferences, like shown here:

timestamp	doc1	page1	resolution1	rect1 [left,bottom,right,top]	doc2	page	resolution1	rect2 [left,bottom,right,top]	algorithm	anchor	anchorBox	threshold	areaThreshold	diff.min	diff.max	area			
20211102_134920	PDF file 1.pdf	0	72	[0,420,578,0]	PDF file 2.pdf	0	72	[0,420,578,0]	RGB	TopLeft	CropBox	0	0	0	0	1.96078	10.6706	
Click to copy

Sample

Simple download the attached 'Compare.log' and copy-paste it in your Preferences.

Machine-readable JSON comparison report

Starting with pdfToolbox 17, comparison results can also be written as a machine-readable JSON report. The JSON report contains the same comparison data that was previously available only through the Compare.log.

Unlike the visual PDF comparison report, the JSON report is intended for automated workflows and integrations. It provides the same comparison information in a structured, machine-readable format, making it easy to parse and process comparison results in scripts, quality assurance systems, or other external applications without having to interpret a visual PDF report.

The compare JSON report can be generated in three different ways:

  1. In a Process Plan, on the outgoing connection of a Compare PDFs step.
  2. In pdfToolbox Desktop, as an additional report when using Compare documents.
  3. From the command line, by using the --compare action together with the --format=json option.