The JSON log files

There are three JSON logging files that will get created during each successfully completed invocation of pdfToolbox:

  1. launch.json:
    immediately created upon launching pdfToolbox CLI or on invocating processing of a PDF in pdfToolbox in the desktop version; neither is the kfpx file loaded yet nor is the PDF file to be processed accessed yet; only minimal information about the environment and context is collected, including the content of the command line, plus an automatically generated unique ID (app_uuid) and the job ID (when it was passed as a parameter on the command line)
  2. init.json:
    gets written once the kfpx file is loaded and parsed, the PDF file to be processed is opened and some basic information extracted from the PDF file
  3. finish.json:
    gets written once pdfToolbox exits (regardless whether processing was successful or not).

If init.json or finish.json are seemingly missing from logging, this is an indication that pdfToolbox terminated prematurely without writing those files, which can serve as a good basis for post mortem analysis. If init.json is not written, something probably went wrong while reading the kfpx file, accessing the PDF file for some basic analysis, or similar. If only finish.json is missing, something went wrong during execution of the kfpx profile, or when creating reports or producing some other output.

Structure of 'launch.json'

name value
verb type of logging file. Possible values: launch. 
Note: The other values (init, finish) are only used by the corresponding init.json and finish.json files.
app_uuid automatically generated unique ID on a per invocation basis. This unique ID is guaranteed to be the same value across all log files belonging to the same invocation (and also identical to the parent folder name containing the corresponding .json files)
timestamp timestamp. Format: YYYY/MM/DD hh:mm:ss
timestamp_hour the hour portion of the timestamp in the format YY (e.g. 20)
timestamp_month the month portion of the timestamp in the format MM (e.g. 07)
timestamp_weekday the weekday portion of the timestamp in the format W (e.g. for Wednesday)
job_id a job ID provided via command line argument (not available in desktop version)
process_id process ID of the process in the operating system
filename file name of the file to be processed
filepath folder path and file name of the file to be processed
cli_params command line parameters
program_name name of the program (e.g. callas pdfToolbox CLI (x64))
program_version version of the program (e.g. 10.0.461)
platform platform on which pdfToolbox is executed (e.g. Mac OS X 10.10.5)
machine_ips list of IP addresses of the machine on which pdfToolbox is executed, in the form of an array of strings. Example: machine_ips : [ "192.168.1.1", "123.45.67.89"]
Note: As machines can have more than one IP address, this entry is structured as a list of entries.
Known limitation: implemented for Windows/Mac OS X/Linux platforms only
machine_name machine name of the machine on which pdfToolbox is executed
machine_uuid UUID of the machine on which pdfToolbox is executed
Note: this is a UUID derived from hardware parameters of the current machine, with some parts of the information removed such that the UUID is still unique but the hardware parameters as such cannot to be derived from the UUID.
temp_folder folder path to the temp folder used during invocation of pdfToolbox

Example content for launch.json

{
    "verb" : "launch",
    "app_uuid" : "0a0ceba3-7ca9-421f-a973-8caae2950690",
    "timestamp" : "2016/10/31 20:08:27",
    "timestamp_hour" : 20,
    "timestamp_month" : 10,
    "timestamp_weekday" : 1,
    "process_id" : 29543,
    "job_id" : "some_job_ID_provided_through_cli_parameter",
    "filename" : "4-Catching Text in PDFs - Michael Fuchs.pdf",
    "filepath" : "/var/folders/80a56def-aa5f-418b-9685-8614ab6d41c2/0x10dd99000/4-Catching Text in PDFs.pdf",
    "cli_params" : ["--hitsperpage=50", "--report=ERROR,WARNING,TEMPLATE=OVERVIEW"],
    "program_name" : "callas pdfToolbox CLI (x64)",
    "program_version" : "9.1.417",
    "platform" : "Mac OS X 10.10.5",
    "machine_ips" : ["192.168.17.63"],
    "machine_name" : "pdftoolbox_satellite_3",
    "machine_uuid" : "---6E851-41E8-5060-B0A7-C0550F43E418",
    "temp_folder" : "/var/folders/yb/16cjr4dn2c5_27r2khx1bvbh0000gn/T/com.callassoftware.pdfToolbox/32e5717f-240a-4a17-86fe-a95551808721",
    "temp_folder_hdd" : "",
}

Structure of init.json

Description of each entry in init.json – an addition to those entries described above for "launch.json". The verb entry has a value of "init".

name value
doc_created timestamp when document was created; same format as timestamp entry (e.g. "2015/06/03 12:23:53.000")
doc_id1 first of the two values in the document ID entry in the document (e.g. "DB10E96543FE2B4E93226FA065FE83BC")
doc_id2 second of the two values in the document ID entry in the document (e.g. "00AFB863B1344FDDBE90D24E513AB992")
doc_modified timestamp when document was last modified; same format as timestamp entry (e.g. "2015/06/08 17:03:18.000")
doc_pages number of pages in the document
doc_size size of the PDF file in bytes (e.g. "863672", equivalent to ca. 863 KB)
firstpage_size structure reflecting the size of the first page in the PDF (see further below for a definition of the firstpage_size structure)
pdf_creator value of the creator entry in the document metadata (e.g. "Acrobat PDFMaker 10.1 for PowerPoint)
pdf_encrypted whether the PDF is encrypted; possible values: 0 (not encrypted) and 1 (encrypted)
Note: for encrypted PDF files an init.json file is only written when a correct password is specified
pdf_version PDF version of the PDF file (e.g. 1.7)
pdf_writer value of the writer entry in the document metadata (e.g. Adobe PDF Library 10.0)
pdfa_version if present, the PDF/A version (e.g. "PDF/A-3u")
pdfe_version if present, the PDF/E version (e.g. "PDF/E-2r")
pdfua_version if present, the PDF/UA version (e.g. "PDF/UA-1")
pdfx_oi_icc_name if present, the name of the PDF/X OutputIntent profile (e.g. "PSO Coated v3")
pdfx_oi_info if present, the text in the OutputIntent Info field
pdfx_oi_output_cond_id if present, the value of the PDF/X OutputIntentIdentifier (e.g. "FOGRA39")
pdfx_version if present, the PDF/X version (e.g. "PDF/X-1a")
profile_filename file name of the kfpx profile (e.g. "Convert to PDFA-1a.kfpx")
profile_id internal ID of the kfpx profile (e.g. "P959c755539c8439e62c516c66a4a9097")
profile_name human readable name of the kfpx profile (e.g. "Sheetfed offset (CMYK, RGB and spot colors) (GWG 2015)")
variables data structure representing the variables as evaluated upon initiating processing (equivalent to the JavaScript object app.variables; for details see documentation on "Variables and JavaScript")

'firstpage_size' entry

The 'firstpage_size' sructure reflects the various page geometry boxes:

  • Each of the page geometry boxes (mediabox, cropbox, bleedbox, trimbox, artbox) have an array of four entries as their value, in the order left, bottom, right, top.
  • Each value in the array represent is expressed in pt (inch/72)
  • The cropsize represents the effective width (w) and height (h) of the CropBox, or in its absence that of the MediaBox.
  • The trimsize represents the effective width (w) and height (h) of the TrimBox, or in its absence that of the CropBox, or in its absence that of the MediaBox.
  • The bleed represents the effective bleed on the four sides (in the order left, bottom, right, top), based on the trimsize. If the BleedBox is missing, all four values are 0 (zero).

Structure of page_size entry

"firstpage_size" : {
  "mediabox" : [l b r t] ,
  "cropbox"  : [l b r t] ,
  "bleedbox" : [l b r t] ,
  "trimbox"  : [l b r t] ,
  "artbox"   : [l b r t] ,
  "cropsize" : [w h] ,
  "trimsize" : [w h] ,
  "bleed"    : [l b r t]
}

Example content for init.json

{
    "verb" : "init",
    "app_uuid" : "0a0ceba3-7ca9-421f-a973-8caae2950690",
    "timestamp" : "2016/10/31 20:08:27",
    "timestamp_hour" : 20,
    "timestamp_month" : 10,
    "timestamp_weekday" : 1,
/*  ... and all further entries defined for "launch.json" */
 
 
 
    "profile" : "P959c755539c8439e62c516c66a4a9097",
    "profile_name" : "Sheetfed offset (CMYK, RGB and spot colors) (GWG 2015)",
 
    "doc_size" : 863672,
    "doc_created" : "2015/06/03 12:23:53.000",
    "doc_modified" : "2016/10/31 20:08:28.000",
    "pdf_version" : "1.5",
    "pdf_creator" : "Acrobat PDFMaker 10.1 für PowerPoint",
    "pdf_writer" : "Adobe PDF Library 10.0",
    "doc_id1" : "DB10E96543FE2B4E93226FA065FE83BC",
    "doc_id2" : "00AFB863B1344FDDBE90D24E513AB992",
 
    "doc_pages" : 23,
    "firstpage_size" : {
      "mediabox" : [-10 0 581 615] ,
      "cropbox"  : [-10 0 581 615] ,
      "bleedbox" : [5 5 559 570] ,
      "trimbox"  : [10 10 551 565] ,
      "artbox"   : [] ,
      "cropsize" : [591 625] ,
      "trimsize" : [541 555] ,
      "bleed"    : [5 5 8 5]
    }
 
 
    "variables" : {
        "Calcs_for_LFP_Preflight_-_viewing_distance" : {
            "eff_min_fontsize":null,
            "eff_min_imageresolution":null
        },
        "eff_min_fontsize":200,
        "eff_min_imageresolution":40,
        "input_scalingfactor":100,
        "input_viewingdistance":10
    }
}

Structure of finish.json

Description of each entry in finish.json (in addition to those entries described above for "launch.json" and for "init.json")

name value
retcode the program exit code (see pdfTooolbox CLI manual for details).
Note: value is provided as an integer.
duration duration of processing (essentially the difference between timestamp at finish and timestamp at lauch), formatted as hh:mm:ss:ttt (e.g. "0:00:11:045")
doc_corrections number of corrections applied during processing
doc_max_severity maximum severity; defined values are:  3 = error, 2 = warning, 1 = info, 0 = no message
doc_messages number of messages, i.e. the combined total of error, warning and info messages
doc_errors number of error messages
doc_errors_list an array of error details; each array entry contains an error-name together with its counter
doc_warnings number of warning messages
doc_warnings_list an array of warning details; each array entry contains a warning-name together with its counter
doc_infos number of info messages
doc_infos_list an array of info details; each array entry contains an info-name together with its counter
num_images number of images
num_fonts number of fonts; two different font resources where the font name happens to be the same are counted as two fonts
fonts data structure representing the fonts in the PDF file
num_spotcolors number of spot colors; i.e. all Separation colour spaces whose name is not one of Cyan, Magenta, Yellow, Black, All or None
spotcolor_names array of spot colour names; for example: "spotcolor" : [ "Orange", "Purple" ] )
num_icc_profiles number of all ICC profiles; excluding ICC profiles in output intents
icc_profiles_gray array of names of ICC profiles; excluding ICC profiles in output intents; the CalGray colourspace, which strictly speaking is not an ICC based colourspace, is reported here as "CalGray"; for example: "icc_profiles_gray" : [ "Generic Gray Profile", "Gamma 2.2 Gray"] )
icc_profiles_rgb array of names of 3-component ICC profiles; excluding ICC profiles in output intents. RGB ICC profiles are reported by their name (content of 'desc' field), the CalRGB colourspace, which strictly speaking is not an ICC based colourspace, is reported here as "CalRGB", the Lab colorspace, which strictly speaking is not an ICC based colourspace either, is reported here as "Lab"; for example: "icc_profiles_rgb" : [ "eciRGB v2", "CalRGB", "Lab"]
icc_profiles_cmyk array of names of CMYK ICC profiles; excluding ICC profiles in output intents; for example: "icc_profiles_cmyk" : [ "PSO Coated v3", "US Web Coated SWOP"] )
icc_profiles_lab array of names of "Lab" ICC profiles; the "Lab" colour space, which strictly speaking is not an ICC based colur space, is still reported here as "Lab";  for example: "icc_profiles_lab" : [ "Lab"]
pdfx_version if present, the PDF/X version (e.g. "PDF/X-1a")
pdfx_oi_output_cond_id if present, the value of the PDF/X OutputIntentIdentifier (e.g. "FOGRA39")
pdfx_oi_info if present, the text in the OutputIntent Info field
pdfx_oi_icc_name if present, the name of the PDF/X OutputIntent profile (e.g. "PSO Coated v3")
pdfa_version if present, the PDF/A version (e.g. "PDF/A-3u")
pdfua_version if present, the PDF/UA version (e.g. "PDF/UA-1")
pdfe_version if present, the PDF/E version (e.g. "PDF/E-2r")
pdf_encrypted whether the PDF is encrypted; possible values: 0 (not encrypted) and 1 (encrypted)

Sub-structure for fonts

"fonts" : [
  {
    "fontname" : "TimesNewRomanPS-BoldMT",
    "fonttype" : "Type1",
    "embedded" : 1,
    "subset"   : 1
  },
  {
    "fontname" : "MyriadPro-BoldItalic",
    "fonttype" : "Type0",
    "embedded" : 1,
    "subset"   : 1
  }
]

Sub-structure for errors_list/warnings_list/infos_list

Each entry in each of the three lists contains a key value pair, where the key is the name of a check (as configured in the kfpx used), and the value is a number n/m/o/p/q/r/s/t/u that reflects the number of occurrences the respective error/warning/info was triggered.

"doc_errors_list" : [
   { 
     "name of check that has triggered an error" : n 
   },
   {
      "another name of a check that has triggered an error" : m
   },
   {
     "yet another name of a check that has triggered an error" : o
   }
]
"doc_warnings_list" : [
   {
     "name of check that has triggered a warning" : p
   },
   {
     "another name of a check that has triggered a warning" : q
   },
   {
     "yet another name of a check that has triggered a warning" : r
   }
]
"doc_infos_list" : [
   {
     "name of check that has triggered an info message" : s
   },
   {
     "another name of a check that has triggered an info message" : t
   },
   {
     "yet another name of a check that has triggered an info message" : u
   }
]

Example content for finish.json

{
    "verb" : "finish",
    "app_uuid" : "0a0ceba3-7ca9-421f-a973-8caae2950690",
    "timestamp" : "2016/10/31 20:08:27",
    "timestamp_hour" : 20,
    "timestamp_month" : 10,
    "timestamp_weekday" : 1,
/*  ... and all further entries defined for "launch.json" */
 
 
    "profile" : "P959c755539c8439e62c516c66a4a9097",
    "profile_name" : "Sheetfed offset (CMYK, RGB and spot colors) (GWG 2015)",
/*  ... and all further entries defined for "init.json" */
 
 
    "retcode" : "8",
    "duration" : "0:00:11:045",
 
 
    "doc_corrections" : 870,
    "doc_max_severity" : 3,
    "doc_messages" : 236,
    "doc_errors" : 160, 
    "doc_errors_list" : [
       {
         "Font not embeddded" : 17
       },
       {
          "DeviceRGB used" : 28
       },
       {
          "TrimBox entry missing" : 1
       }
    ]
    "doc_warnings" : 76,
    "doc_warnings_list" : [
       {
         "Resolution less than 200ppi for continuous tone image" : 3
       },
       { 
         "Page empty" : 2
       }
    ]
    "doc_infos" : 0,
    "doc_infos_list" : [
       {
         "Uses spot color" : 61
       },
       {
         "Uses transpanrency" : 39
       }
    ]
 
    "pdf_encrypted" : 0,   
 
    "num_images" : 70,
 
    "num_fonts" : 2,
 
    "fonts" : [
      {
        "fontname" : "TimesNewRomanPS-BoldMT",
        "fonttype" : "Type1",
        "embedded" : 1,
        "subset"   : 1
      },
      {
        "fontname" : "MyriadPro-BoldItalic",
        "fonttype" : "Type0",
        "embedded" : 1,
        "subset"   : 1
      }
    ],
 
 
    "num_spotcolors" : 3,
    "spotcolor_names" : [
      "Orange",
      "Purple",
      "Varnish"
    ],
 
 
    "num_icc_profiles" : 3,
    "icc_profiles_gray" : [
    ],
    "icc_profiles_rgb" : [
      "sRGB",
      "eciRGB v2"
    ],
    "icc_profiles_cmyk" : [
       "PSO Coated v3"
    ],
 
 
    "pdfx_version" : "PDF/X-4",
    "pdfx_oi_output_cond_id" : "FOGRA39",
    "pdfx_oi_info" : "Prepared for ISO 12647-2:2013, coated sheet fed offset",
    "pdfx_oi_icc_name" : "PSO Coated v3",
    "pdfa_version" : "PDF/A-2b",
    "pdfua_version" : "",
    "pdfe_version: ""
}

Examples files as downloadable attachments:

0 Comments

Send Your Comment

E-Mail me when someone replies to this comment