How to 'dockerize' pdfToolbox

Docker is an open platform for running applications by virtualising them. This is different from the "full" Operating System virtualizations provided by, for example, VmWare, VirtualBox, Parallels, Xen etc. as:

  • it has a very quick startup time (only some milliseconds)
  • changes are not persistent (until explicitly committed)
  • for now, only pdfToolbox linux can be dockerized
  • for now, only console based applications can be dockerized. Especially GUI apps are not supported.
  • containers disappear when the dockerized application terminates

Important to note here is that:

  • Docker works pretty well with Linux and MacOS.
  • There is also a Windows variant available. But this turned out to be unusable at the moment (slow and unstable)

Terminology

  • Docker image: An image is a collection of filesystem layers on the harddisk. You can think of it as some special kind of an (ISO) image that includes everything needed to run an application--the program binary, the runtime files, configuration files etc.
  • Docker container: A container is a running instance of a docker image. Note: you can have multiple containers started from the same docker image.

Installation using Linux

The Linux package manager already provides a package named docker, but that is something different. We want the docker-ce package instead which is contained in a different package repository that is not known by the package manager.

sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
sudo apt-get update
sudo apt-get install docker-ce

Add users to the docker group:

sudo adduser username docker

Note: logged in users need to logoff-login to make the group change effective

Create and run your own docker images

We recommend to use convenience wrapper scripts. As an example,

  • common_docker_env.sh (a script to provide a common (unified) docker environment)
  • debian_docker_env.sh (a script to run a debian docker container)

are attached below.

Below are the typical workflow steps:

  • Edit debian_docker_env.sh and set the variable dimage=debian
  • Execute debian_docker_env.sh (as user root)
DUID=0 bash debian_docker_env.sh
  • Install the sudo command
apt update
apt install sudo
  • From a different terminal find out the container id
docker ps
CONTAINER ID
IMAGE
COMMAND
7aa90c66a4ae
debian
"/bin/bash"
  • Next, we want to use our own image enriched by some useful packages. So we commit the 7aa90c66a4ae container to a new image
docker commit 7aa90c66a4ae callas/debian
  • Leave the 7aa90c66a4ae container (by typing exit inside the containers terminal window)
  • Edit debian_docker_env.sh to use our new callas/debian image by setting the variable dimage=callas/debian
  • Execute debain_docker_env.sh
  • Enrich the callas/debian image by useful packages
sudo apt update
sudo apt install vim-tiny 
etc.

Important to note here is that

  • You have to explicitly commit these changes by running docker commit <id> from a different terminal again.
docker commit <id> <image name>
  • You can overwrite the existing image (callas/debian) again, or you can choose a different name.
docker commit <id> callas/debian
docker commit <id> callas/debian-new

An existing image is *NOT* overwritten. Only the existing image name is moved to a new image id.

You can find full documentation of docker here.

pdfToolbox installation

So far, we have installed docker and prepared a docker image with the prerequisites needed by pdfToolbox. Now we are going to actually install pdfToolbox.

Start the debian container:

bash debian_docker_env.sh

Download and unpack the most current pdfToolbox:

wget 'http://www.callassoftware.com/downloads/callas_pdfToolboxCLI_x64_Linux.tar.gz'
mkdir -p callas_pdfToolbox_CLI
tar -C callas_pdfToolbox_CLI --strip 1 -zxvpf callas_pdfToolboxCLI_x64_Linux.tar.gz

note: Install additional packages as needed.

Now, as usual in the docker environment – we have to commit the container changes.  Find out the container id and type from a different terminal:

docker commit <id> callas/pdfToolbox

For convenience, we recommend that you copy the debian_docker_env.sh script to a different name (e.g. ptb_docker.sh) and change the dimage environment variable to point to the callas/pdfToolbox docker image.

In case you want us to provide you with pre-packaged docker images, simply send an email to support@callassoftware.com with your requirement details.

0 Comments

Send Your Comment

E-Mail me when someone replies to this comment