plom-scan

Plom tools for scanning tests and pushing to servers.

usage: plom-scan [-h] [--version] {process,upload,status,clear} ...

Positional Arguments

command

Possible choices: process, upload, status, clear

options

--version

show program’s version number and exit

Sub-commands:

process

Process one scanned PDF into page images, read QR codes and check info with server (e.g., versions match).

plom-scan process [-h] [--gamma-shift | --no-gamma-shift]
                  [--extract-bitmaps | --no-extract-bitmaps | --demo] [-s SERVER[:PORT]] [-w PASSWORD]
                  scanPDF

Positional Arguments

scanPDF

The PDF file of scanned pages.

options

--gamma-shift

Apply white balancing to the scan, if the image format is lossless (PNG). By default, this gamma shift is NOT applied; this is because it may worsen some poor-quality scans with large shadow regions. Has not been extensively tested recently: NOT recommended.

Default: False

--no-gamma-shift

Do not apply white balancing.

Default: True

--extract-bitmaps

We recommend this option if you scanned the papers yourself. If a PDF page seems to contain exactly one bitmap image and nothing else, then extract that losslessly instead of rendering the page as a new PNG file. This will be MUCH FASTER and will typically give nicer images for the common case where pages are simply JPEG/PNG images embedded in a PDF file. But some care must be taken that the image is not annotated in any way and that no other markings appear on the page. If the papers were produced by other people, this option is NOT RECOMMENDED, in case it misses markings made on top of a bitmap base (e.g., from annotation software). For this reason, it is not yet the default.

Default: False

--no-extract-bitmaps

Don’t try to extract bitmaps; just render each page. This is safer but not always ideal for image quality.

Default: True

--demo

Simulate scanning with random rotations, adding noise etc. Obviously not intended for production use.

Default: False

-s, --server

Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.

-w, --password

for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.

upload

Upload page images to scanner.

plom-scan upload [-h] [-u] [-c] [-y] [-s SERVER[:PORT]] [-w PASSWORD] bundleName

Positional Arguments

bundleName

Usually the name of the PDF file.

options

-u, --unknowns

Upload “unknowns”, pages from which the QR-codes could not be read.

Default: False

-c, --collisions

Upload “collisions”, pages which appear to already be on the server. You should not need this option except under exceptional circumstances.

Default: False

-y, --yes

Assume yes to any prompts (skipping –collisions prompts for confirmation).

Default: False

-s, --server

Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.

-w, --password

for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.

status

Get scanning status report from server. You can customize the report using the switches below or omit all switches to get the full report.

plom-scan status [-h] [--papers] [--unknowns] [--bundles] [-s SERVER[:PORT]] [-w PASSWORD]

options

--papers

show paper info

Default: False

--unknowns

Show info about unknowns

Default: False

--bundles

Show bundle info

Default: False

-s, --server

Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.

-w, --password

for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.

clear

Clear “scanner” login after a crash or other expected event.

plom-scan clear [-h] [-s SERVER[:PORT]] [-w PASSWORD]

options

-s, --server

Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.

-w, --password

for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.

## Overview of the scanning process

  1. Decide on a working directory for your scans, copy your PDFs into that directory and then cd into it.

  2. Use the process command to split your first PDF into bitmaps of each page. This will also read any QR codes from the pages and match these against expectations from the server.

  3. Use the upload command to send pages to the server. There are additional flags for dealing with special cases:

    1. Pages that could not be identified are called “Unknowns”. They can include “Extra Pages” without QR codes, poor-quality scans where the QR reader failed, folded papers, etc. A small number is normal but large numbers are cause for concern and sanity checking. A human will (eventually) have to identify these manually.

    2. If the system detects you trying to upload a test page corresponding to one already in the system (but not identical) then those pages are filed as “Collisions”. If you have good paper-handling protocols then this should not happen, except in exceptional circumstances (such as rescanning an illegible page). Force the upload these if you really need to; the manager will then have to look at them.

  4. Run “plom-scan status” to get a summary of scanning to date.

  5. If something goes wrong such as crashes or interruptions, you may need to clear the “scanner” login with the clear command.

These steps may be repeated as new PDF files come in: it is not necessary to wait until scanning is complete to start processing and uploading.