plom-scan
Plom tools for scanning tests and pushing to servers.
usage: plom-scan [-h] [--version] {process,upload,status,clear} ...
Positional Arguments
- command
Possible choices: process, upload, status, clear
options
- --version
show program’s version number and exit
Sub-commands:
process
Process one scanned PDF into page images, read QR codes and check info with server (e.g., versions match).
plom-scan process [-h] [--gamma-shift | --no-gamma-shift]
[--extract-bitmaps | --no-extract-bitmaps | --demo] [-s SERVER[:PORT]] [-w PASSWORD]
scanPDF
Positional Arguments
- scanPDF
The PDF file of scanned pages.
options
- --gamma-shift
Apply white balancing to the scan, if the image format is lossless (PNG). By default, this gamma shift is NOT applied; this is because it may worsen some poor-quality scans with large shadow regions. Has not been extensively tested recently: NOT recommended.
Default: False
- --no-gamma-shift
Do not apply white balancing.
Default: True
- --extract-bitmaps
We recommend this option if you scanned the papers yourself. If a PDF page seems to contain exactly one bitmap image and nothing else, then extract that losslessly instead of rendering the page as a new PNG file. This will be MUCH FASTER and will typically give nicer images for the common case where pages are simply JPEG/PNG images embedded in a PDF file. But some care must be taken that the image is not annotated in any way and that no other markings appear on the page. If the papers were produced by other people, this option is NOT RECOMMENDED, in case it misses markings made on top of a bitmap base (e.g., from annotation software). For this reason, it is not yet the default.
Default: False
- --no-extract-bitmaps
Don’t try to extract bitmaps; just render each page. This is safer but not always ideal for image quality.
Default: True
- --demo
Simulate scanning with random rotations, adding noise etc. Obviously not intended for production use.
Default: False
- -s, --server
Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.
- -w, --password
for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.
upload
Upload page images to scanner.
plom-scan upload [-h] [-u] [-c] [-y] [-s SERVER[:PORT]] [-w PASSWORD] bundleName
Positional Arguments
- bundleName
Usually the name of the PDF file.
options
- -u, --unknowns
Upload “unknowns”, pages from which the QR-codes could not be read.
Default: False
- -c, --collisions
Upload “collisions”, pages which appear to already be on the server. You should not need this option except under exceptional circumstances.
Default: False
- -y, --yes
Assume yes to any prompts (skipping –collisions prompts for confirmation).
Default: False
- -s, --server
Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.
- -w, --password
for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.
status
Get scanning status report from server. You can customize the report using the switches below or omit all switches to get the full report.
plom-scan status [-h] [--papers] [--unknowns] [--bundles] [-s SERVER[:PORT]] [-w PASSWORD]
options
- --papers
show paper info
Default: False
- --unknowns
Show info about unknowns
Default: False
- --bundles
Show bundle info
Default: False
- -s, --server
Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.
- -w, --password
for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.
clear
Clear “scanner” login after a crash or other expected event.
plom-scan clear [-h] [-s SERVER[:PORT]] [-w PASSWORD]
options
- -s, --server
Which server to contact, port defaults to 41984. Also checks the environment variable PLOM_SERVER if omitted.
- -w, --password
for the “scanner” user’, also checks the environment variable PLOM_SCAN_PASSWORD.
## Overview of the scanning process
Decide on a working directory for your scans, copy your PDFs into that directory and then cd into it.
Use the process command to split your first PDF into bitmaps of each page. This will also read any QR codes from the pages and match these against expectations from the server.
Use the upload command to send pages to the server. There are additional flags for dealing with special cases:
Pages that could not be identified are called “Unknowns”. They can include “Extra Pages” without QR codes, poor-quality scans where the QR reader failed, folded papers, etc. A small number is normal but large numbers are cause for concern and sanity checking. A human will (eventually) have to identify these manually.
If the system detects you trying to upload a test page corresponding to one already in the system (but not identical) then those pages are filed as “Collisions”. If you have good paper-handling protocols then this should not happen, except in exceptional circumstances (such as rescanning an illegible page). Force the upload these if you really need to; the manager will then have to look at them.
Run “plom-scan status” to get a summary of scanning to date.
If something goes wrong such as crashes or interruptions, you may need to clear the “scanner” login with the clear command.
These steps may be repeated as new PDF files come in: it is not necessary to wait until scanning is complete to start processing and uploading.