Stamp Date and Time from Image OCR in Parallel — read_img_stamp_date_in

This function scans a list of images for printed date and time using OCR (Tesseract), and then stamps that information into the images' EXIF metadata. It uses parallel processing to speed up the operation.

Usage

read_img_stamp_date_in_parallel(
  files,
  tzone = "America/Bogota",
  n_workers = max(1L, parallel::detectCores() - 1L),
  verbose = TRUE
)

Arguments

files: character vector. Full paths to the folder with images to process.
tzone: character. Time zone for the parsed dates (default "America/Bogota").
n_workers: integer. Number of parallel workers (default detectCores() - 1).
verbose: logical. If TRUE, prints progress and summary.

Value

A character vector of processed file paths.

Details

The function assumes that the date and time are printed as text on the image and can be found by Tesseract. Currently, it specifically looks at the last two identified text elements and parses them using lubridate::mdy_hms. It means that it will not work well on date and time followed by extra information such as moon phase or temperature. For some camera models that print any other text or information after the date and time we plan a future fix.

Examples

if (FALSE) { # \dontrun{
files <- list.files("path/to/images", pattern = "\\.jpg$", full.names = TRUE)
read_img_stamp_date_in_parallel(files)
} # }