SMRU SRDL QC workflow — smru

Wrapper function that executes the complete SMRU QC workflow from data download to SSM-appended tag data files output as CSV files. All settings are specified in a JSON config file, including program - currently, IMOS, ATN or OTN. The program field determines the specific ArgosQC workflow functions called within the wrapper fn.

smru_qc(wd, config)

Arguments

wd

the path to the working directory that contains: 1) the data directory where tag data files are stored (if harvest$download = FALSE) or downloaded to (if harvest$download = TRUE); 2) the metadata directory where all metadata files are stored; and 3) the destination directory for QC outputs.

config

a hierarchical JSON configuration file containing the following blocks, each with a set of block-specific parameters:

setup config block specifies paths to required data, metadata & output directories:
- program the national (or other) program of which the data is a part. Current options are: imos, atn, or otn.
- data.dir the name of the data directory. Must reside within the wd.
- meta.file the metadata filename. Must reside within the wd. Can be NULL, in which case, the meta config block (see below) must be present & tag-specific metadata are scraped from the SMRU data server.
- maps.dir the directory path to write diagnostic maps of QC'd tracks.
- diag.dir the directory path to write diagnostic time-series plots of QC'd lon & lat.
- output.dir the directory path to write QC output CSV files. Must reside within the wd.
- return.R logical; should the function return a list of QC-generated objects to the R works pace. This results in a single large object containing the following elements:
  - cid the SMRU campaign ID
  - dropIDs the SMRU Reference ID's droppped from the QC process
  - smru the SMRU tag data tables extracted from the downloaded .mdb file
  - meta the working metadata
  - locs_sf the projected location data to be passed as input to the SSM
  - fit1 the initial SSM output fit object
  - fit2 the final SSM output fit object including re-routed locations if specified.
  - smru_ssm the SSM-annotated SMRU tag data tables. This output object can be useful for troubleshooting undesirable results during supervised or delayed-mode QC workflows.
harvest config block specifies data harvesting parameters:
- download a logical indicating whether tag data are to be downloaded from the SMRU data server or read from the local data.dir.
- cid SMRU campaign ID.
- smru.usr SMRU data server username as a string.
- smru.pwd SMRU data server password as a string.
- timeout extends the download timeout period a specified number of seconds for slower internet connections.
- dropIDs the SMRU ref ID's that are to be ignored during the QC process. Can be NULL.
- p2mdbtools (optional) provides the path to the mdbtools library if it is installed in a non-standard location (e.g., on Macs when installed via Homebrew).
model config block specifies model- and data-specific parameters:
- model the aniMotum SSM model to be used for the location QC - typically either rw or crw.
- vmax for SSM fitting; max travel rate (m/s) to identify implausible locations
- time.step the prediction interval (in decimal hours) to be used by the SSM
- proj the proj4string to be used for the location data & for the SSM-estimated locations. Can be NULL, which will result in one of 5 projections being used, depending on whether the centroid of the observed latitudes lies in N or S polar regions, temperate or equatorial regions, or if tracks straddle (or lie close to) -180,180 longitude.
- reroute a logical; whether QC'd tracks should be re-routed off of land (default is FALSE). Note, in some circumstances this can substantially increase processing time. Default land polygon data are sourced from the ropensci/rnaturalearthhires R package.
- dist the distance in km from outside the convex hull of observed locations from which to select land polygon data for re-routing. Ignored if reroute = FALSE.
- barrier the file path (must be within the working directory) for a shapefile to use for the land barrier. If NULL (default) then the default rnaturalearth coastline polygon data is used.
- buffer the distance in km to buffer rerouted locations from the coastline. Ignored if reroute = FALSE.
- centroids whether centroids are to be included in the visibility graph mesh used by the rerouting algorithm. See ?pathroutr::prt_visgraph for details. Ignored if reroute = FALSE.
- cut logical; should predicted locations be dropped if they lie within in a large data gap (default is FALSE).
- min.gap the minimum data gap duration (h) to be used for cutting predicted locations (default is 72 h)
- QCmode one of either nrt for Near Real-Time QC or dm for Delayed Mode QC.
meta config block specifies species and deployment location information. This config block is only necessary when no metadata file is provided in the setup config block.
- common_name the species common name (e.g., "southern elephant seal")
- species the species scientific name (e.g., "Mirounga leonina")
- release_site the location where tags were deployed (e.g., "Iles Kerguelen")
- state_country the country/territory name (e.g., "French Overseas Territory")