Wrapper function that executes the complete workflow from data download to SSM-appended tag data files output as CSV files.

wc_qc(wd, config)

Arguments

wd

the path to the working directory that contains: 1) the data directory where tag data files are stored (if source = local); 2) the metadata directory where all metadata files are stored; and 3) the destination directory for QC output.

config

a hierarchical JSON configuration file containing the following blocks, each with a set of block-specific parameters:

  • setup config block specifies paths to required data, metadata & output directories:

    • program the national (or other) program of which the data is a part. Current options are: atn, or irap.

    • data.dir the name of the data directory. Must reside within the wd.

    • meta.file the metadata filename. Must reside within the wd. Can be NULL, in which case, the meta config block (see below) must be present & tag-specific metadata are scraped from the SMRU data server.

    • maps.dir the directory path to write diagnostic maps of QC'd tracks.

    • diag.dir the directory path to write diagnostic time-series plots of QC'd lon & lat.

    • output.dir the directory path to write QC output CSV files. Must reside within the wd.

    • return.R logical; should the function return a list of QC-generated objects to the R works pace. This results in a single large object containing the following elements:

      • cid the SMRU campaign ID

      • dropIDs the SMRU Reference ID's droppped from the QC process

      • smru the SMRU tag data tables extracted from the downloaded .mdb file

      • meta the working metadata

      • locs_sf the projected location data to be passed as input to the SSM

      • fit1 the initial SSM output fit object

      • fit2 the final SSM output fit object including re-routed locations if specified.

      • smru_ssm the SSM-annotated SMRU tag data tables. This output object can be useful for troubleshooting undesirable results during supervised or delayed-mode QC workflows.

  • harvest config block specifies data harvesting parameters:

    • download a logical indicating whether tag data are to be downloaded from the WC data portal (TRUE) or read from the local data.dir (FALSE).

    • collab.id (optional) the WC data owner ID associated with the data to be downloaded. Ignored (if provided) when harvest$download:FALSE.

    • wc.akey (optional) the WC access key for API access to the data portal. Ignored (if provided) when harvest$download:FALSE.

    • wc.skey (optional) the WC secret key for API access to the data portal. Ignored (if provided) when harvest$download:FALSE.

    • dropIDs the WC UUID(s) for specific tag data set(s) that is/are to be ignored during the QC process. Can be NULL.

  • model config block specifies model- and data-specific parameters:

    • model the aniMotum SSM model to be used for the location QC - typically either rw or crw.

    • vmax for SSM fitting; max travel rate (m/s) to identify implausible locations

    • time.step the prediction interval (in decimal hours) to be used by the SSM

    • proj the proj4string to be used for the location data & for the SSM-estimated locations. Can be NULL, which will result in one of 5 projections being used, depending on whether the centroid of the observed latitudes lies in N or S polar regions, temperate or equatorial regions, or if tracks straddle (or lie close to) -180,180 longitude.

    • reroute a logical; whether QC'd tracks should be re-routed off of land (default is FALSE). Note, in some circumstances this can substantially increase processing time. Default land polygon data are sourced from the ropensci/rnaturalearthhires R package.

    • dist the distance in km from outside the convex hull of observed locations from which to select land polygon data for re-routing. Ignored if reroute = FALSE.

    • barrier the file path (must be within the working directory) for a shapefile to use for the land barrier. If NULL (default) then the default rnaturalearth coastline polygon data is used.

    • buffer the distance in km to buffer rerouted locations from the coastline. Ignored if reroute = FALSE.

    • centroids whether centroids are to be included in the visibility graph mesh used by the rerouting algorithm. See ?pathroutr::prt_visgraph for details. Ignored if reroute = FALSE.

    • cut logical; should predicted locations be dropped if they lie within in a large data gap (default is FALSE).

    • min.gap the minimum data gap duration (h) to be used for cutting predicted locations (default is 72 h)

    • QCmode one of either nrt for Near Real-Time QC or dm for Delayed Mode QC.

    • pred.int the prediction interval (h) to use for sub-sampling predicted locations prior to interpolation of QC'd locations to tag data file event times.