QC Tests & Thresholds
This page describes the QC tests applied to the Coastal Monitoring Program Water Quality data, and the general methods for selecting the most appropriate thresholds for each test.
Thresholds
Where possible, the thresholds for each QC test and variable were determined from historical data, which provide a baseline of “normal” and “outlying” conditions. The historical data used here was the Coastal Monitoring Program Water Quality data sets submitted to the Nova Scotia Open Data Portal in December 2022. Preliminary quality control measures (e.g., obvious outliers and suspected biofouling removed) were applied to these datasets before submission. Additional QC was applied where required throughout the thresholds analysis. For example, freshwater and other outlier stations were excluded to provide a better representation of “normal” coastal ocean conditions.
The historical data was reviewed carefully prior to calculating thresholds. Depending on the number of observations and the spatial and temporal resolution of observations, data was pooled together or separated into different groups (e.g., county, sensor type).
The distribution of observations was then reviewed to determined which statistics to use to quantify outlying conditions. The mean plus/minus 3 standard deviations was used for relatively normally distributed variables (OOI 2022), while upper quartiles were used for skewed distributions.
These thresholds may be re-evaluated in several years, when more data is available.
QC Tests
Three QARTOD tests, two CMAR-developed tests, and a human in the loop test were applied to the CMAR Water Quality data.
Automated QC tests were applied to each sensor string deployment using CMAR-developed R package qaqcmar
, which is available to view and install from GitHub. The human in the loop test was applied during data review using qaqcmar
and the qc_tests_water_quality R repository, which is also available on GitHub.
Gross Range Test
Following QARTOD, the Gross Range Test aims to identify observations that fall outside of the sensor measurement range (flagged Fail) and observations that are statistical outliers (flagged Suspect/Of Interest).
Thresholds for failed observations are named
Thresholds for suspect/of interest observations are named
Following the OOI Biogeochemical Sensor Data: Best Practices & User Guide, these thresholds were calculated from historical data as the mean +/- three standard deviations (Equation 1, Equation 2):
where
Climatological Test
The Climatological Test is a variation of the Gross Range Test that accounts for seasonal variability. Under QARTOD, there is no Fail flag associated with this test for temperature, salinity, or dissolved oxygen due to the dynamic nature of these variables (IOOS 2020, 2018). Following this guidance, CMAR chose to assign the flag Suspect/Of Interest to seasonal outliers for all variables.
The Climatological thresholds are named
The
Note that OOI used a more complex method (harmonic analysis, as described here) to estimate
Spike Test
The QARTOD Spike Test identifies single observations that are unexpectedly high (or low) based on the previous and following observations.
For each observation, a
Due to the dependence on
As a simple example, consider several observations that increase linearly over time (Example 1: Figure). Here, the
Now consider that the value of one of these observations lies above or below the linear pattern (Example 2). This value will have a relatively high
CMAR uses two Spike Test thresholds:
Values for
Rolling Standard Deviation
The Rolling Standard Deviation test was developed by CMAR to identify suspected biofouling in the dissolved oxygen data. The test assumes that there is a 24-hour oxygen cycle, with net oxygen production during the day, and net oxygen consumption during the night. Biofouling is suspected when the amplitude of this cycle, as measured by the standard deviation, increases above a threshold (Figure 1).
The rolling standard deviation,
Although this test was designed to identify suspected biofouling, it was also applied to the other Water Quality variables as a general test of the standard deviation. In particular, it is expected to flag rapid changes in temperature due to fall storms and upwelling.
The Rolling Standard Deviation Test threshold is called
Values for
Depth Crosscheck
The Depth Crosscheck Test was developed by CMAR to flag deployments where the measured sensor depth does not align with the estimated sensor depth in the sensor_depth_at_low_tide_m
column.
For this test, the difference between the minimum value of measured depth and the estimated depth is calculated. If the absolute difference (
Note that the Depth Crosscheck Test is a deployment-level test; all observations from a deployment will have the same depth crosscheck flag value. If there is more than one sensor on the string that measures depth, the worst (highest) flag will be assigned to the deployment. This is because a Suspect/Of Interest flag for the Depth Crosscheck test is an indication that the sensor string was moored in an area deeper (or shallower) than expected. For example, if the string was moored in an area 10 m deeper than anticipated, all sensors will likely be 10 m deeper than recorded in the sensor_depth_at_low_tide_m
column.
Human in the Loop
Human experts reviewed the results of the automated QC tests to identify poor quality observations that were not adequately flagged. Results of the automated tests were not changed, but an additional human in the loop flag of Fail was added to identify these observations.
Situations where observations were upgraded to Suspect/Of Interest or Fail by human experts include:
- Spikes with multiple observations (e.g., Spike Test Example 3 above).
- Known issue with the deployment or sensor, for example:
- sensor malfunctioned for most of the deployment
- string sank due to biofouling
- evidence that sensor was exposed to air at low tide
References
Footnotes
Note that a centered window is possible because the data is being post-processed after collection. Real-time data would likely use a left-aligned window, i.e., observations
, , … .↩︎