QC Overview

CMAR applies automated Quality Control (QC) tests and “human in the loop” QC to the Coastal Monitoring Program Water Quality data.

An automated QC test is an algorithm that evaluates each data observation and assigns a flag to the observation indicating the test results. These flags are typically reviewed by human experts, which is referred to as “human in the loop” QC. End users can then filter the data set for records that meet their quality criteria (UNESCO 2013).

Numerous QC flagging schemes and tests exist and have been applied to oceanographic data sets (e.g., Appendix A in UNESCO 2013). CMAR has adopted the well-known QARTOD flag scheme and several QARTOD tests.

Note that it is beyond the scope of the Program to produce analysis-ready data products for all potential users, and some users may wish to apply additional QC.

QARTOD Guidance

QARTOD stands for the “Quality Assurance / Quality Control of Real-Time Oceanographic Data”. It is a project led by the U.S. Integrated Ocean Observing System (IOOS) that aims to develop guidelines for producing high-quality oceanographic data. QARTOD has developed QC manuals for core ocean variables that outline best practices and describe QC tests with codable instructions. QC manuals are formally reviewed by subject-matter experts, and are updated as needed to reflect new technology, additional knowledge, or growth of the project (IOOS 2020a).

QARTOD manuals focus on QC of real-time data1, but acknowledge that other data types may also benefit from these flags and tests (IOOS 2020b). The CMAR Water Quality data is not processed in real-time2. Instead, data is logged and offloaded from the sensors for processing every 6 - 12 months. Some QARTOD guidance was therefore not applicable to this data, and procedures were adapted to reflect the nature of CMAR data and processing.

Flag Scheme

CMAR has adopted the QARTOD flag scheme (Table 1) with some adaptations. The QARTOD flag scheme provides information to data users on the expected quality of the data. This means that even “bad” data observations may be published, and it is up to the user to determine which records to include in their application. Details on how the QARTOD flag scheme was developed is provided in UNESCO (2013).

Table 1: QARTOD flag scheme. Modified from IOOS (2020b).
Flag Label Flag Value Description
Pass 1 Data have passed critical real-time quality control tests and are deemed adequate for use as preliminary data.
Not Evaluated 2 Data have not been QC-tested, or the information on quality is not available.
Suspect/Of Interest 3 Data are considered to be either suspect or of high interest to data providers and users. They are flagged suspect to draw further attention to them by operators.
Fail 4 Data are considered to have failed one or more critical real-time QC checks. If they are disseminated at all, it should be readily apparent that they are not of acceptable quality.
Missing Data 9 Data are missing; used as a placeholder.

The main CMAR adaptation to this flag scheme is that the Missing Data flag is not used. This flag is meant to alert real-time operators that an expected observation was not received, and may trigger efforts to fix recording and transmission issues. Since CMAR does not receive real-time data, this placeholder flag was deemed unnecessary. Note that data gaps may still occur due to sensor failure, delays between retrieval and re-deployment, and vandalism. It is the responsibility of the data users to identify and address these data gaps if required.

Note that QARTOD uses a flag of 3 to denote observations that are Suspect (e.g., of dubious quality) or Of High Interest (e.g., an unusual event). This is meant to encourage human in the loop decision making (IOOS 2020b). Where possible, CMAR has defined when these test results are likely Suspect vs. Of Interest, although data users should inspect these records carefully before deciding how to use (or discard) them.

Tests

CMAR applied 5 automated QC tests to the Coastal Monitoring Program Water Quality data (Table 2). Three are QARTOD tests (Gross Range, Climatological, and Spike), and the remaining two were developed by CMAR to address specific data quality concerns. Finally, a manual Human in the Loop test was applied, where experts reviewed the results of the automated tests and flagged additional observations where necessary3.

Click here for more detail on each test.

Table 2: Automated Quality Control tests applied to Water Quality Data.
Test Description Reference
Gross Range Flags observations that fall outside of the sensor measurement range and observations that are statistical outliers. IOOS (2018), IOOS (2020c)
Climatological Flags observations that are statistical outliers for a given month. IOOS (2018), IOOS (2020c), OOI (2022)
Spike Flags single-value spikes. IOOS (2018), IOOS (2020c)
Rolling Standard Deviation Flags observations with statistically high rolling standard deviation (e.g., multiple-value spikes). CMAR
Depth Crosscheck4 Flags deployments where the sensor depth at low tide is different from the measured depth at low tide. CMAR
Human in the Loop Flags observations that human experts recognize as poor quality. CMAR

Thresholds

Each automated QC test requires threshold(s) that determine the results of the test. Choosing appropriate thresholds for each test and variable is a considerable part of the QC effort (IOOS 2018). Following best practices, CMAR has developed thresholds based on historical data where possible (IOOS 2020b; Taylor and Loescher 2013; OOI 2022). The QC Tests page provides an overview of each QC test and how the associated thresholds were calculated. The Thresholds page under each variable in the menu above provides additional details on how the threshold(s) were determined.

References

IOOS. 2018. “QARTOD Manual for Real-Time Quality Control of Dissolved Oxygen Observations.” https://ioos.noaa.gov/ioos-in-action/manual-real-time-quality-control-dissolved-oxygen-observations/.
———. 2020a. “QARTOD - Prospects for Real-Time Quality Control Manuals, How to Create Them, and a Vision for Advanced Implementation.” https://doi.org/10.25923/ysj8-5n28.
———. 2020b. “QARTOD Manual for Real-Time Oceanographic Data Quality Control Flags.” https://cdn.ioos.noaa.gov/media/2020/07/QARTOD-Data-Flags-Manual_version1.2final.pdf.
———. 2020c. “QARTOD Manual for Real-Time Quality Control of in-Situ Temperature and Salinity Data: A Guide to Quality Control and Quality Assurance for in-Situ Temperature and Salinity Observations.” https://ioos.noaa.gov/ioos-in-action/temperature-salinity/.
OOI. 2022. “OOI Biogeochemical Sensor Data: Best Practices & User Guide.” https://repository.oceanbestpractices.org/bitstream/handle/11329/2112/OOI%20Biogeochemical%20Sensor%20Data%20Best%20Practices%20and%20User%20Guide.pdf?sequence=1&isAllowed=y.
Taylor, J. R., and H. L. Loescher. 2013. “Automated Quality Control Methods for Sensor Data: A Novel Observatory Approach.” Journal Article. Biogeosciences 10 (7): 4957–71. https://doi.org/10.5194/bg-10-4957-2013.
UNESCO. 2013. “Recommendation for a Quality Flag Scheme for the Exchange of Oceanographic and Marine Meteorological Data.”

Footnotes

  1. e.g., minimal delay from when data are recorded to when they are ready for use↩︎

  2. due to technical, logistical, and budget constraints↩︎

  3. e.g., when there was a known issue with the deployment that was not flagged by the automated tests↩︎

  4. note that this is a deployment-level test. See here for more information.↩︎