Thresholds

QC test thresholds for salinity were based on historical Coastal Monitoring Program data. Preliminary quality control (QC) was applied to the historical data, when obvious outliers and observations impacted by suspected sensor drift (due to biofouling) were omitted.

Separate thresholds were calculated for salinity measured in Inverness County, and salinity measured elsewhere. add to this

Gross Range Test

Sensor Thresholds

The sensor thresholds were determined based on the manual for the aquaMeasure SAL (Table 1).

Table 1: Salinity sensor thresholds for the Gross Range Test.

User Thresholds

The salinity observations are relatively normally distributed for both groups of data (Figure 1), and so the mean and standard deviation were used to determine \(user_{min}\) and \(user_{max}\). The statistics and threshold values are shown in Table 2.

Figure 1: Distribution of salinity observations (binwidth = 0.5 PSU). Dotted orange lines indicate the user thresholds.

Table 2: Gross Range Test statistics and user thresholds for salinity.

Climatological Test

The Climatological Test was not applied to the salinity data because there were insufficient observations to calculate robust seasonal thresholds. Both data groups were missing observations for at least one month (Figure 2). Additionally, there is no substantial seasonal salinity cycle, and so the Gross Range Test is expected to identify any outlying observations.

Figure 2: Mean +/- 3 standard deviations of the monthly salinity observations.

Figure 3: Seasonal distribution of salinity observations in Inverness County (binwidth = 0.5 PSU).

Figure 4: Seasonal distribution of salinity observations for all counties except Inverness (binwidth = 0.5 PSU).

Spike Test

The distribution of the spike value is skewed right (Figure 5), and so several upper quartile values (90th, 95th, and 99.7th quartile) were evaluated to use as the \(spike_{low}\).

There were relatively few large single-value spikes in the salinity data, and so the 99.7th quartile was selected to avoid false positives. \(spike_{high}\) was defined as 3 * \(spike_{low}\) to identify especially egregious spike values.

Figure 5: Distribution of the spike value of salinity observations (binwidth = 0.1 %). Dotted orange line indicates \(spike_{low}\); dotted red line indicates \(spike_{high}\).

Table 3: Spike thresholds for salinity.

Rolling Standard Deviation Test

The distribution of rolling standard deviation is skewed right (Figure 6), and so several upper quartile values were evaluated to use as the \(rolling\_sd\_max\).

The 90th, 95th, and 99.7th quartile values were each applied to the the raw data (no preliminary QC) and the results inspected. The 90th quartile was determined to be too stringent, as it generated false positives, while The 99.7th quartile was determined to be too lenient, as it resulted in false negatives. The 95th quartile was considered the most useful threshold for both data groups.

Figure 6: Distribution of the 24-hour rolling standard deviation of salinity observations (binwidth = 0.1 PSU). Dotted orange line indicates \(rolling\_sd\_max\).

Table 4: Rolling standard deviation threshold for salinity.