The Colorado River is an important source of water for the United States. The river acts as the primary source of drinking water for roughly 40 million people in the southwest United States. Additionally, the river is critical for a myriad of agricultural pursuits. As such, it is currently subject to a multitude of compacts, federal laws, contracts, and every other possible legal device. This collection of regulations is referred to collectively as the “Law of the River.”

The foundation of the “Law of the River” is the Colorado River Compact of 1922. This was an agreement among seven U.S. states (Colorado, Utah, Wyoming, Arizona, New Mexico, California, Nevada) governing the allocation of water rights associated with the river. The Compact divided the Colorado River Basin into the Upper Basin and Lower Basin. The division point between the two is Lee’s Ferry, a point in the River about 30 river miles south of the Utah-Arizona boundary. As such, Lee’s Ferry is the principal point at which river flow is measured to determine water allocations. The states that make up each basin are legally allocated one-half of the river’s natural flow.

Given the importance of the water rights to the livelihood of each state, we have great data on the flow of the Colorado River going back to 1920. For more recent years, we also have data on the cleanliness of the river, among other attributes.

In this analysis, we investigate whether there is an association between the strength of the flow of the river and its cleanliness. In particular, we aim to model conductance as a function of discharge in a SARMA framework. By convention, we use the flow rate (in units of ft^3/ second) as the measure of the strength of the flow of the river. We refer to this measure as discharge. We use specific conductance (in units of microsiemens per centimeter) as a proxy for the cleanliness of the river. Specific conductance measures how well water can conduct an electrical current for a unit length and unit cross-section at a certain temperature. Hereafter, we will refer to this measure simply as conductance. Water is better able to conduct electricity when it contains dissolved solids such as chloride, nitrate, sulfate, phosphate, sodium, magnesium, calcium, and iron. Therefore, in general, it is fair to conclude that higher values of specific conductance are indicative of a dirtier river.

We obtained our data from < http://waterdata.usgs.gov>. We begin our analysis by visualizing monthly discharge and conductance values over the sample period, January 1995 – 2015. We use this sample period because 1995 is the earliest year for which we have consistent conductance measurements at Lee’s Ferry. It should be noted, however, that we were forced to impute around 20 of the conductance measurements using the same month’s value in the prior year.

From these charts, we clearly see that both discharge and conductance have annual cyclic behavior; however, it is hard to see how the cycles relate to one another. In order to provide more clarity on this, we visualize the average values for each month over the twenty-year period below.

From this chart, we gather that conductance and discharge have an inverse relationship. That is, the months with typically low levels of discharge (such as March and April) are exactly those that have high values of conductance. Furthermore, it also appears that the months where discharge is high (the summer months when snow is melting from the Rocky Mountains), are associated with lower levels of conductance. We also note that discharge appears to have two cycles per year while conductance appears to have a single annual cycle. We further explore this phenomenon by plotting the spectral density estimates below.

The spectral density estimates largely confirm what we learned from the monthly analysis above. That is, discharge seems to have two cycles per year while conductance has one cycle per year. Both series also appear to have longer-term trends associated with low frequencies.

We next move to the modeling phase. We use the following notation:

We aim to analyze \(e^{*}_{1:N}\) using a regression with SARMA errors model, \[ E^{}_n = \alpha + \beta u^{*}_n + \epsilon_n,\] where \(\{\epsilon_n\}\) is a SARMA\((3,3)\times(1,1)_{12}\) process.We arrive at this model after an exploration of a subset up SARMA combinations. The model fit is presented below.

## 
## Call:
## arima(x = grouped$conductance, order = c(3, 0, 3), seasonal = list(order = c(1, 
##     0, 1), period = 12), xreg = grouped$discharge)
## 
## Coefficients:
##           ar1     ar2     ar3     ma1     ma2      ma3    sar1     sma1
##       -0.6570  0.6647  0.6857  1.2411  0.1812  -0.3942  0.8593  -0.4698
## s.e.   0.1223  0.0710  0.0982  0.1350  0.1766   0.0834  0.0813   0.1684
##       intercept  grouped$discharge
##        780.9032            -0.1712
## s.e.    43.7858             0.1014
## 
## sigma^2 estimated as 1108:  log likelihood = -1244.69,  aic = 2511.37

The standard errors for this particular model, which are computed from the observed Fisher information approximation, do not suggest a statistically significant association between conductance and discharge. This lack of statistical significance is confirmed by a p-value of 0.09 from a likelihood ratio test. Despite the lack of association this model indicated between conductance and discharge, we do see that the model has an acceptable fit as indicated by residuals over time and the residual ACF function.

In conclusion, although we are not able to detect a statistically significant association between conductance and discharge in this particular model, we do see a strong pattern in the exploratory analysis between these two variables. In particular, it appears that the periods of higher flow clean the river and result in lower conductance. This also appeals strongly to our intuition. Therefore, this relationship deserves further exploration using more complex models.

Supplementary Analysis

Below we present a table of AIC values for various SARMA\((p,q)\times(1,1)_{12}\) models.

MA0 MA1 MA2 MA3 MA4 MA5
AR0 2720.45 2618.66 2569.62 2559.23 2539.29 2541.05
AR1 2529.31 2516.90 2517.77 2518.66 2518.85 2519.74
AR2 2520.33 2517.45 2518.64 2513.34 2514.10 2515.69
AR3 2521.00 2519.39 2516.19 2511.37 2513.19 2515.19
AR4 2516.60 2514.69 2515.78 2513.19 2509.36 2516.33