Introduction

On April 15th, 2008 Delta and Northwest Airlines announced that they were merging. In the press release, Delta claimed that some benefits of this merger would be greater financial stability and improved customer satisfication (Delta Air Lines 2008). While oil prices may have been a large reason for the financial instability, it is also worth wondering if decreased customer satisfication drove the merger. One way to look at customer satisfication is through whether planes arrived on-time because planes arriving late obviously will make customers unsatisfied.

Thanks to the Bureau of Transportation’s Statistics, we have data on planes’ arrival town from 1987 to 2017 (Transportation Statistics 2018). As a result, we can look into whether Northwest’s flights timeliness changed between 1987 and 2008. We include 2008 because while the merger had been announced, Delta and Northwest continued operations separately (Wikipedia 2018). Because there is a lot of data, we will look at the number of planes arriving late at the Detroit Metropolitan Airport, a hub for Northwest Airlines, on the first day of the month. Outside of the New Year, there are no holidays that fall on the first of the month. As a result, we can assume that the first day is roughly representative of the month. On the other hand, if we had looked at the total number or average of late flights per month, some months might have more late flights because there are more flights around the holidays. This would have been something else that we would need to control.

Using the data on the number of planes that were late on the first day of the month, we will attempt to fit an ARMA model with a linear trend based on the year. A hypothesis test on whether the regression coefficient for the linear trend is zero will help us determine if the timeliness of Northwest’s planes has changed.

Model

If we denote n = 1 as October 1987, n = 2 as November 1987, …, and n = 255 as December 2008 and the number of planes arriving late on the first day of a given month n as \(y_n\), the year for the given month n as \(year_n\), then our model is the following:

\[ \begin{aligned} y_n &= Intercept + year_n * \beta + \eta_n\\ \eta_n &= \sum_{i = 1}^{p}\phi_i\eta_{n - i} + \sum_{j = 0}^{q}\psi_j\epsilon_{n - j}\\ \epsilon_n &\sim N(0, \sigma^2) \end{aligned} \]

We will then use the likelihood ratio test to test the following hypothesis:

\[ \begin{aligned} \mathbf{H_0}: \beta = 0\\ \mathbf{H_1}: \beta \neq 0 \end{aligned} \]

Note that for p and q in our model, we will select one set that minimizes the AIC under the null hypothesis and another set that minimizes the AIC under the alternative hypothesis.

Analysis

Our data looks like the following:

We get the following AIC table for our null and alternative hypothesis:

## AIC table under the null hypothesis
##           MA0      MA1      MA2      MA3      MA4      MA5      MA6
## AR0  2638.590 2618.321 2620.303 2612.121 2613.065 2613.405 2615.158
## AR1  2617.556 2615.828 2615.346 2611.755 2613.155 2611.983 2613.872
## AR2  2619.301 2616.365 2609.891 2611.889 2612.152 2614.065 2615.004
## AR3  2609.968 2611.781 2611.890 2613.342 2614.148 2615.860 2611.808
## AR4  2611.767 2610.331 2611.637 2613.327 2611.388 2613.422 2614.755
## AR5  2613.763 2615.759 2612.928 2613.306 2615.865 2615.370 2616.649
## AR6  2615.598 2612.671 2614.625 2616.827 2612.170 2613.442 2613.427
## AR7  2615.100 2613.448 2612.695 2611.215 2613.098 2615.044 2613.159
## AR8  2615.523 2615.259 2614.628 2613.061 2613.358 2610.989 2612.517
## AR9  2615.877 2616.823 2615.628 2618.006 2615.290 2615.691 2614.574
## AR10 2616.826 2618.492 2619.220 2617.077 2614.807 2615.235 2614.617
##           MA7      MA8      MA9     MA10
## AR0  2611.174 2612.969 2614.950 2616.265
## AR1  2612.903 2614.224 2616.161 2618.261
## AR2  2614.855 2616.596 2618.099 2613.555
## AR3  2610.156 2611.214 2613.257 2614.551
## AR4  2611.456 2613.213 2615.877 2622.814
## AR5  2612.341 2612.533 2616.202 2615.546
## AR6  2610.610 2613.929 2618.352 2614.498
## AR7  2614.073 2615.657 2616.531 2616.363
## AR8  2618.534 2614.287 2615.157 2615.475
## AR9  2614.491 2616.269 2620.403 2617.404
## AR10 2618.473 2617.036 2622.417 2615.450
## AIC table under the alternative hypothesis
##           MA0      MA1      MA2      MA3      MA4      MA5      MA6
## AR0  2640.570 2620.313 2622.295 2614.121 2615.064 2615.405 2617.158
## AR1  2619.552 2617.818 2617.316 2613.755 2615.138 2613.983 2615.872
## AR2  2621.299 2618.338 2611.855 2613.850 2614.141 2616.050 2616.973
## AR3  2611.966 2613.778 2613.852 2615.266 2616.135 2618.137 2613.783
## AR4  2613.765 2612.320 2613.619 2615.318 2613.363 2615.383 2616.727
## AR5  2615.761 2617.756 2614.928 2612.871 2617.830 2617.098 2618.641
## AR6  2617.597 2614.664 2616.618 2618.802 2614.131 2612.856 2614.004
## AR7  2617.084 2615.434 2614.689 2613.187 2615.064 2617.031 2615.135
## AR8  2617.517 2617.250 2616.624 2615.040 2615.367 2612.966 2614.437
## AR9  2617.854 2618.807 2617.612 2619.982 2617.215 2617.662 2617.576
## AR10 2618.821 2620.485 2621.219 2619.025 2619.127 2619.279 2616.618
##           MA7      MA8      MA9     MA10
## AR0  2613.171 2614.967 2616.948 2618.265
## AR1  2614.901 2616.215 2618.153 2620.261
## AR2  2616.853 2618.573 2620.072 2615.555
## AR3  2612.154 2613.213 2615.256 2616.545
## AR4  2613.454 2616.154 2617.847 2624.644
## AR5  2614.339 2616.299 2615.991 2619.833
## AR6  2612.641 2615.903 2620.582 2619.561
## AR7  2616.094 2617.631 2618.537 2618.324
## AR8  2619.013 2616.250 2616.601 2618.600
## AR9  2616.491 2616.953 2622.362 2618.493
## AR10 2618.457 2619.585 2621.221 2617.195

Note there are convergence issues for the AIC table because certain nested models have AIC increases greater than 2, such as the ARMA(6, 8) and ARMA(6, 9) models. According to the table, ARMA(2, 2) and ARMA(3, 0) model have the lowest AIC under the null and alternative hypothesis. Looking at the default ACF plot from R,

either choice makes sense because the autocorrelation for the zeroth, first, and third lags are clearly statistically different from white noise autocorrelation. Because the ARMA(2, 2) model reduces AIC the most, gives us a model with AR and MA terms, and adds noise, we’ll proceed with the ARMA(2, 2) model.

If we fit the ARMA(2,2) models under the null hypothesis, we get the following estimates:

Under the alternative hypothesis, we get the following estimates:

Note that unlike other analysis of airline data (Box and Jenkins 1970), we don’t include a seasonal term. If we look at the periodogram,

there is a spike at the inverse of the number of observations.

Then, because there is only one parameter that is different between the null and alternative hypothesis, the asymptotic distribution under the null hypothesis for the two times the log likelihood ratio for the likelihood ratio test is a chi-square distribution with one degree of freedom. Two times the difference between the log likelihood of alternative hypothesis and null hypothesis is 0.0362248, which has a p-value of 0.8490519. Thus, we fail to reject the null hypothesis that \(\beta\) is 0 at the 5% significance level.

Discussion

Based on our analysis, it does not seem that the number of Northwest planes arriving late has changed linearly over time. This makes sense because while we don’t know the distribution, the estimate for \(\beta\) is within one standard error of zero. Further, if we compare a simulation of our fitted ARMA(2,2) model without the linear trend to our actual data, we have the following plot:

While the plots don’t overlap, we still get something similar to our actual data. As such, it doesn’t appear that adding a linear trend based on the year helps.

However, there are improvements that can be made to this analysis. First, for the ARMA model that we fit, the MA roots are the following:

## [1]  1.506919+0i -1.066246+0i

Because one of the roots is close to 1, then the model might not be invertible. As stated in class notes, this might have lead to numerical instabilities (Ionides 2018). Further, when we check the conditions of our model, the autocorrelation plots and QQ plots for the residuals look okay, but not the residuals plot. After all, looking at the autocorrelation plots,

we see that there is no trend in the autocorrelation and no real significant lag past the zeroth lag. Next, looking at the QQPlot of the residuals,

all but a few points on the end are connected in a slight nonlinear trend. Still, it seems reasonable to fit a linear line to the QQPlots because there is only a slight bend. As such, it might be reasonable to expect the residuals to be normal. However, if we look at the residuals plot, we get the following plots:

which looks similar to our original plot. As a result, we might have been able to better model this data with other techniques.

Indeed, along these lines, we assumed a linear relationship based on year. However, if we look at the log transformed of this data,

the data looks sinusoidal with sharper downward spikes. Perhaps there is a non-linear trend to this data. Or, we might need a different time series model. If we expand the autocorrelation plot to show up to 100 lags,

we see a sinusoidal relationship, but with a changing period. After all, there appears to be a complete cycle by lag 40, but lag 40 to 80 only complete half a cycle.

Taking a step back, we also faced data limitations. While it is possible to pull information for multiple airports and for other factors, we can only pull one month for one year at a time. To aggregate information across time, we were restricted by our queries. With more information, we might have been able to filter out cancelled flights for a more accurate representation of late flights. We could also compare flights from Northwest against flights from other companies. However, more information on flights being late might not have helped us because our hypothesis test showed no linear relationship between year and whether planes arrived on time or not.

Conclusion

In this report, we sought to explore whether Northwest’s planes arriving late might have led to its merger with Delta. Based on our analysis of its Detroit hub, we saw no evidence that the number of planes arriving late on the first of the month changed linearly over time. It is not likely that Northwest’s planes arriving late led to its merger with Delta.

References

Box, G.E.P., and G. Jenkins. 1970. Time Series Analysis, Forecasting and Control. Holden-Day.

Delta Air Lines. 2008. “Delta Air Lines, Northwest Airlines Combining to Create America’s Premier Global Airline.” https://web.archive.org/web/20080415215425/http://news.delta.com/article_display.cfm?article_id=11034.

Ionides, Ed. 2018. “Stats 531 (Winter 2018) ‘Analysis of Time Series’.” https://ionides.github.io/531w18/#class-notes.

Transportation Statistics, Bureau of. 2018. “Detailed Statistics Arrivals.” https://www.transtats.bts.gov/ONTIME/Arrivals.aspx.

Wikipedia. 2018. “Delta Air Lines–Northwest Airlines Merger.” https://en.wikipedia.org/wiki/Delta_Air_Lines–Northwest_Airlines_merger.