Blinded.
Bitcoin and the NASDAQ Composite represent two distinct asset classes: cryptocurrency and traditional equities, respectively. Bitcoin is characterized by its high volatility and decentralized nature, while the NASDAQ, a stock market index, primarily tracks the performance of the technology sector. Understanding the relationship between these two assets is crucial for investors seeking to diversify their portfolios across both traditional and emerging markets. Specifically, analyzing the correlation between Bitcoin and the NASDAQ Composite can offer valuable insights into the investor base of Bitcoin. It may reveal whether Bitcoin investors are traditional market participants who also engage in stock market investments or a distinct group of early-stage investors drawn solely to the innovative potential of this new technology.
This analysis investigates whether Bitcoin and NASDAQ prices exhibit correlated movements over time. Such a correlation could indicate shared market sentiment or highlight divergent behaviors during periods of economic change. We hypothesize that the prices of the NASDAQ Composite and Bitcoin are positively correlated. This assumption is based on the premise that investors who are drawn to Bitcoin are likely to have a strong understanding of the technology sector and may already be invested in technology companies listed on the NASDAQ. By exploring this relationship, we aim to shed light on the broader dynamics between traditional equities and emerging digital assets.
Drawing on monthly data from January 2012 to February 2025, our study builds on these foundational ideas by applying time series methods. We perform log transformations and differencing to stabilize variance and achieve stationarity, and we use LOESS smoothing to decompose the data into trend, noise, and cyclical components. An ARIMA-based framework is also employed to model the short-term dynamics of the differenced series. These methods enhance our analysis and provide a clear diagnostic framework to examine our hypothesis. Through this approach, we aim to provide further insight into whether the observed price movements are driven by shared market sentiment or are coincidental, ultimately informing investors about potential diversification strategies in a changing financial landscape.
Data for the NASDAQ Composite and Bitcoin closing prices were obtained from Investing.com. The dataset spans from January 1, 2012, to February 1, 2025, capturing monthly observations on the first day of each month.
Nasdaq and Bitcoin datasets were merged, and only the closing prices were retained for the analysis. Log transformation was applied to both series to stabilize the variance. We then differenced the log-transformed data to eliminate the influence of time.
To decompose monthly data from 2012 to 2024 using LOESS smoothing we first fit a trend component using a locally weighted regression model. Since we intend to capture long term trends and reduce sensitivity from medium-term fluctuations we have used a span of 0.6 instead of 0.5 as used in the lecture notes for chapter 8. This allows a more stable representation of the trend. We extract noise by applying LOESS (Local Estimation by Smoothing) smoothing with a smaller span to remove short term variations. By setting the noise span to 0.07, we ensure that only high frequency fluctuations remain. Finally, the cyclical component is derived by subtracting both the trend and noise from the data, capturing medium-term fluctuations.
The AutoRegressive Integrated Moving Average (ARIMA) model were used to fit the differenced data. The number of AR terms (p) and MA terms (q) were chosen based on the Akaike Information Criterion (AIC).
Bitcoin exhibits exponential growth, while NASDAQ grows steadily. Periods of alignment (e.g., 2020-2021 COVID recovery) and divergence (e.g., 2022 market downturn) are evident (Figure 1). We seek to analyze each series seperately at first to understand the trend, noise and cyclical components of the data.
We observe a persistent upward trajectory in the NASDAQ Index throughout the period, with the slope becoming notably steeper after 2020. This indicates that the core value of technology companies, which significantly impact the NASDAQ, has been growing more quickly in recent years, even when ignoring short-term changes and economic cycles [1]. From the noise component we observe there has been consistent variation (volatility) in the noise and we see an increased amplitude of the noise after 2020, indicating that the monthly market volatility has grown substantially. The lengths of the cyclical components, are almost similar to that of the cyclical components of the Bitcoin price given below. The most pronounced cycle occurs from 2020-2022, with a peak around late 2021 followed by a significant dip. This pattern aligns with the tech sector’s rapid growth during the post-COVID recovery and its subsequent decline as interest rates increased [2].
We observe that unlike NASDAQ, Bitcoin’s trend line remained relatively flat till 2020, after which there was a rapid upward shift. A source by Investopedia suggests that “The pandemic shutdown and subsequent government policies fed investors’ fears about the global economy and accelerated Bitcoin’s rise”. The trend continues to increase from then on at an accelerated rate. When we examine the noise component, we observe very minimal noise till 2018, and then a noise with smaller magnitude of variation can be observed from 2018-2020. However from 2020, the variation in the noise is more pronounced, and can be evidently seen from the plot. When we analyze the cyclical component of Bitcoin, it justifies some of the crucial aspects and fluctuation periods of bitcoin price. The first small cycle in late 2017 captures the first major bull run of bitcoin. The next major cycle we see is the 2020-2022 cycle where we see a strong positive deviation starting in 2020, and peaks at 2021 and returns to baseline by 2022. And finally we see a recovery cycle that is positive starting from mid 2023-2024 showing increasing strength.
To obtain a clear picture of the underlying periodic structure, we apply a smoothing technique using repeated rectangular windows. This non-parametric smoothing helps to average out the random fluctuations present in the raw periodogram, allowing us to better identify the dominant frequencies in the data without assuming a specific model.
We use spans = c(20,20) to apply two successive smoothing windows of 20 points each, balancing variance reduction (through averaging neighboring frequencies) with preservation of spectral resolution. This moderate smoothing suppresses spurious peaks while maintaining sufficient detail to identify both the dominant cycles and potential secondary oscillations.
We observe that the smoothed periodogram does not show evidence of strong cyclical patterns. As we do not see a distinct peak or spike at specific frequencies. This periodogram analysis strongly questions the veracity of the widely known 4 year cycle in bitcoin [10].
This smoothed periodogram shows properties similar to that of the
Bitcoin, and we do not observe any consistent cyclical pattern
throughout the period of observation despite there being a few irregular
cycles capturing the bullish and bearish trends of the market as shown
in the decomposition plot above.
We began by merging the Nasdaq and Bitcoin datasets to facilitate a comparative analysis, retaining only the closing prices as the primary variable of interest. Initial visual inspection of the raw price series (Figures 1 and 2) revealed clear non-stationarity, characterized by trends and time-dependent variance. To address this, we applied a log transformation to the data. The log transformation is a common preprocessing step for financial time series because it stabilizes variance [7], and ensures all values remain positive.
However, the log-transformed series (Figure 7) still exhibited non-stationarity. To achieve stationarity, we applied first-order differencing to the log-transformed series. Differencing is a widely used technique to remove trends and make a series stationary by eliminating time-dependent structures [3] [7]. Specifically, we transformed the log-transformed series \(y_{1:n}\) into a differenced series \(z_{2:n}\) by taking the first order difference \(z_n = \Delta y_n = y_n - y_{n-1}\) [3].
Figure 8 and Figure 9 suggest the transformed data look appropriate for a stationary model. Both transformed datasets have constant means. The data is spread out around 0 (Figure 8). The ACF plots show that the transformed data, overall, are not significantly correlated. Most of the autocorrelation values are within the confidence interval, except for lag 8 in the transformed Bitcoin data. The ACF plots also suggest that both transformed datasets could possibly be modeled as white noise processes (Figure 9).
From figure 8, we observe that the NASDAQ and Bitcoin transformed series show signs of volatility clustering - periods where large changes tend to be followed by other large changes, and small changes tend to be followed by other small changes.[14]For NASDAQ,we can see this particularly around 2020, where there are clusters of larger price swings.
For Bitcoin, there’s notably high volatility in the early period around 2015. A key characteristic of financial time series that need GARCH modeling is that while returns themselves might be uncorrelated, their squared returns often show significant correlation, indicating volatility persistence. While some sources suggested that GARCH is more suited for daily data, rather than monthly data since volatility trends are usually more pronounced in shorter time intervals, we can formally test is it is useful in our case using the Engle’s ARCH LM test.[15]
In an Engle ARCH LM test, a low p-value, typically considered to be less than 0.05, indicates the presence of ARCH effects (autoregressive conditional heteroscedasticity), meaning we must reject the null hypothesis of no ARCH and conclude that the variance of the errors is not constant over time; a high p-value suggests no evidence of ARCH effects.
The Engle’s ARCH test examines the following hypothesis:[16]
\[ H_0: \text{There are no ARCH() effects (homoscedasticity)} \]
\[ H_1: \text{ARCH effects are present (heteroscedasticity)} \]
The test is based on the auxiliary regression:
\[ r_t^2 = \alpha_0 + \alpha_1 r_{t-1}^2 + \alpha_2 r_{t-2}^2 + \cdots + \alpha_{12} r_{t-12}^2 + \varepsilon_t, \]
In our auxiliary regression, we regress the squared returns on 12 lagged values to capture one full year of observations and any potential seasonal patterns in volatility, and the \(R^2\) obtained measures the proportion of current variance explained by these 12 lags. Multiplying \(R^2\) by the sample size \(n\) gives the LM test statistic, \[ LM = nR^2, \]
This quantifies the overall explanatory power of the lagged squared returns. Under the null hypothesis that all 12 lag coefficients are zero (indicating no ARCH effects), this statistic follows a \(\chi^2\) distribution with 12 degrees of freedom—one degree for each lag term tested.
##
## ARCH Test Results:
##
## NASDAQ p-value: 0.05843282
##
## Bitcoin p-value: 0.3204877
The p-value obtained for Bitcoin(0.3204) and NASDAQ(0.058) for the Engle’s ARCH test fails to reject the Null hypothesis, and we can conclude that the evidence of ARCH effects are insignificant.
We proceed by fitting an ARMA model to the differenced data while investigating the relationship between Bitcoin prices and the NASDAQ index. An ARMA(p, q) model for the differenced series \(z_{2:N}\) is referred to as an integrated autoregressive moving average (ARIMA) model for the original series \(y_{1:N}\), denoted as ARIMA(p, 1, q). Formally, the ARIMA(p, 1, q) with intercept \(\mu\) is \(\phi \mathrm{B}[(1 - \mathrm{B})^dY_n-\mu] = \psi(\mathrm{B})\epsilon_n\), where \({\epsilon_n}\) is a white noise process, \(\phi(x)\) and \(\psi(x)\) are ARMA polynomials, and d is the order of differencing [3].. Since we manually computed the differences between consecutive observations, the d term in ARIMA(p, d, q) is effectively set to 0, reducing the model to an ARMA(p, q) formulation for the differenced series \(z_{2:N}\). We will choose p and q by selecting model with the lowest Akaike’s information criterion (AIC), given by \(\text{AIC} = -2 \cdot l(\widehat{\theta}) + 2D\) [4]..
MA0 | MA1 | MA2 | MA3 | MA4 | MA5 | |
---|---|---|---|---|---|---|
AR0 | 15.16 | 15.41 | 15.86 | 17.72 | 19.06 | 20.41 |
AR1 | 15.12 | 16.85 | 17.80 | 19.31 | 20.88 | 18.51 |
AR2 | 16.40 | 15.83 | 10.16 | 12.02 | 14.02 | 15.44 |
AR3 | 17.40 | 18.90 | 12.02 | 12.68 | 13.63 | 14.98 |
AR4 | 18.44 | 20.41 | 13.90 | 13.78 | 21.11 | 15.67 |
Here, the ARIMA(2,0,2) model yields the lowest AIC value of 10.16. However, we observe that some absolute differences between adjacent AIC values in the table are greater than 2, which suggests potential numerical instability during the optimization process. To troubleshoot, we validate our results using an alternative implementation: the auto.arima function from the forecast package in R. The auto.arima function automates the process of selecting the optimal ARIMA model by iterating through various combinations of p and q and selecting the model with the lowest AIC value, which is its default criterion [8]..
## Series: bitcoin$log_first_order_difference_Bitcoin_Price
## Regression with ARIMA(0,0,0) errors
##
## Coefficients:
## intercept xreg
## 0.0442 1.4667
## s.e. 0.0205 0.4152
##
## sigma^2 = 0.06287: log likelihood = -4.58
## AIC=15.16 AICc=15.31 BIC=24.32
The auto.arima function suggests the ARIMA(0,0,0) model. We first fit the ARIMA(2,0,2) model to the data and then compare it to the ARIMA(0,0,0) model.
The ARIMA(2,0,2) model has AR roots that are numerically similar to its MA roots, indicating the potential presence of common factors that could be canceled out (Figure 10). Additionally, all the roots of the ARIMA(2,0,2) model lie close to the boundary of the unit circle, suggesting that the model is near the threshold of being non-causal and non-invertible. This proximity to instability implies that the ARIMA(2,0,2) analysis may not be highly reliable or robust.
To formally compare the two models, we employ a hypothesis test using Wilks’ approximation. In this test, the null hypothesis corresponds to the ARMA(0,0,0) model (a white noise model), while the alternative hypothesis corresponds to the ARMA(2,0,2) model. Wilks’ approximation is given by:
\[\Delta = 2(l_1 - l_0) \approx \chi_{D_1 - D_0}^2\]
where \(l_i\) is the maximum log likelihood under hypothesis \(H_i\) and \(D_i\) is the number of parameters estimated under hypothesis \(H_i\). When comparing the ARMA(0,0,0) and ARMA(2,0,2) models, we find that \(\Delta = 6.5\). This value exceeds the critical value of a \(\chi^2\) distribution with 4 degrees of freedom at the 95% significance level. Consequently, the test does not provide sufficient evidence to reject the null hypothesis. This conclusion is further supported by the approximately canceling roots of the ARIMA(2,0,2) model. Given these results, we proceed with the white noise model for our analysis. [5].
An interesting observation is that when the difference series is white noise, the model for the original series can be written as
\[y_t - y_{t-1} = \epsilon_t\]
which is equivalent to a random walk model: \(y_t = y_{t-1} - \epsilon_t\). “Random walk models are often used to model financial and economic data, The forecasts from a random walk model are equal to the last observation, as future movements are unpredictable, and are equally likely to be up or down.” [6].
We examined the cross-correlation between Bitcoin log returns and NASDAQ index log returns. The strong positive cross-correlation at lag zero supports the association between the two series. In addition, we performed a likelihood ratio test to determine if the NASDAQ index is associated with Bitcoin prices. The null hypothesis corresponds to the ARIMA(0,0,0) model without the NASDAQ index, while the alternative includes the NASDAQ index. Again, we employ a hypothesis test using Wilks’ approximation, given by \(2(l_1 - l_0) \approx \chi_1^2\).
## Likelihood ratio test p-value: 0.0005290291
The p-value of 0.0005290291 indicates a statistically significant association between the NASDAQ index and Bitcoin prices. This finding could offer valuable insights for investors looking to diversify their portfolios across traditional and emerging markets. However, it is important to interpret this association with caution, as correlation does not necessarily imply causation.
We inspect the residuals of the ARIMA(0,0,0) model, and look at their sample autocorrelation to ensure that they are white noise. The residuals should be uncorrelated, have zero mean, and constant variance.
The residuals’ times series plot does not show any unusual patterns, and the residuals appear to be centered around zero (Figure 12).
Examining the autocorrelation plot of the residuals, although there is one lag (lag 8) that is outside the confidence interval, the residuals are mostly within the confidence interval, suggesting that the residuals are uncorrelated (Figure 13).
Finally, we check the normality of the residuals using a QQ plot. With the exception of the last two points deviating from the qqline, the residuals appear to be normally distributed, as most of the points fall along the QQ line (Figure 14).
Using monthly data from January 2012 to February 2025, our analysis finds that the differenced log returns of Bitcoin and the NASDAQ Composite exhibit near-random-walk behavior. LOESS decomposition reveals distinct cyclical patterns: Bitcoin shows pronounced boom-bust cycles, while the NASDAQ displays a steadier trend. Although a strong short-term correlation at lag zero indicates some synchronous market sentiment, the significant volatility gap between the two suggests that Bitcoin carries unique risk factors not present in traditional equity indices.
We explored models beyond ARIMA and, based on literature review, found that the GARCH(1,1) model is preferred by economists for its discrete-time framework and tractable likelihood function; however, a formal hypothesis test provided no statistical evidence to justify its application to our data [11].
Our study introduces several methodological enhancements that improve upon previous work. For example, the “Time Series Midterm Project Report on Bitcoin’s Price Behavior”[12] relied on automated ARIMA selection without performing detailed diagnostic checks. In our analysis, we supplement automated model selection with formal likelihood ratio tests. This additional diagnostic rigor allowed us to detect issues such as nearly canceling AR and MA coefficients, which informed our decision to favor a simpler ARIMA(0,0,0) specification for the differenced series.
Similarly, compared to the “Nasdaq and Gold Price”[13] project which primarily employed basic trend elimination and standard decomposition techniques, we refined the data preprocessing step. Our use of LOESS smoothing enhances the separation of trend, noise, and cyclical components, thereby offering a more precise understanding of underlying market movements. This tailored approach provides clearer input for subsequent time series modeling.
Future work might incorporate macroeconomic or sentiment data (e.g., interest rates, social media metrics) to see if these exogenous factors explain periods of tight coupling or decoupling. Adopt nonlinear methods to handle potential heavy tails, volatility clustering, and abrupt shifts in either market. Expand to high frequency data for capturing intraday interactions, which may be masked by monthly aggregation but are crucial for algorithmic or short-term traders. Ultimately, these expanded methods could explain how “old” and “new” markets sometimes converge or sharply diverge.
Blinded.
[1] https://www.reuters.com/markets/us/stunning-rally-big-tech-drives-nasdaq-20000-2024-12-11/
[2] https://en.wikipedia.org/wiki/2022_stock_market_decline
[3] Ionides, E. Lecture Notes for University of Michigan, STATS 531 Winter 2025. Modelling and Analysis of Time Series Data. Chapter 6, Slide 11.
[4] Ionides, E. Lecture Notes for University of Michigan, STATS 531 Winter 2025. Modelling and Analysis of Time Series Data. Chapter 5, Slide 21.
[5] Ionides, E. Lecture Notes for University of Michigan, STATS 531 Winter 2025. Modelling and Analysis of Time Series Data. Homework 3 Solutions
[6] https://otexts.com/fpp2/stationarity.html
[7] Huang, Wanqi & Li, Yizhuo & Zhao, Yuhang & Zheng, Lanfeng. (2022). Time Series Analysis and Prediction on Bitcoin. BCP Business & Management. 34. 1223-1234. 10.54691/bcpbm.v34i.3163.
[8] https://www.rdocumentation.org/packages/forecast/versions/8.23.0/topics/auto.arima
[9] UM ChatGPT was used to polish the sentences and correct grammars
[10] https://calebandbrown.com/blog/bitcoins-market-cycle/
[11] https://math.berkeley.edu/~btw/thesis4.pdf
[12] https://ionides.github.io/531w24/midterm_project/project10/blinded.html
[13] https://ionides.github.io/531w22/midterm_project/project09/blinded.html
[14] Wikipedia - Volatility Clustering https://en.wikipedia.org/wiki/Volatility_clustering
[15] Mathworks - Engle’s ARCH Test https://www.mathworks.com/help/econ/engles-arch-test.html
[16] Tsay, R. S. (2010). Analysis of Financial Time Series (3rd ed.). Wiley