1 Introduction

This analysis attempts to investigate the behaviour of the Non-Fungible Token (NFT) \(^{[1]}\) market, and compare/contrast it to the well studied stock market. While the stock market has been around for a long time, NFTs are only just gaining popularity thus making the comparison non-trivial. As a proxy to the NFT market, this study will use the daily opening price movement of MANA \(^{[2]}\) which is the largest cryptocurrency by market cap \(^{[3]}\) that is used to buy and sell NFTs in a metaverse\(^{[4]}\) called Decentraland\(^{[5]}\).

The primary question being addressed here is:

2 Analysis

2.1 Visualization

While the Decentraland platform and the MANA cryptocurrency have been around since late 2017, they have only picked up popularity in the recent years. We analyze data from 2021 onwards.

The first striking observation is the sharp rise in late 2021. Further investigation revealed that on the date Facebook officially announced their rebranding to Meta\(^{[7]}\), there was an 82% increase in the MANA price owing to the sentiment of wanting to get in on the metaverse and NFT market.

This is a one-off event, and we don’t expect it to occur often. Even if it did, we do not have the means to successfully model it. For the purpose of this analysis, we treat the spike as an artificial breakpoint and aim to model the data on either side of it.

Before proceeding with any further analysis, we perform a log-transformation on the data. In the context of finance, the difference of log-transformed price values is referred to as the return\(^{[8]}\) of a stock or an index. It is convenient for us to make this transformation, as that enables us to look for a random walk model fit, which would provide evidence to suggest that the MANA price movement follows the Efficient Market Hypothesis.

We re-plot the log-transformed data for clarity:

There is visual evidence of a trend in both parts, so we look at ways to model and remove the trend for further analysis.

2.3 Model Selection

We attempt to approach the model selection without any prior biases, so we fit a series of ARIMA models for a range of P and Q values. We set \(I=1\) to indicate the first order differencing. The general equation is of the form\(^{[9]}\):

\[ \Delta y_n = \frac{\Psi(B)}{\Phi(B)}\epsilon_n \]

where \(\Phi(x)\) is a polynomial of order p, \(\Psi(x)\) is a polynomial of order q, and \(B\) is the backshift operator.

2.3.1 AIC

We look at all the models along with their AIC scores. A lower AIC score indicates a better model fit, and is given by the equation\(^{[10]}\):

\[ AIC = −2 \ell(\theta^{*}) + 2D\] where, \(\ell(\theta^{*})\) is the log-likelihood and \(D\) is the number of parameters.

MA0 MA1 MA2 MA3 MA4
AR0 -233.46 -231.46 -231.60 -229.87 -230.98
AR1 -231.46 -230.00 -229.69 -230.81 -230.97
AR2 -232.06 -230.15 -228.34 -226.39 -232.60
AR3 -230.23 -229.56 -233.57 -232.60 -230.60
AR4 -228.89 -228.29 -226.61 -230.75 -235.74

We notice ARIMA(4,1,4) provides the lowest AIC score. However, we only use AIC as a guideline and not as a concrete means of model selection. We also consider the ARIMA(0,1,0) and ARIMA(3,1,2) model as their AIC values are only slightly larger. Considering ARIMA(0,1,0) would also assist us in answering our questions about whether the data follow a random walk, which is essentially the ARIMA(0,1,0) model. ARIMA(3,1,2) could also be a good model with slightly higher AIC but simpler than ARIMA(4,1,4) in terms of number of parameters.

arima414 = arima(y_after$Open, order = c(4,1,4))
arima312 = arima(y_after$Open, order = c(3,1,2))
arima010 = arima(y_after$Open, order = c(0,1,0))

We fit the three models and perform some tests to see if one is objectively better than the other.

2.3.2 Likelihood Ratio Tests

We compare nested models using a Likelihood Ratio Test, given by the Wilks’ approximation\(^{[11]}\):

\[ \ell_1 − \ell_0 \sim (1/2)\chi^2_{D_1 − D_0} \] where, \(\ell_1\) and \(D_1\) correspond to the log-likelihood and parameters of the larger model, and the subscript 0 refers to the smaller or nested model.

\(H_0:\) ARIMA(0,1,0) and ARIMA(4,1,4) are the same

\(H_a:\) ARIMA(4,1,4) is objectively better and its parameters are non-zero

The LRT reveals a test statistic of 18.2726243 which corresponds to a p-value of 0.0192729 under the \(\chi^2\) distribution with 8 degrees of freedom.

The p-value is \(< 0.05\) so we reject the null hypothesis, and consider the alternative that the larger model is indeed better.

Next we compare the ARIMA(4,1,4) and ARIMA(3,1,2) models, and perform the same Likelihood Ratio Test.

\(H_0:\) ARIMA(3,1,2) and ARIMA(4,1,4) are the same

\(H_a:\) ARIMA(4,1,4) is objectively better and parameters are non-zero

The LRT reveals a test statistic of 8.1700722 which corresponds to a p-value of 0.0426245 under the \(\chi^2\) distribution with 3 degrees of freedom.

In this case as well, the p-value is \(< 0.05\) so we reject the null hypothesis, and consider the alternative that the ARIMA(4,1,4) model is indeed better.

We perform further diagnostic tests on these models.

2.4 Model Diagnostics

As the first step, we compute the AR and MA roots of the models and plot them with a unit circle for reference. This is a check for causality and invertibility of the models, which are properties we desire in a model. For these properties, all roots must be outside the unit circle.

The ARIMA(4,1,4) model has its MA roots (blue) very close to (and on) the unit circle, while the AR roots (red) are safely outside the unit circle. This implies the model is causal but not invertible.

For the ARIMA(3,1,2) model, all of the roots, except one AR root is on the unit circle. This model too, is not causal and not stationary. The one AR root is large and outside the range of the plot, and is not depicted here.

A non-invertible or non-causal model is not ideal, so we are not keen to accept them. We also refrain from further diagnostic tests for these models for this same reason. Further discussions follow in the next section.

3 Conclusions

While the likelihood ratio test indicates that the larger model, ARIMA(4,1,4), is a better fit than the ARIMA(0,1,0) or ARIMA(3,1,2), our diagnostics revealed that the larger model is non-invertible. We prefer causal and invertible models, so we may be inclined to accept the ARIMA(0,1,0) model instead, even though it has higher AIC and is not favoured by the likelihood ratio test.

Considering the random walk model also follows theory from economics, we could have enough support to say the NFT market adheres to the Efficient Market Hypothesis. However, the answer is inevitably not so obvious. We conclude with the statement that there is evidence to build an argument either way, and further work is required to present more concrete results.

4 References

[1]: https://en.wikipedia.org/wiki/Non-fungible_token
[2]: https://www.coinbase.com/price/decentraland
[3]: https://coinmarketcap.com/view/collectibles-nfts/
[4]: https://en.wikipedia.org/wiki/Metaverse
[5]: https://decentraland.org/whitepaper.pdf
[6]: https://en.wikipedia.org/wiki/Efficient-market_hypothesis
[7]: https://about.fb.com/news/2021/10/facebook-company-is-now-meta/
[8]: https://ionides.github.io/531w22/01/slides-annotated.pdf (Slide 23)
[9]: https://ionides.github.io/531w22/04/slides-annotated.pdf (Slide 16)
[10]: https://ionides.github.io/531w22/05/slides-annotated-part1.pdf (Slide 21)
[11]: https://ionides.github.io/531w22/05/slides-annotated-part1.pdf (Slide 19)