Introduction

A non-fungible token (NFT) is a non-interchangeable unit of data stored on a blockchain that can be sold and traded. Various types of NFT data units may be associated with digital files such as photos, videos, and audio, or skins in video games like Counter Strike: Global Offensive. Since each NFT has a unique identifier, they is different from cryptocurrencies, such as bitcoin, which only exists as a chain of transactions. To analyze the recent social interest in NFTs, we collected data from NonFungible.com, one of the largest databases of blockchain gaming and crypto-collectible market activity. The sales data, described on their website as, “Total usd spent on completed sales”. This data ranges from June 22nd 2017 to February 20th 2022 with the unit of USD. Additionally, we collected Google Trends data described as representing the average weekly searches on the topic of NFT_s, which notably is not limited to the query “nft”. We will examine both datasets concurrently to establish a relationship between the sales data and the trending data. As opposed to stocks, NFTs are given value by the people that buy and sell them. With our analysis we wish to explore how public interest and total money in NFTs is related.

Data Preprocessing

The Google trend data takes the average over a week, but we noticed that the most of data starting in 2017 contain zero terms. These terms will be unhelpful for analysis. From the data, we assume that since the social popularity of NFT has not been recognized in 2017, this time frame is not relevant to our exploration’s question, so we picked the data starting from September 27th 2020 to February 20th 2022 for analysis. The sales data of NFT is given daily, which corresponds to the 512 observations that streched over the period of valid Google Trends data.

Exploratory Data Analysis

Our final data consists 511 observations, which are the daily USD amount for NFT sales in million USD from September 27, 2020 to February 16, 2022. The mean sale is $35292401 and the standard deviation is $47384351, which is notably higher than the mean. Since we not only want to find the general trend for NFT sales, we also want to link the sales with the Google search trend for the word ‘NFT’. Hence, we also attached the weekly Google trend analytic data.

However, one problem is the Google trend only counts data weekly, which is less precise than our daily sales, we decided to smooth it using the generalized linear model (GAM) with natural spline. We set the number of knots as large as possible (30) to preserve the underlying trend.

Figure 1: General Visualization for NFT sales and search trend

Figure 1: General Visualization for NFT sales and search trend

The first huge sales boom happens on May 3, 2021. the first spike seen above. This spike is in a lag of the rise in ‘NFT’ search trends in April. The most active trade in the history happens from July 29 to September 7, with an insane transaction amount, however, the search engine reported a steady but relatively slow growth of the word. NFT sales remain relatively steady with little growth afterwards, while it appears more and more often in search engines.

It may be difficult for us to model the sales data directly, since we cannot guarantee its stationarity. In contrast; the sales difference looks more stationary. The autocorrelation plot also shows a nice approximate white noise pattern, thus we decided to construct our model on this sales difference.

Figure 2: Difference for NFT sales and search trend & ACF for sales difference

Figure 2: Difference for NFT sales and search trend & ACF for sales difference

ARIMA fitting

We implemented a loop similar to the procedure introduced in lecture to fit ARIMA models with all combinations of AR and MA structures, specifically from \(ARIMA(0,1,0)\) to \(AR(5,1,5)\), using AIC as the model selection criteria. Our fitting results are the following:

MA0 MA1 MA2 MA3 MA4 MA5
AR0 4452.30 4453.72 4426.60 4421.22 4422.42 4415.05
AR1 4453.96 4427.03 4423.07 4423.03 4422.19 4415.70
AR2 4435.41 4419.01 4406.77 4408.53 4414.75 4409.53
AR3 4417.54 4415.21 4408.21 4407.81 4409.74 4410.85
AR4 4411.04 4412.01 4411.50 4409.62 4411.52 4412.86
AR5 4411.07 4411.33 4411.46 4410.30 4412.26 4403.16

We find the \(ARIMA(2,1,2)\) model gives the smallest AIC score with a relatively simple structure. To take a deeper look on our fitted model, we will examine some of the parameter estimates. The characteristic roots for these estimates is the following: \[ \begin{aligned} Roots\ for\ AR&:\alpha_1=0.6865+0.9806i,\ \alpha_2=0.6866-0.9807i \\ Roots\ for\ MA&:\beta_1=-0.6872,\ \beta_2=2.5219 \end{aligned} \] The characteristic roots comes with problems: Most of them are inside the unit circle, meaning this model is not causal. This may be the result of the abnormal NFT sales boom in late August 2021 which brings much uncertainty to our model.

## 
## Call:
## arima(x = nfts$`Sales USD`/1e+06, order = c(2, 1, 2))
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.9592  -0.6983  -1.0601  0.5779
## s.e.  0.1223   0.0684   0.1476  0.1017
## 
## sigma^2 estimated as 324.6:  log likelihood = -2198.39,  aic = 4406.77
Figure 3: ARIMA(2,1,2) Residuals and its ACF

Figure 3: ARIMA(2,1,2) Residuals and its ACF

The residual and ACF plots, however, give relatively desirable results for our assumptions. Residuals center at 0 with few outilers. Alsom there seems to be no significant correlation between the residuals.

Frequent and Seasonailty

Another intuitive guess about the NFT market, we think, is that it also experience some periodicity like the normal market. Perhaps people tend to trade more in weekdays rather than weekends, or most of the deals take place on the second half of every month. Hence, we want to check the spectrum to find possible periodicity evidences.

Figure 4: Unsmoothed and span-smoothed spectrum

Figure 4: Unsmoothed and span-smoothed spectrum

The maximum spectral density has the corresponding frequency of 0.1680, resulting in a period of \(5.95\approx 6\) days. This may suggests us adding a seasonal term into our model. We tried to change our model into \(SARIMA(2,1,2)\times(1,0)_6\), and conducted a significance test by comparing the log likelihood statistics. The log likelihood shows \(l_{SARIMA}=-2204.03 < l_{ARIMA}=-2202.21\), which means that our SARIMA model actually performs worse. Hence, adding a seasonal term does not improve our model fit since the spectral density is possibly well explained by the normal ARIMA model.

Trend analysis

Our ultimate goal for our analysis is to find potential trend for NFT trade and link it with the search trend. Hence, in this part, we try to focus back on our original sales data rather than the difference version, and try to regress the search trend on it. In general, we assume our model as: \[ Y_i = \beta_0 + \beta_1+\epsilon_i,\quad \epsilon_i\sim ARMA(p,q) \]

We begin with a similar procedure to find a potential favorable model for our analysis. The model we choose is \(ARMA(4,2)\) with the log likelihood \(l_{ARMA}=-1750.33\) and \(AIC_{ARMA}=3516.65\):

And our comparison model with regression on the weekly trend gives the following result:

## 
## Call:
## arima(x = nfts$`Sales USD`/1e+06, order = c(4, 0, 2), xreg = nfts$nft_s)
## 
## Coefficients:
##          ar1      ar2     ar3      ar4      ma1     ma2  intercept  nfts$nft_s
##       2.4616  -2.4026  1.0607  -0.1473  -1.6103  0.9036    13.9618      0.8821
## s.e.  0.0633   0.1255  0.1080   0.0470   0.0502  0.0495    10.7378      0.3014
## 
## sigma^2 estimated as 309.4:  log likelihood = -2191.57,  aic = 4401.13

We conduct the significance test again, and this time we have:

\[ \begin{aligned} \lambda_{LR} &= -2[l_{ARMA}-l_{ARMA_{reg}}] \\ &= -2\times(-2198.89+2195.59) \\ &= 6.6 \sim X_1^2 \end{aligned} \]

The test statistic gives a p-value of 0.01, indicating that our regression model improves our model. Also, the parameter estimates for \(\beta_1\) result in a positive slope 0.87 with 95% confidence interval [0.2773, 1.4652], indicating a significant non-zero positive trend. By this model there is significant evidence to claim that increasing searching trend for NFTs increase with their sales.

Truncated analysis after Boom

The huge NFT boom in August really draws our attention. Due to this actively-traded period, our model is kind of unstable. Also, since the trend data suggests that the word ‘NFT’ becomes increasingly popular since October 2021, in this part, we tried to remove all the data before the boom ends(September 7, 2021) and try fitting using the rest 166 observations since we want to make our model more stable using a fairly enough sample. We hope the result will be more favorable:

Figure 5: Top Left: Differce series; Top Right: ACF of difference series; Bottom Left: Smooth Periodigram; Bottom Right: ARIMA(1,1,1) Residuals

Figure 5: Top Left: Differce series; Top Right: ACF of difference series; Bottom Left: Smooth Periodigram; Bottom Right: ARIMA(1,1,1) Residuals

However, the results have approximately no difference. The difference series still looks stationary, with a new \(ARMA(1,1)\) model, and we should not add a seasonal term into it.

The interesting part lies in the regression part, where we suggest using a \(ARMA(1,0)\) model for the sale trend. The new \(\beta_1\) estimate (from \(ARMA(1,0)\)) gives 0.2477, with 95% confidence interval [-0.1608, 0.6561], which means the trend is not that informative, corresponding to a likelihood ratio test statistic of 1.36.

Estimates Whole set Truncated set
\(\beta_1\) 0.8713 0.2477
95% C.I. [0.2773, 1.4652] [-0.1608, 0.6561]
Test statistic 6.6 1.36
P-value 0.0102 0.2435

This may because the NFT buzz drops mid-January 2022. When the NFT sales drops in a lag. Also, although the search trend did not come to the same level as the NFT sales rise in the August boom, they do proceeds in the same direction. Thus, it is also informative and removing this time period may not be a good choice.

Conclusion

After analyzing the Google Trends data and the NonFungible.com Market data from September 27th 2020 to February 20th 2022, we draw the following conclusions:

During the exploratory stage, the \(ARIMA(2,1,2)\) model fits the best for the sales data. And the characteristic roots indicate that the model is not casual due to the abnormal NFT sales boom in late August 2021.

We originally assumed that the data will show a pattern of seasonality as normal markets do. However, by adding a period of 6 days found in the spectral density graph, we find that the SARIMA model does not perform better than the original model.

We tried to link the Google Trend data to the sales in order to find a link. The \(ARMA(4,2)\) model turns out to validate our assumption that the increasing searching trend for NFT does facilitate sales.

Due to the prosperity of the NFT sales in August 2021, our model could be unstable. To examine the trend with extra caution, we picked the data after the huge sales (September 7, 2021) and fit the 166 observations. The results turned out to be slightly different to the previous one: the \(ARMA(1,1)\) model without a seasonality. However, the regression part suggests that the trend is not informative since the heat of NFT drops off mid-January 2022.

Acknowledgements

  1. Non-fungible token from Wikipedia, the free encyclopedia
  2. NonFungible Market History
  3. Google Trends
  4. Lecture slides and notes
  5. Models for Bitcoin Prices