Modelling and Forecasting Crude Oil Prices during COVID-19 Pandemic

Currently, the world suffers from the COVID-19 pandemic, which affects almost every aspect of daily life, giving rise to recession and affecting the world prices of crude oil. The study aims to model the high uncertainty of volatility as well as to forecast the daily prices of crude oil during the pandemic. One econometric model applied in this study is the Generalised Autoregressive Conditional Heteroscedasticity (GARCH) that allows more accurate and appropriate statistical analyses. Particularly, this study also discusses solving economic issues on the condition of any disturbances in the stability of daily crude oil prices. The findings suggest that the AR(1)-GARCH(1,1) model is a well-fitted model to predict relatively small errors. This model can act as a foundation for determining strategies in the future while facing such uncertain circumstances.


INTRODUCTION
Today's global economy is facing the worst circumstances as COVID-19 continues to spread. This pandemic has been affecting economic conditions such as trading, global supply chains and pressured asset pricing, and it forces multinational businesses to make difficult decisions due to limited information (Ayittey et al., 2020). Ivanov (2020) stated that a certain issue on risk of supply chain was marked by any disturbances and ripple effects that may have a high uncertainty. Furthermore, the pandemic affects people's consumption levels, which are decreasing, forcing markets to be more wary in their budgets. Therefore, while production is still ongoing, there is lesser demand, causing a high uncertainty in prices, such as that in the oil industry.
The COVID-19 pandemic has had the worst impact on crude oil, which plummeted and reached its first negative price. Some analysts argued that the decreasing price might be because some investors are worried about low demands. Besides, the economic mobility of some regions in lockdown will probably be paralysed, indicating a decreasing consumption level. Moreover, the effect of a macro economy on oil price volatility is a crucial factor for importers as well as oil-exporting countries (Drachal, 2016). Therefore, it becomes a necessity to forecast uncertain daily prices of crude oil with error level reduction. In economic statistics, forecasting a financial timeseries data with high accuracy is one way to make better decisions.
Forecasting daily crude oil prices (COPs) is important and challenging because it might have a consequence on increasing and decreasing most economic and non-economic factors (Safari and Davallou, 2018). Abdulmajeed et al. (2020) stated that the applied mathematics model, artificial intelligence, big data and the forecasting method are potential tools predict the oil prices. Statistically, the Generalised Autoregressive Conditional 20_IJEEP_10578_hendrawaty_okey This Journal is licensed under a Creative Commons Attribution 4.0 International License Heteroscedasticity (GARCH) forecasting model shows a good ability in forecasting the time-series dataset (Engle, 1982;Tse and Tsui, 2002). Ahmed et al. (2018) showed in their empirical study that GARCH model can be a fitted alternative to show the volatility behaviours.

METHODOLOGY AND STATISTICAL MODELLING
The observation data analysed in this study are the worldwide COPs.
In this study, we refer to the COPs from late 2019 to May 2020. Bollerslev (1986) introduced GARCH(p,q) to model the behaviour of volatility that can be equipped to have a good measurement for forecasting model. Some procedures are applied to satisfy the fittest model of GARCH(p,q) to predict its short future prices.

Stationary Satisfaction
To satisfy the requirements of the GARCH(p,q) model, the first condition is to have a stationary dataset. Statistically, one measurement is by checking the data plot; if the fluctuation of the dataset is not stable around zero, it is considered as non-stationary . Dickey and Fuller (1979) introduced the Augmented Dickey-Fuller (ADF) test to check stationary data as mathematically present as follows.
The hypothesis is defined as. H 0 : DFτ > 2.57 = non-stationary H 0 : DFτ < 2.57 = stationary In addition, Tsay (2005) tested stationary dataset by computing the autocorrelation function (ACF) and the partial autocorrelation (PACF), where a non-stationary dataset can be identified by their decay movement for any given lags. Since most of financial data series are not stationary in both the mean and variance, transformation into a stationary dataset should be done by applying the method of difference . Granger and Joyeux (1980) introduced the method of differencing to transform a non-stationary time-series dataset into stationary to stabilise its mean and volatility. The mathematic equation is as follows.

Differencing
where B is defined as backward operator; d is the number of differencing; and a(B) is the integrating filter of order d. Once a stationary dataset has been met, the stable movement in mean and volatility model of GARCH can be applied after the confirmation that the model introduced in this study has been free from the ARCH effect (Tsay, 2014).

ARCH Effect Test
It is worth-noting that in modelling time-series for financial data, the probability of having a heteroscedasticity is quite high (Engle, 1982), making the estimation parameters of the forecasting model less accurate. The presence of the ARCH effect is examined by computing the Lagrange Multiplier (LM) test (Lee and King, 1993), and the order of ARCH can be determined by applying the Wong and Li (1995) test. If the probability value of the LM test is significant (<0.001) at any given orders, then the heteroscedasticity involved in the model requires a long memory to process its larger order (Ahmad et al., 2016). Since the estimation of the variance changes in the ARCH(q) model (P = 0) has a short memory process, the GARCH model (P > 0) is then applied as the squared residuals in past data estimated the variances Tsay (2005).

The Mean and Variance Model of AR(p)-GARCH(p,q)
The mean model of AR(p) is defined to have lag order of p, and the order conditional variance and its squared residuals are presented as order p and q, respectively. Equations 3 and 4 mathematically present the purposed models.
If the mean square error (MSE) and root mean square error (RMSE) are relatively small in association with the statistical description model, the models are assumed to have a well-fitted measurement to forecast .

Data Description and Stationary Diagnostics
The data observation is taken from the COPs data during the Covid-19 pandemic, with a study period from December 2019 to May 2020. A total of 152 sample data are observed to examine the impact on COPs, which are becoming very volatile, and to forecast both the mean and volatility for the next 10 days.
The analysis is started by plotting the time-series data to visually understand the behaviour, i.e. stationary or non-stationary. Figure 1 shows that in the first 50 observations, the COPs was relatively stable around $60. As the pandemic spread around the globe, it decreased gradually until approximately the 80 th data. Afterwards, the COVID-19 pandemic caused COPs to plummet that it only reached around $15 and was down to a negative price on the 101 st day of observation, which was the first time ever in the history of COPs. However, for the sake of statistical analysis in this study, the 101 st data is assumed to be similar to the previous one. Furthermore, the data climbed due to a positive reaction in the market, which is shown by the rocket increase in a few days ahead. Nevertheless, the COPs could not maintain the positivity as the COVID-19 pandemic dropped it into the lowest price, nearly reaching zero in the 125 th data. In the meantime, the market returned to positive as the world economy has been rebuilt, described by the gradual upward movement of COPs to the last date of study, reaching approximately $40 per barrel.
The descriptive data shows that the dataset is non-stationary because of the unstable movement of the mean and variance. Hence, it indicates that it is necessary to test the stationary statistically by applying ADF test as presented on Table 1.
From Table 1, the P-value of the zero mean of lag 3 is not significant (>0.05), which suggests that the result agrees with the previous data description, proving that the mean and variance of the data series is not stationary. Furthermore, Figure 2 confirms what has been proven statistically from Table 1, which shows that all three graphs do not satisfy the dataset as stationary. Figure 2a shows that the distribution data are not normal as they exceed the interval curve, and Figure 2b depicts the autocorrelation moves too smooth. These first two graphs from Figure 2 affirm that the mean and variance of the dataset is non-stationary. Figure 3c meets the stationary dataset, as the mean of PACF is around zero after lag 1.

Conversion of a Stationary Dataset
The dataset of the world COPs has been confirmed as nonstationary. The next step in time-series modelling is to convert it to stationary by applying the differencing method with one or more lag(s). This study then conducts differencing 1 to test whether it does turn to a stationary dataset. As shown in Figure 3, all graphs satisfy the stationary condition. Once differencing 1 is conducted, the plotting of dataset shown on Figure 3a volatiles around zero, wherein there are some data that reaches two standard deviations. It is clear as the COVID-19 pandemic makes the fast movement of COPs but still maintains the track of zero circle. Figure 3b and 3c show a rapid down trend after lag 1 in the area of zero and the dataset is normally distributed and fitted the curve shape shown on Figure 4.
The ADF unit-root test is then examined to prove the stationary dataset statistically. Table 2 shows statistical significance (P < 0.05) in the zero mean, which suggests that after conducting differencing 1, the dataset is already statistically stationary. As a result, the stationary dataset allows us to further attempt the stages in modelling forecasting COPs. Azhar et al. (2020) stated that heteroscedasticity is frequent in the financial time-series data. Hence, although the ARIMA(p,d,q) model can fit the data for forecasting, the ARCH effect is required to examine whether the model involves heteroscedasticity. We skipped the process modelling ARIMA (p,d,q) in this study on purpose, but we went further tested the availability of an unstable homoscedasticity using the ARCH-LM test as follows.

Mean and Variance Model
From Table 3, the ARIMA(p,d,q) model involves the ARCH effect as the test of portmanteau Q and Lagrange Model (LM) identifies to have p-value of less than their respective P-value. This diagnosis concludes to reject the null hypothesis, as the dataset involving to model ARIMA(p,d,q) has the effect of ARCH. Therefore, it is then necessary to model mean and variance more accurately, of which is required to be generalised. Furthermore, GARCH(p,q) model is applied to generalise the conditional effect of heteroscedasticity to have a good fit measurement for modelling variance and its forecasting, while the AR(p) model for the mean model is carried out.    Table 4 shows that the AR(1)-GARCH(1,1) model fits to the stationary dataset, which is indicated by a P < 0.005 for each parameter estimation. The model can be equated as follows.
AR(1) for modelling the mean: COPCOV t = 55.8089-0.9680 COPCOV t-1 + e t GARCH(1,1) for modelling the variance: The AR(1)-GARCH(1) model is statistically a fit measurement as shown in Table 5. It is worth-noting that the MSE value is relatively small at 4.91458, which means that the mean error is also considerably small. The RMSE is then calculated as 2.21688,    which is significantly small relative to its unconditional variance. This implies that the variance of both the model and dataset is significantly close to each other. The r-square is also identified to have a considerably significant value of 98.27%. In fact, it can be summed up that the model gives an accurate prediction as a forecasting model.
The persistency of the model can be analysed by summing both parameter estimates of ARCH and GARCH. If the coefficients are close to 1, the conditional variance of the model is distributed constantly. This implies that the model provides a better prediction.

Forecasting the Daily World COPs
Since 1 June 2020, with the ongoing COVID-19 pandemic, most countries have come to decide to reopen the economy. The role of the AR(1)-GARCH(1) model is to predict the next 10 days of the world COPs, which is from June 1 st to 12 th 2020. As shown in Figure 4, there is a gradual upward trend in the daily COPs forecast, but the increasing trend is still not approaching the initial price before the pandemic was coming. However, as regulators attempt to stabilise the economy, the increasing world price of oil is expected to continue as predicted from the model.
The study is carried out after the 10-day prediction price. Consequently, the prediction model can be compared with its real data to check its accuracy. Table 6 shows a comparison between the predicted prices from the established model and the real price of world COPs from 1 June to 12 June 2020.
From Table 6, the predicted prices are well fitted with the respective real prices for the first eight days. However, for the last two days, the gaps are wider. This is because the AR(1)-GARCH(1,1) model has a wider confidence interval for a longer time. Therefore, it is only suitable in forecasting data for a short period.

DISCUSSION
Forecasting models on oil prices has been widely applied as a benchmark to determine what strategies to take in the future. Crude oil as a global commodity becomes a very crucial element in the world economy. The increasing and decreasing daily COPs could be a determinant of the price levels of other commodities. In previous study on the macroeconomic changes in oil price, Hamilton (1983) found that a shock on oil price is a factor that contributes to the recession in the United States of America. Furthermore, oil prices that experience upward trends might increase companies' production costs, reduce the profit and affect the stock prices (Apergis and Miller, 2009).
As we encounter many economic issues caused by the fluctuation of oil prices, forecasting data could be used as an alternative to prepare for the worst. As the COVID-19 pandemic has caused the decrease in oil price, it also affects to decrease the consumption level in communities dramatically. On the other hand, it is almost impossible to stop the production of crude oil as the high cost of its exploration. Therefore, an alternative solution is to save production on the available oil tanks at the time of oil price decreasing.
The dramatic decrease on world COPs gives a lesson on how important the speed of the oil supply chain is to reach selling tanks. Government, as buyers, should prepare the budget to use at any time when a decrease in oil prices occurs, so the buy action is highly likely to be a benefit. Oil refinery companies should also provide spare tanks to save production even if there is zero demand. Time scheduling on oil production based on forecasting the economic condition could also be a wise strategy in planning. The decreasing on oil price could be a sign of a declined demand; therefore, during this time, it is proper to slow down the production.

CONCLUSION
The COVID-19 pandemic has caused a dramatic effect on most sectors, including the economic sector. The global COPs are the most affected, dropping it into the lowest price in history, reaching a negative price during the outbreak. The ultimate aim of this study was to construct the fitted model of mean and variance in order to forecast the daily and future COPs once lockdowns are lifted globally by most countries, which is the first 10 days of June 2020.