FUTURE NATURAL GAS PRICE FORECASTING MODEL AND ITS POLICY IMPLICATION

Future natural gas (FNG) price is a collected data over the years and is a volatile movement in the market. In other words, FNG price is categorised as a time series data with volatility in both variance and mean, as well as most likely in some cases having heteroscedasticity problem. To come up with an estimated prediction model, some analysis tools, such as autoregressive integrated moving average (ARIMA) and generalised autoregressive conditional heteroscedasticity (GARCH), are introduced to find the best-fitted model having the smallest error value with high significance of probability value. This study aims to examine the best-fitted model that allows us to forecast FNG prices more accurately in the near future. There are 2842 observed data of daily FNG prices from 2009 to 2019 as the input of study objects. The finding suggests that the first measurement model of ARIMA (1,1,1) does not fit the model as having a non-significant probability value. Thus, it is required to check its heteroscedasticity by conducting an ARCH effect test. It is concluded that a data set has an effect of ARCH, so AR (p)–GARCH (p,q) model is then tested. AR (1)–GARCH (1,1) model is believed to be a best-fitted model having a significant P


INTRODUCTION
Currently, natural gas massively needed among communities, particularly Indonesia. Gas is basically needed in almost every aspect, both in domestic and businesswise. However, gas prices are frequently changed, affecting its consumption level and quality. This issue can be the most risky issue on the natural gas market. Whitcomb Jr. (1988) mentioned in his empirical study that future natural gas (FNG) prices can be affected by the uncontained number of its supply. Therefore, in some emerging countries like Indonesia, there is a wise solution as a policy of subsidy to lower the cost in civil society as the impact of price volatility in the gas global market. Subsidy is launched into communities due to some factors, including an increase of population and society consumption, immediately challenging the government to produce FNG; other factors might be a lack of renewable energy sources as well as development and penetration of alternative energy sources (Azadeh et al., 2013).
In making decision of subsidy-based policy application, the government is necessarily required to allocate their budgeting for subsidy. The government then accurately needs to early calculate the amount of subsidy to be launched in societies. If the daily prices of FNG are one of basic assumptions of subsidy policy, forecasting time series data of FNG is highly required, because bear in mind that an error in prediction model of FNG prices could be an anomaly of budgeting need for subsidy. An accurate prediction model might use proper mathematics assumptions to reduce error. Autoregressive integrated moving average (ARIMA) and generalised autoregressive conditional heteroscedasticity (GARCH) are simply two analysis tools to construct a model of estimated FNG prices. Some related studies have been conducted such as the new hybrid-forecasting model studied by Zhang and Zhang (2017) that applied HMM, EGARCH and LSSVM. However, this model has not been widely used in seeing their better performance in future values prediction compared to other related methods. Other studies focusing on more real-time forecasting models with short time in financial data such as daily price, interest rates and business cycle have been conducted by Skiadas et al. (2017) and Azhar et al. (2020).

DATA AND MODELLING PROCESS
In accordance with the data analysis tested by applying the proper method, this study follows some important steps. First, the method in both ARIMA (p,d,q) and GARCH (p,q) models is to form an equation model. This equation then is used as a foundation in doing the whole study, as it can be led the study into econometrical considerations, which are discussed as follows.

Stationary Transformation
The first econometric to satisfy is to check the stationarity of the data set. Stationary data set is required to make them more stable in mean and variance. There are some ways to check the stationarity of data set both statistically and non-statically. Statistic measurement of stationary data can be tested by considering the results of the augmented Dickey-Fuller (ADF) test, autocorrelation function (ACF), partial autocorrelation (PACF) and distribution of normality data (Azhar et al., 2020). A visual graphical data set can also be one way to see whether a data set is stationary or not. The following is mathematical equation of the ADF test (Dickey and Fuller, 1979): The rejection area of H 0 is DF τ < −2.57, and the probability value is <0.05 (Brockwell and Davis, 2002). To ensure statistic tests of stationarity from the data set, white noise test is run to check the indication of autocorrelation (Ljung and Box, 1978). In the Ljung and Box test, if Q statistic is indifferent significantly around zero, then it is a sign of white noise, and the model can fit the data very well (Enders, 2010).
From those tests, if the data set is determined as a non-stationary data, then the next step is to conduct differencing, which is to stabilise its mean by eliminating changes in a time series data (Hyndman and Athanasopoulos, 2013).

Adequacy Model
Given the data set has been stationary for its mean and variance, the next step is to select the appropriate forecasting model. In this study, we will examine two available models of ARIMA (p,d,q) and AR (p)-GARCH (p,q). Wold (1938) was the first man to introduce the combination of autocorrelation (AR) and moving average (MA) with a selected order of differencing. The prime condition of which is to have stationary data set to ARIMA (p,d,q) model. The following is an equation of ARIMA (p,d,q) model:

ARIMA (p,d,q) model
where μ is the constant of AR (p); ϕ i is the regression coefficient; i is 1,2,…, p.; p is the order of AR; λ k is the parameter estimation of MA; k is 1,2,…., q.; q is the order of MA; and ε t is error at time t.
Furthermore, time series data might show a correlation among them with the errors correlating with each other due to factor of time correlation. The Durbin-Watson (DW) test was then introduced to solve the issue to hypothesise zero conditional of non-residual autocorrelation. More specifically, the null hypothesis of DW test is a coefficient of autocorrelation such that the first order of p is all zero, where p is a chosen variable. The following is statistic equation of the DW test: Statistic test reports of DW test are as follows: • DW equals 2 is no autocorrelation.

ARCH effect tests
If the first model in this study, which is ARIMA (p,d,q) model, does not fit the data set, indicating a non-significant result of its probability value on estimated parameters, then it can be concluded that a presence of heteroscedasticity is available on the data set. It is therefore necessary to run another test, that is, autoregressive conditional heteroscedasticity (ARCH) effect test (Azhar et al., 2020). Virginia et al. (2018) argued that the effect can be tested using the Lagrange multiplier (LM) test, as heteroscedasticity is an issue in modelling time series data (Engle, 1982). In the LM test, a significant probability value indicates that ARCH test has a high effect order, so heteroscedasticity needs to be modelled (Wong and Li, 1995).

AR (p)-GARCH (p,q) model
Having ARCH effect, the data set having a heteroscedasticity effect can be modelled by applying the GARCH model. This model is applicable in processing a long memory, where the current variance is estimated by all past squared residuals (Tsay, 2005).
The order of p in AR (p) is denoted for the lag order of mean model, whereas the order of p and q in GARCH (p,q) are the order of conditional variances and squared residuals, respectively. Mathematically, they can be shown follows: Equations (4) and (5) can be considered the best-fitted model in forecasting model if its statistical descriptions show a significantly small root of mean squared error (MSE) and root mean squared error (RMSE) (Virginia et al., 2018).

FINDINGS AND SUGGESTED MODEL
In this empirical study, the data collected is the daily prices of FNG during the last decade, from 2009 to 2019. The number of observed data is 2842 which is then analysed to find the fitted model in forecasting its future prices in the next 30 days.
It is argued that the first thing to do in forecasting model is to ensure that the observed data is in a stationary form (Becker et al., 2006). Some measurements, such as checking the behaviour of visual plotting graph of the data series, statistical test of ADF unit root test, ACF and PACF test, as well as white noise inspection, are conducted to examine its stationarity. Figure 1 presents a picture of daily FNG prices that is visible to conclude as non-stationary series data, because its movement of mean and variance is not stable around zero.
It starts with a much-fluctuated data where in about the 300 th data it has almost 6 basis points and in the 800 th drops to <2 basis points. The series then reaches its peak on the 1300 th data to more than 6 basis points. The FNG price significantly drops to approximately 1.5 basis points at the 1800 th data. Afterwards, the data series is back to having an upward slope up to around the 2500 th data to reach back almost 5 basis points. Nevertheless, the decline trend is not evidently avoided until the end of period that drops to two basis points.
This non-stationary data is affirmed statistically by first studying the ADF unit root test. Table 1 approves what has been shown in Figure 1, as the probability value for mean of zero in lags of 3 exceeds the critical value of alpha (0.05). Further investigation is verified by the inspection of white noise (Table 2), of which its   autocorrelations for any given lags are close to one, suggesting that we can reject its null hypothesis to have a non-stationary data.
The following figures support both Tables 1 and 2 statistically. Figure 2a depicts the normal distribution of the daily FNG prices data, which can be said as not normally distributed. On the basis of autocorrelations, the ACF graph, as shown on Figure 2b, is moving down very slowly, indicating a non-stationary data. However, PACF (Figure 2c) has already shown a little good shape of autocorrelations, depicting that after lag of 1, the autocorrelations are around zero mean. Therefore, from the statistical graphs on Figure 2, the stationarity of daily FNG prices is justified as a non-stationary data.

Stationary Transformation
After guaranteeing that the daily FNG prices are not stationary, for the sake of further analysis, it is then suggested to transform them into stationary series data. The first test is to run differencing. Differencing 1 is selected to statistically convert the stationarity data. The following graphs imply the series data having stationary data already after differencing 1 was conducted. Figure 3a has visually proved that after examining differencing 1, the spread of mean and variance for series data of daily FNG prices is everywhere of zero, which means that the stationary condition is now satisfied. This graph also supports Figure 1 where its fluctuation occurs, which in its first 300 th data, the variance exceeds 1 standard deviation. In addition, on around 1300 th data, the variance also exceeded more than 0.5 standard deviation, confirming the peak of data on Figure 1. Similarly, at the end of the period, the standard deviation also deviates to almost 1 that upholds what have been on Figure 1 at the same period. Next, the ACF graph (Figure 3b) now shows the fast movement after lag of 1, revealing it as stationary data. This is also confirmed by Figure 3c, in which the PACF graph shows the autocorrelations inside the circle of zero.
Furthermore, Table 3 establishes the stationary data statistically by testing unit root test of ADF. Here it is shown that the probability  value of the series data is significantly <0.05, making it now stationary. White noise inspection as shown on Table 4 also verifies the argument, in which autocorrelations providing any lags are adjacent to zero; thus, we cannot reject the null hypothesis. In conclusion, those proofs make clear that after testing differencing 1, the daily FNG prices data are stationary. This stationary condition then enables us to conduct the next steps in having the best selected model for forecasting FNG prices.

ARIMA (p,d,q) model
The first model to forecast the stationary data set in this study is ARIMA. The conditional assumption is that the model should have been in differencing 1 as the lag of transformation into stationary. With differencing 1 from the output of estimation, as shown in the following Tables 5 and 6, we can construct the model of ARIMA (p,d,q).

FNG FNG
However, the model is not that significant as shown on Table 5 the probability value of both AR(1) and MA(1) with q=1 are more than 5%. Thus, we cannot rely on such this model to predict daily FNG prices.

ARCH effect inspection
ARIMA (1,1,1), as previously elaborated, is not significantly fitted to model the data set. This might be suspected that the presence of instable homoscedasticity error in the model of ARIMA (1,1,1). What this means that the model has an issue of heteroscedasticity, making the application of ordinary least square (OLS) as the basic assumption of estimation model is not efficient and effective. Therefore, such this problem is required to solve by first running the inspection of ARCH effect before continuing to test another model as the best selection. In this study we test ARCH-LM to identify the presence of heteroscedasticity in the data set.

Model comparison of AR (p)-GARCH (p,q)
Because the ARIMA (1,1,1) model does not fit the data set to be a prediction model, the next step is to find another model that fits the data sets wherein the condition of heteroscedasticity problem is generalised. Having p order equal to one and q order equal to one with differencing 1 in the ARIMA model, we can then analyse them to have mean model of AR (1) and variance model of GARCH (1,1) as the following statistical descriptive and parameter estimation. Table 8 explains to us the statistic description for the model that we analyse. It is argued statistically that the AR (1)-GARCH (1,1) model fits the data because an MSE value of 0.012 is significantly small, indicating that the error in means of the model is also very small. From this MSE, we can then estimate its root MSE (RMSE), which is 0.108. This value implies a very small RMSE, indicating that the variance of the model is very close to the variance of the observed data. The total R-square also indicates a very good value, which is 98.38%, giving a very strong explanation for the model. In other words, the model has a good accuracy as forecasting model.
Furthermore, as it is found that the model has a good ability and accuracy for forecasting data set, we can then be able to construct the model estimators as presented in Table 9. Here, both models for its mean and variance are presented.
Mean model of AR (1) As can be seen in Table 9, the AR (1)-GARCH (1,1) model is statistically significant, in which the probability value of <0.0001 is significantly small. Additionally, the sum of both ARCH and GARCH coefficient is argued as the GARCH model. If the value of the GARCH model is approaching one, the disturbances to the conditional variance will be significantly constant. Because GARCH coefficient is significant, it can be said that GARCH is a better-fitted model in forecasting compared to the ARCH or ARIMA model.
The AR (1)-GARCH (1,1) model is then applied to predict the FNG prices accurately for next 30 days. The following graph shows the forecast results. Figure 4 shows the forecasted data of FNG prices for the following 1 month with its upper and lower confidence level of 5%. It is suggested that the prediction of data series experiences a modest increase or even seems level visually from the start. Nevertheless, it is worth noting that the confidence interval increases its variance over time. Thus, the AR (1)-GARCH (1,1) model fits to only a short period to predict the data set.

POLICY IMPLICATIONS
Given the forecasted series data of FNG prices, some implications exist for policymakers, particularly in such emerging countries like Indonesia where daily consumption of gas is very high.       High price economically can affect low demand as stated on an empirical study by Liviu and Claudia (2011) that theoretically, demand and supply are a linear function evolving from inelastic to elastic perfectly and vice versa. This condition applies to supply and demand of FNG usage. The higher prices will result in the suffering of its end users, lowering their consumption.
In addition, Indonesia's industries are very dependent to FNG supply, in which Data and Information Centre Ministry of Industry Indonesia (Pusdatin Kemenprin (2018) argued that almost 90% of the manufacturing sector uses FNG domestically. In addition, they use FNG to expand their products and increase their added economic value. These phenomena should be a high consideration for the policymakers.
Subsidy as the offered implication as stated on Trinomics' working paper (2018) argued that it affects negatively and positively both industries and domestic households' purchasing power, which can directly be connected to economic growth ratio. In the case of natural gas, subsidy should be regulated wisely, as this might become a drawback instead. Thus, the highlight of this policy is to give it to end users most needing of natural gas, as there might be a possible opportunity to abuse it by exporting natural gas in some different products, such as using it in a form of industrial products with high value.
Therefore, to conduct and maintain the economic growth, analysis before determining subsidy policy is essential. On the basis of government budgeting, the accurate calculation should be a top priority; thus, the predicted AR (1)-GARCH (1,1) model can be a good foundation in making a wise policy.

CONCLUSION
Daily prices, as they are time series data, from certain commodities have sometimes high volatility. In fact, daily FNG prices experience vicissitude trends over the years. The mean and variance volatility in time series data are predictable as long as they are stationary. This study aims to find the fitted selection model to predict both mean and variance of one data set, and for this observation, we use daily FNG prices. After the series was stationary in mean and variance with differencing 1, we first constructed the ARIMA (p,d,q) model to forecast data set. As the model is not significant with p-value above t-statistic of 5%, then another forecasting model was introduced. We formulate AR (p)-GARCH (p,q) as the second trial model in condition with the same order of p and q as with the previous model, and the finding indicates the AR (1)-GARCH (1,1) model is significant with p-value under 5% as well as relatively small MSE and RMSE values.
The purpose of the selected model is to forecast daily FNG prices in the future. The expected data for the next 30 days shows an upward trend, although it only increases moderately. This urges policymakers to consider the better policy regarding this matter. A strategy of subsidy allotment in FNG prices will impact on an increased use of FNG, and industrial businesses as the end and most users of FNG are expected to create and expand their products for economic growth.