GAUSSIAN PROCESS REGRESSION FOR FORECASTING GASOLINE PRICES IN JORDAN

The purpose of this paper is to forecast monthly gasoline prices in Jordan by applying Gaussian process regression on monthly prices of two types of gasoline (octane-90 and octane-95) during the period January 2008–December 2019. Accurately predicting gasoline prices have several fiscal policy implications concerning fuel subsidies and taxes. Also, they affect the consumption and the production of decisions. Moreover, they are crucial for designing and analyzing environmental policies. The Gaussian process model was able to treat a geometric Brownian motion with a deterministic unknown drift function. The performance of prediction was measured using the Root Mean Square Error (RMSE) and the Mean Average Percentage Error (MAPE). Where the numerical results show that the model predictions of gasoline prices were accurate.


INTRODUCTION
Worldwide, policy makers face the dilemma that gasoline is an important catalyzer of economic growth but also a major source of negative externalities including, among others, air pollution, car accidents, and traffic congestion. Faced with this dilemma, some countries choose to keep tight control over gasoline wholesale and retail prices through imposing fuel taxes and providing fuel subsidies. However, countries differ significantly regarding their policies of fuel taxes and subsidies (Burke and Nishitateno, 2013). Some countries subsidize gasoline prices, thus resulting in overconsumption, while others employ price and non-price policies to reduce gasoline consumption (Moshiri, 2020). The ultimate goal of price policies is reducing gasoline consumption through raising its price. This is customarily accomplished either through imposing fuel taxes or through the elimination of fuel subsidies (Moshiri, 2020). However, the effectiveness of such polices is largely questioned given that the demand for gasoline is relatively inelastic in response to price changes; this has created serious challenges to policy makers in their course of action to control gasoline consumption (Lin and Prince, 2013). However, albeit these shortcomings, these policies are widely used across the globe.
In many countries, taxes levied on gasoline are used to reinforce government revenues, as well as to cross subsidize other petroleum products largely consumed by the poor such as kerosene and Liquefied Petroleum Gas (LPG). Whereas subsidies are used, among other purposes, to expand access to energy and protect the poor against high fuel costs (Fattouh and El-Katiri, 2013). However, recent spikes in oil prices have compelled many countries to adopt energy prices reforms to face growing fiscal constraints. These reforms took the form of either imposing extra fuel taxes or reducing fuel subsidies or even completely removing them. Savings accrued to the elimination of fuel subsidies are used to lessen fiscal pressures on government budgets. In sum, subsidies and taxes are two principal tools used to influence prices, which in This Journal is licensed under a Creative Commons Attribution 4.0 International License turn supposed to influence the behavior of consumers, producers and governments. They also have several important implications for the design and analysis of environmental policies. Baumeister et al. (2017) conclude that fluctuations in gasoline prices directly affect the purchasing power of consumers, but also affect their decisions concerning which cars to purchase, as well as whether to live close to or distant from their places of work. Molloy and Shan (2013) reached similar conclusions concerning the choice of residential locations. Xu et al. (2018) further suggest that gasoline prices directly affect people's choices regarding the mode of transportation. Busse et al. (2013) argue that in the long-run, changes in gasoline prices might induce automobile manufacturers to manufacture more fuel-efficient cars, or even to change fuel technologies to hybrid or electric vehicles. Xu et al. (2018) note that forecasts of gasoline prices enable automobile manufacturers to adjust their designs, level of productions and marketing plans. On a similar note, governments can use the forecasts of gasoline prices to estimate the revenues from ad valorem gasoline taxes as well as to draw expectations concerning inflation and economic growth (Baumeister et al., 2017). These considerations strongly advocate the urgency to have accurate forecasts of gasoline prices to support decision-making processes undertaken by various actors in the economy. Surveying the literature on gasoline prices reveals that considerable research is concerned with analyzing the welfare and fiscal impacts associated with altering fuel subsidies and taxes regimes. Also, many studies are devoted to examine the relationship between gasoline prices and the demand for gasoline. Nevertheless, it is evident that the research on forecasting gasoline prices did not receive sufficient attention. Baumeister et al. (2017) attribute the shortage in forecasting studies to the widely accepted belief that accurate forecasts of gasoline prices cannot extend beyond few days and hence current prices are the best predictors of future prices. In spite of this, various methods have been used in the literature to forecast gasoline prices. Some of these methods use price expectations of consumers (e.g. Anderson et al., 2011;2013). Other studies use regression-based models (e.g. Baumeister et al., 2017 andXu et al., 2018). Recently, artificial intelligence and machine learning techniques have been utilized (e.g. Mustaffa et al., 2014;Chiroma et al., 2014). However, the Gaussian Process regression (GPR) method, a powerful nonparametric machine learning method for regression, has not been used so far to forecast the gasoline prices, although it has been used to forecast other energy variables (see for example Blum and Riedmiller, 2013;Leith et al., 2004 for forecasting demand for electricity; Yang et al., 2018 for forecasting power load Laib et al., 2018 for natural gas consumption prediction). Therefore, this study aims to partially fill this gap in the literature and to form a benchmark against which future research can be compared with. It also aspires to set the stage for future research on other petroleum products prices using GPR technique. In particular, this study will employ the Gaussian Process Regression (GPR) method to yield forecasts of gasoline prices using monthly time series data on gasoline prices in Jordan. The GPR is a powerful state-of-the-art nonparametric Bayesian regression method able to handle complex relationships contained in time series (Yang et al., 2018).
Jordan is viewed as an interesting case to study given that it has recently witnessed a dramatic shift in the fuel-pricing regime from being highly subsidized to subsidy-free pricing regime, except for the LPG. This shift has resulted in considerable fiscal and welfare implications, in addition to its impact on consumption patterns. Fortunately, fiscal and welfare consequences have been thoroughly analyzed by the International Monetary Fund (IMF) and the World Bank (e.g. Atamanov et al., 2017 andGillingham et al., 2006). However, no research has been undertaken to explore the behavior and properties of the stochastic processes that have generated gasoline prices despite their crucial role in predicting future prices trajectories deemed necessary for formulating sound energy policies as well as predicting consumers' behavior and other variables.
The rest of the paper is organized as follows. Section 2 describes the key features of Jordan oil market, fuel price reform and gasoline prices. Section 3 summarizes some relevant literature on forecasting gasoline prices. Section 4 describes the statistical models of Geometric Brownian motion and Gaussian Process regression. In section 5, we describe the gasoline prices data set. Section 6 contains model estimation and results and Section 7 concludes the paper.

JORDAN OIL MARKET
Jordan is small non-oil producing country located in the MENA region. It imports crude oil and refines it into diverse petroleum products. The gap between domestic supply and demand for petroleum products is met through importing ready-made refined petroleum products. Starting in 1958 and for half a century, a single firm named the Jordan Petroleum Refinery Company (the JPRC, henceforth) monopolized the Jordanian oil market. In particular, the Government of Jordan (GoJ, hereafter) awarded the JPRC a 50-year concession scheduled to expire in 2008. The concession entitles the JPRC the exclusive right to import, store and refine crude oil, as well as, to import, store, distribute and sell petroleum products throughout Jordan. On its part, the JPRC should provide the GoJ with a detailed list of the costs of production. In light of these costs, the GoJ sets retail prices of the various products. In 2008, the concession agreement with the JPRC has expired. Concurrently, the features of a new oil market structure began to shape, together with the efforts to reform energy prices. Indeed, the market was partially liberalized as the GoJ broke the JPRC's monopoly over the distribution activities and passed legislations allowing the private sector to invest in these activities. Currently the oil market comprises three privately owned distributing (marketing) companies, one of which is owned by the JPRC. Formerly, the GoJ compelled the marketing companies to exclusively purchase fuel form the JPRC. Recently these restrictions have been removed and the companies are authorized to import petroleum products.

Fuel Prices Reform
In 2005, as a response to growing fiscal pressures on the governmental budget, Jordan has initiated a reforming process of petroleum products prices by means of raising prices and removing subsidies. Discontinuation in crude oil supply from Iraq at below-market prices, combined with hikes in world oil prices, were the primary motives for the reform. Before 2003, Jordan used to purchase oil from Iraq at concessional prices. This situation, accompanied with low international prices, has enabled Jordan to subsidize retail prices heavily and thus avoiding the need either to charge higher prices or to free fuel prices (Gillingham et al., 2006;World Bank, 2009). After 2003, Jordan lost Iraq as a source of cheap oil. This shock along with hikes in the international fuel prices has spurred the GoJ to embrace a plan for eliminating fuel subsidies. In 2005, the GoJ increased the prices of fuels and in 2006 the prices were increased again (World Bank, 2009). In Feb 2008, oil prices subsidies were eliminated except for the LPG (Liquefied Petroleum Gas), and an automatic fuel pricing mechanism was put in place. Based on the novel mechanism, prices are revised on monthly basis to better reflect the actual costs of production. At the end of 2010, as oil prices reached US$ 90 a barrel, the government discontinued the monthly petroleum price adjustments and reintroduced petroleum subsidies. In December 2012, increasing fiscal pressures forced the GoJ to resume monthly price adjustments again except for the LPG (see Atamanov et al., 2017 andKojima, 2009 for more details).

Prices of Gasoline
After the 1989 crisis, the GoJ introduced subsidies for selected petroleum products. In 1992, the government introduced a crosssubsidization scheme. This scheme entailed charging above-market prices for gasoline while setting prices of other products, consumed mostly by the poor, at below world market levels. In order to reflect increases in world oil prices, the cross subsidization scheme was coupled with a periodic adjustment of fuel prices. To reduce the negative impacts of fluctuations in the oil world prices on the budget, the GoJ introduced a 2 percent GST on petroleum products in 2002 and was raised to 4 percent in 2003. In the same year, the GoJ raised fuel prices by 4-20 percent to reduce the growing pressure on the budget caused by losing Iraq as a source of cheap oil (Mansur, 2004).
Traditionally gasoline has been taxed with the aim of generating revenues which were used to cross-subsidize other products (Gillingham et al., 2006). Notably, prior to 2005 retail prices were occasionally adjusted on an ad hoc basis (e.g. Figure 1 for a historical overview of the average annual nominal prices for various types of gasoline during 1979-2007) 1 . Starting in Feb 2008, the prices of gasoline and other petroleum products were set according to an automatic fuel pricing mechanism that is revised 1 Historically and until 2007, two types of gasoline were traded in Jordan; namely, regular and super. In 1995, a third type called unleaded was first introduced into the market. In 2008, the year when fuel subsidies were eliminated, two new types of gasoline were traded in the market; namely, octane 90 and octane 95.
on monthly basis. Since then, prices fluctuated considerably and thus creating conditions of uncertainty and disturbed households budget planning and expending schemes. Other than some ex ante predictions published in press releases few days before the end of each month, no other sources of forecasting are available. Hence, establishing a reliable forecasting framework would be of great assistance for consumers to better plan their expenditures and smooth consumption as well. Noel and Chu (2015) assert that forecasting prices receives great attention in economics because forecasting aids economic agents to make optimal intertemporal decisions. Noticeably, forecasting oil prices received greater attention than forecasting petroleum products such as gasoline prices. This, in part, can be explained by the common belief that current prices are the best predictors of future prices, and it is impossible to accurately forecast prices beyond few days (Baumeister et al., 2017). Various streams of applied research tried to forecast gasoline prices. One line of research used the prices of crude oil to forecast retail gasoline prices. This line of research was pioneered by Bacon (1991) who first invented the term "rockets and feathers" to indicate that the response of gasoline retail prices to increases in oil prices is fast, while the response to low oil prices is slow. Other studies employed regression based models to forecast gasoline prices. Among these is the study by Baumeister et al. (2017) who employ regression-based forecasting methods to forecast future gasoline prices in the USA. More specifically they employed autoregressive, autoregressivemoving average and exponential smoothing models.

BRIEF LITERATURE REVIEW ON FORECASTING GASOLINE PRICES
Based on the Mean-Squared Prediction Error (MSPE) measure, they found that the bivariate VAR(1) model is the most accurate forecasting model. An interesting finding was that pooling forecasts result in extra reduction in MSPE. As noted by Xu et al. (2018), time series forecasting models are preferable over other models to predict future gasoline prices since it is difficult to obtain accurate estimates of external factors that affect gasoline prices. In order to predict gasoline prices in China, Xu et al. (2018) estimated 5 time-series forecasting; ARIMA-GARCH, exponential smoothing, grey system, artificial neural network (ANN), and support vector machines (SVR) models. Using Mean Square of the Errors (MSE) and Mean Absolute Percentage Error (MAPE) criteria, it was found that ARIMA model was the best predictor in the short run, while SVR was the best predictor in

Geometric Brownian Motion Model
The modeling of price option and many financial data is often conducted via a stochastic differential equation (SDE) given by (Iacus, 2008):

dZ t a t a t Z t dt b t b t Z t dW t
where Z(t) is a stochastic process, W(t) is a Brownian motion and a i (t) and b i (t), i=1,2 are real-valued functions. The SDE is called a non-homogeneous Geometric Brownian motion (GBM) when a 1 (t)=f(t), b 1 (t)=σ and a 2 (t)=b 2 (t)=0, for all t, where f(t) is a deterministic continuous function. So for the log returns of prices, as a special case of the above SDE, is the GBM: In discrete case, with a time grid t 1 , t 2 ,…, t n , if we set Z t Where Y t i is the price at time t i , f(t i ) is a deterministic function representing the drift function and σ t i is a random shock with ϵ t 's are independently distributed as N(0,1).
In the next section, we introduce the reader to the GPR mode and then we consider it to construct a prediction of the GBM model (2) with unknown drift function f(t).

Gaussian Process Regression Model
The Gaussian process regression model constitutes a general and flexible model for nonlinear regression. Over the past two decades, it has received considerable attention in the machine learning community. Also, it allows the modeler to treat the regression problems in a full Bayesian framework, without the complexities of Monte Carlo Markov Chain methods, since it provides a closed form posterior distribution of predictions (Rasmussen and Williams, 2006). The GPR can be described as follows. Let Y t be a response variable measured at time t. A nonparametric regression model is expressed by: Where f (.) is an unknown function and ϵ is a random error, i.e. the value of the function f (.) is measured as Y t but corrupted by a random noise ϵ t . The Gaussian process regression (GPR) model is a nonparametric model which enjoys nice features. For example, the Gaussian regression predictive distribution is also Gaussian distribution. The model is summarized as follows. Suppose that we have observed Y Y n t t n 1 , , , … data points on the variable Y at the times t 1 ,…, t n , respectively, and let the set D = {(t 1 , Y 1 ),…,(t n , Y n )} denote the observed data. It is assumed that the observed data are governed by model (4) as follows Where  t i s ' are iid N(0, σ 2 ). Since we don't have information about the functional form of f(t), then the Gaussian process is used    as prior process over the space of all functions, i.e., we assume that for every finite choices of distinct times t 1 , t 2 ,…, t k ,, the random vector f t f t k 1 } , , T has a k-variate normal distribution with mean vector 0 and a covariance matrix 6 V ij i j k , 1 , with entries given by V ij i j C t t ,� , where C (.,.) is some covariance function. A common choice of C is the linear exponential covariance function, which is given as follows (Brahim-Belhouari and Bermak, 2004): where θ 1 , θ 2 and θ 3 are unknown parameters to be estimated from the data. Rasmussen and Williams (2006) have shown that the conditional predictive distribution of f(t) given the data D, at new input say t * , is Where \ So, once the parameters have been estimated, then the following predictor is used to predict the value of f (t) at the new input t * : with an uncertainty measured by It can be seen that the marginal distribution of the data Y= (Y 1 , Y 2 ,…, Y n ) T is n-variate normal distribution with mean vector 0 and covariance matrix Σ + σ 2 I n , i.e. Y~N n (0, Σ+σ 2 I n ), where I n is the identity matrix of size n. So the parameters of the model, which are denoted by θ 1 , θ 2 , θ 3 and σ 2 , are estimated by the Maximum Likelihood (ML) method, i.e. we find the values of θ 1 , θ 2 , θ 3 and σ 2 which maximize the log-likelihood (10) associated with the marginal data Y (Neal, 1996;Rasmussen and Williams, 2006): So the parameters θ 1 , θ 2 , θ 3 and σ 2 are estimated using the maximum likelihood method from the marginal distribution of Y.  In model (3), we assume that the drift function is a deterministic function of the time. So, we treat this model using a Bayesian approach via the GPR method. The fitted version of the model Where σ is the maximum likelihood estimator of σ and ˆ( ) i f t is the predicted value of the drift function at time point , which can be obtained based on equation (8).

DATA
We use monthly prices of gasoline octane-90 and octane-95 over the period January 2008 to December 2019 2 . The prices are measured in Fils/Liter 3 . As noticed earlier, at the end of 2010, as the government discontinued the monthly petroleum price adjustments and reintroduced petroleum subsidies, the prices remained constant for almost 16 months (Feb 2011 to May 2012). Therefore, to increase variability in the data, nominal prices were deflated by the Consumer Price Index (CPI) to get real figures. Traditionally many researchers transform energy data into logarithmic form (e.g. Mishra and Smyth 2014); we followed this tradition in our analysis of the data. Table 1 shows the descriptive statistics for monthly prices of Gasoline 90 and 95.
In Figure 2, we display the prices series for Octane 90 and Octane 95. It can be seen that, in general, the difference between the two prices is increasing with time. Figure 3 shows the histogram and the boxplot for each gasoline type. The distribution of octane 90 has less skewness than that of octane 95, and the two distributions are left skewed. The boxplots of the two types tell us that the middle part of the data has about the same variation, but octane 95 has less variation in upper quarter, while octane 90 has less variation in the lower quarter than octane 95.

MODEL ESTIMATION AND RESULTS
To test the validity of the underlying assumptions of the GBM model on gasoline prices, the GBM was applied to the log return of monthly data of gasoline Octane 90 and Octane 95 (Figure 4). It can be noticed from Figure 4 that the drift is not constant over time. Therefore, this motivates us to use a model with nonconstant drift.
2 Prices are compiled with the Jordanian Ministry of Energy and Mineral Resources (MEMR). 3 The local currency used in Jordan in Jordanian Dinar (JD). One JD equals The log returns of the time series prices are fitted using model (3) via the Gaussian process regression method with ∆t=1. The R package on Gaussian Process Function Data Analysis (GPFDA) was used to obtain the predictions and the parameter estimates of model (3). The estimates of these parameters for both type of Gasoline are shown in Table 2 It can be noticed form Table 2 that the parameters estimates obtained from Octane 90 data and those from Octane 95 data are close to each other. Actually, this is expected noting that both prices move together (the correlation coefficient of the 2 time series is 96.9% and 93.9% for nominal prices and real prices, respectively). The values of RMSE and MAPE are very small, which support GPR as a good fitting model for both Octane 90 and Octane 95. Also Figures 5 and 6 show the log-return for octane 90 and octane 95 series together with the their fitted values, which were obtained using equation (11). It can be seen from Figures 5 and 6 that the GPR model fits the two prices very well.
In order to validate the model, we divided the data into two parts. The first part consists of the first 134 observations, which is used to estimate the parameters of the model, and the second part consists of the last 10 observations. These ten observations were predicted using equation (8) and displayed in Figure 7, with parameters estimated from the other data part. The RMSE and the MAPE for the predicted values are computed using the equations (12) and summarized in Table 3.

CONCLUSION
In this paper, we have assumed that gasoline prices can be modelled by GBM model, but with an unknown deterministic drift function. Then the model is treated from Bayesian viewpoint using the GPR. The parameters of the model are estimated via the ML method. The numerical results have shown that the model accurately predicts gasoline prices.
As a future work, one may think of extending the present model by adding another prior on the volatility parameters σ 2 such as inverse-Gaussian or generalized inverse Gaussian distributions. Further, we may assume that the volatility parameter σ 2 varies with time, so another Gaussian process prior may be assumed on σ 2 (t), which gives more general treatment of the GBM model.