• No results found

An ANFIS algorithm for improved forecasting of oil consumption: a case study of USA, Russia, India and Brazil

N/A
N/A
Protected

Academic year: 2022

Share "An ANFIS algorithm for improved forecasting of oil consumption: a case study of USA, Russia, India and Brazil"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

*Author for correspondence E-mail: aazadeh@ut.ac.ir

An ANFIS algorithm for improved forecasting of oil consumption:

a case study of USA, Russia, India and Brazil

Ali Azadeh1*, Morteza Saberi2 and Sara Ghorbani3

1Department of Industrial Engineering and Center of Excellence for Intelligent Experimental Mechanics, Department of Engineering Optimization Research, College of Engineering, University of Tehran, Iran

2Department of Industrial Engineering, University of Tafresh, Member of Young Researcher Club of Islamic Azad University of Tafresh, Iran

3Department of Industrial and System Engineering, State University of New Jersey, Rutgers, USA Received 04 March 2009; revised 06 January 2010; accepted 08 January 2010

This paper proposed an adaptive network-based fuzzy inference system (ANFIS) algorithm for oil consumption forecasting based on monthly oil consumption (January 2001 - September 2006) in USA, Russia, India and Brazil. Using mean absolute percentage error (MAPE), efficiency of different ANFIS models was examined. Proposed algorithm used Autocorrelation Func- tion (ACF) to define input variables irrespective of trial and error method (TEM). Algorithm for calculating ANFIS performance is based on its closed and open simulation abilities.

Keywords: Adaptive network based fuzzy inference system (ANFIS), Mean absolute percentage error (MAPE), Oil consump- tion estimation

Introduction

Due to abrupt and uncontrolled changes in demand of countries, to estimate required oil production appropriately is very critical. Fuzzy logic represents knowledge and learning ability of an artificial neural network (ANN)1. Adaptive neuro-fuzzy inference system (ANFIS), which has shown significant results in modeling nonlinear functions, has been applied successfully in scientific and industrial research1,2. Combination of traditional concepts with Neuro fuzzy utilizing preprocessing and post processing to model time series has been rarely done, excepting data preprocessing concept3-10. Also, selection of input variables in most heuristic methods is experimental or based on trial and error method (TEM)3-5, 11-17. Fuzzy inference process refers to Mamdani’s fuzzy inference method18, similar to Takagi-Sugeno method of fuzzy inference19.

This paper proposed methodology to integrate conventional time series with ANFIS for oil consumption forecasting with non-stationary data.

ANFIS for Oil Consumption Forecasting

Data Preprocessing

In forecasting models, a preprocessing method should have the capability of transforming preprocessed data into its original scale (post processing). It should make process stationary and must have post processing capability. Most useful preprocessed methods are as follows:

Normalization

Normalization method in reported3-10 to be used for estimation of time series functions using heuristic approach. There are different normalization algorithms.

Min-Max Normalization

This method scales numbers in a data set to improve accuracy of subsequent numeric computations. Let

min max,

,x x

xold be main value, maximum and minimum of raw data, respectively, and xmax′ ,xmin′ be maximum and minimum of normalized data, respectively, then normalization of xoldcalled x′new can be obtained as

min min

max min

max

min )( )

( x x x

x x

x

xnew xold ′ − ′ + ′

= −

…(1)

(2)

In this method, data are changed so that their mean is 0 and variance is 1 as

…(2)

Sigmoidal Normalization

This method uses a sigmoid function to scale data in range of [-1, 1] as

std mean x

e x e

old new

= − +

= − α

α α

1 1

…(3)

First Difference Method

First step in Box-Jenkins method20 is to transform data so as to make it stationary. In this method, transformation is applied as

1

= t t

t x x

y …(4)

First difference of Logarithm

In this method, transformation is applied as )

log(

)

log( 1

= t t

t x x

y …(5)

Error Estimation Methods

Four basic error estimation methods [mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE)] are as

...(6)

Among all these methods, MAPE is most suitable to estimate relative error because input data used for

different scales21-28.

Methodology

An algorithm was developed to model time series process. Neuro Fuzzy model was considered to determine impact of preprocessing on ANFIS model for estimation.

Proposed algorithm has following basic steps:

Step 1

Divide data into following two sets: i) train data set8 (containing 80% or 90% of all data) — for estimating model; and ii) test data set — for evaluating validity of estimated model. Data (63 samples) were assigned to training and rest (6 samples) to test model for predicting capability.

Step 2

Stationary assumption should be studied for ANFIS model. If process is not covariance stationary, most suitable preprocessing method should be selected and applied to the model.

Step 3

Input variables for ANFIS models can be selected using Autocorrelation Function (ACF). However, in most heuristic methods, selection of input variables is experimental or based on TEM3-5, 11-17. Importance of ACF approach is understood when difficulty and careless of TEM are considered. Irregular input selection is cause of lack of preciseness in TEM. Even if all previous lag combinations are used, TEM will be time-consuming.

For example, if all combinations are selected from recent 12 lag, number of combination will be

4096 12 212

12

1

=

 =

 

=

i i …(7)

While ACF approach introduces few combinations for model input in comparison with TEM.

Step 4

Plausible architectures of ANFIS are constructed in this step. First ANFIS model is run by train data set then run for test data set.

Step 5

Post processing the estimated preprocessed data is performed next.

n x

x x MAPE

n x x RMSE

n x x MSE

n x x MAE

n

t t

t n

t t n

t t n

t

t t

=

=

=

=

=

=

=

=

1

2

1 2

1 1

) (

) (

variance mean xnew xold

=

(3)

Step 6

Predicting ability of model is evaluated in this step.

Furthermore, comparison of model with actual data is performed and relative error of model is calculated (MAPE). Efficiency of model is calculated in two ways (Fig. 1): 1) open simulation; and 2) closed simulation.

MAPE error for model in each way was calculated in present algorithm.

Step 7

Model parameters, which are consisted of input membership functions and their number, will be tuned by running model for several times for minimizing MAPE error.

Open and Closed Simulation

ANFIS was trained by a time series of y(t); t=1…

m, using a set of lagged samples y(t–i), ith lag of y(t);

i=1,…,k; 1< k <<m, as input (independent) variables. In a closed loop simulation, y(t+j); j >=1 is generated. To test simulation power of an ANFIS, use samples m–k+1

to m and then let model generate all succeeding samples, y(t); t=m+1, …, m+n. If network is capable to re- generate data without any need to correction by measurements, it is validated as an acceptable ANFIS;

otherwise, structure and/or parameters should be modified. Closed simulation is more preferable than open simulation because former considers more realistic situation than latter.

Case Studies

Proposed algorithm is applied to four data sets related to four countries (Russia, India, Brazil and USA).

There are 69 rows of data (monthly oil consumption from January 2001 to September 2006) for each country. Each step of algorithm is explained as follows:

Step 1

Raw data (Figs 2-5) is divided into training data (63) and test data (6). Also, preprocessing data is divided into training data and test data.

Fig. 1— Close and open loop simulation A N F IS L

L

L

y(t) y(t-1 )

y(t-2 )

. . .

y(t) y(t-1 )

y(t-2 ) L

. . . L

A N F IS a . C lo se lo o p sim u la tio n

b . O p e n lo o p sim u la tio n

(4)

Step 2

It can be seen that raw data based on Russia (Fig. 2), Brazil (Fig. 3) and USA (Fig. 4) have a trend, whereas India raw data (Fig. 5) has no trend. As removal of trend is needed for more precise estimation in time series methods and also for studying impact of preprocessing on ANFIS, all preprocessing methods were applied on Russia, Brazil and USA raw data.

Finally, best preprocessing method for data, which can make the used model covariance stationary, was selected.

For Russia (Fig. 2), Brazil (Fig. 3), and USA (Fig. 4), difference of series seems to have a constant mean and stationary variance except two data values. For India (Fig. 5), this data set needs no preprocessing because of having constant mean and stationary variance.

Step 3

For ANFIS model, input variables are selected by using ACF, located in Eviews Software. Variables with ACF values (>0.2) were considered as follows (Fig. 6):

i) Russia, by using ACF on preprocessing data, input variables have been reduced to two months (5th and 12th months); ii) Brazil, by using ACF on preprocessing data, input variables have been reduced to two months (8th and 12th months); iii) USA, by using ACF on preprocessing data, input variables have been reduced to two months (4th and 12th months); and iv) India, by using ACF on actual data, input variables have been reduced to four months (1st, 2nd and 3rd and 4th months).

0 1 ,000 2 ,000 3 ,000 4 ,000 5 ,000 6 ,000 7 ,000 8 ,000 9 ,000

1 5 9 13 17 21 25 2 9 33 37 41 45 4 9 53 57 6 1 65 6 9

-25 0 -20 0 -15 0 -10 0 -5 0 0 5 0 10 0 15 0 20 0

1 5 9 1 3 17 21 2 5 2 9 3 3 3 7 41 45 4 9 5 3 5 7 6 1 65 Months

Fig. 2— Raw and preprocessed data charts for Russia Paired comparison

preprocessed production by first difference method

b) Preprocessed data for Russia a) Raw data for Russia

(5)

Step 4

In this step, model was run for preprocessed test data set for Russia, Brazil, USA and actual data for India.

Output of each country was obtained in this part.

Step 5

Since data were preprocessed for ANFIS models for some countries, estimated data obtained by these models should be post processed. Estimated data, which was not preprocessed, needed no post processes. As estimated data was obtained for test data set, only test data was post processed. For preprocessed data, let x- old be actual data (last data on train set) and y-new(i) be ith estimated data (difference between actual data), and x-new(i) be ith post processed data.

x-new(1)=x-old+y-new(1) …(8)

x-new(i)=x-new(i-1)+y-new(i) for i=2,..,N, where N is

number of test data …(9)

Step 6

In this step, models were run, tuned and efficiency of each model was evaluated. MAPE error was calculated in open and closed simulation. Parameters of ANFIS model are membership functions (mf type) and number of membership function (numMf). The more numMf decreases, the more MAPE error. Default membership function is usually bell-shape (gbell, generalized bell-shaped built-in membership function).

Models were run by applying several combinations of these two parameters and MAPE error was calculated for each model. According to least MAPE error, best parameter was selected for model for each country.

0 200 400 600 800 1,000 1,200 1,400 1,600 1,800 2,000

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 il)

-150 -100 -50 0 50 100 150 200

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65

il)

difference method

a) Raw data for Brazil

b) Preprocessed data for Brazil Months

Fig. 3— Raw and preprocessed data charts for Brazil preprocessed production by first difference method

Paired comparison

(6)

0 1,000 2,000 3,000 4,000 5,000 6,000

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 s

-1,200 -1,000 -800 -600 -400 -200 0 200 400

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65

s) b) Preprocessed data for United States

a) Raw data for United States

0 100 200 300 400 500 600 700 800

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 a)

Fig. 4— Raw and preprocessed data charts for United States

Fig. 5— Raw data for India (It needs no preprocessing) Months

preprocessed production by first difference method

Paired comparison

Months

(7)

For Russia, Best MAPE error was obtained by running model with 3 numMf for each input variable

and applying triangle and bell-shape membership function (Tables 1 & 5). For Brazil, best MAPE error

-0 .2 -0 .1 5 -0 .1 -0 .0 5 0 0 .0 5 0 .1 0 .1 5 0 .2 0 .2 5

1 3 5 7 9 1 1 1 3 1 5 1 7 1 9 2 1 2 3 2 5 2 7

S eries 1 0 .1 -0 . 2

-0 . 1 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7

1 3 5 7 9 1 1 1 3 1 5 1 7 1 9 2 1 2 3 2 5 2 7 0 . 2

0 . 3

S e ri e s 1

-0 .3 -0 .2 -0 .1 0 0.1 0.2 0.3 0.4

1 4 7 10 13 16 19 22 25 28

0

0.1 A C F (rus s ia )

-0 .3 -0. 25 -0 .2 -0. 15 -0 .1 -0. 05 0 0.05 0 .1 0.15 0 .2

1 3 5 7 9 11 13 15 1 7 19 21 2 3 25 2 7 29 9 S e rie s 1

a) ACF for Russia and for input variable selection b) ACF for India and input variable selection

c) ACF for Brazil and input variable selection

d) ACF for United States and input variable selection Fig. 6— Input selection for ANFIS models

Table 1—ANFIS tuning process for Russia

Iteration Mf type for Mf type for NumMf for NumMf for MAPE by open MAPE by first input second input first input second input simulation, % closed

variable variable variable variable simulation, %

1 triangle triangle 4 3 100.93 -

2 gbell Bell 3 3 10.5 -

3 gbell triangle 3 3 1.97 7.53

Table 2--ANFIS tuning process for Brazil

Iteration Mf type for Mf type for NumMf for NumMf for MAPE by open MAPE by

first input second input first input second input simulation, % closed

variable variable variable variable simulation, %

1 3 3 triangle triangle 24.76 27

2 3 3 gbell triangle 40.74 23

3 3 3 gbell gbell 70.46 19

4 3 2 gbell gbell 61 18

5 2 2 triangle triangle 2.68 2.68

(8)

was obtained by running model with 2 numMf for each input variable and applying bell-shape membership function (Tables 2 & 5). In this case, for both open simulation and closed simulation, MAPE error is identical because the estimated data were obtained by using the train data in both ways. For USA, best MAPE error was obtained by running model with 3 numMf for each input variable and applying triangle membership function (Tables 3 & 5). For India, best MAPE error was obtained by running model with 3 numMf for each input variable and applying bell-shape membership function (Tables 4 & 5).

Step 7

MAPE estimation is as follows: India, 3.30, Russia, 1.97, USA, 2.19, and Brazil. 2.68%.

In general, ANFIS is used when sample space is limited or small in comparison to conventional methods.

If MAPE reported by ANFIS is relatively small, it indicates acceptability and reasonability. Moreover, in actual case of this study (69 sets of data for monthly oil consumption), a very small value of MAPE is reported (1.97-3.3%). Selection of 69 sets of data is due to lack of data from standard sources for selected countries of this study. ANFIS is capable of dealing both data

Iteration Mf type for Mf type for NumMf for NumMf for MAPE by MAPE by

forth input tweleth input forth input tweleth open closed

variable variable variable input variable simulation, % simulation, %

1 gbell gbell 2 2 32.32 -

2 triangle gbell 2 2 6.99 -

3 triangle gbell 3 3 5.03 -

4 triangle triangle 3 3 2.19 2.13

Table 4—ANFIS tuning process for India

iteration Mftypefor Mf type Mftype Mftype NumMf NumMf NumMf NumMf MAPE by MAPE by

first input for for third for forth for first for for third for forth open closed

variable second input input input second input input simulation, simulation,

input variable variable variable input variable variable % %

variable variable

1 triangle triangle triangle triangle 3 2 2 3 220.68 -

2 gbell gbell triangle gbell 3 3 3 3 10.85 -

3 gbell triangle gbell gbell 3 2 3 3 38.04 -

4 gbell gbell gbell triangle 3 3 3 3 29.75 -

5 gbell triangle gbell triangle 3 3 2 3 55.76 -

6 gbell gbell gbell gbell 3 3 3 3 4.14 8.52

Table 5—Comparison of output of ANFIS and original test data

Country Months

1 2 3 4 5 6

Russia Actual data 9170 9160 9260 9260 9330 9280

ANFIS 9029 9054 9037 9095 9052 9101

Brazil Actual data 1733 1703 1725 1630 1748 1737

ANFIS 1654 1661 1674 1695 1720 1717

USA Actual data 5120 5158 5171 5219 5100 5067

ANFIS 5105 5089 4940 5041 5084 4974

India Actual data 650 691 704 689 685 683

ANFIS 656 643 711 688 620 692

(9)

complexity and ambiguity due to its intelligent mechanism. Although, some features (global and/or regional economic issues, even political) may have important impacts on oil consumption but this avoided global, regional and political issues. This study considered engineering and economic aspects of oil consumption. Moreover, low yield of error by integrated ANFIS algorithm (without political or global issues) showed suitability of selected data.

Conclusions

An algorithm was developed to model time series processes. A non-covariance stationary process was converted to covariance stationary process by a suitable data preprocessing method. Model was run for three set data and MAPE error for each country (Russia, Brazil, USA, India) was calculated. Comparison of MAPE errors showed high efficiency of model. In order to extend proposed model, seasonal changes from data can be removed and compared with two ANFIS models.

Moreover, comparison of proposed algorithm with some of the conceptual methods shows its advantages over previous models (Table 6).

References

1 Edwin R D J & Kumanan S., ANFIS for prediction of weld bead width in a submerged arc welding process, J Sci Ind Res, 66 (2007) 335-338

2 Gokdag M, Hasiloglu A S, Karsli N, Atalay A & Akba_ A, Modeling of vehicle delays at signalized intersection with an adaptive neuro-fuzzy (ANFIS), J Sci Ind Res, 66 (2007) 736- 740.

3 Nayak P C, Sudheer K P, Rangan D M & Ramasastri K S, A neuro-fuzzy computing technique for modeling hydrological time series, J Hydrol, 291 (2004) 52-66.

4 4-Karunasinghea D S K & Liong S Y, Chaotic time series pre- diction with a global model artificial neural network, J Hydrol, 323 (2006) 92-105.

5 Tseng F M, Yu H C & Tzeng G H, Combining neural network model with seasonal time series ARIMA model, Technol Fore- casting & Social Change, 69 (2002) 71-87.

6 6-Oliveira A L I & Meira S R L, Detecting novelties in time series through neural networks forecasting with robust confi- dence intervals, Neurocomputing, 70 (2006) 79-92 .

7 Niskaa H, Hiltunen T, Karppinen A, Ruuskanen J &

Kolehmainena M, Evolving the neural network model for fore- casting air pollution time series, Engg Appl Artific Intell, 17 (2004) 159-167.

8 Aznarte J L, Sanchez J M B, Lugilde D N, Fernandez C D L, Guardia C D & Sanchez F A,. Forecasting airborne pollen concentration time series with neural and neuro-fuzzy models, Expert Systems with Applications, 32 (2007) 1218-1225.

9 Gareta R, Romeo L M & Gil A, Forecasting of electricity prices with neural networks, Energy Convers Manage, 47 (2006) 1770- 1778.

10 Jain A & Kumar A M, Hybrid neural network models for hy- drologic time series forecasting, Appl Soft Compu (in press).

11 Hwarng H B, Insights into neural-network forecasting of time series corresponding to ARMA (p; q) structures, Omega, 29 (2001) 273-289.

12 Zhang G P & Qi M, Neural network forecasting for seasonal and trend time series, Eur J Operat Res, 160 (2005) 501-514.

13 Zhang G P, An investigation of neural networks for linear time- series forecasting, Compu & Operat Res, 28 (2001) 1183-1202.

14 Palmer A, Montano J J & Sese A, Designing an artificial neural network for forecasting tourism time series, Tourism Manage, 27 (2006) 781-790.

15 Kim T Y, Oh K J, Kim C & Do J D, Artificial neural networks for non-stationary time series, Neurocomputing, 61 (2004) 439- 447.

Table 6—ANFIS approach versus other methods

Datalinearity Data Intelligent Fuzzy High Dealing Data pre-

Method complexity uncertainty modeling data precision ambiguity processing and

and non- and non- and modeling and post-processing

crisp data forecasting reliability

set

ANFIS approach

ANN

Fuzzy regression

Linear regression

Nonlinear regression

Decision tree

Genetic algorithm

(10)

ral network model for forecasting time series events, Int J Fore- cast, 21 (2005) 341-362.

17 Al-Saba T & El-Amin I, Artificial neural networks as applied to long-term demand forecasting, Artific Intell Engg, 13 (1999) 189-197.

18 Mamdani E H & Assilian S, An experiment in linguistic syn- thesis with a fuzzy logic controller, Int J Man-Machine Stud, 7 (1975) 1-13.

19 Takagi T & Sugeno M, Fuzzy identification of systems and its application to modeling and control, IEEE Trans Syst, Man &

Cybernetics, 15 (1985) 116-132.

20 Box G E P & Jenkins G M, Time series analysis: Forecasting and Control (Holden Day, San Francisco) 1970. (rev edn 1976).

21 Azadeh, A & Tarverdian S, Integration of genetic algorithm, computer simulation and design of experiment for forecasting electrical consumption, JEPO, 35 (2007) 5229-5241.

22 Azadeh A, Saberi M, Ghaderi, S F, Gitiforouz A & Ebrahimipour V, Improved estimation of electricity demand function by inte- gration of fuzzy system and data mining aapproach, Energy Convers & Manage, 49 (2008) 2165-2177.

23 Azadeh A, Ghaderi S F, Anvari M & Saberi M, Measuring per- formance electric power generations using artificial neural net-

Industrial Electronics Society, edited by G A Capolino & LG Franquelo(eds) (IECON, Paris, France) 2006.

24 Azadeh A, Ghaderi S F, Anvari M & Saberi M, Performance assessment of electric power generations using an adaptive neural network algorithm, JEPO, 35 (2007) 3155-3166.

25 Azadeh A, Ghaderi S F, Anvari M, Saberi M & Izadbakhsh H, 2006.An integrated artificial neural network and fuzzy cluster- ing algorithm for performance assessment of decision making units, Appl Math & Compu, 187 (2007) 584-599.

26 Azadeh A, Ghaderi S F & Sohrabkhani S, Forecasting electrical consumption by integration of neural network, time series and ANOVA, Appl Math & Compu, 186 (2007) 1753-1761.

27 Azadeh A, Ghaderi S F, Tarverdian S & Saberi M, Integration of artificial neural networks and GA to predict electrical energy consumption, in Proc 32nd Annual Conf IEEE Industrial Elec- tronics Society, edited by G A Capolino & LG Franquelo(eds) (IECON, Paris, France) 2006.

28 Azadeh A, Ghaderi S F, Tarverdian S & Saberi M, Integration of artificial neural networks and genetic algorithm to predict electrical energy consumption, Appl Math & Compu, 187 (2007) 1731-1741 .

References

Related documents

et al., Deep learning based forecasting of Indian sum- mer monsoon rainfall.. et al., Convolutional LSTM network: a machine learning approach for

This paper has reviewed the applications of four of these tools, namely, knowledge based systems, fuzzy logic, artificial neural network and genetic algorithm for stator

Over the period under study the groundnut oil being a cheaper substitute oil for refined vegetable oil the consumers buy the groundnut oil instead of refined oils on account of price

In an aircraft engine, basic functionality of configuration system is to form the network of supply lines for air, fuel &amp; oil. Configuration hardware mainly consists of

This paper presents a simple framework for understanding the effect of oil prices on BRICS countries’ macroeconomic variables over a period of time from January 1, 2000 to

Our discussion in this chapter suggests that the level of consumption inequality in India has declined marginally over the years both in rural and urban

In the present study, electricity consumption in seven dif- ferent sectors, namely industry, domestic, agriculture, commercial, traction and railways, others along with

In this category, implemented control algorithm are adaptive sinusoidal tracer control algorithm, anti-hebbain control algorithm, adaptive filter based control algorithm,