• No results found

A time-series forecasting-based prediction model to estimate groundwater levels in India

N/A
N/A
Protected

Academic year: 2023

Share "A time-series forecasting-based prediction model to estimate groundwater levels in India"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

*For correspondence. (e-mail: dsena.mtech2013.ee@nitrr.ac.in) 20. APHA, Standard Methods for the Examination of Water and

Wastewater, American Public Health Association, American Water Works Association, Water Pollution Control Federation, Washington, DC, 2002, 22nd edn.

21. Haiping, L., Pei, X., Timberley, M., Roane, E., Jenkins, P. and Zhiyong, R., Microbial desalination cells for improved perform- ance in wastewater treatment, electricity production and desalina- tion. Bioresour. Technol., 2012, 105, 60–66.

22. Mehanna, M. et al., Using microbial desalination cells to reduce water salinity prior to reverse osmosis. Energy Environ. Sci., 2010, 3, 1114–1120.

23. Franks, A. E., Nevin, K. P., Jia, H. F., Izallalen, M., Woodard, T.

L. and Lovley, D. R., Novel strategy for three-dimensional real- time imaging of microbial fuel cell communities: Monitoring the inhibitory effects of proton accumulation within the anode biofilm. Energy Environ. Sci., 2009, 2, 113–119.

24. Lee, H. S., Parameswaran, P., Kato-Marcus, A., Torres, C. I. and Rittmann, B. E., Evaluation of energy-conversion efficiencies in microbial fuel cells (MFCs) utilizing fermentable and non- fermentable substrates. Water. Res., 2008, 42(6–7), 1501–

1510.

25. Mercer, Microbial Fuel Cells: Generating Power from Waste, illumin.usc.edu April (vol. XV (II), Online), 2014, 26.

26. Katuri, K. P., Enright, A., Flaherty, V. O. and Leech, D., Micr o- bial analysis of anodic biofilm in a microbial fuel cell using slaughterhouse wastewater. Bioelectrochem, 2012, 87, 164–171.

27. Kokabian, B. and Gude, V. G., Sustainable photosynthetic bio- cathode in microbial desalination cells. J. Chem. Eng., 2015, 262, 958–965.

28. Rabaey, K., Clauwaert, P., Aelterman, P. and Verstraete, W., Tubular microbial fuel cells for efficient electricity generation.

Environ. Sci. Technol., 2005, 39, 8077–8082.

29. Min, B. and Logan, B. E., Continuous electricity generation from domestic wastewater and organic substrates in a flat plate micro- bial fuel cell. Environ. Sci. Technol., 2004, 38, 5809–

5814.

30. Sun, J. J., Zhao, H. Z., Yang, Q. Z., Song, J. and Xue, A., A novel layer-by-layer self assembled carbon nanotube-based anode, prep- aration, characterization, and application in microbial fuel cell.

Electrochim. Acta, 2010, 55, 3041–3047.

31. Rabaey, K., Angenent, L., Schroder, U. and Keller, J., Bioelectro- chemical Systems: From Extracellular Electron Transfer to Bio- technological Application, IWA Publishing, London, 2009, 1st edn.

32. Aelterman, P., Versichele, M., Marzorati, M., Boon, N. and Verstraete, W., Loading rate and external resistance control the electricity generation of microbial fuel cells with different three-dimensional anodes. Bioresour. Technol., 2008, 99, 8895–

8902.

33. Xie, X., Hu, L. B., Pasta, M., Wells, G. F., Kong, D. S., Criddle, C. S. and Cui, Y., Three-dimensional carbon nanotube-textile anode for high-performance microbial fuel cells. Nano. Lett., 2011, 11, 291–296.

ACKNOWLEDGEMENTS. We thank Science & Engineering Research Board (SERB) Fast Track Young Scientist Award Scheme (Financial order no. SR/FT/LS-14/2011) for funding and the Director, CSIR-NEERI for his support and permission to carry out this work.

Received 30 September 2015; revised accepted 26 May 2016

doi: 10.18520/cs/v111/i6/1077-1083

A time-series forecasting-based prediction model to estimate groundwater levels in India

Debasish Sena* and Naresh Kumar Nagwani

National Institute of Technology, Raipur 492 010, India

India is one of the fast developing countries in the world with a growth rate of 6.4%. Rapid industriali- zation is the main cause behind such growth.

Although industrialization is of utmost importance for growth, sustainability of ecology is also a matter of concern. India has a vast coastline, but the saline water is not suitable for industrialization; so ground- water is the primary source for both industrialization and human consumption. Agriculture plays a major role in India’s economy and irrigation is also depend- ent on groundwater to some extent. Hence the study of groundwater levels is the need of the hour. In this study, time-series techniques like fuzzy time-series analysis and ARIMA are utilized for forecasting monthly groundwater levels. Experiments are per- formed on the datasets collected from different re- gions of India. The experimental results demonstrate that fuzzy time series analysis yields more accurate forecast of groundwater levels compared to the ARIMA model. The results of this study can be utilized for planning a suitable policy for groundwater use and its proper regulation to avoid future crisis.

Keywords: Fuzzy logic, groundwater level, prediction models, time-series forecasting.

GROUNDWATER is a major resource in our country. In fact, India tops the list of groundwater abstracting coun- tries. Groundwater is essential for sustainability of ecosystem; it provides stream water during drought con- ditions. Considering the effects of climate change, land- use change and global environmental changes like change in the amount of precipitation, increase in temperature and increase in demand of groundwater because of popu- lation growth, it is important to assess them1. Water being a dynamic resource, its storage undergoes continuous change either by recharge from various sources or dis- charge due to extraction or natural basin outflow. Hence periodic monitoring of groundwater levels is imperative for planning systematic development and management of groundwater resources2.

Groundwater level prediction in India is of utmost im- portance as our large population is heavily dependent on groundwater for daily consumption. Also groundwater is heavily used both for irrigation and industrialization in India. Due to faulty irrigation system, a lot of ground- water is wasted. Prediction of groundwater levels is the

(2)

need of the hour to avoid future crisis. Earth science data like groundwater level are large and complex and often represent a time series, making them difficult to analyse3. A wide variety of data mining, machine learning and information theoretic approaches are applicable to groundwater-level data. Artificial neural network and model tree ensembles methods have generally been em- ployed for future prediction of groundwater levels3. This communication focuses on methods like auto-regressive integrated moving average (ARIMA) using Box–Jenkins methodology and fuzzy time-series analysis for forecast- ing groundwater level.

In the past, various approaches have been suggested for predicting water level, including physical and statistical models. However, none of them is considered best because of the high degree of uncertainty and time- varying characteristics of the hydro-system. Principal component analysis (PCA) and neural network models have been designed for predicting water level of Hoek Van Holland during storm situations by van de Weg4. However, the PCA method has the disadvantage of diffi- culty in calculation of covariance matrix in an accurate manner5. Chang and Chang6 have proposed an adaptive neuro-fuzzy inference system for forecast of water level in reservoirs. Not having any systematic method for designing the controller is the main issue with fuzzy logic while a lot of computational resource is needed to fully implement a standard neural network. Scitovski et al.7 have utilized the periodicity of water-level behaviour using trigonometric regression for long-term forecasting and nearest neighbour method for short-term forecasting.

Forecasting of groundwater level using conceptual physi- cal models has also been proposed. But these are con- strained with the limitation of too many dependent variables8. A hybrid model combining genetic algorithm and wavelet network model has been proposed by Wang and Zhao8. But the genetic algorithm does not assure of a global optimal solution.

The ARIMA model and fuzzy time-series model used in this study have been designed based on past data and some random noise component with mathematical ma- nipulation and can predict the groundwater levels more accurately. As the fuzzy time-series analysis provides a better result in comparison to ARIMA, the latter model can be used in combination with various parametric and non-parametric methods of forecasting. Artificial neural network (ANN), k-nearest neighbour (k-NN), Markov chains, etc. can be used with the ARIMA model to reduce the forecasting error. The ARIMA forecast models are usually governed by three components: variables of the model, coefficients of the variables and some unobserved errors or random shock9. All the three components con- tribute towards the uncertainty of forecasting. In this communication, all the three components have undergone thorough analysis experimentally. The effect of temporal aggregation on ARIMA processes has been discussed by

Stram and Wei10. Fuzzy time-series analysis on ground- water level dataset is performed by following the work of Song and Chissom11. Trend analysis of pre- and post- monsoon groundwater levels has been performed by Gokhale and Sohoni12. The ARIMA model takes care of the seasonal variation of groundwater level in addition to the trend component.

For the ARIMA model monthly groundwater-level data have been collected from Groundwater Information Sys- tem, Government of India (GoI), Ministry of Water Resources, Central Ground Water Board (CGWB)13. In this study the ARIMA model is used for future prediction of groundwater level utilizing Box–Jenkins methodology.

As multiple ARIMA models can be proposed for a single dataset, a suitable model has been chosen by studying as- pects such as mean square error (MSE) and mean magni- tude of relative error (MMRE).

The Box–Jenkins methodology is useful for stationary time series, i.e. it must have a stable mean, variance and autocorrelation over the series14. For a stationary series, the correlogram dies down rapidly, or it lasts for four to five lags above the significant level. One way of remov- ing non-stationarity from time series is by simply apply- ing difference operation to the time series.

The first-order differencing is expressed as

t t t 1

X XX , where Xt is the value of the time-series variable at time t, Xt–1 the value of the time-series vari- able at time t – 1 and Xt is the first differenced time- series value. Likewise second-order differencing is expressed as XtXtXt1. In most cases, up to second- order differencing is performed for a time series. The backshift operator B is often used to represent the equa- tion in a compact manner; the first-order difference op- eration is expressed as Xt (1B X) t and the second- order difference operation as Xt(1B)2Xt.

A time series generally consists of two parts: a deter- ministic part representing the time-series values and a white noise part induced implicitly. The ARIMA model includes both parts. The auto-regressive part represents the deterministic component and determines how the data values of a time series regress upon themselves. The moving average part corresponds to the memory of the time series for the preceding random noise components23. The integrated part represents the degree of differencing needed to convert a time series to a stationary one.

An auto-regressive model of order p (AR(p)) suggests how the current time series value is regressed upon p number of past time-series values. So an AR(p) model can be mathematically represented as

1 1 2 2 ,

t t t p t p t

XX X  X (1) where {Xi} are the time-series values at instance i, {i} are the auto-regressive parameters and t is the white noise component at instance t. Equation (1) can be writ- ten in terms of backshift operator as

(3)

( ) ,

p B Xt t

(2)

where p( )B  1 1B2B2pBp is the AR char- acteristic polynomial calculated on B (ref. 16).

A moving-average model of order q (MA(q)) indicates the current value of the time series as a linear regression of q previous white noise values15. Mathematically, an MA(q) model can be expressed as

1 1 2 2 ,

t t t t q t q

X      (3) where { }i are the moving-average parameters. Equation (3) can be written in terms of backshift operator as

( ) ,

t q t

X B (4)

where q( )B  1 1B2B2qBq is the MA chara- cteristic polynomial calculated on B (ref. 16).

A d order differenced time series can be expressed in terms of backshift operator B as (1B)dXt. So an ARIMA(p, d, q) model is a combination of an auto- regressive model of order p and a moving-average model of order q applied over a d times differenced time series.

Mathematically, an ARIMA(p, d, q) can be expressed as ( )(1 )d ( ) .

p B B Xt q B t

  (5)

The values of p, d, q, {i} and {i} can be calculated by building appropriate ARIMA models.

Generally the ARIMA model is expressed as ARIMA(p, d, q). While applying ARIMA model to a time series, first the differencing is performed to convert the time series to a stationary one and then auto-regressive moving average (ARMA) model is applied to the differ- enced series. The ARIMA(p, d, q) model can be repre- sented as given in eq. (5). Sometimes the time series displays seasonality, i.e. dependency on past data seems prominent at multiples of some seasonal lag s. So the ARIMA model for such time series comprises a seasonal auto-regressive component and a seasonal moving- average component employed over a seasonally differ- enced time series. The model is referred as ARIMA(P, D, Q)s and is expressed as

( s)(1 s D) ( s) ,

P B B Xt Q B t

    (6)

where P(Bs)  1 1Bs 2B2s PBPs and

2

1 2

( s) 1 s s Qs

Q B B B QB

       are the seasonal AR and MA operators of orders P and Q respectively, with seasonal lag s (ref. 27). In general the seasonal and non-seasonal operators can be aggregated into a multipli- cative seasonal ARIMA, denoted by SARIMA(p, d, q) (P, D, Q)s and expressed as

( ) ( s)(1 ) (1d s D) ( ) ( s) .

p B P B B B Xt q B Q B t

    (7)

For a given time series, first the order of difference d is determined. Then, the order of auto-regression p and order of moving average q are determined. There can be multiple possible sets of p, q, d, P, Q and D for a particu- lar time series. So to derive a suitable model, three steps20 are followed according to the Box–Jenkins models: (a) Model structure identification; (b) Parameter estimation and calibration; (c) Validation or model testing.

In model structure identification, the order of auto- regression (p), order of seasonal auto-regression (P), moving average (q), order of seasonal moving average (Q), order of differencing (d) and seasonal order of dif- ferencing (D) are estimated. The order of difference is determined from the number of difference (time period changes) operations applied on the time series to make it a stationary one. The order of auto-regression is deter- mined from the number of significant partial auto- correlation values and the order of moving average is the number of significant auto-correlation values with either exponentially decaying or in a dampened sine wave. Af- ter obtaining a stationary time-series model, it can be identified from the theory20,21 given in Table 1.

The seasonal differencing (D) can be indicated by a correlogram decaying gradually at multiples of some seasons, but negligible between consecutive periods17. The seasonal auto-regression order (P) is the number of significant partial auto-correlation values occurring at some season and the seasonal moving average order (Q) is the number of significant auto-correlation values occurring at some season. The seasonality orders can be identified from Table 2.

In parameter estimation and calibration, after determin- ing the order of auto-regression (p) and order of moving average (q), the next big task is to estimate the auto- regressive parameters {i} and the moving average parameters {i}. The auto-regressive parameters can be determined by the Yule–Walker’s equation21,22 and the moving average parameters can be determined using the equation

1 1 2 2

2 2 2

1 2

( )

(1 ) ,

k k k q k q

k

q

   

    

    

 (8)

for k = 1, 2, ..., q. There are various algorithms like Marquadt’s algorithm, ‘armax’ toolbox in ‘MATLAB’

and ‘arima’ functions in R available for parameter estima- tion23. Maximum likelihood estimation is also used for estimating parameters24.

Table 1. Behaviour of ACF and PACF for ARMA models

Model ACF PACF

AR(p) Dies down Cuts off after p lags

MA(q) Cuts off after q lags Dies down

ARMA(p, q) Dies down Dies down

(4)

Table 2. Behaviour of ACF and PACF for seasonal ARMA models

Model ACF* PACF*

AR(P)s Tails off at lags ks, k = 1, 2, ... , Cuts off after lag PS

MA(Q)s Cuts off after lags Qs Tails off at lags ks, k = 1, 2, ... , ARMA(P, Q)s Tails off at lags ks Tails off at lags ks

*Values at non-seasonal lags h  ks for k = 1, 2, ... , are zero.

As there are many possible models, choosing the appropriate one is of utmost importance in time-series analysis. Model testing and validation are used to validate the proposed model. Among the various possible models, the one best suitable for the time series is determined either by maximum likelihood estimation (MLE)22, MSE23 or MMRE criteria. Also criteria like AIC, AICc and BIC are used to decide the suitable model for a par- ticular time series17. More often, MLE method is used for long-term data generation, whereas MSE and MMRE methods are advisable for short-term forecasting of the time series. In this study, MMRE was calculated for each model and the one with minimum MMRE selected as the best model for forecasting. Residual analysis was also performed to check the fitness of the model.

Fuzzy time-series analysis is a recent technique of future forecasting. It is basically established on the fuzzy set theory. The drawback of conventional set theory is that in the real world, many concepts cannot be explained by their membership or non-membership within the set.

So fuzzy set theory appears as the solution to the prob- lems posed by conventional set theory.

Let U be the universe of discourse divided into n inter- vals as U{ ,u u1 2,,un}, where ui is an interval in the universe of discourse U. A fuzzy set Ai of U is defined as

1 1 2 2

( )/ ( )/ ( )/ ,

i Ai Ai Ai n n

Af u uf u u  f u u

where fAi is the membership function for fuzzy set Ai, : [0, 1].

fAi Uuk is the element of fuzzy set Ai and ( )

Ai k

f u is the degree of membership of uk to Ai. ( ) [0, 1]

Ai k

f u  , where 1kn.

Let Y t( ) (t, 0, 1, 2,) be a subset of R, the uni- verse of discourse on which fuzzy sets f ti( ) (i1, 2,) are defined, and let F(t) be a collection of fi(t). Then, F(t) is called a fuzzy time series on Y t( ) (t, 0, 1, 2,). F(t) can be called a linguistic variable28 and f ti( )

(i1, 2,) can be viewed as possible linguistic values of F(t) and are presented by fuzzy sets. As F(t) is time- dependent and according to Song and Chissom11, if F(t) is caused by F(t – 1) only, then the relationship can be represented by F(t – 1)  F(t). The above dependency can be represented by F t( )F t( 1)R t( 1, ),t where R(t – 1, t) represents the fuzzy relationship between F(t) and F(t – 1), and ‘’ represents an operator (can be max–

min11, min–max29 or arithmetic operator30). If F(t – 1) can be represented by Ai–1 and F(t) by Ai, then F(t – 1)  F(t) can be represented as Ai–1  Ai.

A fuzzy logical relationship group can be constructed by grouping all right-hand-side fuzzy sets preceded by the same fuzzy set in the left-hand-side of fuzzy logical relationship28. If there are fuzzy logical relationships such thatAiA Aj, iA Ak, iAl,, then they can be merged into a fuzzy logical relationship group asAiA A Aj, k, l,.

Determining the length of the interval to divide the fuzzy time series into multiple fuzzy sets is an important task as different lengths of intervals may produce differ- ent forecasting results. An effective length of interval should not be too large or too small, as too large intervals lead to no fluctuation in the fuzzy time series and too small, intervals will diminish the mining of fuzzy time series31. A heuristic for determining the effective length of the interval is set in a manner so that at least half of the fluctuations in the time series should be reflected by the in- terval. Based on this concept, two approaches are pro- posed31. They are average-based length and distribution- based length. In this communication, the distribution-based approach is used for effective length determination.

The calculations of forecasting are carried out by the following procedure as given by Chen28.

(a) If fuzzified value of time i is Ai and there exists a fuzzy logical relationship Ai  Aj and the maximum membership value of Aj occurs in the interval uj, then fo- recasted value of time i + 1 is mj, where mj is the mid- point of uj.

(b) If fuzzified value of time i is Ai and there exists fuzzy logical relationships AiAj1,AiAj2,,

i jp

AA and the maximum membership values ofAj1,Aj2,,Ajp occur in the interval u u1, 2,,up re- spectively, then forecasted value for time i + 1 is

1 2

(mm mp) / ,p where m m1, 2,,mp are the mid- points of intervals u u1, 2,,up respectively.

(c) If fuzzified value of time i is Ai and there does not exist any fuzzy logical relationship group whose current state of value is Ai and the maximum membership value of Ai occurs in the interval ui with a midpoint mi, then the forecasted value for time i + 1 is mi.

We now explain details of the experiment, performed on monthly groundwater level of Jainath region, Adilabad district, Andhra Pradesh, India, based on the methodology described above.

The dataset used here is taken from the Groundwater Information System, GoI. The datasets taken are monthly groundwater level data from 2005 to 2012.

(5)

The obtained dataset has some missing values and is filled in by linear interpolation. The dataset is also ana- lysed for possible outliers and these have been replaced by the average value of the corresponding month.

The steps for designing the ARIMA model for the da- taset are explained using the R software package25. Like R, there are several other software packages available for time series analysis. The time-series for the monthly groundwater level has been used for building the ARIMA models.

Figure 1 shows the dataset taken for the analysis. It shows the monthly groundwater level of Jainath region from 2005 to 2012. Figure 2 shows the time plot for the dataset. The time (in years) is represented on the X-axis and the monthly groundwater level (in metres) is repre- sented on the Y-axis. The plot presents a stationary time series with a seasonality s = 12. Hence, no difference op- eration is needed for the dataset and the order of integra- tion (d) for the time series is zero.

Then the order of auto-regression and moving average is determined by drawing the ACF and PACF plot of the

Figure 1. Dataset of monthly groundwater level.

Figure 2. Plot of monthly groundwater level.

stationary series, as shown in Figures 3 and 4 respec- tively. The dotted line around the abscissa represents the 95% confidence interval, and the ACF and PACF values within the confidence interval are considered as insignifi- cant. The ACF shows a damping sine wave with signifi- cant auto-correlation values at lag 1, lag 12 and lag 24.

So the order of moving average (q) and seasonal moving average order (Q) are determined as 1 and 2 respectively.

Significant partial auto-correlation values at lag 0 and lag 12 can be observed from the PACF plot. So the order of auto-regression (p) and seasonal auto-regression (P) can be determined as 1 and 1 respectively. Hence the model identified for the monthly groundwater level is ARIMA(1, 0, 1)(1, 0, 2)12.

The auto-regressive parameters 1 and 1 are to be 0.7735 and 0.9181 respectively and the moving average

Figure 3. ACF plot of monthly groundwater level.

Figure 4. PACF plot of monthly groundwater level.

(6)

parameters 1, 1 and 2 are –0.3038, –0.6261 and 0.1494 respectively. There is an intercept of 6.0654 for the time series. So, the mathematical expression for the built model is as follows:

(1 0.9181 B12)(1 0.7735 ) B Xt 6.0654

(1 0.6261 B120.1494B24)(1 0.3038 ) . Bt (9) Equation (9) can be simplified as given in eq. (10).

1 12 13

0.7735 0.9181 0.7101

t t t t

XX X X

0.045t250.1494t240.1902t13

0.6261t120.3038t1t6.0654. (10) Forecast of the monthly groundwater level for the year 2013 is performed using the model designed above and is shown in Figure 5. The plot indicates a seasonal ground- water-level fluctuation within a confidence interval of 80% and 95%.

In this experiment, groundwater-level data from 2005 to 2012 have been used for the design of the model and data of 2013 have been used for verifying the designed model. As the groundwater-level data of 2013 only con- tains information for the months of January, May, August and November, the forecast values of the corresponding months only have been used for the calculation of the MMRE. After forecasting, the MMRE for forecast values has been calculated followed by calculation of the percentage error, which is 9.39. So the predicted model is

Figure 5. Forecast plot of monthly groundwater level.

better for the prediction of earth science data like groundwater level. Again to test the goodness of the desired model, diagnosed checking has been performed.

Residual analysis is used here as a method of diagnostic checking. The quantile–quantile (Q–Q) plot shown in Figure 6 is almost linear, which implies a normal distri- bution of residuals. Figure 7 shows a symmetric histo- gram with a normal curve. Figures 6 and 7 validate a good fitness of the model26.

The dataset considered here for implementation pur- pose is the monthly groundwater-level data of Jainath region from 2005 to 2012. The groundwater level varies from 3.07 to 10.17 m. So the universe of discourse is

Figure 6. Quantile–quantile plot for monthly groundwater level.

Figure 7. Histogram of the residuals of monthly groundwater level.

(7)

chosen from 3.00 to 10.20. To fuzzify the universe, divi- sion of the overall interval is performed using the distri- bution-based length approached as discussed by Huarng31. Initially the average value the absolute values of the first difference of the series is calculated; it is 0.85 m. Then the base for the length of the interval is calculated as 0.1 according to Table 3. The length of the interval is chosen as 0.4, which is the largest value less than at least half of the first differences.

After determining the effective length of the intervals, the universe of discourse is divided into 18 intervals as shown in Table 4.

While defining fuzzy sets on the universe, the linguis- tic variable is ‘monthly water level’ and the universe of discourse is divided into 18 fuzzy sets, A1, A2, ..., A18 and each Ai (i = 1,2, ..., 18) is defined by intervals u1, u2,..., u18 as follows:

1 {1/ , 0.5/1 2};

Au u

2 {0.5/ ,1/1 2, 0.5/ };3

Au u u

3 {0.5/ 2,1/ 3, 0.5/ 4};

Au u u

4 {0.5/ 3,1/ 4, 0.5/ };5

Au u u

5 {0.5/ 4,1/ 5, 0.5/ };6

Au u u

6 {0.5/ 5,1/ 6, 0.5/ 7};

Au u u

7 {0.5/ 6,1/ 7, 0.5/ };8

Au u u

8 {0.5/ 7,1/ 8, 0.5/ };9

Au u u

9 {0.5/ 8,1/ 9, 0.5/ 10};

Au u u

10 {0.5/ 9,1/ 10,0.5/ 11};

Au u u

11 {0.5/ 10,1/ 11, 0.5/ 12};

Au u u

12 {0.5/ 11,1/ 12, 0.5/ 13};

Au u u

13 {0.5/ 12,1/ 13, 0.5/ 14};

Au u u

14 {0.5/ 13,1/ 14, 0.5/ 15};

Au u u

15 {0.5/ 14,1/ 15, 0.5/ 16};

Au u u

16 {0.5/ 15,1/ 16, 0.5/ 17};

Au u u

17 {0.5/ 16,1/ 17, 0.5/ 18};

Au u u

18 {0.5/ 17,1/ 18}.

Au u

After defining the fuzzy sets, each value of the monthly groundwater level series is assigned with its correspond- ing fuzzy sets.

After fuzzifying the whole dataset, the fuzzy logical re- lationship group is obtained from each fuzzy logical rela-

tionship by following the theory as mentioned earlier in the text. The fuzzy logical relationship groups are as given in the Table 5.

Using the monthly groundwater level dataset of Jainath region, the groundwater level for the last 12 months of the dataset has been forecast and relative error calculated by comparing the forecasted groundwater level against their actual values for January, May, August and Novem- ber. Table 6 shows details of the forecasting. MRE is ob- tained as 0.0687 and the percentage error is calculated to be 6.87.

Table 3. Base mapping table

Range Base

0.1 to 1.0 0.1

1.1 to 10 1

11 to 100 10

101 to 1000 100

Table 4. Fuzzy set intervals

1 [3.0, 3.4]

u u2[3.4,3.8] u3[3.8, 4.2]

4 [4.2, 4.6]

u u5[4.6,5.0] u6[5.0, 5.4]

7 [5.4,5.8]

u u8[5.8, 6.2] u9[6.2, 6.6]

10 [6.6, 7.0]

u u11[7.0, 7.4] u12[7.4, 7.8]

13 [7.8,8.2]

u u14[8.2,8.6] u15[8.6, 9.0]

16 [9.0, 9.4]

u u17[9.4,9.8] u18[9.8,10.2]

Table 5. Monthly groundwater level fuzzy logical relationship group

9 11, 7, 12, 10

A A A A A

11 13, 1, 12, 3, 11, 10, 15

A A A A A A A A

13 15

A A

15 16, 18

A A A

16 12, 5, 18, 10

A A A A A

12 8, 2, 9, 15, 18, 16

A A A A A A A

8 4, 9, 10, 7

A A A A A

4 4, 5, 3, 7

A A A A A

5 5, 6, 4, 7

A A A A A

6 6, 7, 5, 8

A A A A A

7 8, 9, 3, 4, 7, 6, 14, 11, 5

A A A A A A A A A A

1 2

AA

2 2, 4, 7

A A A A

3 4, 7, 3, 5

A A A A A

10 12, 10, 11, 1, 3

A A A A A A

14 16

A A

18 12, 11

A A A

Table 6. Forecast details of monthly groundwater level

Year Month AGL FC FGL RE

2013 January 4.69 A5 5 0.066

2013 May 6.35 A9 6.8 0.0708

2013 August 4.27 A4 4.7 0.1007

2013 November 4.82 A5 5 0.0373

(8)

In this study, different models for groundwater-level prediction have been proposed for ‘Jainath’ region using ARIMA and fuzzy time-series analysis. The models have been built using the past groundwater-level fluctuation patterns. The current predicted groundwater level is linearly related to its previous value because the ARIMA models are based on auto-correlations. The models are verified using the groundwater-level values of year 2013 of the dataset. Percentage error is calculated for both ARIMA and fuzzy time-series analysis as 9.35 and 6.87 respectively. This clearly indicates that the fuzzy time-series analysis is better than the ARIMA model for forecasting. The CGWB along with the state groundwater agencies can apply these models for the quinquennial periodic groundwater assessment (GWA) for estimating the dynamic groundwater resource. The groundwater esti- mation committee conducting the national GWA exercise can adopt these models for the estimation of groundwater level of individual GWA units. There is scope for further improvement of the present model by combining it with other parametric and non-parametric models.

Conflict of interest: The authors certify that there is no conflict of interest regarding the publication of this paper.

1. Stoll, S., Hendricks Franssen, H. J., Barthel, R. and Kinzelbach, W., What can we learn from long-term groundwater data to im- prove climate change impact studies? Hydrol. Earth Syst. Sci., 2011, 15(12), 3861–3875.

2. Reddy, A. G. S., Water level variation in fractured, semi-confined aquifers of Anantpur district, southern India. J. Geol. Soc. India, 2012, 80, 111–118.

3. Hoffman, F. M. et al., Data mining in earth system science. In Proceeding of International Conference on Computational Sci- ence, Reykjavik, Iceland, 2011, pp. 1450–1455.

4. van de Weg, M. C., Prediction of water level during storm situa- tions using neural networks, Department of Computer Science, Thesis, Leiden University, The Netherlands, 1997.

5. Karamizadeh, S., Abdullah, S. M., Manaf, A. A., Zamani, M. and Hooman, A., An overview of principal component analysis.

J. Signal Inf. Proc., 2006, 4(1), 173–175.

6. Chang, F. J. and Chang, Y. T., Adaptive neuro-fuzzy inference system for prediction of water level in reservoir. Adv. Water Resour., 2006, 29, 1–10.

7. Scitovski, R., Maričić, S. and Scitovski, S., Short term and long term water level prediction at one river measurement location.

Croat. Oper. Res. Rev., 2012, 3, 1–11.

8. Wang, L. and Zhao, W., Forecasting groundwater level based on WNM with GA. J. Comput. Inf. Syst., 2011, 7(1), 160–167.

9. Gavirangaswamy, V. B., Gupta, G., Gupta, A. and Agrawal, R., Assessment of ARIMA based prediction techniques for road traf- fic volume. In Fifth International Conference on Management of Emergent Digital EcoSystems, New York, USA, 2013, pp. 246–

251.

10. Stram, D. O. and Wei, W. W. S., Temporal aggregation in the ARIMA process. J. Time Series Anal., 1986, 7(4), 279–292.

11. Song, Q. and Chissom, B. S., Forecasting enrolments with fuzzy time series part-I. Fuzzy Sets Syst., 1993, 54, 1–9.

12. Gokhale, R. and Sohoni, M., Detecting appropriate groundwater- level trends for safe groundwater development. Curr. Sci., 2015, 108(3), 395–404.

13. Groundwater Information System, Government of India (on-line);

http://gis2.nic.in/cgwb/Gemsdata.aspx (accessed October 2014).

14. Goulão, M., Fonte, N., Wermelinger, M. and Abreu, F. B., Soft- ware evolution prediction using seasonal time analysis: a com- parative study. In 16th European Conference on Software Maintenance and Reengineering, Szeged, Hungary, 2012, pp. 213–

222.

15. Wu, W., Zhang, W., Yang, Y. and Wang, Q., Time series analysis for bug number prediction. In Second International Conference on Software Engineering and Data Mining, Chengdu, China, 2010, pp. 589–596.

16. Crye, J. D. and Chan, K. S., Time Series Analysis with Applica- tions in R, Springer Verlag, New York, 2010, 2nd edn, ISBN: 978- 0-387-75958-6.

17. Shumway, R. H. and Stoffer, D. S., Time Series Analysis and its Application with R Examples, Springer, New York, 2010, 3rd edn, ISBN 978-1-4419-7864-6.

18. Contreras, J., Espínola, R., Nogales, F. J. and Conejo, A. J., ARIMA models to predict next day electricity prices. IEEE Trans.

Power Syst., 2003, 18(3), 1014–1020.

19. Ahmad, S. and Latif, H. A., Forecasting on the crude palm oil and kernel palm production: seasonal ARIMA approach. In IEEE Col- loquium on Humanities, Science and Engineering Research, Penang, Malaysia, 2011, pp. 939–944.

20. Singh, L. L., Abbas, A. M., Ahmad, F. and Ramaswamy, S., Pre- dicting software bugs using ARIMA model. In 48th Annual Southeast Regional Conference, New York, USA, 2010.

21. Nielsen, H. B., Univariate time series analysis; ARIMA models.

Econometrics, 2005, 2, 1–21.

22. Chatfield, C., Time Series Forecasting, Chapman & Hall/CRC, Boca Raton, Florida, USA, 2000, ISBN: 1-58488-063-5.

23. National Program on Technology Enhanced Learning, Govern- ment of India (on-line); http://nptel.ac.in/courses/105108079/ (ac- cessed September 2014).

24. Hagan, M. T. and Behr, M., The time series approach to short term load forecasting. IEEE Power Eng. Rev., 1987, 2(8), 56–57.

25. Gentleman, R., Ihaka, R. and Bates, D., The R project for statisti- cal computing (on-line); http://www.r-project.org

26. Yan, Z., Traj-ARIMA: a spatial-time series model for network- constrained trajectory. In Second International Workshop on Computational Transportation Science, New York, USA, 2010, pp. 11–16.

27. Engle, R. F. and White, H., Co-integration, Causality and Fore- casting, EconPapers, Oxford University Press, Oxford, UK, 1999, pp. 1–44.

28. Chen, S. M., Forecasting enrollments based on fuzzy time series.

Fuzzy Sets Syst., 1996, 81, 311–319.

29. Song, Q. and Chissom, B. S., Forecasting enrollments with fuzzy time series part-II. Fuzzy Sets Syst., 1994, 62, 1–8.

30. Chen, S. M. and Hwang, J. R., Temperature prediction using fuzzy time series. IEEE Trans. Syst., Man Cybernatics – Part B, 2000, 30(2), 263–275.

31. Huarng, K., Effective length of intervals to improve forecasting in fuzzy time series. Fuzzy Sets Syst., 2001, 123, 387–394.

ACKNOWLEDGEMENTS. We thank the National Institute of Tech- nology Raipur, for providing the necessary facilities to carry out this work.

Received 19 February 2015; revised accepted 22 April 2016

doi: 10.18520/cs/v111/i6/1083-1090

References

Related documents

At the same time, for a similar dataset, the results obtained after analysis by this method have been com- pared with those obtained from two other methods – trend

submitted by Shri Arun Kumar to the Indian Institute of Technology,Delhi for the award of the degree of Doctor of Philosophy in Civil Engineering is a record of bonafide

This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time-series analysis of satellite images utilizing pixel spectral information for

USING RECURRENCE RELATION.. 58 A Time- series is a sequence of data points measured typically at successive points in uniform intervals of time. Two methods are available for

In forecasting univariate (depending on single variable) time-series reliability, inputs which are used by any model are the past lagged observations of the time series, while

In this work, time series based statistical data mining techniques are used to predict job absorption rate for a discipline as a whole.. Each time series describes a phenomenon as

Various distance measures have been proposed in the past which can optimally cluster financial time series data.. It is generally observed that some stock pairs may not have high

We propose a successive update scheme which uses communication between sampling instants to refine estimates of the latest sample and study the following question: Is it better