Does ENSO Significantly Affect Rice Production In Indonesia? A Preliminary Study Using Computational Time-Series Approach

effects and try to ﬁt it with year-to-year rice production in Java. This study did not use MEI directly while the index is the ENSO’s intensity representation. There are also another recent study had try to build and ﬁnd any correlation between ENSO phenomena and rice production speciﬁcally in Asia and South East Asia (SEA) where rice is one of the primary food. ENSO is a phenomenon that is suspected to inﬂuence rice production in Indonesia. In this study, we try to ﬁnd direct correlation between ENSO and rice production in this region by using various latest computational time series methods, such as Dynamic Time Warping, Wavelet Coherence, and Bayesian Structural Time Series to quantify the statistical relationship between the Multivariate ENSO Index on annual rice production in 1961- 2019. We did not ﬁnd a direct correlation between these two variables, which may be due to the local inﬂuence of ENSO on different rice production areas in Indonesia. This study would also point out the importance of shifting the theme of research in Indonesia from mapping to monitoring and freely share the data. This step would bring science to progress further and faster.


Introduction
Indonesia is the world's third biggest rice producer in 2019. The rice crop, which is also known as paddy, itself is one of the crops heavily influenced by rainfall. For this reason, it is very important for Indonesia to know the effect of climate on rice production. This article hopes to find relationship between Multivariate ENSO Index (MEI), a phenomena which will influenced rainfall periodically, and rice production. With a simple model, due to the needs of its simplicity so that it can be digest easily by researchers from the various field of studies, we try to find if there is any correlation between these two variables. Aside from using simple model, consistent with our accessibility information, we also try to execute our study by using open-source software to maximize the reproducibility and replicability.
One of the most similar study by, which also inspires this paper, had found that ENSO has strong connections with Java's rice production. However, this study is using year-to-year change of rainfall and 4-month sea surface temperature anomalies (SSTA) which represent ENSO's effects and try to fit it with year-to-year rice production in Java. This study did not use MEI directly while the index is the ENSO's intensity representation. There are also another recent study had try to build and find any correlation between ENSO phenomena and rice production specifically in Asia and South East Asia (SEA) where rice is one of the primary food. ENSO is a phenomenon that is suspected to influence rice production in Indonesia. In this study, we try to find direct correlation between ENSO and rice production in this region by using various latest computational time series methods, such as Dynamic Time Warping, Wavelet Coherence, and Bayesian Structural Time Series to quantify the statistical relationship between the Multivariate ENSO Index on annual rice production in 1961-2019. We did not find a direct correlation between these two variables, which may be due to the local influence of ENSO on different rice production areas in Indonesia. This study would also point out the importance of shifting the theme of research in Indonesia from mapping to monitoring and freely share the data. This step would bring science to progress further and faster.
The first one is tackling the same issue in Philippines with same method as in Java's study with addition of ecosystems and fitting target variability. The study also find association between rainfall, SSTA, and rice production, harvested area, also the yield in one of the Philippines' Island, Luzon. There is also one other study conducted in China which adopts this method (using SSTA rather than MEI) for their work which do not find any strong correlation between ENSO event and rice production even when the SSTA has a strong correlation with rainfall. Then, there is one study by that we could find which are using MEI as one of other variables in their pooled mean group estimation to find the correlation between the ENSO event and rice production in SEA. However, the MEI is used together with irrigation area, fertilizer consumption, and harvested area. We believe those other variables are arising more complexities which will affect the difficulties of the result interpretation. This study too find that there is an association between ENSO intensity and rice production in SEA.
With all of those precedents, as stated before, this study will try to differentiate by building a simpler model that investigate the correlation between MEI Index and rice production only. As for the method, to test the robustness of our result, we are using three kind of correlation analysis which consist of Dynamic Time Wrapping (DTW), Wavelet Coherence (WTC), and Bayesian Structural Time Series (BSTS). After estimating their correlation, we also provide further discussion about the results.

Data and Code
The analysis was based rice production data provided by FAO (https://www.fao.org/faostat/en/#home) and the Multivariate Enso Index (MEI) provided by NOAA (https://psl.noaa.gov/enso/mei/). We processed the data using two programming languages: Python with Numpy, Matplotlib and Pandas package and R with biwavelet, bsts, and readr packages. Data and code are stored in the project' Github page (https://github.com/sandyherho/ensoIndoRice).

Computational Analysis
The first step we take is to calculate annual rice yield index, which is a standardized comparison of total rice production P (in tonnes) with the area of production A (in hectare) based on FAO data (equation (1)). (1) We define the annual standardized rice yield index (ASRYI) through equation (2) below, where and are respectively the mean and standard deviation of the data defined according to equation (3) as follows, where N is the number of data points, i.e., 59. The results of the time-series reconstruction of this index can be seen in Figure 1 below, In order to measure the similarity between ASRYI and annual average of MEI signals, we use a dynamic time warping (DTW) algorithm. DTW is an algorithm that is commonly used to determine the optimal distance between two sequential data pieces to determine the similarity between the two time-series patterns. Identification of the similarity of the sequential data is done by comparing the optimal computation time to find the shortest distance between the two sequences.
The first step is to input both time-series and data, which have the same sequence length, so that we can form a cost matrix . Then, we initialize the cost matrix values as follows, ; for every up to row, ; and for every up to column, . Then, for each row component and column to the row and column, the components of the cost matrix are calculated using the following equation, In this study we use Euclidean distance to measure . We need to trace back the components of the cost matrix from to to get the alignment of these two sequences. To automate this computation, we use an implementation of the FastDTW algorithm which is implemented in the Python computing environment via the dtw-python library. The 2D cost matrix representation is shown in Figure 2.
With the minimum DTW path with minimum distance of 54.87, it can be concluded that the synchronization process between the two-time series is not efficient in terms of computational costs. However, we cannot conclude whether there is no statistical relationship between the ENSO and rice production in Indonesia. For this reason, we explore causal relationship between these two signals using the Wavelet Coherence (WTC) algorithm.
The wavelet transform algorithm is intended to localize the change in time ( ) and angular frequency ( ) of the time-series data into a function of frequency with respect to time, so that we can know changes in time and frequency simultaneously. One of the wavelet functions that is often used in the fields of climatology is the Morlet transform, known as the continuous wavelet transform (CWT), with the following equation,  (5) where is the normalized term and η = the dimensionless temporal parameter which is the ratio of n as a time parameter and s as a scale parameter. The angular frequency parameter is defined as . In the Morlet transform, the period of oscillation is = 1.03s. CWT for a timeseries is formally defined in the equation (6)  In order to determine the relationship between the two-time series, in the context of this study between ASRYI and annual average of MEI, a complex conjugate multiplication these two CWT is needed as shown in equation (7), Phase coherence in WTC is obtained by normalizing and filtering the power spectrum on the scale and time. Time smoothing is performed by normalizing using a Gaussian profile, while scale smoothing is performed using a boxcar filter. These two operations are indicated by the expression in the following equation, To calculate the phase difference between these two-time series, we use the comparison of the imaginary term and the real term of as follows, We use the biwavelet package in the R computing environment to automate the WTC calculation process. This package is an implementation of the algorithm used by. This 2D representation of the WTC is shown in Figure 3. Time is displayed on the horizontal axis, while the vertical axis shows the frequency (the lower the frequency, the higher the scale). Regions in time-frequency space where the two-time series co-vary are located by the WTC. Warmer colors represent regions with significant interrelation, while colder colors signify lower dependence between the sequences. Cold regions beyond the significant areas represent time and frequencies with no dependence in the series. Arrows in this figure represents the lead/lag-phase relationship between annual average of MEI and ASRYI. A zero-phase difference means that the two sequences move together on a particular scale. Arrows point to the right when the time series are in phase, yet if the arrows are pointing to the right, then both time sequences are in anti-phase. When this two-series data are in phase, it indicates that they move in the same direction, and anti-phase means that they move in the opposite direction. Arrows pointing to the right-down or left-up indicate that annual average of MEI is leading, while arrows pointing to the right-up or left-down show that the ASRYI is leading.
In Figure 3, we do not see any significant driving pattern of annual average of MEI on ASRYI, which belongs to the 95% cone of influence (COI) determined by two degrees of freedom in the distribution.

Results and Discussions
Based on the results we get from the DTW and WTC algorithms, we do not find a causal relationship between the annual average of MEI and the ASRYI in Indonesia. However, to further ensure that there is no statistical significance between these two variables, we try to make MEI as a covariate component of the ASRYI sequence in the univariate Bayesian structural time-series (BSTS) framework. We model the ASRYI into a local linear trend with seasonality and regression (LLTSR) model as follows,  (10) where is a local linear trend, is a seasonal component which is defined in equation (11) as follows, (11) According to the total data points, the number of seasons that we use in this model is 59. We use a single vector covariate, the annual average of MEI as the spike in the slab-spike priors, or in other words we have the assumption that this variable will have a significant effect on ASRYI. The regression coefficient is calculated numerically. In this model, we also assume that the residual terms and follow Gaussian distribution. We obtained the posterior density distribution of ASRYI by using the Kalman filter, Kalman smoother, and 1000 times Markov Chain Monte Carlo (MCMC) process automatically by utilizing the bsts package in the R computing environment. Based on the posterior density distribution of the non-residual components of the LLTSR model ( Figure 4), we can see that the prior regression component that we expect to have an effect, namely the annual average of MEI, actually barely contributes to the ASRYI predictability. This is also further confirmed by the negative inclusion probability value of the regression coefficient which does not reach 20% ( Figure 5).