This study focuses on the novel forecasting method (SutteARIMA) and its application in predicting Infant Mortality Rate data in Indonesia. It undertakes a comparison of the most popular and widely used four forecasting methods: ARIMA, Neural Networks Time Series (NNAR), HoltWinters, and SutteARIMA. The data used were obtained from the website of the World Bank. The data consisted of the annual infant mortality rate (per 1000 live births) from 1991 to 2019. To determine a suitable and best method for predicting Infant Mortality rate, the forecasting results of these four methods were compared based on the mean absolute percentage error (MAPE) and mean squared error (MSE). The results of the study showed that the accuracy level of SutteARIMA method (MAPE: 0.83% and MSE: 0.046) in predicting Infant Mortality rate in Indonesia was smaller than the other three forecasting methods, specifically the ARIMA (0.2.2) with a MAPE of 1.21% and a MSE of 0.146; the NNAR with a MAPE of 7.95% and a MSE of 3.90; and the HoltWinters with a MAPE of 1.03% and a MSE: of 0.083.
In this era of globalization and continuous industrial development, every human being wants to get information as fast as possible. Statistics, which is one of the fields of science related to the acquisition of information in several scientific disciplines, has made progress. This advancement usually requires different methods of solving different problems. Statistics has been known for a long time and has even been used in dealing with problems in everyday life such as in the fields of health, economics, social sciences, atmospheric sciences, and other fields. In addition, the development of data mining and big data analysis also requires an understanding of statistics. This is in line with the opinion of Sivarajah, Kamal, Irani, and Weerakkody [
Statistics are usually used by data analysts to consider possible events that may recur. Therefore, the likelihood of future events is strongly influenced by the frequency and routine of events that have occurred in the past. This is in line with the opinion of Edwards [
In the health sector, forecasting is often used as a means of evaluating the implementation, success and failure of a health program or health service that is being implemented. In addition, forecasting is also often used as a means of planning and decision making in the implementation of future activities. For example, Ranapurwala [
The infant mortality rate is one of the health problems in Indonesia that needs to be highlighted, because the infant birth rate is one of the indicators commonly used in determining public health. It is not surprising that health programs in Indonesia focus a lot on the problem of infant mortality, namely the reduction in infant mortality rates. In 2008, the Infant Mortality Rate in Indonesia was still quite high, around 31/1000, or in other words, 31 babies died in every 1,000 births. This mortality rate is higher when compared to Malaysia and Singapore, which amounted to 16.39/1000 and 2.3/1000 live births, respectively.
According to WHO data, in 2019, globally, as many as 7000 newborns died every day and 185 cases per day occurred in Indonesia with an infant mortality rate of 24 per 1000 live births, with details of 75% of neonatal deaths occurring in the first week, and 40% died within the first 24 h [
The Autoregressive Integrate Moving Average (ARIMA) model was first discovered and presented by George Box and Gwilym Jenkins in 1976, and their names are often synonymous and associated with the ARIMA process applied for time series analysis, namely ARIMA BoxJenkins. In general, the ARIMA model is written with the ARIMA notation (p, d, q), where p represents the order of the autoregressive process (AR), d represents the differencing, and q represents the order of the moving average (MA) process.
The autoregressive model is a form of regression that connects the observed values at a certain time with the values of previous observations at certain intervals [
In general, the autoregressive process of data at the p level (AR (p)) [
Because of this
The moving average process is a process that functions to describe phenomena in which the event produces an immediate effect which only lasts for a short period of time. The model of the general process moving average (MA) is as follows [
Because
The model of the moving average autoregressive process (ARMA) [
For an invertible process, it is required that the root of
The ARIMA process is basically similar to the ARMA process, they state that stationary and invertible processes can be represented in the form of a moving average or in an autoregressive form in the ARMA section. AR, MA, and ARMA require that data must be stationary, both in mean and in variance. Data can be stated as stationary in terms of average, if the time series data is relatively constant over time, it is stated to be stationary in variance, if the time series data structure from time to time has constant or constant data fluctuations and does not change or does not change the variance in the magnitude of the fluctuation. To overcome this nonstationary mean, a differencing process is carried out, and for nonstationary variants, a power transformation is carried out.
This ARIMA contains a differencing process to stationary data that is not stationary in the mean in the ARMA process. If there is a dorder differencing, then to achieve a stationary and general model of the ARIMA process (0, d, 0) it becomes:
With the AR stationary operator
In the execution of time Series data forecasting, the ARIMA method (p, d, q) have steps or stages. The stages in forecasting are as follows [
Model identification
Model identification is done to see the meaning of autocorrelation and data stationarity, to determine whether or not it is necessary to carry out a transformation or a differencing process (differentiation). From this stage, a temporary model will be obtained from which the process of testing the model will be carried out whether it is appropriate or not on the data.
Model Assessment and Testing
After the model identification process has been carried out, the next step is to assess and test the model. This stage consists of two parts, namely parameter assessment and model diagnostic examination.
Parameter Assessment
After obtaining one or more provisional models, the next step is to find estimates for the parameters in that model.
Model Diagnostic
Diagnostic checking is done to check whether the estimated model is quite suitable or adequate with the existing data. Diagnostic checking is based on residual analysis. The basic assumption of the ARIMA model is that the residual is an independent random variable with a normal distribution with a constant mean of zero variance.
Independent Test
This independent test is performed using the BoxPierce Q statistical test. The BoxPierce Q test can be calculated using the formula [
If the value is
Normality Test
Residual analysis is used to examine whether the residuals of the model are white noise or not. White noise is the basic assumption of the ARIMA model where the residual in this case is a free random variable that is normally distributed with zero mean and constant variance.
HoltWinters is a method for modeling and predicting the behavior of data from a time series. In addition, HoltWinters is one of the most used time series forecasting methods. It is decades old, but is still widely used in a variety of applications, including monitoring, which is used for things like anomaly detection and capacity planning. The HoltWinters model uses three aspects of the time series: a typical value (average)/stationary, trend, and seasonality. Because it uses these three aspects, HoltWinters is also known as triple exponential smoothing. HoltWinters uses three smoothing parameters, namely α, β, γ, each of which has a value between 0 – 1.
The formula of HoltWinters [
Seasonal Smoothing:
X
S
S
T
T
SN
SN
F
m = the time period to be predicted,
L = seasonal length.
An artificial neural network (ANN) is a system that processes information with characteristics and performance close to that of a biological neural network. Artificial neural networks are a generalization of biological neural network modeling with several assumptions, including:
Information processing lies in a number of components called neurons.
The signal spreads from one neuron to another via a connecting line.
Each connecting line has a weight and multiplies the value of the incoming signal (certain types of neurons).
Each neuron implements an activation function (usually nonlinear) which adds up all the inputs to determine the output signal.
Neural networks are useful for estimating or regression analysis including for forecasting and modeling, classification including pattern recognition and sequence recognition, as well as for decision making in sorting and processing data including filtering, grouping, and compression as well as programming of robots that move independently without human assistance. According to Wuryandari et al. [
Patterns of relationships between neurons (network architecture)
The method for determining and changing the joint weights is called the training method or network learning process
Activation function
Artificial neural networks are also known as brain metaphors, computational neuron science, and parallel distributed processing. Neural networks are used for complex nonlinear forecasting. One of the network requirements related to the Time Series is NNAR (Neural Network Autoregressive). Time series lag values can be used as input to neural networks, such as the lag values used in linear autoregressive models. This method is known as the neural network autoregressive model (NNAR). The NNAR model is generally denoted by NNAR (
The NNAR model is a feedforward neural network that involves a combination of linear and activation functions. This function formulation is defined as:
SutteARIMA is a shortterm forecasting method developed by Ahmar et al in 2019 [
The formula of SutteARIMA [
In this paper, we use annual time series data from Mortality rate, infant (per 1,000 live births) for Indonesia which is obtained from the World Bank Database. Data for this paper is available at:
In the results of the fitting/testing data, two performance indicators or forecasting accuracy are used to assess the quality with the good of fit standard and the accuracy of the forecasting results obtained. The indicators are as follows [


In the case of infant mortality rates, the data obtained is in the form of a trend and has decreased every year (see
To obtain forecasting models and forecasting results from data using ARIMA, Neural Network Time Series, HoltWinter, and SutteARIMA models, we use the
$Forecast_AutoARIMA
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
23 25.05055 24.44849 25.65261 24.12978 25.97132
24 24.11898 23.31295 24.92502 22.88626 25.35171
25 23.18742 22.07214 24.30271 21.48174 24.89310
26 22.25586 20.75409 23.75763 19.95910 24.55262
27 21.32430 19.37637 23.27223 18.34520 24.30340
28 20.39274 17.94907 22.83641 16.65547 24.13001
29 19.46118 16.47840 22.44395 14.89941 24.02294
$AutoARIMA
Series: al_mi_10
ARIMA(0,2,2)
Coefficients:
ma1 ma2
−1.1098 0.5000
s.e. 0.2471 0.2149
sigma2 estimated as 0.2207: log likelihood=12.88
AIC = 31.76 AICc = 33.26 BIC = 34.75
From the results of the analysis output, it is obtained the ARIMA model (0, 2, 2) with 2 times differencing and the values of MA (1): −1.1098 and MA (2): 0.5000. The form of the model is as follows.
$Forecast_NNETAR
Point Forecast
23 25.40859
24 24.90951
25 24.49229
26 24.14630
27 23.86136
28 23.62803
29 23.43788
$NNETAR
Series: al_mi_10
Model: NNAR(1,1)
Call: nnetar(y = al_mi_10)
Average of 20 networks, each of which is
a 1–1–1 network with 4 weights
options were  linear output units
sigma2 estimated as 0.1567
Similar to the ARIMA method, the Neural Network Time Series method is obtained by a forecasting model, namely NNAR (1, 1) with 1 hidden screen.
$Forecast_HoltWinters
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
23 25.0594324.5314025.5874524.2518825.86698
24 24.1523723.4704224.8343323.1094125.19533
25 23.2453222.3157924.1748521.8237224.66691
26 22.3382621.0909623.5855620.4306824.24584
27 21.4312019.8124623.0499418.9555523.90685
28 20.5241518.4900022.5583017.4131923.63511
29 19.6170917.1294922.1046915.8126423.42154
$HoltWinters
HoltWinters exponential smoothing with trend and without seasonal component.
Call:
HoltWinters(x = al_mi_10, gamma = FALSE)
Smoothing parameters:
alpha: 0.4384154
beta: 0.8642498
gamma: FALSE
Coefficients:
[,1]
a 25.9664829
b −0.9070558
From the analysis for forecasting using the HoltWinters method (
$Tes_Data
[1] 25.1 24.2 23.3 22.5 21.7 20.9 20.2
$Forecast_AlphaSutte
[1] 25.01820 24.15097 23.28372 22.41644 21.64915 20.88184 20.11449
$Forecast_AutoARIMA
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
23 25.05055 24.44849 25.65261 24.12978 25.97132
24 24.11898 23.31295 24.92502 22.88626 25.35171
25 23.18742 22.07214 24.30271 21.48174 24.89310
26 22.25586 20.75409 23.75763 19.95910 24.55262
27 21.32430 19.37637 23.27223 18.34520 24.30340
28 20.39274 17.94907 22.83641 16.65547 24.13001
29 19.46118 16.47840 22.44395 14.89941 24.02294
$Forecast_SutteARIMA
Point Forecast Low 95 High 95
23 25.0344 23.7827 26.2861
24 24.1350 22.9282 25.3417
25 23.2356 22.0738 24.3973
26 22.3362 21.2193 23.4530
27 21.4867 20.4124 22.5611
28 20.6373 19.6054 21.6692
29 19.7878 18.7984 20.7772
After obtaining the forecasting model in the specification model section, the result of forecasting for testing data are shown in
Based on the results of forecasting for testing data from various prediction methods that have been presented previously, the comparison of the forecasting results is presented in
Data  Forecast  Low 95  High 95  APE  SE 

25.1  25.05055  24.12978  25.97132  0.001293059  0.004307 
24.2  24.11898  22.88626  25.35171  0.001324584  0.004228 
23.3  23.18742  21.48174  24.89310  0.004135937  0.004151 
22.5  22.25586  19.95910  24.55262  0.007163493  0.026847 
21.7  21.32430  18.34520  24.30340  0.015005208  0.045486 
20.9  20.39274  16.65547  24.13001  0.023422265  0.069017 
20.2  19.46118  14.89941  24.02294  0.032479571  0.169880 
Mean 
Data  Forecast  Low 95  High 95  APE  SE 

25.1  25.40859  24.13816  26.67902  0.012294422  0.095228 
24.2  24.90951  23.66403  26.15499  0.029318595  0.503404 
23.3  24.49229  23.26768  25.7169  0.051171245  1.421555 
22.5  24.14630  22.93899  25.35362  0.073168889  2.710304 
21.7  23.86136  22.66829  25.05443  0.099601843  4.671477 
20.9  23.62803  22.44663  24.80943  0.130527751  7.442148 
20.2  23.43788  22.26599  24.60977  0.160291089  10.483870 
Mean 
Data  Forecast  Low 95  High 95  APE  SE 

25.1  25.05943  24.25188  25.86698  0.001616335  0.001646 
24.2  24.15237  23.10941  25.19533  0.001968182  0.002269 
23.3  23.24532  21.82372  24.66691  0.002346781  0.002990 
22.5  22.33826  20.43068  24.24584  0.007188444  0.026160 
21.7  21.43120  18.95555  23.90685  0.012387097  0.072253 
20.9  20.52415  17.41319  23.63511  0.017983254  0.141263 
20.2  19.61709  15.81264  23.42154  0.028856931  0.339784 
Mean 
Data  Forecast  Low 95  High 95  APE  SE 

25.1  25.0344  23.7827  26.2861  0.002615  0.004307 
24.2  24.1350  22.9282  25.3417  0.002687  0.004228 
23.3  23.2356  22.0738  24.3973  0.002765  0.004151 
22.5  22.3362  21.2193  23.4530  0.007282  0.026847 
21.7  21.4867  20.4124  22.5611  0.009828  0.045486 
20.9  20.6373  19.6054  21.6692  0.012570  0.069017 
20.2  19.7878  18.7984  20.7772  0.020404  0.169880 
Mean 
Year  Forecast 

2020  19.7557 
2021  19.0523 
2022  18.7566 
2023  18.0748 
2024  17.9185 
Based on the forecast results in
The purpose of this research is to model the infant mortality rate data and find the best model to predict this problem in the future. To achieve this goal, four model are used (ARIMA, HoltWinters, Neural Network Time Series, and SutteARIMA) to predict the infant mortality rate data. To determine which prediction model is more suitable and precise in predicting data, the MAPE and MSE values of each of the forecasting methods used are calculated and the results are compared according to the predetermined performance criteria. Based on the findings of this study, it is concluded that the better or more suitable model, with smaller forecast errors in the infant mortality case data, is SutteARIMA which is then followed by HoltWinters, ARIMA, and NNAR. And based on data trends and forecast results, the infant mortality rate is decreasing from year to year. The SutteARIMA method provides an estimated infant mortality rate for 2020 of 19.7557 and 17.9185 for 2024, a decline from 2019. These findings have the potential to help promote policies in order to address and minimize infant mortality rates in the coming years and can be used as a basis for implementing appropriate strategies to overcome them so that Indonesia's SDGs targets can be achieved. Although the infant mortality rate is predictable and has a satisfactory level of accuracy, it is possible that the results of this prediction are not precise due to human behavior and policies taken by policy makers.