Seasonal Autoregressive Integrated Moving Average (DSARIMA) for Stock Forecasting

Background: Stock price forecasting assists investors to anticipate risks and opportunities in making prudent investments and maximizing returns. Objective: This study aims to identify the most accurate model for stock forecasting. Methods: This paper utilized the daily closing stock price of Unilever Indonesia, Tbk (UNVR) from January 1, 2018 to July 31, 202. Double Seasonal Autoregressive Integrated Moving Average (DSARIMA), was utilized in this study. Mean Absolute Scaled Error (MASE) and Median Absolute Percentage Error (MdAPE) are used to compare forecasting accuracy. Results: Following conducting each model, we assessed that the best models are DSARIMAX (0,1,[4]) ([3],1,1) 5 (1,1,0) 253 , regarding MASE and MdAPE corresponding to approximately 1.423 and 0.111. The scope of this study has limitations to a test set for one-month forecast periods. Conclusion: As stock prices rise, investors require precise forecasts. Models of forecasting must perform well. This analysis shows how the DSARIMA generate forecasts stock prices more accurately. This investigation evaluated the closing stock price of UNVR. Both MASE and MdAPE assess prediction. After analyzing each model, DSARIMAX (0,1,[4])([3],1,1) 5 (1,1,0) 253 has the lowest MASE and MdAPE values, 1.423 and 0.111, respectively. The procedure lasted one month. Research may combine forecasts and improve their accuracy.


INTRODUCTION
Stocks play a significant role in the modern financial environment, providing as an essential channel for investment, asset creation, and business growth. Individuals have the potential to accumulate capital through the purchase of stocks. They enable investors to gain benefits from appreciation in capital and dividends, and also engage in the accomplishments of businesses. International capital flows are facilitated by stock markets, which contribute to global integration. They facilitate the participation of foreign investors in the expansion of international economies, thereby nurturing economic interdependence.
The market's interest in Unilever (UNVR) stock is intense. This is because UNVR stock have a favorable reputation for quality and the products are readily accessible. In addition, UNVR stocks are defensive (movement-stable) stocks. In recent years, the stock price of UNVR has declined. The decline was attributable to a decline in home and personal care (HPC) segment earnings. Despite the fact that this segment will account for over 67% of total annual revenue in 2021, it will continue to decline. In addition, COVID 19 and market competition have an impact on the business. Stock prices fluctuate erratically on a daily basis, necessitating investors to conduct technical analysis to reduce their risk exposure. Forecasting future fluctuations in is a strategy for mitigating this risk.
The importance of stock forecasting in current volatile financial markets cannot be overstated. Forecasting future price movements and trends enables investors and traders to make informed decisions. It provides insightful information that facilitates the making of informed investment decisions. Therefore, it can acquire an increased comprehension of the behaviour of stocks, leading them to reach more thoughtful conclusions. It assists investors by discovering prospective victors, preventing poor investments, and optimising portfolio composition. Furthermore, it plays an important role in decision-making activities (Khoiriyah and Cahyani, 2022) Forecasting of future stock price can be done by time series analysis (Rifai, 2019). Time series analysis is an approach used to forecast future values based on historical data (Silfiani and Lembang, 2023). The autoregressive integrated moving average (ARIMA) model is one of the most common and well-known time series analyses and its enhanced applications have provided excellent accuracy in forecasting in a wide range of domains (Li, Wu, and Liu, 2023). Numerous fields that applied ARIMA in forecasting are unemployment rate (Yamacli and Yamacli, 2023), oil production (Ning, Kazemi and Tahmasebi, 2022), daily reservoir inflow (Gupta and Kumar, 2022) and stock price (Hayati and Ulama (2016) and Putri and Setiawan, (2015)). ARIMA is frequently employed because its application and interpretation are simple (Perone, 2020). Furthermore, ARIMA model have some extensions that accomodates seasonality in the time series data.
Stock market prices change over time and sometimes have seasonal patterns (Rahmadianto, Lesmana and Budiarti, 2022). Therefore, this study applied double seasonal autoregressive integrated moving average approach in UNVR stock price in order to forecast future stock price changes.

Research Design
The research design in this research follows as Figure 1.

Research Subjects
The dataset in this research are UNVR, a stock from PT Unilever Indonesia Tbk. It produces, sells, and distributes household goods in Indonesia. The business provides soaps, detergents, dairy products, ice cream, savoury snacks, soy sauce, cosmetics, tea, and fruit juices. Unilever Indonesia Tbk is a subsidiary by Unilever PLC.

Data Analysis Technique
We applied DSARIMA is a variation of the autoregressive integrated moving average (ARIMA) to construct forecasting model. In addition, we calculated mean absolute scaled error (MASE) and median absolute percentage error (MdAPE) to evaluate forecasting measurement accuracy. The explanation about DSARIMA and forecasting measurement accuracy as follows as: DSARIMA is a variation of the Autoregressive Integrated Moving Average (ARIMA) that includes two seasonal periods. The seasonal model's orders imply that there are of two seasonal changes in the observed data. In general, DSARIMA model denotes by (1) (Dinata, et al., 2020).
(1) where: The same techniques as non-seasonal ARIMA apply to the construction of DSARIMA. Box and Jenkins created ARIMA using four processes, namely identification, estimate, diagnostic testing, and forecasting (Hyndman and Koehler, 2006). Through model identification, data series features including such seasonality and stationarity are discovered (Bowerman, O'Connell and Koehler, 2005).
Applying model estimation, we may estimate the DSARIMA parameters. Thirdly, diagnostic checking is used to examine the white noise and normality distributions of residuals. The authors have considered outliers in the DSARIMA model if the residuals do not follow to a normal distribution. In time series, Additive Outliers (AO) and Level Shifts are two types of outliers (LS). In general, the DSARIMA model with outliers follows (2) (Bowerman, O'Connell and Koehler, 2005). To measure accuracy performance of the forecasting model, we applied MASE and MdAPE. The MASE and MdAPE can be expressed as (3) and (4) as follows:

RESULT AND DISCUSSION
Unilever (UNVR) closing prices on weekdays are gathered. Each week, we have five data points. The central tendency and dispersion of the dataset are determined using the mean and standard deviation. According to Table 1, the largest and smallest annual averages for UNVR from 1 January 2018 to 29 July 2022 are IDR 9447.5 and IDR 4205, respectively. In 2018 and 2021, the largest and smallest standard deviations were IDR 850.7 and 1123.7, respectively. Table 1 also indicates that 253 observations are made each year.
The construction of a DSARIMA requires four phases. These processes include identification, estimation, diagnosis, and forecasting. Stationarity and seasonality were detected throughout the identifying process.   Figure 2a, UNVR is currently demonstrating a declining trend. On the basis of data features, it is anticipated that UNVR has two types of seasonality, namely weekly and annual seasonality, with 5 (5 days in a week) and 253 days, respectively (253 days in a year). In addition to examining the mean stationarity, we also analyzed the variance stationarity. We found that neither the mean nor the standard deviation of the UNVR was constant (Figure 2. b). Figure 2 demonstrates that the autocorrelation function lacks a lag, indicating that UNVR is a nonstationary process (c). To solve this issue, we created the nonseasonal differencing of d=1 and the double seasonal differencing of D1=5 and D2=253.
The estimate phase determines the values of the DSARIMA parameters and examines the significance of each parameter using the t-test. We used conditional least squares to estimate the DSARIMA parameters, and we set the significance level (α) at 5% to determine whether or not the parameters were significant. If one or more parameters do not pass the tstatistic test, the model must be reconstructed until each parameter passes the test. The subsequent step is the diagnostic testing phase. It is used to examine white noise and the normal distribution of DSARIMA's residual. We applied the Ljung-Box test and the Kolmogorov-Smirnov distribution test with a significance threshold (α) of 5% in order to examine the white noise and normality distributions of residuals.
Consider that the residual does not fulfil the white noise assumption until lag 60. In such a circumstance, the model must be reconstructed, and the estimation process must be repeated until all parameters and the white noise criterion are fulfilled. If the residual distribution is not normal during this phase, additional actions such as outlier detection will be required. Frequent outliers prevent the residual from satisfying the requirements of the normal distribution. In order to confirm that the residuals have a normal distribution, it is necessary to include the outliers into the model.