Multiscale geographically weighted regression with LASSO and Group LASSO: Review and application to micro and small enterprises revenue
DOI:
https://doi.org/10.58524/jgsa.v1i3.55Keywords:
Group structured data, Kernel function , Micro Small Enterprises' income , Multiscale Spatial Modelling , Variable selectionAbstract
Micro and Small Enterprises (MSE) holds a crucial role in the economy because it contributes 55% of the state’s income, but MSE still has a lot of deficiencies, so immediate optimization is vital. The purpose of this study is to model and map the MSE income at the regency level in West Java using Multiscale Geographically Weighted Regression (MGWR) with a selection variable process. MGWR is a method that is used to capture a spatial heterogeneity process by allowing effects to vary over space using “borrowed” nearby data that is controlled by various bandwidths for each variable. This research also adds variable selection processes such as LASSO and Group LASSO as an improvement of MGWR to model group-structured data. The response of this study is MSE income in 27 regencies/cities in West Java province, Indonesia, with 144 independent variables that will be selected using LASSO and Group LASSO to become predictor variables in MGWR model. The results of the spatial modelling show that the best model is MGWR with Group LASSO using bi-square kernel function. Based on this result, it can be seen that a group of important variables which significantly affect the MSE income are fertility, energy source, natural disaster, industry, and tourism. Fertility and energy source significantly affect the MSE income in all regencies, but fertility itself has no significant effect in big cities. Then, in the industry and tourism, the number of visiting foreign tourists has the most significant effect.
Downloads
References
Ainiyah, N., Deliar, A., & Virtriana, R. (2016). The Classical Assumption Test to Driving Factors of Land Cover Change in The Development Region of Northern Part of West java. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLI-B6, 205–210. https://doi.org/10.5194/isprs-archives-XLI-B6-205-2016
Alemayehu, M. A., Agimas, M. C., Shewaye, D. A., Derseh, N. M., & Aragaw, F. M. (2023). Spatial distribution and determinants of limited access to improved drinking water service among households in Ethiopia based on the 2019 Ethiopian Mini Demographic and Health Survey: spatial and multilevel analyses. Frontiers in Water, 5. https://doi.org/10.3389/frwa.2023.1166733
Al-Hasani, G., Asaduzzaman, M., & Soliman, A.-H. (2021). Geographically weighted Poisson regression models with different kernels: Application to road traffic accident data. Communications in Statistics: Case Studies, Data Analysis and Applications, 7(2), 166–181. https://doi.org/10.1080/23737484.2020.1869628
Anshika, Singla, A., & Mallik, G. (2021). Determinants of financial literacy: Empirical evidence from micro and small enterprises in India. Asia Pacific Management Review, 26(4), 248–255. https://doi.org/10.1016/j.apmrv.2021.03.001
Ayambila, S. N. (2023). Determinants of micro and small enterprises financial performance in the non-farm sector of Ghana: A quantile regression approach. Cogent Economics & Finance, 11(2). https://doi.org/10.1080/23322039.2023.2225331
Basbay, M. M., Elgin, C., & Torul, O. (2016). Energy Consumption and the Size of the Informal Economy. Economics, 10(1). https://doi.org/10.5018/economics-ejournal.ja.2016-14
Brundage, M. P., Chang, Q., Li, Y., Arinez, J., & Xiao, G. (2014). Utilizing energy opportunity windows and energy profit bottlenecks to reduce energy consumption per part for a serial production line. 2014 IEEE International Conference on Automation Science and Engineering (CASE), 461–466. https://doi.org/10.1109/CoASE.2014.6899366
Cavanaugh, J. E., & Neath, A. A. (2019). The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements. WIREs Computational Statistics, 11(3). https://doi.org/10.1002/wics.1460
Comber, A., Harris, P., & Brunsdon, C. (2024). Multiscale spatially varying coefficient modelling using a Geographical Gaussian Process GAM. International Journal of Geographical Information Science, 38(1), 27–47. https://doi.org/10.1080/13658816.2023.2270285
Daoud, J. I. (2017). Multicollinearity and Regression Analysis. Journal of Physics: Conference Series, 949, 012009. https://doi.org/10.1088/1742-6596/949/1/012009
De Carvalho, F. de A. T., Lima Neto, E. de A., & Ferreira, M. R. P. (2017). A robust regression method based on exponential-type kernel functions. Neurocomputing, 234, 58–74. https://doi.org/10.1016/j.neucom.2016.12.035
Díaz Rios, D. J., & González Castro, J. (2022). The commercial micro and small enterprise as a complex adaptive system. A management model. SCIÉNDO, 26(2), 161–167. https://doi.org/10.17268/sciendo.2023.023
E. Colipano, T. (2022). Bussiness Performance of Micro-Enterprises in Northen Mindanao, Philippines: a Multinomial Logistic Regression Analysis. International Journal of Applied Science and Research, 05(04), 105–118. https://doi.org/10.56293/IJASR.2022.5413
El-Habil, A. M. (2012). An Application on Multinomial Logistic Regression Model. Pakistan Journal of Statistics and Operation Research, 8(2), 271. https://doi.org/10.18187/pjsor.v8i2.234
Esubalew, A. A., & Raghurama, A. (2020). The mediating effect of entrepreneurs’ competency on the relationship between Bank finance and performance of micro, small, and medium enterprises (MSMEs). European Research on Management and Business Economics, 26(2), 87–95. https://doi.org/10.1016/j.iedeen.2020.03.001
Fan, G.-F., Peng, L.-L., & Hong, W.-C. (2018). Short term load forecasting based on phase space reconstruction algorithm and bi-square kernel regression model. Applied Energy, 224, 13–33. https://doi.org/10.1016/j.apenergy.2018.04.075
Fotheringham, A. S., Oshan, T. M., & Li, Z. (2023). Multiscale Geographically Weighted Regression. CRC Press. https://doi.org/10.1201/9781003435464
Fotheringham, A. S., Yang, W., & Kang, W. (2017). Multiscale Geographically Weighted Regression (MGWR). Annals of the American Association of Geographers, 107(6), 1247–1265. https://doi.org/10.1080/24694452.2017.1352480
Fotheringham, A. S., Yu, H., Wolf, L. J., Oshan, T. M., & Li, Z. (2022). On the notion of ‘bandwidth’ in geographically weighted regression models of spatially varying processes. International Journal of Geographical Information Science, 36(8), 1485–1502. https://doi.org/10.1080/13658816.2022.2034829
Halunga, A. G., Orme, C. D., & Yamagata, T. (2017). A heteroskedasticity robust Breusch–Pagan test for Contemporaneous correlation in dynamic panel data models. Journal of Econometrics, 198(2), 209–230. https://doi.org/10.1016/j.jeconom.2016.12.00 5
Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical Learning with Sparsity. Chapman and Hall/CRC. https://doi.org/10.1201/b18401
Higazi, S. F., Abdel-Hady, D. H., & Ahmed Al-Oulfi, S. (2013). Application of Spatial Regression Models to Income Poverty Ratios in Middle Delta Contiguous Counties in Egypt. Pakistan Journal of Statistics and Operation Research, 9(1), 93–110.
Huang, Y., Tibbe, T., Tang, A., & Montoya, A. (2024). Lasso and Group Lasso with Categorical Predictors: Impact of Coding Strategy on Variable Selection and Prediction. Journal of Behavioral Data Science, 3(2). https://doi.org/10.35566/jbds/v3n2/montoya
Jalaliah, J., Wulandari, H. K., & Dumadi, D. (2022). Pengaruh Modal Kerja, Tenaga Kerja, dan Bahan Baku Terhadap Pendapatan UMKM Pabrik Tahu (Studi Empiris UMKM Tahu Kecamatan Banjarharjo Periode Tahun 2019-2021). AURELIA: Jurnal Penelitian Dan Pengabdian Masyarakat Indonesia, 1(1), 68–78. https://doi.org/10.57235/aurelia.v1i1.32
Jurečková, J., & Picek, J. (2007). Shapiro–Wilk-type test of normality under nuisance regression and scale. Computational Statistics & Data Analysis, 51(10), 5184–5191. https://doi.org/10.1016/j.csda.2006.08.026
Kappal, S. (2019). Data Normalization using Median & Median Absolute Deviation (MMAD) based Z-Score for Robust Predictions vs. Min-Max Normalization. London Journal of Research in Science: Natural and Formal, 19(4), 39–44.
Ma, Z., & Huang, Z. (2023). A Bayesian Implementation of the Multiscale Geographically Weighted Regression Model with INLA. Annals of the American Association of Geographers, 113(6), 1501–1515. https://doi.org/10.1080/24694452.2023.2187756
Maulida, Y. (2013). Pengaruh Tingkat Upah Terhadap Migrasi Masuk di Kota Pekanbaru. Jurnal Ekonomi, 21(2), 1–12.
Nugroho, N. F. T. A., & Slamet, I. (2018). Geographically Weighted Regression Model with Kernel Bisquare and Tricube Weighted Function on Poverty Percentage Data in Central Java Province. Journal of Physics: Conference Series, 1025, 012099. https://doi.org/10.1088/1742-6596/1025/1/012099
Olayemi, O. O., Okonji, P. S., & Ogbeiwi, M. O. (2022). Ease of Doing Business Reforms and Business Growth Among Selected Micro, Small And Medium Enterprises (MSMEs) in Present Day Nigeria: a Hierarchical Multiple Regression Modelling Approach. Journal of Economics and Management Research, 10, 57–73. https://doi.org/10.22364/jemr.10.04
Oyedele, O. (2023). Determining the optimal number of folds to use in a K-fold cross-validation: A neural network classification experiment. Research in Mathematics, 10(1). https://doi.org/10.1080/27684830.2023.2201015
Pillay, K. G., & Lin, S. P. (2023). Prediction of KLCI Index Through Economic LASSO Regression Model and Model Averaging. Pakistan Journal of Statistics and Operation Research, 19(1), 103–113. https://doi.org/10.18187/pjsor.v19i1.4214
Rahmawati, F., Ananda, F. P., & Narmaditya, B. (2020). Socio-Economic Indicators and Income Inequality: Lesson from West Java in Indonesia. Scientific Papers of the University of Pardubice, Series D: Faculty of Economics and Administration, 28(3). https://doi.org/10.46585/sp28031114
Rianty, M., & Rahayu, P. F. (2021). Pengaruh E-Commerce Terhadap Pendapatan UMKM Yang Bermitra Gojek Dalam Masa Pandemi Covid-19. Akuntansi Dan Manajemen, 16(2), 153–167. https://doi.org/10.30630/jam.v16i2.159
Schneider, U., & Tardivel, P. J. (2020). The Geometry of Uniqueness and Model Selection of Penalized Estimators including SLOPE, LASSO and Basis Pursuit. HAL Open Science, 20, 1–34. https://hal.science/hal-02548350v2
Sharif, S., & Kamal, S. (2018). Comparison of Significant Approaches of Penalized Spline Regression (P-splines). Pakistan Journal of Statistics and Operation Research, 14(2), 289. https://doi.org/10.18187/pjsor.v14i2.1948
Takano, Y., & Miyashiro, R. (2020). Best subset selection via cross-validation criterion. TOP, 28(2), 475–488. https://doi.org/10.1007/s11750-020-00538-1
Tibshirani, R. (1996). Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Yue, H., Duan, L., Lu, M., Huang, H., Zhang, X., & Liu, H. (2022). Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method. Atmosphere, 13(4), 627. https://doi.org/10.3390/atmos13040627
Yunus, M., Saefuddin, A., & Soleh, A. M. (2020). Pemodelan Statistical Downscaling dengan LASSO dan Group LASSO untuk Pendugaan Curah Hujan. Indonesian Journal of Statistics and Its Applications, 4(4), 649–660. https://doi.org/10.29244/ijsa.v4i4.724
Zhong, S., Chen, D., Xu, Q., & Chen, T. (2013). Optimizing the Gaussian kernel function with the formulated kernel target alignment criterion for two-class pattern classification. Pattern Recognition, 46(7), 2045–2054. https://doi.org/10.1016/j.patcog.2012.12.012
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Juli Yandi Rahman, Farell Fillyanno Zevic, Wawan Hendriawan Nur, Irsyad Ramli (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

