An econometric and Machine Learning study on individuals who became poor during the pandemic based on the Continuous PNAD

Authors

DOI:

https://doi.org/10.20435/multi.v28i69.4104

Keywords:

Econometrics, poverty, Machine Learning

Abstract

This study aim to investigate the relationship between poverty and the COVID-19 pandemic, based on microdata from Continuous PNAD. To obtain different approaches to the topic, two methodologies were used: 1) Econometrics and 2) Machine Learning. The study focuses on understanding the main determinants of poverty during the pandemic period, as well as predicting the vulnerability of individuals to poverty using Machine Learning. The results obtained indicate a higher likelihood of transitioning into poverty for non-white individuals, women, residents of metropolitan areas, individuals in larger families, and those with lower educational attainment. Furthermore, the XGBoost algorithm performed best in predicting poverty after data balancing. These results can be used to assist in decision-making in combating poverty in Brazil.

Author Biographies

Roberto Santolin, Universidade Federal Rural do Rio de Janeiro (UFRRJ)

Doutor em Economia pelo Centro de Desenvolvimento e Planejamento Regional da Universidade Federal de Minas Gerais (CEDEPLAR/UFMG). Professor associado da Universidade Federal Rural do Rio de Janeiro (UFRRJ), campus Três Rios. Professor Permanente do Programa de Pós-Graduação em Economia Aplicada da Universidade Federal de Outro Preto (PPEA/UFOP).

Patrick Gomes de Oliveira, Universidade Federal Rural do Rio de Janeiro (UFRRJ)

Bacharel em Ciências Econômicas pela Universidade Federal Rural do Rio de Janeiro (UFRRJ), campus de Seropédica.

References

ATHEY, Susan; IMBENS, Guido. Machine Learning Methods That Economists Should Know About. Annual Review of Economics, [s.l.], v. 11, n. 1, p. 685–725, 2019.

CARUANA, Rich; NICULESCU-MIZIL, Alexandru. An Empirical Evaluation of Supervised Learning for ROC Area. In: INTERNATIONAL WORKSHOP, 1., Valencia, 2004. Valencia: ROCAI, 2004.

CHAKRABARTY, Navoneel; BISWAS, Sanket. A Statistical Approach to Adult Census Income Level Prediction. In: INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, 1., [s.l.], 2018. Proceedings […]. [s.l.]: IEEE, 2018.

CHEN, Li-Pang. Supervised Learning for Binary Classification on US Adult Income. Journal of Modeling and Optimization, [s.l.], v. 13, n. 2, p. 80-91, 2021.

FAYYAD, Usama; PIATETSKY-SHAPIRO, Gregory; SMYTH, Padhraic. From data mining to knowledge discovery in databases. AI Magazine, Washington, v. 17, n. 3, p. 37-54, 1996.

GONÇALVES, Solange Ledi; MACHADO, Ana Flávia. Poverty dynamics in Brazilian metropolitan areas: An analysis based on Hulme and Shepherd’s categorization (2002–2011). EconomiA, Niterói, v. 16, n. 3, p. 376-94, 2015.

LEE, Samuel; LEE, Edward. Fuzzy Sets and Neural Networks. Journal of Cybernetics, [s.l.], v. 4, n. 2, p. 83-103, 1974.

MARINHO, Emerson; MENDES, Sérgio. The impact of government income transfers on the Brazilian job market. Estudos Econômicos, São Paulo, v. 43, n. 1, p. 29-50, Jan./Mar. 2013

MASRI, Diala; FLAMINI Valentina; TOSCANI, Frederik. The Short-Term Impact of COVID-19 on Labor Markets, Poverty and Inequality in Brazil. International Moneraty Fund Working Paper [online], [s.l.], 2021.

OLIVEIRA, Gilson de; RAIHER, Augusta Pelinski. The inclusion of poor youth in the Brazilian labour market and the impact of the Bolsa Família programme. CEPAL Review, [s.l.], n. 135, 2021.

PARRAY, Irfan Ramzan; KHURANA, Surinder Singh; KUMAR, Munish; ALTALBE, Ali. Time series data analysis of stock price movement using machine learning techniques. Soft Computing, [s.l.], v. 24, p. 16509-517, 2020.

RIBEIRO, Jouse; SANTOLIN, Roberto. An evaluation of the structure of the labour market, assistance policies and sectoral productivity on the pro‐poor growth for Brazil from 2004 to 2014: a dynamic panel analysis. Journal of International Development, [s.l.], v. 33, n. 5, p. 927-44, 2021.

RIBEIRO, Lilian Lopes; MARINHO, Emerson. Time poverty in Brazil: measurement and analysis of its determinants. Estudos Econômicos, São Paulo, v. 42, n. 2, p. 285–306, 2012.

SCALON, Celi; CAETANO, André Junqueira; CHAVES, Hugo; COSTA, Luana. Back to the past: gains and losses in Brazilian society. The Journal of Chinese Sociology, [s.l.], v. 8, n. 3, 2021.

SMITH, Michael; MARTINEZ, Tony; GIRAUD-CARRIER, Christophe. An instance level analysis of data complexity. Machine learning, [s.l.], v. 95, n. 2, p. 225-56, 2014.

TOPIWALLA, Mohammed. Machine learning on UCI adult data set using various classifier algorithms and scaling up the accuracy using extreme gradient boosting. 2013. (Dissertation for Big data and Analytics) - University of SP Jain School of Global Management, Portland, 2013.

VARIAN, Hal. Big data: New tricks for econometrics. Journal of Economic Perspectives, [s.l.], v. 28, n. 2, p. 3-28, 2014.

VERDIKHA, Naufal Azmi; ADJI, Teguh Bharata; PERMANASARI, Adhistya Erna. Study of undersampling method: Instance hardness threshold with various estimators for hate speech classification. IJITEE, Yogyakarta, v. 2, n. 2, p. 39-44, 2018.

WANG, Hongchang; LI, Chunxiao; GU, Bin; MIN, Wei. Does AI-based credit scoring improve financial inclusion? Evidence from online payday lending. In: INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS, 40., 2019, Munich. Proceedings […]. ICIS: Munich, 2019.

WU, Xindong; KUMAR, Vipin; QUINLAN, Ross; GHOSH, Joydeep; YANG, Qiang; MOTODA, Hiroshi; MCLACHLAN, Geoffrey; NG, Angus; LIU, Bing; YU, Philip; ZHOU, Zhi-Hua; STEINBACH, Michael; HAND, David; STEINBERG, Dan. Top 10 algorithms in data mining. Knowledge and Information Systems, v. 14, n. 1, p. 1-37, 2007.

Published

2023-10-04

How to Cite

Santolin, R., & Oliveira, P. G. de . (2023). An econometric and Machine Learning study on individuals who became poor during the pandemic based on the Continuous PNAD . Multitemas, 28(69), 233–257. https://doi.org/10.20435/multi.v28i69.4104