A Data Mining Approach to Detecting Bias and Favoritism in Public Procurement
Abstract:
In a public procurement process, corruption can occur at each stage, favoring a participant with a previous agreement, which can result in over-pricing and purchases of substandard products, as well as gender discrimination. This paper’s aim is to detect biased purchases using a Spanish Language corpus, analyzing text from the questions and answers registry platform by applicants in a public procurement process in Ecuador. Additionally, gender bias is detected, pro-moting both men and women to participate under the same conditions. In order to detect gender bias and favoritism towards certain providers by contracting enti-ties, the study proposes a unique hybrid model that combines Artificial Intelligence algorithms and Natural Language Processing (NLP). In the experimental work, 303,076 public procurement processes have been analyzed over 10 years (since 2010) with 1,009,739 questions and answers to suppliers and public insti-tutions in each process. Gender bias and favoritism were analyzed using a Word2-vec model with word embedding, as well as sentiment analysis of the questions and answers using the VADER algorithm. In 32% of cases (96,984 answers), there was favoritism or gender bias as evidenced by responses from contracting entities. The proposed model provides accuracy rates of 88% for detecting favor-itism, and 90% for detecting gender bias. Consequently one-third of the procurement processes carried out by the state have indications of corruption and bias. In Latin America, government corruption is one of the most significant challenges, making the resulting classifier useful for detecting bias and favoritism in public procurement processes.
Año de publicación:
2023
Keywords:
- Natural Language processing
- bias
- Favoritism
- Word embeddings
- Word2Vec
- sentiment analysis
Fuente:
Tipo de documento:
Article
Estado:
Acceso abierto
Áreas de conocimiento:
- Minería de datos
- Auditoría
Áreas temáticas:
- Funcionamiento de bibliotecas y archivos