Exploring the Impact of Toxic Comments in Code Quality


Abstract:

Software development has an important human-side, which implies that developers' feelings have a significant impact to software development and could affect developers' quality, productivity, and performance. In this paper, we explore the process to find, understand and relate the effects of toxic emotions on code quality. We propose a tool and sentiments dataset, a clean set of commit messages, extracted from SonarQube code quality metrics and toxic comments obtained from GitHub. Moreover, we perform a preliminary statistical analysis of the dataset. We apply natural language processing techniques to identify toxic developer sentiments on commits that could impact code quality. Our study describes data retrieval process along with tools used for performing a preliminary analysis. The preliminary dataset is available in CSV format to facilitate queries on the data and to investigate in depth factors that impact developer emotions. Preliminary results imply that there is a relationship between toxic comments and code quality that may affect the quality of the software project. Future research will be the development of a complete dataset and an in-depth analysis for efficiency validation experiments along with a linear regression. Finally, we will estimate the code quality as a function of developers' toxic comments.

Año de publicación:

2022

Keywords:

  • Commits
  • GitHub
  • Sentiments Analysis
  • Sonarqube
  • Toxic comment classification
  • Software Engineering
  • Software quality

Fuente:

scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso abierto

Áreas de conocimiento:

  • Ingeniería de software
  • Ciencias de la computación
  • Software

Áreas temáticas:

  • Programación informática, programas, datos, seguridad
  • Economía financiera
  • Física aplicada