On the value of parameter tuning in heterogeneous ensembles effort estimation


Abstract:

Accurate estimation of software development effort estimation (SDEE) is fundamental for efficient management of software development projects as it assists software managers to efficiently manage their human resources. Over the last four decades, while software engineering researchers have used several effort estimation techniques, including those based on statistical and machine learning methods, no consensus has been reached on the technique that can perform best in all circumstances. To tackle this challenge, Ensemble Effort Estimation, which pbkp_redicts software development effort by combining more than one solo estimation technique, has recently been investigated. In this paper, heterogeneous ensembles based on four well-known machine learning techniques (K-nearest neighbor, support vector regression, multilayer perceptron and decision trees) were developed and evaluated by investigating the impact of parameter values of the ensemble members on estimation accuracy. In particular, this paper evaluates whether setting ensemble parameters using two optimization techniques (e.g., grid search optimization and particle swarm) permits more accurate estimates of SDEE. The heterogeneous ensembles of this study were built using three combination rules (mean, median and inverse ranked weighted mean) over seven datasets. The results obtained suggest that: (1) Optimized single techniques using grid search or particle swarm optimization provide more accurate estimation; (2) in general ensembles achieve higher accuracy than their single techniques whatever the optimization technique used, even though ensembles do not dominate over all single techniques; (3) heterogeneous ensembles based on optimized single techniques provide more accurate estimation; and (4) generally, particle swarm optimization and grid search techniques generate ensembles with the same pbkp_redictive capability.

Año de publicación:

2018

Keywords:

  • Swarm optimization
  • Software effort estimation
  • Heterogeneous ensembles
  • Machine learning
  • Optimization techniques

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Ingeniería de software
  • Software
  • Modelo estadístico

Áreas temáticas: