Exploring the QSAR’s pbkp_redictive truthfulness of the novel N-tuple discrete derivative indices on benchmark datasets


Abstract:

Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true pbkp_redictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively higher than those reported by other authors in similar experiments. Comparisons with respect to external correlation coefficients (q2ext) revealed that the models based on GDIs possess superior pbkp_redictive ability in seven of the eight datasets analysed, outperforming methodologies based on similar or more complex techniques and confirming the good pbkp_redictive power of the obtained models. For the q2ext values, the non-parametric comparison revealed significantly different results to those reported so far, which demonstrated that the models based on DIVATI’s indices presented the best global performance and yielded significantly better pbkp_redictions than the 12 0–3D QSAR procedures used in the comparison. Therefore, GDIs are suitable for structure codification of the molecules and constitute a good alternative to build QSARs for the pbkp_rediction of physicochemical, biological and environmental endpoints.

Año de publicación:

2017

Keywords:

  • graph derivative indices
  • genetic algorithm, Friedman test for multiple comparisons
  • TOMOCOMD system
  • free and open source software
  • CARDD suite
  • feature selection
  • Keysfinder framework
  • DIVATI module
  • multiple linear regression
  • QSAR model
  • Molecular descriptors

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Relación cuantitativa estructura-actividad

Áreas temáticas:

  • Programación informática, programas, datos, seguridad
  • Química física
  • Fisiología humana