Examining the pbkp_redictive accuracy of the novel 3D N-linear algebraic molecular codifications on benchmark datasets


Abstract:

Background: Recently, novel 3D alignment-free molecular descriptors (also known as QuBiLS-MIDAS) based on two-linear, three-linear and four-linear algebraic forms have been introduced. These descriptors codify chemical information for relations between two, three and four atoms by using several (dis-)similarity metrics and multi-metrics. Several studies aimed at assessing the quality of these novel descriptors have been performed. However, a deeper analysis of their performance is necessary. Therefore, in the present manuscript an assessment and statistical validation of the performance of these novel descriptors in QSAR studies is performed. Results: To this end, eight molecular datasets (angiotensin converting enzyme, acetylcholinesterase inhibitors, benzodiazepine receptor, cyclooxygenase-2 inhibitors, dihydrofolate reductase inhibitors, glycogen phosphorylase b, thermolysin inhibitors, thrombin inhibitors) widely used as benchmarks in the evaluation of several procedures are utilized. Three to nine variable QSAR models based on Multiple Linear Regression are built for each chemical dataset according to the original division into training/test sets. Comparisons with respect to leave-one-out cross-validation correlation coefficients $$\left({Q-{loo}^{2} } \right)$$ Q l o o 2 reveal that the models based on QuBiLS-MIDAS indices possess superior pbkp_redictive ability in 7 of the 8 datasets analyzed, outperforming methodologies based on similar or more complex techniques such as: Partial Least Square, Neural Networks, Support Vector Machine and others. On the other hand, superior external correlation coefficients $$\left({Q-{ext}^{2} } \right)$$ Q e x t 2 are attained in 6 of the 8 test sets considered, confirming the good pbkp_redictive power of the obtained models. For the $$Q-{ext}^{2}$$ Q e x t 2 values non-parametric statistic tests were performed, which demonstrated that the models based on QuBiLS-MIDAS indices have the best global performance and yield significantly better pbkp_redictions in 11 of the 12 QSAR procedures used in the comparison. Lastly, a study concerning to the performance of the indices according to several conformer generation methods was performed. This demonstrated that the quality of pbkp_redictions of the QSAR models based on QuBiLS-MIDAS indices depend on 3D structure generation method considered, although in this preliminary study the results achieved do not present significant statistical differences among them. Conclusions: As conclusions it can be stated that the QuBiLS-MIDAS indices are suitable for extracting structural information of the molecules and thus, constitute a promissory alternative to build models that contribute to the pbkp_rediction of pharmacokinetic, pharmacodynamics and toxicological properties on novel compounds.

Año de publicación:

2016

Keywords:

  • 3D-QSAR
  • QuBiLS-MIDAS
  • multiple linear regression
  • ToMoCoMD-CARDD

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso abierto

Áreas de conocimiento:

  • Química

Áreas temáticas:

  • Ciencias de la computación