Ensemble Models Based on QuBiLS-MAS Features and Shallow Learning for the Pbkp_rediction of Drug-Induced Liver Toxicity: Improving Deep Learning and Traditional Approaches


Abstract:

Drug-induced liver injury (DILI) is a key safety issue in the drug discovery pipeline and a regulatory concern. Thus, many in silico tools have been proposed to improve the hepatotoxicity pbkp_rediction of organic-type chemicals. Here, classifiers for the pbkp_rediction of DILI were developed by using QuBiLS-MAS 0-2.5D molecular descriptors and shallow machine learning techniques, on a training set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the learning series, namely, the following: accuracy = 0.840, sensibility = 0.890, specificity = 0.761, Matthew's correlation coefficient = 0.660, and area under the ROC curve = 0.904. The model was also satisfactorily evaluated with Y-scrambling test, and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external validation was also carried out by using two test sets and five external test sets, with an average accuracy value equal to 0.854 (±0.062) and a coverage equal to 98.4% according to its applicability domain. A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPbkp_redictor Software, Deep Learning DILIserver, and Vslead) reported in the literature, was also performed. In general, E13 presented the best global performance in all experiments. The sum of the ranking differences procedure provided a very similar grouping pattern to that of the M-ANOVA statistical analysis, where E13 was identified as the best model for DILI pbkp_redictions. A noncommercial and fully cross-platform software for the DILI pbkp_rediction was also developed, which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for the screening of seven data sets, containing natural products, leads, toxic materials, and FDA approved drugs, to assess the usefulness of the QSAR models in the DILI labeling of organic substances; it was found that 50-92% of the evaluated molecules are positive-DILI compounds. All in all, it can be stated that the E13 model is a relevant method for the pbkp_rediction of DILI risk in humans, as it shows the best results among all of the methods analyzed.

Año de publicación:

2020

Keywords:

    Fuente:

    scopusscopus

    Tipo de documento:

    Article

    Estado:

    Acceso restringido

    Áreas de conocimiento:

    • Aprendizaje automático
    • Farmacología

    Áreas temáticas:

    • Programación informática, programas, datos, seguridad
    • Medicina y salud
    • Enfermedades