Articles producció científica> Bioquímica i Biotecnologia

Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks

  • Dades identificatives

    Identificador: imarina:9287714
    Autors:
    Saldivar-Espinoza, BryanMacip, GuillemGarcia-Segura, PolMestres-Truyol, JuliaPuigbo, PereCereto-Massague, AdriaPujadas, GerardGarcia-Vallve, Santiago
    Resum:
    Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity value of 0.79, and an Area Under the Curve (AUC) of 0.8, showing that the prediction of recurrent SARS-CoV-2 mutations is feasible. Subsequently, we compared our predictions with updated data from January 2022, showing that some of the false positives in our prediction model become true positives later on. The most important variables detected by the model's Shapley Additive exPlanation (SHAP) are the nucleotide that mutates and RNA reactivity. This is consistent with the SARS-CoV-2 mutational bias pattern and the preference of some host deaminases for specific sequences and RNA secondary structures. We extend our investigation by analyzing the mutations from the variants of concern Alpha, Beta, Delta, Gamma, and Omicron. Finally, we analyzed amino acid changes by looking at the predicted recurrent mutations in the M-pro and spike proteins.
  • Altres:

    Autor segons l'article: Saldivar-Espinoza, Bryan; Macip, Guillem; Garcia-Segura, Pol; Mestres-Truyol, Julia; Puigbo, Pere; Cereto-Massague, Adria; Pujadas, Gerard; Garcia-Vallve, Santiago;
    Departament: Bioquímica i Biotecnologia
    Autor/s de la URV: Cereto Massagué, Adrián José / Garcia Vallve, Santiago / Macip Sancho, Guillem / PUIGBÒ AVALOS, PEDRO / Pujadas Anguiano, Gerard / Saldivar Espinoza, Bryan Percy
    Paraules clau: Sars-cov-2 Mutations Machine learning Covid-19
    Resum: Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity value of 0.79, and an Area Under the Curve (AUC) of 0.8, showing that the prediction of recurrent SARS-CoV-2 mutations is feasible. Subsequently, we compared our predictions with updated data from January 2022, showing that some of the false positives in our prediction model become true positives later on. The most important variables detected by the model's Shapley Additive exPlanation (SHAP) are the nucleotide that mutates and RNA reactivity. This is consistent with the SARS-CoV-2 mutational bias pattern and the preference of some host deaminases for specific sequences and RNA secondary structures. We extend our investigation by analyzing the mutations from the variants of concern Alpha, Beta, Delta, Gamma, and Omicron. Finally, we analyzed amino acid changes by looking at the predicted recurrent mutations in the M-pro and spike proteins.
    Àrees temàtiques: Zootecnia / recursos pesqueiros Spectroscopy Saúde coletiva Química Psicología Physical and theoretical chemistry Organic chemistry Odontología Nutrição Molecular biology Medicine (miscellaneous) Medicina veterinaria Medicina iii Medicina ii Medicina i Materiais Interdisciplinar Inorganic chemistry Geociências Farmacia Engenharias iv Engenharias ii Engenharias i Educação física Computer science applications Ciências biológicas iii Ciências biológicas ii Ciências biológicas i Ciências ambientais Ciências agrárias i Ciência de alimentos Ciência da computação Chemistry, multidisciplinary Catalysis Biotecnología Biodiversidade Biochemistry & molecular biology Astronomia / física
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: adrianjose.cereto@urv.cat bryanpercy.saldivar@estudiants.urv.cat bryanpercy.saldivar@estudiants.urv.cat guillem.macip@estudiants.urv.cat guillem.macip@estudiants.urv.cat santi.garcia-vallve@urv.cat gerard.pujadas@urv.cat
    Identificador de l'autor: 0000-0002-9667-2818 0000-0002-9667-2818 0000-0002-0348-7497 0000-0003-2598-8089
    Data d'alta del registre: 2024-09-07
    Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referència a l'article segons font original: International Journal Of Molecular Sciences. 23 (23):
    Referència de l'ítem segons les normes APA: Saldivar-Espinoza, Bryan; Macip, Guillem; Garcia-Segura, Pol; Mestres-Truyol, Julia; Puigbo, Pere; Cereto-Massague, Adria; Pujadas, Gerard; Garcia-Val (2022). Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks. International Journal Of Molecular Sciences, 23(23), -. DOI: 10.3390/ijms232314683
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2022
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Biochemistry & Molecular Biology,Catalysis,Chemistry, Multidisciplinary,Computer Science Applications,Inorganic Chemistry,Medicine (Miscellaneous),Molecular Biology,Organic Chemistry,Physical and Theoretical Chemistry,Spectroscopy
    Sars-cov-2
    Mutations
    Machine learning
    Covid-19
    Zootecnia / recursos pesqueiros
    Spectroscopy
    Saúde coletiva
    Química
    Psicología
    Physical and theoretical chemistry
    Organic chemistry
    Odontología
    Nutrição
    Molecular biology
    Medicine (miscellaneous)
    Medicina veterinaria
    Medicina iii
    Medicina ii
    Medicina i
    Materiais
    Interdisciplinar
    Inorganic chemistry
    Geociências
    Farmacia
    Engenharias iv
    Engenharias ii
    Engenharias i
    Educação física
    Computer science applications
    Ciências biológicas iii
    Ciências biológicas ii
    Ciências biológicas i
    Ciências ambientais
    Ciências agrárias i
    Ciência de alimentos
    Ciência da computação
    Chemistry, multidisciplinary
    Catalysis
    Biotecnología
    Biodiversidade
    Biochemistry & molecular biology
    Astronomia / física
  • Documents:

  • Cerca a google

    Search to google scholar