Articles producció científicaCiències Mèdiques Bàsiques

Contamination of fungal genomes of Onygenaceae (Phylum Ascomycota) in public databases: incidence, detection, and impact

  • Identification data

    Identifier:  imarina:9469348
    Authors:  Granados-Casas, AO; Fernández-Bravo, A; Stchigel, AM; Cano-Lira, JF
    Abstract:
    Genomic datasets often contain unwanted, foreign, or erroneous nucleotide sequences that do not belong to the organism under study. Such contamination can significantly compromise genome analyses, reducing the accuracy and reliability of the results. Despite its potential impact, few studies have addressed the contamination of fungal genomes by exogenous sequences. Here, we analyzed eleven publicly available genomes of fungi from the family Onygenaceae, retrieved from the National Center for Biotechnology Information (NCBI) database. A comprehensive quality assessment was performed, evaluating genome completeness, contiguity, and contamination levels. Genomes with lower statistical quality and putatively contaminated were selected for further improvement. To enhance assembly quality, we built a custom Kraken 2 database including four high-quality genomes of closely related fungal taxa. After filtering, we reassessed the genomes to compare contiguity, completeness, and contamination levels before and after the process. Furthermore, structural and functional annotation was conducted to evaluate changes in predicted proteins, protein families and domains. Additionally, Average nucleotide identity and phylogenetic analyses were performed to further assess the impact of the filtering. Four genomes showed low-quality statistics and contamination levels between 5 and 12%, mainly of bacteria origin. After removing the contaminated regions, assembly quality metrics improved, and contamination level dropped below 3% in all cases. Functional annotation of the filtered assemblies revealed a reduction in bacteria-associated protein families. Our results demonstrate the presence of contamination in publicly available Onygenaceae fungal genomes and highlight its potential to bias downstream analyses. We emphasize the importance of contamination screening and removal to ensure reliable genomic data for fungal research.
  • Others:

    Link to the original source: https://link.springer.com/journal/12864
    APA: Granados-Casas, AO; Fernández-Bravo, A; Stchigel, AM; Cano-Lira, JF (2025). Contamination of fungal genomes of Onygenaceae (Phylum Ascomycota) in public databases: incidence, detection, and impact. Bmc Genomics, 26(1), 1057-. DOI: 10.1186/s12864-025-12223-3
    Paper original source: Bmc Genomics. 26 (1): 1057-
    Article's DOI: 10.1186/s12864-025-12223-3
    Journal publication year: 2025-11-19
    Entity: Universitat Rovira i Virgili
    Paper version: info:eu-repo/semantics/publishedVersion
    Record's date: 2026-02-11
    URV's Author/s: Cano Lira, José Francisco / Granados Casas, Alan Omar / Stchigel Glikman, Alberto Miguel
    Department: Ciències Mèdiques Bàsiques
    Licence document URL: https://repositori.urv.cat/ca/proteccio-de-dades/
    Publication Type: Journal Publications
    Author, as appears in the article.: Granados-Casas, AO; Fernández-Bravo, A; Stchigel, AM; Cano-Lira, JF
    licence for use: https://creativecommons.org/licenses/by/3.0/es/
    e-ISSN: 1471-2164
    Research group: Unitat de Micologia i Microbiologia Ambiental
    Thematic Areas: Zootecnia / recursos pesqueiros, Saúde coletiva, Química, Odontología, Medicina veterinaria, Medicina iii, Medicina ii, Medicina i, Matemática / probabilidade e estatística, Interdisciplinar, Genetics & heredity, Genetics, Farmacia, Engenharias iv, Engenharias iii, Engenharias ii, Educação física, Ciências biológicas iii, Ciências biológicas ii, Ciências biológicas i, Ciências ambientais, Ciências agrárias i, Ciência de alimentos, Ciência da computação, Biotecnología, Biotechnology & applied microbiology, Biotechnology, Biodiversidade, Astronomia / física
    Author's mail: alanomar.granados@urv.cat, alanomar.granados@urv.cat, albertomiguel.stchigel@urv.cat, jose.cano@urv.cat
  • Keywords:

    Whole genome sequencing
    Whole
    Taxonomy
    Software
    Quality assessment
    Phylogeny
    Onygenales
    Molecular sequence annotation
    Genomics
    Genome
    fungal
    Fungi
    Dna contamination
    Databases
    genetic
    Coverage
    Contamination
    Bacteria
    Ascomycota
    Algorithm
    <italic>ascomycota</italic>
    Biotechnology
    Biotechnology & Applied Microbiology
    Genetics
    Genetics & Heredity
    ascomycota
    Zootecnia / recursos pesqueiros
    Saúde coletiva
    Química
    Odontología
    Medicina veterinaria
    Medicina iii
    Medicina ii
    Medicina i
    Matemática / probabilidade e estatística
    Interdisciplinar
    Farmacia
    Engenharias iv
    Engenharias iii
    Engenharias ii
    Educação física
    Ciências biológicas iii
    Ciências biológicas ii
    Ciências biológicas i
    Ciências ambientais
    Ciências agrárias i
    Ciência de alimentos
    Ciência da computação
    Biotecnología
    Biodiversidade
    Astronomia / física
  • Documents:

  • Cerca a google

    Search to google scholar