Articles producció científica> Enginyeria Química

Consistencies and inconsistencies between model selection and link prediction in networks

  • Dades identificatives

    Identificador: imarina:5133290
    Autors:
    Vallès-Català T, Peixoto TP, Sales-Pardo M, Guimerà R
    Resum:
    A principled approach to understand network structures is to formulate generative models. Given a collection of models, however, an outstanding key task is to determine which one provides a more accurate description of the network at hand, discounting statistical fluctuations. This problem can be approached using two principled criteria that at first may seem equivalent: selecting the most plausible model in terms of its posterior probability; or selecting the model with the highest predictive performance in terms of identifying missing links. Here we show that while these two approaches yield consistent results in most cases, there are also notable instances where they do not, that is, where the most plausible model is not the most predictive. We show that in the latter case the improvement of predictive performance can in fact lead to overfitting both in artificial and empirical settings. Furthermore, we show that, in general, the predictive performance is higher when we average over collections of models that are individually less plausible than when we consider only the single most plausible model.
  • Altres:

    Autor segons l'article: Vallès-Català T, Peixoto TP, Sales-Pardo M, Guimerà R
    Departament: Enginyeria Química
    Autor/s de la URV: Guimera Manrique, Roger / Sales Pardo, Marta
    Paraules clau: Hashtag Etiqueta «#» @uroweb @residentesaeu @infoAeu
    Resum: A principled approach to understand network structures is to formulate generative models. Given a collection of models, however, an outstanding key task is to determine which one provides a more accurate description of the network at hand, discounting statistical fluctuations. This problem can be approached using two principled criteria that at first may seem equivalent: selecting the most plausible model in terms of its posterior probability; or selecting the model with the highest predictive performance in terms of identifying missing links. Here we show that while these two approaches yield consistent results in most cases, there are also notable instances where they do not, that is, where the most plausible model is not the most predictive. We show that in the latter case the improvement of predictive performance can in fact lead to overfitting both in artificial and empirical settings. Furthermore, we show that, in general, the predictive performance is higher when we average over collections of models that are individually less plausible than when we consider only the single most plausible model.
    Àrees temàtiques: Zootecnia / recursos pesqueiros Statistics and probability Statistical and nonlinear physics Saúde coletiva Química Physics, mathematical Physics, fluids & plasmas Odontología Medicina ii Medicina i Materiais Matemática / probabilidade e estatística Interdisciplinar Geociências General medicine Farmacia Engenharias iv Engenharias iii Engenharias ii Educação física Educação Economia Condensed matter physics Ciências biológicas ii Ciências biológicas i Ciências ambientais Ciências agrárias i Ciência da computação Biotecnología Biodiversidade Astronomia / física
    ISSN: 1063651X
    Adreça de correu electrònic de l'autor: roger.guimera@urv.cat marta.sales@urv.cat
    Identificador de l'autor: 0000-0002-3597-4310 0000-0002-8140-6525
    Data d'alta del registre: 2024-09-07
    Volum de revista: 97
    Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
    Enllaç font original: https://journals.aps.org/pre/abstract/10.1103/PhysRevE.97.062316
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referència a l'article segons font original: Physical Review e. 97 (6-1): 062316-
    Referència de l'ítem segons les normes APA: Vallès-Català T, Peixoto TP, Sales-Pardo M, Guimerà R (2018). Consistencies and inconsistencies between model selection and link prediction in networks. Physical Review e, 97(6-1), 062316-. DOI: 10.1103/PhysRevE.97.062316
    DOI de l'article: 10.1103/PhysRevE.97.062316
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2018
    Pàgina inicial: Article number 062316
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Condensed Matter Physics,Physics, Fluids & Plasmas,Physics, Mathematical,Statistical and Nonlinear Physics,Statistics and Probability
    Zootecnia / recursos pesqueiros
    Statistics and probability
    Statistical and nonlinear physics
    Saúde coletiva
    Química
    Physics, mathematical
    Physics, fluids & plasmas
    Odontología
    Medicina ii
    Medicina i
    Materiais
    Matemática / probabilidade e estatística
    Interdisciplinar
    Geociências
    General medicine
    Farmacia
    Engenharias iv
    Engenharias iii
    Engenharias ii
    Educação física
    Educação
    Economia
    Condensed matter physics
    Ciências biológicas ii
    Ciências biológicas i
    Ciências ambientais
    Ciências agrárias i
    Ciência da computação
    Biotecnología
    Biodiversidade
    Astronomia / física
  • Documents:

  • Cerca a google

    Search to google scholar