Author, as appears in the article.: Piga, Angelo; Font-Pomarol, Lluc; Sales-Pardo, Marta; Guimera, Roger
Department: Enginyeria Química
URV's Author/s: Font Pomarol, Lluc / Guimera Manrique, Roger / Piga, Angelo / Sales Pardo, Marta
Keywords: Bayesian estimation; Entropy estimation; Inferenc; Information theor; Information theory; Kullback-leibler divergence; Kullback–leibler divergence; Shannon entropy; Sparse sampling
Abstract: Estimating the Shannon entropy of a discrete distribution from which we have only observed a small sample is challenging. Estimating other information-theoretic metrics, such as the Kullback-Leibler divergence between two sparsely sampled discrete distributions, is even harder. Here, we propose a fast, semi-analytical estimator for sparsely sampled distributions. Its derivation is grounded in probabilistic considerations and uses a hierarchical Bayesian approach to extract as much information as possible from the few observations available. Our approach provides estimates of the Shannon entropy with precision at least comparable to the benchmarks we consider, and most often higher; it does so across diverse distributions with very different properties. Our method can also be used to obtain accurate estimates of other information-theoretic metrics, including the notoriously challenging Kullback-Leibler divergence. Here, again, our approach has less bias, overall, than the benchmark estimators we consider.
Thematic Areas: Applied mathematics; Astronomia / física; Ciência da computação; Ciências biológicas i; Ciências biológicas ii; Direito; Economia; Engenharias i; Engenharias ii; Engenharias iii; Engenharias iv; General mathematics; General physics and astronomy; Geociências; Interdisciplinar; Matemática / probabilidade e estatística; Materiais; Mathematical physics; Mathematics (all); Mathematics (miscellaneous); Mathematics, applied; Mathematics, interdisciplinary applications; Physics; Physics and astronomy (all); Physics and astronomy (miscellaneous); Physics, mathematical; Physics, multidisciplinary; Química; Statistical and nonlinear physics
licence for use: https://creativecommons.org/licenses/by/3.0/es/
Author's mail: marta.sales@urv.cat; lluc.fonti@estudiants.urv.cat; lluc.fonti@estudiants.urv.cat; roger.guimera@urv.cat
Record's date: 2024-10-19
Paper version: info:eu-repo/semantics/publishedVersion
Link to the original source: https://www.sciencedirect.com/science/article/pii/S0960077924001152?via%3Dihub
Paper original source: Chaos Solitons & Fractals. 180 114564-
APA: Piga, Angelo; Font-Pomarol, Lluc; Sales-Pardo, Marta; Guimera, Roger (2024). Bayesian estimation of information-theoretic metrics for sparsely sampled distributions. Chaos Solitons & Fractals, 180(), 114564-. DOI: 10.1016/j.chaos.2024.114564
Licence document URL: https://repositori.urv.cat/ca/proteccio-de-dades/
Article's DOI: 10.1016/j.chaos.2024.114564
Entity: Universitat Rovira i Virgili
Journal publication year: 2024
Publication Type: Journal Publications