Alvarado-Perez, Juan Carlos; Garcia, Miguel Angel; Puig, Domenec (2025). Online dimensionality reduction through stacked generalization of spectral methods with deep networks. Machine Learning, 114(5), 125-. DOI: 10.1007/s10994-024-06715-8
Paper original source:
Machine Learning. 114 (5): 125-
Abstract:
Analyzing large volumes of high-dimensional data poses significant challenges. Dimensionality reduction aims to reveal the most prominent properties of data by embedding them into a low-dimensional representation. Spectral dimensionality reduction methods using kernel matrices have been proven to yield optimal results. Online versions of those methods are desirable to incrementally project new data without recomputing the whole embedding from the complete dataset. In addition, integrating different spectral methods may have a synergistic effect. This paper presents an online dimensionality reduction method based on deep neural networks that integrates embeddings optimized by statistical approximation of neighborhoods and induced by different spectral methods through stacking ensemble learning. In particular, the proposed method first applies a self-supervised stage in order to train a set of deep encoders based on the embeddings induced by different spectral methods applied to a given input dataset. Those basis encoders are optimized and then integrated through a metamodel constituted by a fully connected network. A supervised and an unsupervised approach have been designed depending on whether the final aim is to enforce topological preservation or cluster induction. The proposed method has been experimentally validated on well-known image datasets and compared to some of the most relevant dimensionality reduction techniques by using widely-used quality measures.
Analyzing large volumes of high-dimensional data poses significant challenges. Dimensionality reduction aims to reveal the most prominent properties of data by embedding them into a low-dimensional representation. Spectral dimensionality reduction methods using kernel matrices have been proven to yield optimal results. Online versions of those methods are desirable to incrementally project new data without recomputing the whole embedding from the complete dataset. In addition, integrating different spectral methods may have a synergistic effect. This paper presents an online dimensionality reduction method based on deep neural networks that integrates embeddings optimized by statistical approximation of neighborhoods and induced by different spectral methods through stacking ensemble learning. In particular, the proposed method first applies a self-supervised stage in order to train a set of deep encoders based on the embeddings induced by different spectral methods applied to a given input dataset. Those basis encoders are optimized and then integrated through a metamodel constituted by a fully connected network. A supervised and an unsupervised approach have been designed depending on whether the final aim is to enforce topological preservation or cluster induction. The proposed method has been experimentally validated on well-known image datasets and compared to some of the most relevant dimensionality reduction techniques by using widely-used quality measures.
Title:
Online dimensionality reduction through stacked generalization of spectral methods with deep networks