Skip to Main content Skip to Navigation

Statistical Learning on Heterogeneous Medical Data with Bayesian Latent Variable Models: Application to Neuroimaging Dementia Studies

Luigi Antelmi 1
Abstract : This thesis presents new computational tools for the joint modeling of multi-modal biomedical data, robust to missing data, with application to neuroimaging studies in dementia. The theoretical base for this work is the Variational Autoencoder (VAE), a latent variable generative model well suited for working with complex data as it forces them into a simpler low-dimensional space, able to model data non-linearities. The core of this Thesis consists in the Multi-Channel Variational Autoencoder (MCVAE), an extension of the VAE to jointly model latent relationships across multi-modal observations. This is achieved by: 1) constraining the latent distribution of each data modality to a common target prior, 2) forcing these latent distribution to generate all the data modalities through their associated generative functions. Moreover, we adapt the MCVAE to a Multi-Task setting, where the problem of dealing with missing data is addressed with a specific optimization scheme following these steps: 1) defining tasks across datasets based on the identification of data subsets presenting compatible modalities, 2) stacking multiple instances of the MCVAE, where each instance models a specific task, 3) sharing the models parameters of common modalities between modeling tasks. Thanks to these actions, the Multi-Task MCVAE allows to learn a joint model for all the data points leveraging on all the available information. Overall, this thesis provides a novel investigation of flexible approaches to account for data heterogeneity in the analysis of biomedical information. This work enables new research directions in which medical information can be consistently modeled within a joint probabilistic framework accounting for multiple data modalities, missing information, and biases across different datasets. Lastly, thanks to their general formulation, the methodologies here proposed can find applications beyond the neuroimaging research field.
Complete list of metadata
Contributor : Luigi Antelmi Connect in order to contact the contributor
Submitted on : Sunday, July 18, 2021 - 7:02:20 PM
Last modification on : Wednesday, July 21, 2021 - 3:49:07 AM


Files produced by the author(s)


  • HAL Id : tel-03289782, version 1


Luigi Antelmi. Statistical Learning on Heterogeneous Medical Data with Bayesian Latent Variable Models: Application to Neuroimaging Dementia Studies. Statistics [math.ST]. INRIA Sophia Antipolis - Méditerranée; Université Côte d'Azur, 2021. English. ⟨tel-03289782⟩



Record views


Files downloads