Regularized Statistical Techniques for High Dimensional Medical Imaging Data Processing

No Thumbnail Available
Liang, Jingsai
Journal Title
Journal ISSN
Volume Title
Middle Tennessee State University
This dissertation consists of two topics. The first topic is IMSmining: A Tool for Imaging Mass Spectrometry Data Biomarker Selection and Classification. We developed IMSmining, a free software tool combining functions of intuitive visualization of imaging mass spectrometry (IMS) data with advanced analysis algorithms in a single package which is easy to operate. The main functions of IMSmining include data visualization, biomarker selection and classification using advanced multivariate analysis methods such as elastic net, sparse PCA, and wavelets. It can be used to study the correlation and distribution of the IMS data by incorporating the spatial information in the entire image cube and helping to find the distinction of the possible features caused by the biological structure and the potential biomarkers.
The second topic is Non-Gaussian Penalized PARAFAC Analysis for Functional Magnetic Resonance Imaging (fMRI) Data. Independent Component Analysis (ICA) method has been used widely and successfully in fMRI data analysis for both single and group subjects. As an extension of the ICA, Tensorial Probabilistic ICA (TPICA) is used to decompose fMRI group data into three-mode of subject, temporal and spatial. But due to the independent constraint of the spatial components, TPICA is not very efficient in the presence of overlapping of active regions of different spatial components. Parallel Factor Analysis (PARAFAC) is another method to process three-mode data and can be solved by alternating least-squares. PARAFAC may converge into some degenerate solutions if the matrix of one mode is collinear. However, it is reasonable to find significant collinear relationships within subject mode of two similar subjects in group fMRI data. Thus both TPICA and PARAFAC have unavoidable drawbacks. In this topic, we try to alleviate both overlapping and collinear issues by integrating the characters of PARAFAC and TPICA together, which imposes a non-Gaussian penalty term to each spatial component under the PARAFAC framework. The proposed algorithm can regulate the spatial components, as the high nongaussianity is possible to avoid the degenerate solutions aroused by collinear issue, and get rid of the independent constraint of the spatial components to bypass the overlapping issue. This proposed algorithm outperforms TPICA and PARAFAC on the simulation data. Its performance on real fMRI data is also comparable with other algorithms.