Statistical Computing Schemes for Proteomics Data Processing and Insurance Solvency Modeling

dc.contributor.advisor Hong, Don en_US Xiong, Lu en_US
dc.contributor.committeemember Hong, Don en_US
dc.contributor.committeemember Robertson, William en_US
dc.contributor.committeemember Zha, Xiaoya en_US
dc.contributor.committeemember Wu, Qiang en_US
dc.contributor.committeemember Wallin, John en_US
dc.contributor.department Basic & Applied Sciences en_US 2014-12-19T19:02:45Z 2014-12-19T19:02:45Z 2014-11-14 en_US
dc.description.abstract The accumulating of big-data such as medical data and insurance data requires more advanced computational statistical data analysis methods. As an interdisciplinary computational science research, we study mathematical methods of multi-resolution analysis (MRA), statistical techniques of Bayes classifiers and Markov Random Field (MRF), computing tools of pyramid imaging matching and Markov Chain Monte Carlo (MCMC) and develop new statistical computing schemes in the applications of Imaging Mass Spectrometry (IMS) proteomic data analysis and insurance solvency modeling. en_US
dc.description.abstract IMS technique is an important and useful tool to discover biomarkers and detect early cancer. However, the high-dimensionality of IMS data makes IMS data processing a difficult task and the development of computational methods for IMS data analysis is lagging behind its technological progress. To overcome high-dimensionality difficulty in IMS data analysis, we propose the MRA method to reduce the dimensionality of IMS data. By transforming IMS data onto wavelet coefficients space and analyze it from low resolution scale to high resolution scale using the idea inspired by pyramid imaging matching technique, the computational complexity can be reduced, while important biomarkers are still selected. For better IMS classification results, we select feature variables from wavelet coefficients and use Bayes classifier to classify IMS pixels based on its feature variables. To incorporate spatial information of IMS data, we consider the Markovianity in cancer growth that the state (cancer or non-cancer) of a sample point (pixel) is highly determined by the configuration of its neighboring system and use MRF to incorporate spatial information of IMS data. This algorithm is implemented using MCMC sampling and the result is probabilistic which provides more information than a deterministic result. We also tested different neighborhood definitions. en_US
dc.description.abstract As another application of statistical computing techniques, we study insurance solvency modeling. Insurance solvency is one of the most important measurements of insurance companies' financial health. It is directly related to the financial security of an insurance company and the benefits of insurance policyholders. The current solvency prediction methods are more deterministic rather than probabilistic. However, the deterministic method can not provide information such as percentiles and probabilities as a probabilistic method provides. In this application, we design an innovating model to predict captive insurance solvency using a probabilistic method with Monte Carlo simulation. Based on a pre-built financial report for captive insurance, we simulate future losses according to loss distribution to predict solvency scores in coming years. We score solvency from 0 to 1. This solvency score measures the probability that any of the future Insurance Regulatory Information System (IRIS) ratios breaks its upper and lower bounds. These bounds can be defined by users according to their business situations. en_US
dc.description.abstract The data experiment shows MRA methods in proteomic data analysis are able to select important biomarkers and also achieve a higher classification accuracy with less computation complexity. The data experiment for the MCMC-MRF method shows that the MCMC-MRF method can improve classification accuracy significantly. Also, the captive insurance solvency model designed in this research can be a useful tool for captive managers to use and give more probabilistic information than the traditional deterministic IRIS models. en_US Ph.D. en_US
dc.publisher Middle Tennessee State University en_US
dc.subject Bayes classifier en_US
dc.subject Big data analysis en_US
dc.subject Captive Insurance Solvency Mod en_US
dc.subject Monte Carlo Markov chain en_US
dc.subject Proteomics Data Processing en_US
dc.subject Wavelet methods en_US
dc.subject.umi Statistics en_US
dc.thesis.degreegrantor Middle Tennessee State University en_US
dc.thesis.degreelevel Doctoral en_US
dc.title Statistical Computing Schemes for Proteomics Data Processing and Insurance Solvency Modeling en_US
dc.type Dissertation en_US
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
4.02 MB
Adobe Portable Document Format