Statistical Computing Schemes for Proteomics Data Processing and Insurance Solvency Modeling

dc.contributor.advisorHong, Donen_US
dc.contributor.authorXiong, Luen_US
dc.contributor.committeememberHong, Donen_US
dc.contributor.committeememberRobertson, Williamen_US
dc.contributor.committeememberZha, Xiaoyaen_US
dc.contributor.committeememberWu, Qiangen_US
dc.contributor.committeememberWallin, Johnen_US
dc.contributor.departmentBasic & Applied Sciencesen_US
dc.date.accessioned2014-12-19T19:02:45Z
dc.date.available2014-12-19T19:02:45Z
dc.date.issued2014-11-14en_US
dc.description.abstractThe accumulating of big-data such as medical data and insurance data requires more advanced computational statistical data analysis methods. As an interdisciplinary computational science research, we study mathematical methods of multi-resolution analysis (MRA), statistical techniques of Bayes classifiers and Markov Random Field (MRF), computing tools of pyramid imaging matching and Markov Chain Monte Carlo (MCMC) and develop new statistical computing schemes in the applications of Imaging Mass Spectrometry (IMS) proteomic data analysis and insurance solvency modeling.en_US
dc.description.abstractIMS technique is an important and useful tool to discover biomarkers and detect early cancer. However, the high-dimensionality of IMS data makes IMS data processing a difficult task and the development of computational methods for IMS data analysis is lagging behind its technological progress. To overcome high-dimensionality difficulty in IMS data analysis, we propose the MRA method to reduce the dimensionality of IMS data. By transforming IMS data onto wavelet coefficients space and analyze it from low resolution scale to high resolution scale using the idea inspired by pyramid imaging matching technique, the computational complexity can be reduced, while important biomarkers are still selected. For better IMS classification results, we select feature variables from wavelet coefficients and use Bayes classifier to classify IMS pixels based on its feature variables. To incorporate spatial information of IMS data, we consider the Markovianity in cancer growth that the state (cancer or non-cancer) of a sample point (pixel) is highly determined by the configuration of its neighboring system and use MRF to incorporate spatial information of IMS data. This algorithm is implemented using MCMC sampling and the result is probabilistic which provides more information than a deterministic result. We also tested different neighborhood definitions.en_US
dc.description.abstractAs another application of statistical computing techniques, we study insurance solvency modeling. Insurance solvency is one of the most important measurements of insurance companies' financial health. It is directly related to the financial security of an insurance company and the benefits of insurance policyholders. The current solvency prediction methods are more deterministic rather than probabilistic. However, the deterministic method can not provide information such as percentiles and probabilities as a probabilistic method provides. In this application, we design an innovating model to predict captive insurance solvency using a probabilistic method with Monte Carlo simulation. Based on a pre-built financial report for captive insurance, we simulate future losses according to loss distribution to predict solvency scores in coming years. We score solvency from 0 to 1. This solvency score measures the probability that any of the future Insurance Regulatory Information System (IRIS) ratios breaks its upper and lower bounds. These bounds can be defined by users according to their business situations.en_US
dc.description.abstractThe data experiment shows MRA methods in proteomic data analysis are able to select important biomarkers and also achieve a higher classification accuracy with less computation complexity. The data experiment for the MCMC-MRF method shows that the MCMC-MRF method can improve classification accuracy significantly. Also, the captive insurance solvency model designed in this research can be a useful tool for captive managers to use and give more probabilistic information than the traditional deterministic IRIS models.en_US
dc.description.degreePh.D.en_US
dc.identifier.urihttp://jewlscholar.mtsu.edu/handle/mtsu/4331
dc.publisherMiddle Tennessee State Universityen_US
dc.subjectBayes classifieren_US
dc.subjectBig data analysisen_US
dc.subjectCaptive Insurance Solvency Moden_US
dc.subjectMonte Carlo Markov chainen_US
dc.subjectProteomics Data Processingen_US
dc.subjectWavelet methodsen_US
dc.subject.umiStatisticsen_US
dc.thesis.degreegrantorMiddle Tennessee State Universityen_US
dc.thesis.degreelevelDoctoralen_US
dc.titleStatistical Computing Schemes for Proteomics Data Processing and Insurance Solvency Modelingen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Xiong_mtsu_0170E_10344.pdf
Size:
4.02 MB
Format:
Adobe Portable Document Format