New Algorithms for Supervised Dimension Reduction

dc.contributor.advisor Wu, Qiang
dc.contributor.author Zhang, Ning
dc.contributor.committeemember Hong, Don
dc.contributor.committeemember Li, Cen
dc.contributor.committeemember Robertson, William
dc.contributor.department Basic & Applied Sciences en_US
dc.date.accessioned 2019-06-13T18:00:31Z
dc.date.available 2019-06-13T18:00:31Z
dc.date.issued 2019
dc.date.updated 2019-06-13T18:00:33Z
dc.description.abstract Advances in data collection and storage capabilities during the past decades have led to information overload in most sciences and ushered in a big data era. Data of big volume, as well as high dimensionality, become ubiquitous in many scientific domains. They present many mathematical challenges as well as some opportunities and are bound to give rise to new theoretical developments. Dimension reduction aims to explore low dimensional representation for high dimensional data. It helps promote the understanding of the data structure through visualization and enhance the predictive performance of machine learning algorithms by preventing the “curse of dimensionality.” As high dimensional data become ubiquitous in modern sciences, dimension reduction methods are playing more and more important roles in data analysis. The contribution of this dissertation is to propose some new algorithms for supervised dimension reduction that can handle high dimensional data more efficiently. The first new algorithm is the overlapping sliced inverse regression (OSIR). Sliced inverse regression (SIR) is a pioneer tool for supervised dimension reduction. It identifies the subspace of significant factors with intrinsic lower dimensionality, specifically known as the effective dimension reduction (EDR) space. OSIR refines SIR through an overlapping slicing scheme and can estimate the EDR space and determine the number of effective factors more accurately. We show that the overlapping procedure has the potential to identify the information contained in the derivatives of the inverse regression curve, which helps to explain the superiority of OSIR. We prove that OSIR algorithm is √n-consistent. We also propose the use of bagging and bootstrapping techniques to further improve the accuracy of OSIR. Online learning has attracted great attention due to the increasing demand for systems that have the ability of learning and evolving. When the data to be processed is also high dimensional, and dimension reduction is necessary for visualization or prediction enhancement, online dimension reduction will play an essential role. We propose four new online learning approaches for supervised dimension reduction, namely, the incremental sliced inverse regression, the covariance-free incremental sliced inverse regression, the incremental overlapping sliced inverse regression, and the covariance-free incremental overlapping sliced inverse regression. All four methods are able to update the EDR space fast and efficiently when new observations come in. The effectiveness and efficiency of all four algorithms are verified by simulations and real data applications.
dc.description.degree Ph.D.
dc.identifier.uri http://jewlscholar.mtsu.edu/xmlui/handle/mtsu/5887
dc.language.rfc3066 en
dc.publisher Middle Tennessee State University
dc.subject Statistics
dc.subject Computer science
dc.thesis.degreegrantor Middle Tennessee State University
dc.title New Algorithms for Supervised Dimension Reduction
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Zhang_mtsu_0170E_11130.pdf
Size:
770.38 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections