APPLICATIONS OF MODERN NLP TECHNIQUES FOR PREDICTIVE MODELING IN ACTUARIAL SCIENCE

dc.contributor.advisor Hong, Don
dc.contributor.advisor Barbosa, Salvador E.
dc.contributor.author Xu, Shuzhe
dc.contributor.committeemember Manathunga, Vajira
dc.contributor.committeemember Sinkala, Zachariah
dc.contributor.committeemember Wu, Qiang
dc.date.accessioned 2021-11-17T17:03:03Z
dc.date.available 2021-11-17T17:03:03Z
dc.date.issued 2021
dc.date.updated 2021-11-17T17:03:03Z
dc.description.abstract In this dissertation, the research focuses on Natural Language Processing (NLP) applications in actuarial science. NLP techniques, as powerful text analytic tools, can automatically help actuaries to exploit the information in textual data. Recently, many NLP techniques have been applied in different research fields, but only a few NLP applications can be found in actuarial science. This dissertation researches NLP techniques in actuarial science and proposes some NLP solutions for actuarial applications. This dissertation consists of five chapters. The first chapter is an introduction of NLP and some opportunities for its use in actuarial science. The possibilities of traditional actuarial applications incorporating NLP are also discussed. A few NLP applications proposed by actuaries are also introduced as references. The second chapter is the literature review of relevant NLP techniques. Some basic technologies are introduced such as word embeddings and tokenizations. Also, advanced NLP tools such as Bidirectional Encoder Representation for Transformers (BERT) and related techniques are discussed. The third chapter is an NLP application based on extended truck warranty data. This chapter develops a BERT-based aggregate loss model with a rescaled 10-value scale severity to predict future losses based on the frequency distribution of claim counts with contracts and severity distribution of claim records. The NLP tool helps to extract information from the textual description in the data, and the extracted values are exploited to predict loss severity. The fourth chapter is another NPL application for basic truck warranty data. A data-based portfolio allocation model is proposed to predict losses using the modern portfolio theory (MPT) developed by Nobel Laureate Harry Markowitz in 1952. In this chapter, BERT is applied to improve the accuracy of multi-class classification in the BERT enhanced data-based portfolio allocation model. Also, a technique similar to the one used in chapter 3 is applied to derive a BERTbased severity model for multi-class aggregate loss prediction through a different approach with the BERT enhanced data-based portfolio allocation model. The last chapter summarizes the described applications. The applications of modern NLP techniques for predictive analytics are practical and promising. However, applications to actuarial science are almost nonexistent. This dissertation demonstrates the possibilities of NLP applications to improve predictive modeling in actuarial science. The NLP techniques can help to gather information from textual descriptions discarded by traditional models. The possible improvements that can be made in future research are also described in this chapter.
dc.description.degree Ph.D.
dc.identifier.uri https://jewlscholar.mtsu.edu/handle/mtsu/6557
dc.language.rfc3066 en
dc.publisher Middle Tennessee State University
dc.source.uri http://dissertations.umi.com/mtsu:11506
dc.subject Actuarial science
dc.subject Machine Learning
dc.subject Natural language processing
dc.subject Neural network
dc.subject Predictive modeling
dc.subject Text mining
dc.subject Computer science
dc.subject Statistics
dc.subject Language
dc.thesis.degreelevel doctoral
dc.title APPLICATIONS OF MODERN NLP TECHNIQUES FOR PREDICTIVE MODELING IN ACTUARIAL SCIENCE
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Xu_mtsu_0170E_11506.pdf
Size:
992.21 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.27 KB
Format:
Item-specific license agreed upon to submission
Description: