APPLICATIONS OF MODERN NLP TECHNIQUES FOR PREDICTIVE MODELING IN ACTUARIAL SCIENCE

No Thumbnail Available
Date
2021
Authors
Xu, Shuzhe
Journal Title
Journal ISSN
Volume Title
Publisher
Middle Tennessee State University
Abstract
In this dissertation, the research focuses on Natural Language Processing (NLP) applications in actuarial science. NLP techniques, as powerful text analytic tools, can automatically help actuaries to exploit the information in textual data. Recently, many NLP techniques have been applied in different research fields, but only a few NLP applications can be found in actuarial science. This dissertation researches NLP techniques in actuarial science and proposes some NLP solutions for actuarial applications. This dissertation consists of five chapters. The first chapter is an introduction of NLP and some opportunities for its use in actuarial science. The possibilities of traditional actuarial applications incorporating NLP are also discussed. A few NLP applications proposed by actuaries are also introduced as references. The second chapter is the literature review of relevant NLP techniques. Some basic technologies are introduced such as word embeddings and tokenizations. Also, advanced NLP tools such as Bidirectional Encoder Representation for Transformers (BERT) and related techniques are discussed. The third chapter is an NLP application based on extended truck warranty data. This chapter develops a BERT-based aggregate loss model with a rescaled 10-value scale severity to predict future losses based on the frequency distribution of claim counts with contracts and severity distribution of claim records. The NLP tool helps to extract information from the textual description in the data, and the extracted values are exploited to predict loss severity. The fourth chapter is another NPL application for basic truck warranty data. A data-based portfolio allocation model is proposed to predict losses using the modern portfolio theory (MPT) developed by Nobel Laureate Harry Markowitz in 1952. In this chapter, BERT is applied to improve the accuracy of multi-class classification in the BERT enhanced data-based portfolio allocation model. Also, a technique similar to the one used in chapter 3 is applied to derive a BERTbased severity model for multi-class aggregate loss prediction through a different approach with the BERT enhanced data-based portfolio allocation model. The last chapter summarizes the described applications. The applications of modern NLP techniques for predictive analytics are practical and promising. However, applications to actuarial science are almost nonexistent. This dissertation demonstrates the possibilities of NLP applications to improve predictive modeling in actuarial science. The NLP techniques can help to gather information from textual descriptions discarded by traditional models. The possible improvements that can be made in future research are also described in this chapter.
Description
Keywords
Actuarial science, Machine Learning, Natural language processing, Neural network, Predictive modeling, Text mining, Computer science, Statistics, Language
Citation