APPLICATIONS OF MODERN NLP TECHNIQUES FOR PREDICTIVE MODELING IN ACTUARIAL SCIENCE
APPLICATIONS OF MODERN NLP TECHNIQUES FOR PREDICTIVE MODELING IN ACTUARIAL SCIENCE
No Thumbnail Available
Date
2021
Authors
Xu, Shuzhe
Journal Title
Journal ISSN
Volume Title
Publisher
Middle Tennessee State University
Abstract
In this dissertation, the research focuses on Natural Language Processing (NLP)
applications in actuarial science. NLP techniques, as powerful text analytic tools,
can automatically help actuaries to exploit the information in textual data. Recently,
many NLP techniques have been applied in different research fields, but
only a few NLP applications can be found in actuarial science. This dissertation
researches NLP techniques in actuarial science and proposes some NLP solutions
for actuarial applications.
This dissertation consists of five chapters. The first chapter is an introduction
of NLP and some opportunities for its use in actuarial science. The possibilities
of traditional actuarial applications incorporating NLP are also discussed. A few
NLP applications proposed by actuaries are also introduced as references.
The second chapter is the literature review of relevant NLP techniques. Some
basic technologies are introduced such as word embeddings and tokenizations.
Also, advanced NLP tools such as Bidirectional Encoder Representation for Transformers
(BERT) and related techniques are discussed.
The third chapter is an NLP application based on extended truck warranty
data. This chapter develops a BERT-based aggregate loss model with a rescaled
10-value scale severity to predict future losses based on the frequency distribution
of claim counts with contracts and severity distribution of claim records. The NLP
tool helps to extract information from the textual description in the data, and the
extracted values are exploited to predict loss severity.
The fourth chapter is another NPL application for basic truck warranty data.
A data-based portfolio allocation model is proposed to predict losses using the
modern portfolio theory (MPT) developed by Nobel Laureate Harry Markowitz
in 1952. In this chapter, BERT is applied to improve the accuracy of multi-class
classification in the BERT enhanced data-based portfolio allocation model. Also,
a technique similar to the one used in chapter 3 is applied to derive a BERTbased
severity model for multi-class aggregate loss prediction through a different
approach with the BERT enhanced data-based portfolio allocation model.
The last chapter summarizes the described applications. The applications of
modern NLP techniques for predictive analytics are practical and promising. However,
applications to actuarial science are almost nonexistent. This dissertation
demonstrates the possibilities of NLP applications to improve predictive modeling
in actuarial science. The NLP techniques can help to gather information from
textual descriptions discarded by traditional models. The possible improvements
that can be made in future research are also described in this chapter.
Description
Keywords
Actuarial science,
Machine Learning,
Natural language processing,
Neural network,
Predictive modeling,
Text mining,
Computer science,
Statistics,
Language