Text Summarization and Sentiment Analysis of Drug Reviews: A Transfer Learning Approach

Abuka, Gloria

Text Summarization and Sentiment Analysis of Drug Reviews: A Transfer Learning Approach

dc.contributor.advisor	Ranganathan, Jaishree
dc.contributor.author	Abuka, Gloria
dc.contributor.committeemember	Dong, Zhijiang
dc.contributor.committeemember	Gu, Yi
dc.date.accessioned	2023-04-25T16:06:29Z
dc.date.available	2023-04-25T16:06:29Z
dc.date.issued	2023
dc.date.updated	2023-04-25T16:06:29Z
dc.description.abstract	Transfer learning is a machine learning method where a model that has been trained on a specific or general task (source domain) is reused as a starting point for a similar task in a new model (target domain). This is an important concept in the Natural Language Processing field because of its ability to produce remarkable results from small datasets. Text summarization produces a concise and meaningful form of text from a larger one while sentiment analysis distinguishes the polarity present in the text. News and scientific articles have been used in text summarization models over the years, but drug reviews have gotten considerably less attention. This study proposes a text summarization and sentiment analysis method based on the transformer architecture for the 10 most useful reviews for 500 different drugs from a dataset of drugs reviews. We created human summaries for the drug reviews manually and compared the performance of a fine-tuned Text-to-Text Transfer Transformer (T5) model and Pre-training with extracted gap-sentences for abstractive summarization (PEGASUS) models with that of a Long Short-Term Memory (LSTM) model. Additionally, we assessed the impact of various preprocessing steps on the ROUGE scores. We also fine-tuned the Bidirectional Encoder Representation from Transformers (BERT) model for sentiment analysis in comparison to an LSTM model. Our T5-Base model had the best results with average ROUGE1, ROUGE2, and ROUGEL scores of 50.31, 29.14, and 40.06 respectively while the BERT model achieved an accuracy of 84\% for the sentiment analysis task. We evaluated our fine-tuned models on a dataset of BBC news summaries for text summarization and we achieved average ROUGE1, ROUGE2, and ROUGEL scores of 72.20, 63.59, and 57.42 respectively. Our models outperformed two previous works, which had ROUGE1, ROUGE2, and ROUGEL of 47.0, 33.0, 42.0 and 47.30, 26.50 and 36.10 respectively.
dc.description.degree	M.S.
dc.identifier.uri	https://jewlscholar.mtsu.edu/handle/mtsu/6884
dc.language.rfc3066	en
dc.publisher	Middle Tennessee State University
dc.source.uri	http://dissertations.umi.com/mtsu:11682
dc.subject	LSTM
dc.subject	PEGASUS
dc.subject	Sentiment Analysis
dc.subject	T5
dc.subject	Text Summarization
dc.subject	Computer science
dc.thesis.degreelevel	masters
dc.title	Text Summarization and Sentiment Analysis of Drug Reviews: A Transfer Learning Approach

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Abuka_mtsu_0170N_11682.pdf
Size:: 485.86 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.27 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses