Preprint

The Dark Side of Sentiment Analysis: An Exploratory Review Using Lexicons, Dictionaries, and a Statistical Monkey and Chimp

SSRN

Authors: Jim Samuel, Gavin Rozzi, Ratnakar Palle

Published: January 4, 2022

Abstract

This article discusses the inconsistencies, inaccuracies and challenges of sentiment analysis and demonstrates problems with using sentiment analysis lexicons or dictionaries for estimating sentiment in textual artifacts, comparing multiple methods on stock market and vaccine tweets.

Overview

Sentiment analysis, an important dimension of natural language processing (NLP), has seen an exponential adoption rate across research and practitioner disciplines. Many interesting developments in NLP methods continue to improve the accuracy of sentiment analysis.

However, the plethora of sentiment analysis methods, dictionaries and lexicons, tools, open source code for machine learning based sentiment analysis, and off-the-shelf sentiment analysis solutions have led to a flurry of research and applied solutions without sufficient concern for the limitations, context, and the inaccuracies of sentiment analysis.

Research Approach

This study reviews known issues with sentiment analysis as documented by prior research and then compares the application of multiple off-the-shelf lexicon and dictionary methods to stock market and vaccine tweets.

The intention is not to improve the accuracy of sentiment analysis as compared to prior benchmarks but to identify and discuss critical aspects of the “dark side” and develop a conceptual discussion of the characteristics of the dark side of sentiment analysis.

Key Contributions

Comprehensive review of sentiment analysis limitations
Empirical comparison of multiple lexicon-based methods
Conceptual framework for understanding sentiment analysis challenges
Recommendations for future research directions

Implications

This research helps align researcher and practitioner expectations to understanding the limits and boundaries of natural language processing based solutions for sentiment analysis and estimation.

Keywords: nlpsentiment-analysisdata-sciencemachine-learningresearch

← Back to Publications

Abstract

Overview

Research Approach

Key Contributions

Implications

Share this article

Related Articles

What Social Media Data Tells Us About the 2021 New Jersey Governor's Race

Sentiment Analysis of Governor Murphy's Executive Orders

Making Data Tell Stories: Visualization for Public Policy

A Spatial Analysis of New Jersey's Medical Cannabis Dispensary Accessibility

Related Expertise

Data Science

About the Author

Gavin Rozzi