Comparing VADER and Text Blob to Human Sentiment | by Leah Pope | Dec, 2020
Let’s start by loading the labeled tweets and creating a new column for the VADER sentiment.
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzertweets = pd.read_csv('../data/prepped_sxsw_tweets.csv')# Get the VADER sentiments
def get_vader_sentiment(analyzer, tweet):
tweet = tweet.replace('#','') # include hashtag text
vader_scores = analyzer.polarity_scores(tweet)
compound_score = vader_scores['compound']
vader_sentiment = None
# using thresholds from VADER developers/researchers
if (compound_score >= 0.05):
vader_sentiment = 'positive'
elif (compound_score < 0.05 and compound_score > -0.05):
vader_sentiment = 'neutral'
elif (compound_score <= -0.05):
vader_sentiment = 'negative'
return vader_sentimentanalyzer = SentimentIntensityAnalyzer()tweets['vader_sentiment'] = tweets.apply(lambda row: get_vader_sentiment(analyzer, row['tweet_text']), axis=1)tweets.head(3)
Cool! We just got the sentiment from VADER. I’m using the compound score to label the tweet as Positive/Neutral/Negative per the VADER documentation.
- positive: compound score >= 0.05
- neutral: (compound score > -0.05) and (compound score < 0.05)
- negative: compound score <= -0.05
Text Blob Sentiment
Now, let’s get the sentiment from Text Blob using the code below.
from textblob import TextBlobdef get_text_blob_sentiment(tweet):
polarity = TextBlob(tweet).sentiment.polarity
# The polarity score is a float within the range [-1.0, 1.0].
textblob_sentiment = None
if (polarity > 0):
textblob_sentiment = 'positive'
elif (polarity == 0):
textblob_sentiment = 'neutral'
elif (polarity < 0):
textblob_sentiment = 'negative'
return textblob_sentimenttweets['text_blob_sentiment'] = tweets.apply(lambda row: get_text_blob_sentiment(row['tweet_text']), axis=1)tweets.head(3)
Great! We just got the sentiment from Text Blob. I’m using the sentiment polarity score to label the tweet as Positive/Neutral/Negative. The polarity score is a float within the range [-1.0, 1.0].
A Deeper Look
Now let’s compare the human-labeled sentiment to the tool-labeled sentiment. I used bar plots to visualize the comparison with totals at the top of each bar for clarity. If you just looked at the bar heights and didn’t pay attention to the “Number of Tweets” on the side of each subplot, you could be misled.
The two tools do seem to be pretty comparable, with similar breakdowns for Negative, Neutral, and Positive tweets.
- 12.90% VADER Negative, 13.70% Text Blob Negative
- 41.70% VADER Neutral, 37.00% Text Blob Neutral
- 45.40% VADER Positive, 49.30% Text Blob Positive
To no surprise, there is a notable difference between the human and tool sentiment labels. 47.10% of tweets had differing human and VADER sentiments. 51.50% of tweets had differing human and Text Blob sentiments.
The dataset provider mentions that the human-labeled sentiment “directed at” a brand or product. This may account for the difference. Let’s try with another, similar dataset.
# Get the VADER sentiments
analyzer = SentimentIntensityAnalyzer()
apple_tweets['vader_sentiment'] = apple_tweets.apply(lambda row: get_vader_sentiment(analyzer, row['text']), axis=1)# Get the Text Blob sentiments
apple_tweets['text_blob_sentiment'] = apple_tweets.apply(lambda row: get_text_blob_sentiment(row['text']), axis=1)apple_tweets.head(3)
In this case, we notice some larger differences between the two tools. VADER and Text Blob have similar numbers of Positive Tweets but differ quite a bit on Negative and Neutral Tweets.
- 33.00% VADER Negative, 24.80% Text Blob Negative
- 34.90% VADER Neutral, 42.80% Text Blob Neutral
- 32.10% VADER Positive, 32.40% Text Blob Positive
Again, no big surprise here. We see another notable difference between the human and tool sentiment labels. 39.70% of tweets had differing human and VADER sentiments. 42.70% of tweets had differing human and Text Blob sentiments.
Second opinion on Neutral?
While we probably will trust the human sentiment labels over the tool sentiment labels, we could use the tools to get a “second opinion”.
Human opinions on Neutrals might vary from human to human and even as a single human goes about labeling neutral text. Also, we might want to create a binary, Positive/Negative classifier. These are both compelling scenarios for using the tools to get a second opinion.
How might we do that? One way is to get the sentiment from both tools and if both agree, use that sentiment in place of the human Neutral label. I used this approach to get the following guidance on re-labeling Neutrals:
---- Apple and Google @ SXSW Tweets -----
28.60% of Neutral Tweets could be re-labeled.
24.50% of Neutral Tweets could be re-labeled to Positive.
4.10% of Neutral Tweets could be re-labeled to Negative.
---- Apple Tweets -----
21.00% of Neutral Tweets could be re-labeled.
18.50% of Neutral Tweets could be re-labeled to Positive.
2.50% of Neutral Tweets could be re-labeled to Negative.
Read More …