How to do sentiment analysis in Python?

Commodity review mining, movie recommendations, stock market predictions ...... Sentiment analysis is of great use. This article helps you make your own sentiment analysis results in Python step by step, don't you want to give it a try?

(Some of the links in this article may not open correctly due to restrictions on external links to WeChat Public. If necessary, please click on the "Read the original article" button at the end of the article to access the version that displays the external links properly. )


If you follow data science research or business practice, the term "sentiment analysis" is not new to you, right?

On Wikipedia, sentiment analysis is defined as.

Text sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text mining, and computer linguistics to identify and extract subjective information from the original material.

Sounds high and mighty, doesn't it? What if we were more specific?

Given a paragraph of text, you can use the automated method of sentiment analysis to get what the emotional overtones contained in that paragraph are.

Magical, right?

Sentiment analysis is not a show-off tool. It is the muffled way to make a fortune. Back in 2010, it was noted that sentiment analysis of publicly available information from Twitter could be relied upon to predict the rise and fall of the stock market with an accuracy of 87.6%!

In the view of these academics, once you have access to large amounts of real-time social media text data and you use the black magic of sentiment analysis, you've got a crystal ball for predicting near-term investment market trends.

Isn't this feeling of crushing the competition with data science wonderful?

There is just so much textual data available to us in the age of big data. Just the amount of review information from VW, Douban and Amazon is enough to swing a shovel and a pickaxe and dig deep.

Are you wondering how you, a non-computer science major, can apply such advanced technology?

No need to worry. Once upon a time sentiment analysis was just a lab or a big company's one-trick pony. It has now long since flown into the common household. The lowering of the threshold makes it possible for us ordinary people to do sentiment analysis processing of large amounts of text with a few lines of code in Python.

Is it time to get your fist pumping and plan to try it out?

Then let's get started.


To better use Python and related packages, you need to install the Anaconda suite first. For detailed process steps, please refer to the article "How to make a word cloud in Python".

Go to your system "Terminal" (macOS, Linux) or "Command Prompt" (Windows), go to our working directory demo, and execute the following command.

pip install snownlp
pip install -U textblob
python -m textblob.download_corpora

OK, so at this point your sentiment analysis runtime environment has been configured.

In a terminal or at a command prompt, type.

jupyter notebook

You'll see those previous files in the directory, just ignore them.

Well, here's how we can happily use Python to write programs that do text sentiment analysis.


We start by looking at the sentiment analysis of the English text.

Here we need to use the TextBlob package .

In fact, as you can see from the image above, this package can do a lot of things related to text processing. In this article we will focus only on the sentiment analysis item. We'll cover the other features later when we have time.

Let's create a new Python 2 notebook and name it "sentiment-analysis".

Prepare the English text data first.

text = "I am happy today. I feel sad today."

Here we have entered two sentences and stored it inside the variable text. Having studied English for over a decade, you should immediately discern the emotional properties of these two sentences. The first sentence is "I'm happy today", positive; the second sentence is "I'm depressed today", negative.

Below we see if the sentiment analysis tool TextBlob can correctly identify the sentiment properties of these two sentences.

First we call out the TextBlob.

from textblob import TextBlob
blob = TextBlob(text)

Press Shift+Enter to execute it, and the result seems to just print out the two sentences as they are.

Don't worry, TextBlob has helped us to break a text into different sentences. Let's see if it's the right division.


The output after execution is as follows.

The delineation is unmistakable. But what's the big deal if you can break a sentence? I want the results of the sentiment analysis!

Why are you in such a hurry? One step at a time. Okay, we output the results of the sentiment analysis for the first sentence.


Upon execution, you will see interesting results emerge: the

Emotional polarity 0.8, subjectivity 1.0. To clarify, the range of variation in affective polarity is [-1, 1], with -1 representing completely negative and 1 representing completely positive.

Since I said I was "happy", it's only right that the sentiment analysis is positive.

While we're at it, let's look at the second sentence.


The result of the implementation is as follows.

"Depression" corresponds to an emotional polarity of negative 0.5, no problem!

More interestingly, we can also have the TextBlob synthesize the sentiment of the entire text.


What is the result of the implementation?

I'll give you 10 seconds. Guess.

Without further ado, it goes like this.

It may not make sense to you. How can a "happy" and a "frustrated" sentence be combined and end up with a positive result?

First of all words of different polarity are numerically different. We should be able to find a more negative word than "depressed". And it's logical, who would describe their feelings in such a contradictory way, "one foot in the sky, one foot on the ground"?


Having experimented with sentiment analysis of English texts, it's time for us to return to our native language. After all, the text we usually come across most on the Internet is still in Chinese.

Chinesetext analysis, We use the SnowNLP packet 。 This bag is just likeTextBlob the same as, Versatile, too.。

Let's just prepare the text. Let's try it with 2 different adjectives this time.

text = u" I'm happy today.。 I'm angry today.。"

Notice that before the quotation marks we have added the letter u. It is important. Because it prompts Python, "The text encoding format we're entering in this paragraph is Unicode, so don't get it wrong." As for the details of the text encoding format, we'll talk more about that when we get a chance.

Okay, the text is there, so let's put SnowNLP to work.

from snownlp import SnowNLP
s = SnowNLP(text)

We want to see if SnowNLP can divide our input sentences correctly like TextBlob, so we execute the following output.

for sentence in s.sentences:

The result of the implementation looks like this.

OK, it seems that SnowNLP has the right division of sentences.

Let's look at the results of the sentiment analysis of the first sentence.

s1 = SnowNLP(s.sentences[0])

The result of the implementation is.

It seems that the key word "happy" really speaks for itself. It basically gets full marks.

Let's look at the second sentence.

s2 = SnowNLP(s.sentences[1])

The results of the implementation are as follows.

Here you must have noticed the problem - why does the word "anger" still score positive when it expresses such a strong negative emotion?

This is because SnowNLP and textblob have different scoring methods. SnowNLP's sentiment analysis takes the value of "the probability that the sentence represents a positive sentiment". That is, for the phrase "I'm angry today", SnowNLP believes that it has a very, very low probability of expressing a positive emotion.

It makes so much more sense to explain it that way.


It's fun to learn the basic moves, isn't it? Below you can find some Chinese and English texts to practice sentiment analysis on your own.

But you may run into problems soon. For example, you type in some explicitly negative emotion statement and get a positive result.

Don't think you've been fooled again. Let me explain what the problem is.

First, the sentiment determination of many utterances requires context and background knowledge, so if this type of information is lacking, the correct rate of discrimination will be affected. This is where man is more powerful than machines (at least for now).

Second, any sentiment analysis tool is, in fact, trained. What text material is used for training has a direct impact on the adaptation of the model.

SnowNLP, for example, whose training text is the review data. Therefore, you should have good results if you use it to analyze Chinese comment messages. However, if you use it to analyze other types of text - such as fiction, poetry, etc. - the results are much less effective. Because such a way of combining text data it has not been seen before.

The solution, of course, is to train it with other types of text. I've seen a lot of them, so I'm used to seeing them. As for how to train, please contact the author of the relevant software package for advice.


In addition to the text analytics application areas mentioned in this paper, What other tasks do you know of that could be automated with sentiment analysis to assist in doing? except forTextBlob harmonySnowNLP outside, What other open and free software packages do you know that can help us with sentiment analysis? Feel free to leave a comment and share it with everyone, We'll talk together. discussions。

1、MIT releases survey of driverless ethics guidelines 40 million decisions to inform AI
2、Four Dimensional Times Zheng Kuan Giving up a glimpse of life we came back from Germany with a baby and a dog
3、Wisers Group Big Data Nuggets
4、So many numbers have been blocked come in and check it out
5、Vivo X21 Black Gold Edition Limited screen fingerprint first fiddle line design

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送