Sentiment Analysis of Online Reviews
It is a given that computers aren’t naturally smart. They aren’t able to understand the intricacies of human languages. However, we are able to use machine learning to allow computers to be able to extract data from our language. Sentiment analysis, as the name suggests, allows us to teach computers how to analyze data to return a value that would determine whether a given text is written in a positive, negative, or neutral manner. Using sentiment analysis, we are able to use computers for opinion mining. Opinion mining and sentiment analysis become more possible as social media websites are uploading more data that can be used. Therefore, we will discuss sentiment analysis on reviews that we can find on social media. This has significance not only in a research aspect due to the nature of machine learning, but also in an economic aspect because it will allow business leaders to make more informed decisions using data that is more reliable.
The source of our text will from social media websites such as Yelp because they allow users to crowdsource influence. For example, if you’re looking to purchase something or eat at a specific restaurant, you are able to read the thoughts of past customers rather than asking your friends who have a lower chance of having purchased the item or visited the place. (Liu) More than 70% of readers of reviews say that they’re largely influenced by the reviews of their purchases (Pang et al., 2) While reviews may be impactful on readers, it’s important to acknowledge the existence of “fake” reviews that are fabricated for the benefit of the business as well as spam reviews that are written to advertise another business.
When data mining from social media websites such as Yelp, we can parse the text in different ways for sentiment analysis. By categorizing text using categories, we will be able to use that when we give a sentiment rating (Pak et al.). For example, if we have text that has many words from the positive set of words, we will know that it is written in a positive sense. While this may seem simple, it is more complicated when the human language can use seemingly positive words in a negative sense and vice versa (Vinodhini et al.). For example, the phrase “not bad” is written with a positive intent. This is where we have to take into consideration other cases of sentiment when we’re opinion mining texts online.
There are different types of opinions. There are regular/comparative and explicit/implicit opinions. (Liu) When we parse texts our data, it is wise to determine what types of opinions are in our data set. Regular opinions consist of direct and indirect opinions that either state a sentiment about something or a sentiment as a result of something. Comparative opinions simply compare something over another thing. Explicit and implicit opinions deal with objective and subjective statements that would yield an opinion. Given these different types of opinions, sentiment analysis becomes more difficult and more prone to error if these aren’t accounted for. There are undoubtedly going to be problems with sentiment analysis. As previously mentioned, computers are only as intelligent as we make them to be.
As the texts get longer and more linguistically complex, we will have to evaluate the expressions separately. For example, if a given text has more than one expression, it is difficult to determine the sentiment if the expressions have contradicting sentiments. Therefore, it is better to segment the text into different expressions so that we can determine the polarity of each expression to retrieve the contextual polarity (Wilson et al.). By using the polarity of certain expressions to modify the polarity of others, we will be able to achieve a more precise result that fits the context of the expression itself.
Machine learning, sentiment analysis, and opinion learning will allow us to learn more about the human language on a larger scale due to the processing power of computers. There are many Computer Science and Natural Language Processing problems that need to be dealt with in order to be able to achieve successful results. Nevertheless, it is interesting what the results tell us about our own language. By applying these concepts, we are able to break down our language and subjectively evaluate linguistics by objective means.
Works Cited
1. Liu, Bing. “Sentiment analysis and opinion mining.” Synthesis lectures on human language technologies 5.1 (2012): 1-167.
2. Pak, Alexander, and Patrick Paroubek. “Twitter as a Corpus for Sentiment Analysis and Opinion Mining.” LREc. Vol. 10. 2010.
3. Pang, Bo, and Lillian Lee. “Opinion mining and sentiment analysis.” Foundations and trends in information retrieval 2.1-2 (2008): 1-135.
4. Vinodhini, G., and R. M. Chandrasekaran. “Sentiment analysis and opinion mining: a survey.” International Journal 2.6 (2012).
5. Wilson, Theresa, Janyce Wiebe, and Paul Hoffmann. “Recognizing contextual polarity in phrase-level sentiment analysis.” Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 2005.