• No results found

Sentiment Analysis

N/A
N/A
Protected

Academic year: 2022

Share "Sentiment Analysis"

Copied!
23
0
0

Loading.... (view fulltext now)

Full text

(1)

Sentiment Analysis

PhD Seminar

Balamurali A R(08405401)

Under guidance of

Prof. Pushpak Bhattacharyya

Dept of CSE-IIT Bombay Mumbai

(2)

o Introduction o Motivation o Challenges o General Model

o Word level sentiment analysis o Sentence level sentiment analysis o Comparative sentence analysis o Document level sentiment analysis o Conclusion & Future works

o References

Outline

(3)

•Advent of UGC – A two way communication.

•Vast of information – Most of them direct feed backs

•Objective: To fine Sentiment or opinion of a user with regard to an entity/object

•Fine grain version of Subjectivity Analysis

•Subjectivity Analysis - finding whether phrase, sentence, document is subjective or objective.

Sentiment Analysis(SA) - Introduction

(4)

 Businesses and organizations:

Product and service benchmarking.

Market intelligence.

 People:

Finding opinions while purchasing a new product Finding opinions on political topics

 Advertisement:

Placing ads in the user-generated content Place an ad when one praises a product.

Place an ad from a competitor if one criticizes a product.

 Information search & Retrieval:

Providing general search for "opinions".

Motivation

(5)

Opinion holder (source) :person who holds the sentiment.

E.g. I love playing hockey.

E.g I agree to what pope said “hate the sin not the sinners” -<I,Pope>

Object (Target) :product, person, organization or a topic on which sentiment is expressed.

E.g. I like nano. But I don’t like the steering of nano.

Opinion/sentiment a view or appraisal on an object

E.g. It’s a pity(negative) that she didn’t marry.

General Model

(6)

• Identifying source and target:

some of the parts is in not equal to the whole- [

Turney’02]

Movies and the themes included – how to separate the sentiment

“Movie was classic in fact Gabbar Singh was epitome of villainy!"

• Differentiating feature and attributes

“I hate iPod, but I like the scroll technology”

• Role of semantics

How could anyone sit through this movie?”

• Issue of Ideology- [Sack’ 92]

Saddam Hussein” - Mixed opinion?????

Challenges

(7)

Sentiment Classification

at Document level

Sentence level sentiment

analysis.

Word level Sentiment

Analysis

Comparative sentence

analysis

Sentiment Analysis: How to do?

(8)

Word level Sentiment Analysis

• Used for grammatically incoherent text – Short news paper headlines e.g .

Almost Perfection [The Hindu’22/04/09]

• Direct computation using lexical resource –

SentiWordNet, WordNetAffect

• SentiWordNet

wordnet graded with pos(c),neg(c)& obj(c) score. e.g. Love

– Created using classifiers

– Interesting Findings -Mostly opinionated content carried by modifiers(adjective & adverbs)

e.g. smart IITian

source:http://sentiwordnet.isti.cnr.it

(9)

E.g. ―Manmohan insists troop stay in Guwhati, predicts midterm victory‖

System achieves valence accuracy of 55%

Pre Process

POS tagging

Dependency Parsing

Global Sentence

Rating Addition of

Rules Linear Combination

of Words

Evaluation

UPAR’07

(10)

• Contextual information necessary for SA at sentence level

“Indian observers were not happy about things happening in its border country, even though west were enjoying the show.‖

• Cannot assign prior polarity to all words!

E.g long battery life & long time to recharge.

• Issue of negation – not happy

• Issue of syntactic role - Polluters are V/s they are polluters

• Issue of neutral polarity – look forward to

Sentence level Sentiment Analysis

(11)

How to include Contextual Polarity

Prior clues

• Assumption-sentence polarity is product of clues

• Different Clues Manually detected

Sub Detection

• Classify sentence into polar/neutral

• Create Feature Vector from Polar sentence

Polarity Classification

• Disambiguate Contextual Polarity

• Assign Polarity

(12)

• 28 Features for Neutral-polar classifier with an accuracy of 75.9%

• Polarity classifiers used 10 features for classification.

• Polarity classifier achieved an accuracy of 65.7%

Contd.

(13)

• A preferential emotion detection

―I like Prof. X class to Prof. Y class‖

• More Related to Opinion Mining

• Common Feature – presence of comparison word e.g. IIT Bombay is better than IIT Y

• Comparison word may/may not opinionated(emotional state) – e.g longer

• Preference of sentence with opinionated sentence easy

– Find context -> <features, opinion comparison word><battery Life ,longer>

– Use context to determine opinion orientation- will be explained later.

Comparative sentence analysis

(14)

• Different types of comparatives

Non equal Gradable(less than),Equative(same),Superlative(longest), NonGradable(Nano and supera has got different features)

• Comparative Relation(CR)<long,battery,S1,S2>

• Objective – Given CR, to find S1 or S2 is preferred.

• Some more Categories of Comparatives

Type 1 (er,est), Type 2( more, most) , Increasing comparatives (longer), Decreasing Comparatives(fewer)

• Final analysis depend type of comparative word(C) and feature involved(F).

– Opinionated comparative

– Comparative with context dependent opinion (“higher milage”)

Comparative sentence analysis

(15)

• Different cases

Comparative sentence analysis

• e.g. Better

• Assign S1 preferred

C opinionated

• e.g. X makes more noise than Y

• Use Comparative rules

• Get preferred entity

Only F opinionated

• e.g. Long (battery) life

• Use external source

• Find OSA(F,C)=log(pr(f,c)pr(c|f)/(pr(f)pr(c)))

• Decision rule is applied for preference

Both C & F not opinionated

e.g. Nano is smaller than Indica

• Count number of times it appears in cons and pros

• If #pros(C) > #cons(C),S1 is preferred

C – feature

indicator

(16)

• Baseline – default preference S1 -84%

• System accuracy – 94%

• Inferences

– People usually give S1 more preference in comparative sentence

Comparative sentence analysis

(17)

• To classify documents as positive or negative

e.g.―Manali travel review” – Recommended/Not recommended

Sentiment Analysis – Document Level

Extract phrases - 2 words long(with context) e.g.

good place

Calculate the semantic orientation of extracted phrases using PMI with ―excellent‖ & ―poor‖

Classification based on Average semantic orientation of the phrases

Classify based on average threshold – Here its Zero

(18)

SO(phrase)=

e.g. “unethical practices

” -

8.484

• Different categories were tested – Automobiles, Banks, Movies, Travel destinations.

• Average Accuracy 74%, except for movies.

• Movies contain theme within expressing a sentiment

e.g Raj’s arrogance and sadistic mentality towards society is mercilessly shown by director Mani Ratnam. The film can be regarded as one of all time best nonfiction movie.”

Sentiment Analysis – Document Level



 

excellent) poor)hits(

near e

hits(phras

hits(poor) excellent)

near e

hits(phras log

(19)

Source: Pang& lee,2004

Base lined Version Graph Based Min Cut system

Sentiment Analysis – Document Level: A

graph based method

(20)

• Objective: Minimize

Individual score Non negative estimates of each xi preference for being in Cj

Association score assoc (xi,xj) –:Non negative estimate of how important it is that xi and xj are in the same class.

Solution: Create (G,V) = {v1, v2...vn, s, t} and partition into cuts of minimum costs.

• In our case S,T would be subjective and objective

indj(si) = pr(si|sub)

– Assoc(si, sj)= function of distance between si and sj

• Accuracy improved from 85.2% to 86.4%

Contd.

(21)

Conclusion:

• Different level of text requires different treatment for assessing the sentiment.

• Domain of Text also play an important role.

Future work:

finding the target of the sentiment

Dealing with sarcasm

Multilingual sentiment analysis

Ideology and its handling

Conclusion & Future work

(22)

[1]. Warren Sack 1994,On the computation of point of view,Proceedings of the twelfth national conference on Artificial intelligence, 1994 pp. 1488.

[2]. Pang & Lee 2002, Thumbs up? Sentiment Classification using Machine Learning Technique, Proceedings of the

Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia,ACL July 2002, pp. 79-86.

[3].Peter Turney 2002, Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of

Reviews, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 417-424.

[4]. C Strapparava, A Valitutti ,2004 ,WordNet-Affect: an affective extension of WordNet ,Proceedings of LREC, Vol. 4 , pp.

1083-1086.

[5]. Bo Pang and Lillian Lee 2004, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, Proceedings of ACL

[6]. Wiebe.et.al 2005, Annotating Expressions of Opinions and Emotions in Language, Computers and the Humanities, Vol.

39, No. 2-3. (May 2005), pp. 165-210.

[7]. Wilson Theresa, Wiebe Janyce, Hoffmann Paul. 2005, Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis, Proceedings of Human Language Technologies Conference/Conference on Empirical Methods in Natural Language Processing(HLT/EMNLP 2005)

Reference

(23)

[8]. Wiebe, J. and Mihalcea, R. 2006. Word sense and subjectivity. In Proceedings of the 21st international

Conference on Computational Linguistics and the 44th Annual Meeting of the Association For Computational Linguistics(Sydney, Australia, July 17 - 18, 2006) PP 1065-1072

[9]. Andrea Esuli, Fabrizio Sebastiani 2006,SENTIWORDNET: A publicly available lexical resource for opinion mining,In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC’06),pp.417—422 [10]. François-Régis Chaumartin 2007, UPAR7: A knowledge-based system for headline sentiment tagging, ,

Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007)Prague, Association for Computational Linguistics pages 422–425

[11]. Liu, Bing 2007, Web Data Mining,Springer, chapter 11

[12].Ganapathibhotla & Liu 2008, Mining Opinions in Comparative Sentences, Proceedings of the 22nd International Conference on Computational Linguistics,Coling 2008, pages 241–248

[13].http://emetrics.org/2007/washingtondc/track_web20_measurement.php#usergenerated [14]. http://en.wikipedia.org/wiki/User-generated_content

References

Related documents

Theme: The Imperatives of the Business Case for Migration This session will be dedicated to traversing the key issues that permeate the business case and are germane to industry

Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:

Chapter 5 Hybrid Methods for Sentiment Analysis Step IV Training of neural network: After the calculation of sentiment score, unsupervised methods that is DBSCAN and fuzzy C-means

This thesis completely focuses on classification of movie reviews in either as positive or negative review using machine learning techniques like Support Vector Machine(SVM),

This thesis is a research work in the area of sentiment analysis that evaluates the application of lexical resources and machine learning algorithm for sentiment classification

This thesis contributes to classification of tweets in to either positive or negative using Machine learning techniques such as Nave Bayes classifier, Multinomial Nave Bayes

Atiśa Śr ī D ī pa)kara-jñ ā na, a great saint-philosopher of the 10th-11th century, almost forgotten in India over the past centuries, had been venerated as an

Sentiment Analysis is important term of referred to collection information in a source by using NLP, computational linguistics and text analysis and to make decision by