• No results found

Sentiment Analysis

N/A
N/A
Protected

Academic year: 2023

Share "Sentiment Analysis"

Copied!
23
0
0

Loading.... (view fulltext now)

Full text

(1)

Sentiment Analysis

PhD Seminar

Balamurali A R(08405401)

Under guidance of

Prof. Pushpak Bhattacharyya

Dept of CSE-IIT Bombay Mumbai

(2)

o Introduction o Motivation o Challenges o General Model

o Word level sentiment analysis o Sentence level sentiment analysis o Comparative sentence analysis o Document level sentiment analysis o Conclusion & Future works

o References

Outline

Outline

(3)

•Advent of UGC – A two way communication.

•Vast of information – Most of them direct feed backs

•Objective: To fine Sentiment or opinion of a user with regard to an entity/object

•Fine grain version of Subjectivity Analysis

•Subjectivity Analysis - finding whether phrase, sentence, document is subjective or objective.

Sentiment Analysis(SA) - Introduction

Sentiment Analysis(SA) - Introduction

(4)

 Businesses and organizations:

Product and service benchmarking.

Market intelligence.

 People:

Finding opinions while purchasing a new product Finding opinions on political topics

 Advertisement:

Placing ads in the user-generated content Place an ad when one praises a product.

Place an ad from a competitor if one criticizes a product.

 Information search & Retrieval:

Providing general search for "opinions".

Motivation

Motivation

(5)

Opinion holder (source) :person who holds the sentiment.

E.g. I love playing hockey.

E.g I agree to what pope said “hate the sin not the sinners” -<I,Pope>

Object (Target) :product, person, organization or a topic on which sentiment is expressed.

E.g. I like nano. But I don’t like the steering of nano.

Opinion/sentiment a view or appraisal on an object

E.g. It’s a pity(negative) that she didn’t marry.

General Model

General Model

(6)

• Identifying source and target:

some of the parts is in not equal to the whole- [

Turney’02]

Movies and the themes included – how to separate the sentiment “Movie was classic in fact Gabbar Singh was epitome of villainy!"

• Differentiating feature and attributes

“I hate iPod, but I like the scroll technology”

• Role of semantics

How could anyone sit through this movie?”

• Issue of Ideology- [Sack’ 92]

Saddam Hussein” - Mixed opinion?????

Challenges

Challenges

(7)

Sentiment Analysis: How to do?

Sentiment Analysis: How to do?

(8)

Word level Sentiment Analysis Word level Sentiment Analysis

• Used for grammatically incoherent text – Short news paper headlines e.g

.

Almost Perfection [The Hindu’22/04/09]

• Direct computation using lexical resource

SentiWordNet, WordNetAffect

• SentiWordNet

wordnet graded with pos(c),neg(c)& obj(c) score. e.g. Love

– Created using classifiers

– Interesting Findings -Mostly opinionated content carried by modifiers(adjective & adverbs)

e.g. smart IITian

source:http://sentiwordnet.isti.cnr.it

(9)

E.g. “Manmohan insists troop stay in Guwhati, predicts midterm victory”

System achieves valence accuracy of 55%

UPAR’07

UPAR’07

(10)

• Contextual information necessary for SA at sentence level

“Indian observers were not happy about things happening in its border country, even though west were enjoying the show.”

• Cannot assign prior polarity to all words!

E.g long battery life & long time to recharge.

• Issue of negation – not happy

• Issue of syntactic role - Polluters are V/s they are polluters

• Issue of neutral polarity – look forward to

Sentence level Sentiment Analysis

Sentence level Sentiment Analysis

(11)

How to include Contextual Polarity

How to include Contextual

Polarity

(12)

• 28 Features for Neutral-polar classifier with an accuracy of 75.9%

• Polarity classifiers used 10 features for classification.

• Polarity classifier achieved an accuracy of 65.7%

Contd.

Contd.

(13)

• A preferential emotion detection “I like Prof. X class to Prof. Y class”

• More Related to Opinion Mining

• Common Feature – presence of comparison word e.g. IIT Bombay is better than IIT Y

• Comparison word may/may not opinionated(emotional state) – e.g longer

• Preference of sentence with opinionated sentence easy

– Find context -> <features, opinion comparison word><battery Life ,longer>

– Use context to determine opinion orientation- will be explained later.

Comparative sentence analysis

Comparative sentence analysis

(14)

• Different types of comparatives

Non equal Gradable(less than),Equative(same),Superlative(longest), NonGradable(Nano and supera has got different features)

• Comparative Relation(CR)<long,battery life,S1,S2>

• Objective – Given CR, to find S1 or S2 is preferred.

• Some more Categories of Comparatives

Type 1 (er,est), Type 2( more, most) , Increasing comparatives (longer), Decreasing Comparatives(fewer)

• Final analysis depend type of comparative word(C) and feature involved(F).

– Opinionated comparative

– Comparative with context dependent opinion (“higher milage”)

Comparative sentence analysis

Comparative sentence analysis

(15)

• Different cases

Comparative sentence analysis

Comparative sentence analysis

(16)

• Baseline – default preference S1 -84%

• System accuracy – 94%

• Inferences

– People usually give S1 more preference in comparative sentence

Comparative sentence analysis

Comparative sentence analysis

(17)

• To classify documents as positive or negative

e.g.“Manali travel review” – Recommended/Not recommended

Sentiment Analysis – Document Level

Sentiment Analysis – Document Level

(18)

SO(phrase)=

e.g. “unethical practices

” -

8.484

• Different categories were tested – Automobiles, Banks, Movies, Travel destinations.

• Average Accuracy 74%, except for movies.

• Movies contain theme within expressing a sentiment

e.g Raj’s arrogance and sadistic mentality towards society is mercilessly shown by director Mani Ratnam. The film can be regarded as one of all time best nonfiction movie.”

Sentiment Analysis – Document Level Sentiment Analysis – Document Level



 

excellent) poor)hits(

near e

hits(phras

hits(poor) excellent)

near e

hits(phras log

(19)

Source: Pang& lee,2004

Base lined Version Graph Based Min Cut system

Sentiment Analysis – Document Level: A graph based method

Sentiment Analysis – Document Level: A

graph based method

(20)

• Objective: Minimize

Individual score Non negative estimates of each xi preference for being in Cj

Association score assoc (xi,xj) –:Non negative estimate of how important it is that xi and xj are in the same class.

Solution: Create (G,V) = {v1, v2...vn, s, t} and partition into cuts of minimum costs.

• In our case S,T would be subjective and objective

indj (si) = pr(si|sub)

– Assoc(si, sj)= function of distance between si and sj

• Accuracy improved from 85.2% to 86.4%

Contd.

Contd.

(21)

Conclusion:

• Different level of text requires different treatment for assessing the sentiment.

• Domain of Text also play an important role.

Future work:

finding the target of the sentiment

Dealing with sarcasm

Multilingual sentiment analysis

Ideology and its handling

Conclusion & Future work

Conclusion & Future work

(22)

[1]. Warren Sack 1994,On the computation of point of view, Proceedings of the twelfth national conference on Artificial intelligence, 1994 pp. 1488.

[2]. Pang & Lee 2002, Thumbs up? Sentiment Classification using Machine Learning Technique, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia,ACL July 2002, pp. 79-86.

[3].Peter Turney 2002, Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp.

417-424.

[4]. C Strapparava, A Valitutti ,2004 ,WordNet-Affect: an affective extension of WordNet ,Proceedings of LREC, Vol. 4 , pp.

1083-1086.

[5]. Bo Pang and Lillian Lee 2004, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, Proceedings of ACL

[6]. Wiebe.et.al 2005, Annotating Expressions of Opinions and Emotions in Language, Computers and the Humanities, Vol. 39, No. 2-3. (May 2005), pp. 165-210.

[7]. Wilson Theresa, Wiebe Janyce, Hoffmann Paul. 2005, Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis, Proceedings of Human Language Technologies Conference/Conference on Empirical Methods in Natural Language

Processing (HLT/EMNLP 2005)

Reference

Reference

(23)

[8]. Wiebe, J. and Mihalcea, R. 2006. Word sense and subjectivity. In Proceedings of the 21st international

Conference on Computational Linguistics and the 44th Annual Meeting of the Association For Computational Linguistics (Sydney, Australia, July 17 - 18, 2006) PP 1065-1072

[9]. Andrea Esuli, Fabrizio Sebastiani 2006,SENTIWORDNET: A publicly available lexical resource for opinion mining, In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC’06),pp.417—422 [10]. François-Régis Chaumartin 2007, UPAR7: A knowledge-based system for headline sentiment tagging, ,

Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007)Prague, Association for Computational Linguistics pages 422–425

[11]. Liu, Bing 2007, Web Data Mining, Springer, chapter 11

[12].Ganapathibhotla & Liu 2008, Mining Opinions in Comparative Sentences, Proceedings of the 22nd International Conference on Computational Linguistics, Coling 2008, pages 241–248

[13].http://emetrics.org/2007/washingtondc/track_web20_measurement.php#usergenerated [14]. http://en.wikipedia.org/wiki/User-generated_content

References

Related documents

Theme: The Imperatives of the Business Case for Migration This session will be dedicated to traversing the key issues that permeate the business case and are germane to industry

Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:

Proceedings of 17th International Conference on Information Technology, Systems and Management (ICITSM-2015), and published in the International Journal of Social,

CROP-WISE , SEASON WISE IRRIGATED AND UN-IRRIGATED AREA FOR THE YEAR 2007-2008 IN ANDHRA PRADESH. (Area in Hectares)

The suite of technological options for farmers should be as broad as possible, including agricultural biotechnologies, which represent a large range of technologies used in food

Pavan, “Energy efficient routing protocol for wireless sensor and actor networks,” in Proceedings of the international conference on recent trends in networks and

Atiśa Śr ī D ī pa)kara-jñ ā na, a great saint-philosopher of the 10th-11th century, almost forgotten in India over the past centuries, had been venerated as an

Sentiment Analysis is important term of referred to collection information in a source by using NLP, computational linguistics and text analysis and to make decision by