CS626- Speech, NLP, Web
Harnessing Annotation Process Data for Natural Language Processing
An investigation based on eye-tracking
Lecture 37 Presented By:
Abhijit Mishra Roll no. : 114056002 Under the guidance of:
Prof. Pushpak Bhattacharyya (PhD advisor) Prof. Michael Carl (Mentor)
Background
Eye-movement data as a form of annotation
Translation Complexity and Sentiment Annotation Complexity measurement Eye-movement data for cognitive modeling
Extracting “signature scanpaths” to study trends in linguistic task oriented reading User specific topic modeling using eye-gaze information
Eye-movement data for cognitive studies in linguistics
Study of subjectivity extraction in sentiment annotation Conclusion and Future work
Roadmap
Joshi, Aditya and Mishra, Abhijit and S., Nivvedan and Bhattacharyya, Pushpak.2014. Measuring Sentiment Annotation Complexity of Text. Association for Computational Linguistics, Baltimore, USA.
Mishra, Abhijit and Joshi, Aditya and Bhattacharyya, Pushpak. 2014. A cognitive study of subjectivity
extraction in sentiment annotation. 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA) , ACL, 2014, Baltimore, USA
Mishra, Abhijit and Singhal, Shubham and Bhattacharyya, Pushpak. 2014. Combining Scanpaths to identify Common Eye-Movement Strategies in Linguistic-Task Oriented Reading. Under review
Mishra, Abhijit and Bhattacharyya, Pushpak and Carl, Michael. 2013. Automatically Predicting Sentence Translation Difficulty. Association for Computational Linguistics, Sofia, Bulgaria
Mishra, A and Carl, M and Bhattacharyya, P.2012. A heuristic-based approach for systematic error correction of gaze data for reading. Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, Mumbai, India. The COLING
Additional collaborative work
Kunchukuttan Anoop and Mishra, Abhijit and Chatterjee, Rajen and Shah, Ritesh and Bhattacharyya, Pushpak, Shata-Anuvadak: Tackling Multiway Translation of Indian Languages, LREC 2014, Rekjyavik, Iceland, 26-31 May, 2014
Publications
The eye-tracking and sentiment database created at IIT Bombay as a part of this work, several scripts for data processing and
analysis are released under Creative Commons License.
http://http://www.cfilt.iitb.ac.in/˜cognitive-nlp/
The predictive frameworks for Sentiment Annotation
Complexity and Translation Complexity computation are made available as web-services at
http://www.cfilt.iitb.ac.in/TCI http://www.cfilt.iitb.ac.in/SAC.
Released Resources and Tools
BACKGROUND
NLP state of the art:
Machine learning + Linguistics
Supervised/Semi-supervised methods are popular and relatively accurate.
The BIG Picture
Raw Data
Annotation Process Data (Gaze patterns, Key-stroke sequence)
Training Data
Annotation MODEL
Raw Data
Prediction (Information
regarding how humans predict)
Features Labels
Annotation Process Data – A byproduct of Annotation
Annotation- It is a task of labelling text, images or other data with comments, explanation, tags or markups.
Example: The movie was good.
Part of Speech Annotation: The/DT movie/NN was/VBD good/JJ ./.
Translation (to Hindi) : िफ म अ छी थी.
Sentiment Annotation: Positive
Annotation involves visualization, comprehension (understanding) and production (producing annotations) requiring “reading” and “typing”
Annotation Process Data – Organized representation of such activities.
Data representing reading and writing activities.
Gaze data
• Gaze points : Position of eye-gaze on the screen
• Fixations: A long stay of the gaze on a particular object on the screen.
Fixations have both Spatial (coordinates) and Temporal (duration) properties.
• Saccade: A very rapid movement of eye between the positions of rest.
• Scanpath: A path connecting a series of fixations.
• Regression: Revisiting a previously read segment Keystrokes
• Insertion: Insertion of a word or character inside running text.
• Deletion: Deleting a word or character.
• Selection: Highlighting
• Rearrangement: Dragging or Copy/Pasting highlighted text.
Other types of data like EEG signals, speech etc.
Annotation Process Data
Human eye movement is poised between perception and cognition.
The eye movement pattern during goal oriented reading is driven by Perceptual properties of the text
Cognitive processes underlying language processing
Eye-movement is controlled by the “occipital lobe” in the brain. The duration of
fixation and the saccadic distance and direction (progression/regression) vary based on the complexity of information to be processed (ref: Neuroergonomics: The brain at work by Parasuraman and Matthew, 2008).
Motivation : “Cognition, Linguistic Complexity and Eye-movement are related.”
Annotation and Eye-tracking
Psycholinguistics
Bicknell and Levy (2010) who model eye-movement control of readers. Using Bayesian inference on sentence identity, their model predicts how long to fixate on the current position and where to fixate next.
Demberg and Keller (2008) and Boston et al. (2008) relate eye-movement during reading to the underlying syntactic complexity of text.
Emotion word processing by Graham G. Scott and Sereno (2012) Computational Linguistics
Kliegl (2011) established a technique to predict word frequency and pattern from eye movements.
Doherty et. al (2010) introduced eye-tracking as an automatic Machine Translation Evaluation Technique.
Stymne et al. (2012) explained eye-tracking as a tool for Machine Translation (MT) error analysis in which they identified and classified MT errors.
Dragsted (2010) observed co-ordination of reading and writing process during translation.
Joshi et al. (2011) studied the cognitive aspects of sense annotation process using eye-tracking along with a sense marking tool.
Literature
PhD THEME
• Using “shallow” cognitive information from eye-tracking data.
• Three different scenarios where eye-tracking data can be used
PhD Theme
Eye-movement data as a form of annotation
Modeling Eye-movement
data
Eye movement data to find out strategies employed by humans
for language processing a. Useful where direct
manual annotation is unintuitive and prone to subjectivity
b. Problems addressed:
Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis a. Modeling of eye-movement
data for
a. User profiling and user-specific modeling b. Finding out reading
trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye- gaze patterns
• Using “shallow” cognitive information from eye-tracking data.
• Three different scenarios where eye-tracking data can be used
PhD Theme – Part 1
Eye-movement data as a form of annotation
Modeling Eye-movement
data
Eye movement data to find out strategies employed by humans
for language processing a. Useful where direct
manual annotation is unintuitive and prone to subjectivity
b. Problems addressed:
Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis a. Modeling of eye-movement
data for
a. User profiling and user-specific modeling b. Finding out reading
trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye- gaze patterns
EYE-MOVEMENT DATA AS ANNOTATION
Towards measuring and Predicting Annotation Complexity
Eye- movement data - “Subconscious Annotation”
Objective: To consider this form of annotation for tasks for which manual/direct annotation is
“unintuitive” and “subjective”
Example:
• Assigning fluency/adequacy scores to a translation,
• Giving readability/translatability scores to paragraphs
Using eye-gaze parameters as ‘subconscious annotation’ we propose frameworks to measure and predict Translation Complexity of text
Sentiment Annotation Complexity of text
Introduction
Translation Complexity Index (TCI): A measure of inherent complexity in text translation.
Application:
Categorization of sentence into different level of difficulty.
Can be used for better translation cost modeling in a crowdsourcing/outsourcing scenario.
Can provide a way of monitoring the progress of second language learners.
Study 1: Predicting Translation Complexity of Text
Length is not a good indicator of translation difficulty.
Example:
1. The camera-man shot the policeman with a gun. (Length = 8)
2. I was returning from my old office yesterday. (Length = 8)
Sentence 1 is lexically and structurally ambiguous due to the presence of polysemous word “shot” and the prepositional phrase attachment.
Translation Complexity Index: Insight
Framework for Prediction of TCI
Training data Regressor
Labeling through translator’s eye-
tracking information Linguistic and Translation Features
Test Data
TCI
• Direct manual labeling of training examples are fraught with subjectivity.
• Labeling by “time for which translation related processing is carried out by the brain”, or “Translation Processing Time (Tp). Tpis the total fixation and
= +
∈
+
∈
+
∈ ∈
TCImeasured = Tp / sentence_length
• TCImeasured is then mapped to a score between 1-10 using MinMax normalization (higher the score, greated the complexity)
Lexical Features
Sentence Length (L)
• Word count
• Intuition: Lengthy sentence are more complex to translate Degree of Polysemy (DP)
• Average Senses per word, as per WordNet
• Intuition: The more polysemous a word is; the harder it would be to disambiguate the sense Out of vocabulary measures (OOV)
• Percentage of words not present in General Service List (GSL) and Academic Word List (AWL)
• Intuition: Words not present in the working vocabulary of the translator would clearly pose challenges to translation.
Average syllables per word (SPW)
• Intuition : SPW is an indicator of readability. Readability relates to translatability.
Fraction of Noun, Verb and Preposition (PNJ) -by Trial and Error Presence of Digits (DIG) -by Trial and Error
Named entity count (NEC) -Selection of features by Trial and Error
Linguistic Features for TCI
Syntactic Features
Lin’s Structural Complexity (SC):
• Mean of the total length of the dependency links appearing in the dependency parse tree of a sentence(Lin, 1996)
• SC for the example sentence = 15/7 = 2.14
• Intuition: the farther apart syntactically linked elements are, the harder it would be to parse and comprehend the sentence.
Non-Terminal to Terminal Ratio (NTR):
• The ratio of non-terminals to terminals in the constituency parse-tree of a sentence
• Example sentence “It is possible that this is a tough sentence”
(ROOT (S (NP (PRP It)) (VP (VBZ is) (ADJP (JJ possible)) (SBAR (IN that) (S (NP (DT this)) (VP (VBZ is) (NP (DT a) (JJ tough) (NN sentence))))))))
The Non-terminal to Terminal Ratio is thus 10 / 9 = 1.11.
• Intuition: the ratio would be higher for sentences with nested structures which add to syntactic difficulty and thus translation complexity.
Linguistic Features for TCI (2)
Semantic Features
Co-reference Distance (CRD)
• The sum of distances, in number of words, between all pairs of co-referring text segments in a sentence.
• Example: Johnand Mary live together but she likes cats while he likes dogs. (CRD = 14)
• Intuition: Large portion of text has to be kept in translator’s working memory if CRD is high.
Count of Discourse connectors (DSC)
• Discourse connectors are those linking words or phrases that connect multiple discourses of different semantic content to ring about semantic coherence. (e.g. Although, However etc.)
• Intuition: Since the presence of discourse connectors semantically links two discourses, a translator is required to have the old discourse in active working memory.
Passive Clause Count (PCC)
• The number of clauses in passive voice, in a sentence.
• Example: The house is guarded by the dog that is taken care of by the home owner. (PCC = 2)
• Intuition: Intuitively, it was felt that passive voice is harder to translate than active voice
Linguistic Features for TCI(3)
Semantic Features (2)
Height of hypernymy (HH)
• The path-length to the common hypernymy (parental node) in WordNet.
• Example: Adaptation and mitigation efforts must therefore go hand (HH = 4.94)
• Intuition: It is indicative of the level of abstractness or specificity of a sentence. Not much insight
• Perplexity (PX)
• Perplexity is the degree of uncertainty of N-grams in a sentences.
• A highly perplexed N-gram induces a higher degree of surprise and slows down the process of comprehension.
• For our experiments, we computed trigram perplexity of sentences using language models
trained on a mixture of sentences form Brown corpus, a corpus containing more than one million words.
Linguistic Features for TCI(3)
Translation Feature:
Translation Model Entropy (TME)
• Translation model entropy of a phrase expresses the uncertainty involved in selecting a
candidate translation of a source phrase from a set of possible translations. For a source phrase r with T as a set containing all the possible translations with translation probabilities summing to 1
the translation entropy H(r) is:
= − ) ∗ log
∈
• TME of a sentence (according to Kohen et. al.) is computed as follows
= min
∀
= ( , , , … . . ) = ∑ ( )
Here H(R) is the joint entropy of each of the component of R which is the sum of the entropy of each phrase ( by independence assumption) .
• Intuitively, if the TME is high, it should take more time even for humans to decide between the available translation options.
Linguistic Features for TCI(3)
Correlation: Features and Measured TCI
For 80 sentences extracted from the TPR database.
Training data consists of 80 examples obtained from the TPR database (Carl 2012). We applied Support Vector Regression (Joachims et al., 1999) with different kernels and
“trade-off” parameter C.
Table 1 Table 2
Comparison between Mean Square Error (MSE) between old framework (Table – 1, considering only L, DP, SC) and new framework (Table- 2 considering all the features)
Experiment and Results
L DP SC NTR CRD DSC HH NEC DIG OOVPCC PN PV PJ SPW PX TME 0
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Features
Mean Squared Error
Ablation Test
Observation of correlation between Machine Translation Quality Estimates (METEOR and TER) and TCI. Obtained Google’s translations for 80 sentences. Compared with reference translation from TPR database to compute TER and METEOR scores,
Man vs Machine :: TCI vs MT Quality estimates
TCI not very well correlated, but some features are (can they be used in MT systems) to get better output?
Central Message: The effort required for a human annotator to detect sentiment is not uniform for all texts. Our metric called “Sentiment Annotation Complexity (SAC)”
quantifies this effort, using linguistic properties of text.
Example: Just what I wanted: a good pizza versus Just what I wanted: a cold pizza Complexity in sentiment at
Lexical level Syntactic level
Semantic and pragmatic level
Study 2: Measuring Sentiment Annotation Complexity
It is messy, uncouth, incomprehensible, vicious
and absurd.
A somewhat crudely constructed but
gripping, questing look at a person
so racked with self-loathing, he becomes an enemy to his own
race.
It’s like an all-star salute to disney’s
cheesy commercialism.
Framework for Prediction of SAC
Training data Regressor
Labeling through translator’s eye-
tracking information Linguistic and sentiment
features Test Data
SAC
• TCI like approach.
• SAC of a piece of text measured using “Total fixated time” as follows:
There is no existing database containing eye-tracking information and sentiment annotation (like TPR for translation). We created annotated data as follows:
Five annotators read 566 sentences from movie review dataset and 493 sentences from a tweet dataset (total: 1059)
Annotators were asked to give polarities (Positive/Negative/Objective) to each of the texts
While they annotated, the eye-movement data were recorded using an eye-tracker and Translog-II software
SAC is measured using the previously discussed equation.
Collection of eye-tracking and SA database
Statistics related to the eye-tracking data
Table 1: Corpus statistics
Table 2: Annotator’s reading speed (Average fixation duration per word)
Figure 1 : Distribution of measured and normalized SAC (between 0-10) 1059 documents. Here each document is a
sentence Table 3: Annotation agreement level
Linguistic Features for SAC
Correlation: Feature and measured SAC
Support Vector Regression for predicting SAC using linguistic features.
Prediction framework for SAC
Table 4: Performance of Predictive Framework for 5-fold in-domain and cross-domain validation using Mean Squared Error (MSE), Mean Absolute Error (MAE) and Mean Percentage
Error (MPE) estimates and correlation with the gold labels.
Ablation tests
Error in parsing:
Movie reviews sometimes ungrammatical, tweets mostly ungrammatical. SC and NTR features are flawed due to incorrect parsing.Example: Just saw the new Night at the Museum movie
(Stanford parser unable to resolve PP attachment ambiguity, making it a verb attachment)
Error in Co-reference resolution:
Anaphora is not resolved appropriately by Stanford CoreNLP for most cases. Co-reference distance computation is flawed.Error in Sentiment Feature Extraction
SentiWordNet used for this purpose. It is automatically annotated and flawed.
In order to calculate sentiment features using SentiWordNet, one needs Word Sense Disambiguation (WSD).
Example: I checked the score and it score was 20 love.
The word “love” is used here in the sense of “a score of zero in tennis or squash”, which
is the fifth sense for “love” in Wordnet. It is not sentiment bearing (may be taken as a positive word if the first sense it considered).
Error Analysis
We wanted to check if the confidence scores of a sentiment classifier are negatively correlated with SAC, implying that something which is difficult for humans is also difficult for machines.
Three P/N/O classifiers were trained using Naïve Bayes (NB), Maximum Entropy (MaxEnt) and Support Vector Classification (SVC) techniques.
Training data – Random 10000 movie reviews from Amazon Corpus and 20000 tweets from twitter corpus
Features- Presence/absence of Unigram, Bigram, Trigrams + SAC features Test data – All 566 movie reviews and 493 tweets from eye-tracking database Classifiers’ confidence:
Man vs Machine: SAC vs Classifier Confidence
Man vs Machine: SAC vs Classifier Confidence
SAC based on human readings is able to capture the difficulty experienced by classifiers as well.
Goal: If different classifiers can be used to handle text with different levels SAC
We categorized all the sentences in the training data into three groups: Easy (SAC 0.1- 3), Medium (3.1-7) and Hard (SAC 7.1-10).
Accuracies of classifiers (in %) given in the table below.
No conclusion could be drawn from this experiment
SAC and Ensemble Classifiers
Till now we have considered gaze fixation/saccade duration as a measure of complexity for SAC and TCI.
Fixation duration: Still prone to subjectivity and distractions Can other eye-movement features be exploited?
A small experiment was done taking Saccadic distance and Regression count in the SAC formulation
A better way of measuring SAC/TCI
Qualitative analysis of modified SAC
# Sentence SACold SACnew
1 it is messy , uncouth , incomprehensible , vicious and absurd
3.3 3.1
2 the drama was so uninspiring that even a story immersed in love , lust , and sin couldn’t keep my attention.
3.44 1.06
3 it’s like an all-star salute to disney’s cheesy commercialism.
8.3 10
For many examples scoring using the modified SAC formula gives better ranking of
sentences based on complexity.
Support Vector Regression was applied using the same features and modified SAC labels.
Results – More erroneous than the previous
Error analysis, better feature engineering
Predicting SAC with modified SAC labels
In this section, we
Demonstrated how eye-tracking data can be taken as a form of sub- conscious annotation.
Applied annotation obtained from eye-tracking to build predictive
framework for the prediction of Translation and Sentiment Annotation Complexity of text.
Summary
• Using “shallow” cognitive information from eye-tracking data.
• Three different scenarios where eye-tracking data can be used
PhD Theme – Part 2
Eye-movement data as a form of annotation
a. Useful where direct manual annotation is unintuitive and prone to subjectivity
b. Problems addressed:
Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis a. Modeling of eye-movement
data for
a. User profiling and user-specific modeling b. Finding out reading
trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye- gaze patterns
Modeling Eye-movement
data
Eye movement data to find out strategies employed by humans
for language processing
EYE-MOVEMENT DATA
FOR COGNITIVE MODELING
Text + Eye-gaze information = Multimodality Can this multimodal data be modeled for
Automatically figuring out the “reading trend” with respect to specific linguistic subtleties ?
User profiling and user-specific modeling to find out personal preferences?
Introduction
Eye-tracking studies:
Till now gaze-features like fixations, saccades have been used.
Scanpaths can be more powerful since they comprise fixations,
saccades (in terms of gaze progressions and regressions) [Malsburg et.al. 2012]
Scanpaths Line graphs:
• Fixations as nodes
• Saccades are edges
Study 3: Synthesizing “signature scanpaths” that represent consensus eye-
movement patterns
Vary from task to task
Nature of scanpaths
Reading Surfing Viewing
Vary with
Goal Person Time
Nature of scanpaths (Reading)
Eye-movement trajectories (Scanpaths) for different task oriented reading. Each rows correspond to a scanpaths of a single human subject. Columns (a), (b) and (c) refer scanpaths for tasks of Sentiment Analysis, Summarization and Translation respectively.
Sentence: I hate movie trailers that spoil all the movie also. But Dark City is a great movie, I havent seen it in blue ray yet. But I think it has a very good story, kind of dark (as the title says). My type of story.
Goal: To a common eye-movement representation or “signature scanpath” from N individual scanpaths corresponding to N readers performing a “task oriented” reading.
Utility:
Represents a common behavior. Helps explaining linguistic phenomena in the lights of common perception.
Reduction of noise and subjectivity.
Can be used to predict the expected eye-movement trajectory during reading
• Helpful for UI design
• Computational Advertisement
Signature Scanpaths
Expected output
Figure 3: Combining Scanpaths
1.
Clustering of fixations to find common Regions of Interest based on
spatial information (position of the word in terms of cluster count). Each region is assigned with an id called ROI_id
2.
Modeling Markov Chains to get the most likely sequence of ROI_id.
These sequence will represent the skeleton of the signature scanpath.
3.
Compute the duration of each fixation in the final scanpath using interpolation.
Method
One dimentional Clustering of fixations to find Common ROIs. Cursor distance used for distance matrix computation.
K-mean clustering algorithm is used for clustering. The value of K changes dynamically as per the length of the considered sentence.
The words/phrases in each common ROI is believed to exhibit peculiar characteristics (e.g. elements causing sarcasm)
Clustering to find common region of interest
Figure: Clustering of fixations
(Features- word positons mapped to Fixations)
Objective:
= ∝ ×
= ∏ ( | ) × ( | ) … … … . 1
(Using independence assumptions and Markov assumptions) Where,
→
, → , , … . . ,
( , , … . . ) Note that “ROI set” comprises Cluster IDs from 1,2…..k
Applying HMM(1)
Emission Probabilities P(f
i|t
i):
P(f
i|t
i) = count (fixations, t
i) / Total fixations.
Transition Probabilities P(t
i|t
i-1)
P(ti|ti-1) = count(Transitions from, ti-1to ti) / Total transitions from ti-1 to other ROIs.
Decoding using Viterbi
Heuristic: Time based segmentation : after a certain time interval T the transition and emission probabilities are updated.
• To deal with zero probability values, we apply ‘add delta’ smoothing
(Flaw: We have not included valuable information like “distinct users visiting to the same ROI”
,”multiple times visiting the same ROI)
Applying HMM(2)
Experiment Setup
• Datasets: Dundee readability corpus, IITB Sentiment Analysis paragraph and sentence level corpora, TPR database translation Post-editing dataset for English-Hindi, English-Spanish
• Dataset statistics given below (Ann represented number of readers/annotators)
Table: Statistics of the data used
Challenges
Hard to collect gold standard consensus scanpaths against which the algorithm output can be compared.
There is no other existing literature reporting such algorithms against which a comparison can be done
Validation Methods:
Quantitative validation Qualitative validation
Validation
Validation 1: By Fixation-Word Overlap Estimation
We extract all the words underlying the fixated regions in all individual scanpaths and the signature scanpath.
For each word in the signature scanpath, we check what fraction of the population fixate on the same word as expressed by the individual scanpaths.
Intuition: “What percentage of the collectively focused regions are reflected in the signature scanpath?
Validation 2: By Measuring Variance in scanpath distance
We find out the distance of the consensus scanpath from each individual scanpath using Scasim (von der Malsburg et al., 2012).
Scasim is a modified edit-distance based similarity measure that considers both spatial(co-ordinates) and and temporal(durations) properties of the fixations to calculate distance between two scanpaths.
We compute the pair-wise distance between the consensus scanpath and each individual scanpath and then compute the variance in distance.
Intuition: A consensus scanpath is believed to have taken components from every individual scanpath if the variance of the pair-wise distance is low.
Qualitative Validation
Quantitative Validation
TABLE: Validation results for different datasets shown in terms of Word overlap (AOIol) and Relative Standard Deviation (Diststdev)
Modeling does not consider many factors (like individual vs collective eye-transitions, eye-regressions etc.)
Algorithms have to be compared against a baseline
Qualitative Evaluation: Study if variation in signature is caused by the underlying linguistic characteristics.
Modeling using random walk theory ?
Scopes for improvements
Topic model: A set of algorithms that discover latent topics in a data set based on observed words.
Simple topic models: Latent Dirichlet allocation based models Recently started working (in collaboration with Aditya) on
User profiling and user-specific modeling (Incorporating eye-gaze information for term weighting of LDAs. Experiment going on )
Utility of such personalized models
• Recommendation systems
• Readers’ sentiment Analysis
• In classroom teaching where different teaching standards can be adopted based on overlaps in individual topics of interest.
Study 4: User specific Topic Models
• Using “shallow” cognitive information from eye-tracking data.
• Three different scenarios where eye-tracking data can be used
PhD Theme – Part 3
Eye-movement data as a form of annotation
a. Useful where direct manual annotation is unintuitive and prone to subjectivity
b. Problems addressed:
Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis a. Modeling of eye-movement
data for
a. User profiling and user-specific modeling b. Finding out reading
trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye- gaze patterns
Modeling Eye-movement
data
Eye movement data to find out strategies employed by humans
for language processing
COGNITIVE STUDY OF SUBJECTIVITY EXTRACTION IN SENTIMENT
ANNOTATION
Subjectivity extraction: Extracting subjective (sentiment bearing portions) from a text before predicting the sentiment polarity.
Conducted eye-tracking experiments involving humans annotating paragraphs with sentiment labels
Two different kinds of subjective document
Linear sentiment : Each subjective sentences follow the same sentiment through-out Oscillating sentiment: Sentiment flips throughout the document.
Two distinct strategies employed by humans for the task of sentiment analysis- (a)
Subjectivity Extraction through Anticipation, where sentences following a series of either positive negative polar sentences are skipped and the sentiment is anticipated without reading the whole document(b)Subjectivity Extraction through Homing, where some portions in the complex documents (with sentiment flips) are rigorously revisited even after a complete pass of reading.
(*A detailed research is being carried out)
Study 5: Subjectivity extraction during sentiment annotation
CONCLUSION AND FUTURE PLANS
Three different ways to utilize shallow cognitive information from eye-tracking
Focused on annotation complexity- measurement, prediction and application
Proposed and implemented a method to extract signature eye movement patterns to study reading trends.
Proposed a to obtain user specific topic modes by “term- weighting” bacic LDA models.
Cognitive study explaining “Homing” and “Anticipation”
strategies during sentiment annotation.
Conclusion – Work done so far (or ongoing)
Towards Generalization of Annotation Complexity Study can be applied to other NLP tasks like WSD
Leveraging unlabeled data to address data scarcity problem – transductive svr (experiments going on)
Extraction of “signature scanpaths”
Better formulation of the synthesis step Proper validation against baseline
User specific topic modeling
Validation of the topics using quantitative and qualitative evaluation
Modeling and predicting ‘pause’ placement in social media text.
Does pause affect sentiment? Grounding through cognitive studies Prediction framework for automatic pause insertion.
Future Work
THANK YOU
Parasuraman, Raja, and Matthew Rizzo. Neuroergonomics: The brain at work. Oxford University Press, Inc., 2008.
Klinton Bicknell and Roger Levy. 2010. A rational model of eye movement control in reading. In Proceedings of the 48th annual meeting of the association for computational linguistics, pages 1168–
1178. Association for Computational Linguistics.
Steven Bird. 2006. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL on Interactive presentation sessions, pages 69–72. Association for Computational Linguistics.
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research, 3:993–1022.
Marisa Boston, John Hale, Reinhold Kliegl, Umesh Patil, and Shravan Vasishth.2008. Parsing costs as predictors of reading difficulty: An evaluation using the potsdam sentence corpus. Mind Research Repository
Michael Carl. 2012a. The CRITT-TPR-DB 1.0: A database for empirical human translation process research. AMTA.Workshop on Post-Editing Technology and Practice
Michael Carl. 2012b. Translog-II: a program for recording user activity data for empirical reading and writing research. In LREC, pages 4108–4112.
References
Vera Demberg and Frank Keller. 2008. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition, 109(2):193210.
Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 85–91. Association for Computational Linguistics
Stephen Doherty, Sharon OBrien, and Michael Carl. 2010. Eye tracking as an MT evaluation technique.
Machine translation, 24(1):1–13
Barbara Dragsted. 2010. Coordination of reading and writing processes in translation.Translation and cognition, 15:41.
Andrea Esuli and Fabrizio Sebastiani. 2006. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, volume 6, pages 417–422.
Kar¨en Fort, Adeline Nazarenko, Sophie Rosset, et al. 2012. Modeling the complexity of manual
annotation tasks: A grid of analysis. In Proceedings of the International Conference on Computational Linguistics (COLING 2012), pages 895–910.
Joseph Goldberg and Jonathan Helfman. 2010. Scanpath clustering and aggregation. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, pages 227–234. ACM
References
Patrick J. O’Donnell Graham G. Scott and Sara C. Sereno.2012. Emotion words affect eye fixations during reading. Journal of Experimental Psychology:Learning, Memory, and Cognition, 38(3):783
Thorsten Joachims. 2006. Training linear svms in linear time. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 217–226. ACM
Salil Joshi, Diptesh Kanojia, and Pushpak Bhattacharyya. 2013. More than meets the eye: Study of human cognition in sense annotation. In Proceedings of NAACLHLT, pages 733–738
Dekang Lin. 1996. On the structural complexity of natural language sentences. In Proceedings of the 16th conference on Computational linguistics-Volume 2, pages 729–733. Association for Computational Linguistics
Pascual Martınez-G´omez and Akiko Aizawa. 2013. Diagnosing causes of reading difficulty using bayesian networks. IJCNLP, Japan