• No results found

Language Modeling for Information Retrieval

N/A
N/A
Protected

Academic year: 2022

Share "Language Modeling for Information Retrieval"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Language Modeling for Information Retrieval

Manoj Kumar Chinnakotla

KReSIT IIT Bombay

Language Technologies for the Web

Mar 2006

(2)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Outline

1 Introduction

2 The Classical Approach

3 The Language Modeling Approach

4 Smoothing Techniques

5 Relation with Classical Approach

(3)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Probabilistic Models

The Central Problem in IR

(4)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Probabilistic Models

Is this Document Relevant?

(5)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Probabilistic Models

Probabilistic Models

Model uncertainties in the problem well Example

Is this term relevant?

Is this document relevant?

The Random Variables Relevance (R) - R 2 f0; 1g

Documents (D) - D 2 fD

1

; D

2

; : : : ; D

N

g

Query (Q) - Q 2 fAll Possible Queriesg

A Term (A ) - A 2 f0; 1g

(6)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Probabilistic Models

The Ranking Function

Rank documents based on Posterior Probability of Relevance Score(D; Q) = P(R = 1jD; Q) (1) Ranking using following log-odds ratio is equivalent

Score (D; Q) = log P (R = 1jD; Q)

P (R = 0jD; Q) (2)

(7)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Probabilistic Models

Probabilistic Ranking Principle

Due to Robertson [6]

Central theorem for Probabilistic IR Theorem

Ranking documents using the log-odds ratio of posterior probability

of relevance is optimal with respect to various retrieval measures (like

Average Precision).

(8)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

The Classical Approach

Due to Robertson-Sparck Jones [7]

Generative Model of Relevance

Rank documents based on the following log-odds ratio Score (D; Q) = log P (DjQ; R = 1)

P (DjQ; R = 0) (3) For query Q, most of collection C is irrelevant

P (:jQ; R = 0) P(:jQ; C) (4)

(9)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Binary Independence Retrieval Model

Assuming term independence, we have Score(D; Q) = X

A

i

2D

log P(A i jQ; R = 1)

P(A i jQ; C) (5) Popularly known as “Binary Independence Retrieval (BIR)”

Model

Need to estimate Relevance Distribution P (:jQ; R = 1)

(10)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Are we back to the Original Problem?

Estimating P (:jQ; R = 1) is equivalent to solving the original problem!

Challenge - No sample relevant documents available initially Current Approaches

Choose some initial estimates for P (wjQ; R = 1) Iteratively assume top k documents retrieved are relevant Update estimates

Accuracy depends on initial separation achieved

(11)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Are we back to the Original Problem?

Estimating P (:jQ; R = 1) is equivalent to solving the original problem!

Challenge - No sample relevant documents available initially Current Approaches

Choose some initial estimates for P (wjQ; R = 1) Iteratively assume top k documents retrieved are relevant Update estimates

Accuracy depends on initial separation achieved

(12)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Are we back to the Original Problem?

Estimating P (:jQ; R = 1) is equivalent to solving the original problem!

Challenge - No sample relevant documents available initially Current Approaches

Choose some initial estimates for P (wjQ; R = 1) Iteratively assume top k documents retrieved are relevant Update estimates

Accuracy depends on initial separation achieved

(13)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Are we back to the Original Problem?

Estimating P (:jQ; R = 1) is equivalent to solving the original problem!

Challenge - No sample relevant documents available initially Current Approaches

Choose some initial estimates for P (wjQ; R = 1) Iteratively assume top k documents retrieved are relevant Update estimates

Accuracy depends on initial separation achieved

(14)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

The Language Modeling Approach

Basic Idea (Ponte and Croft [5])

Assuming document D is relevant, what is the likelihood of user choosing current query Q to retrieve D?

Model the language of each document as a distribution over words (Unigram)

Individual document distributions P (wjD) called “Language Models”

Rank documents based on posterior probability of document given query

P(DjQ) = P(QjD)

Document Prior

z }| {

P(D)

(15)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

A Shift in Paradigm

Some Immediate Benefits

Allows integration of document importance through Document Prior P (D)

Document Prior could be estimated from Link Analysis algorithms (Page Rank, HITS)

Ease of Estimation - Document size usually larger than the query Document Language Models P (wjD) could be pre-computed at index time

Assuming uniform document priors, Query Likelihood Ranking Function is given by

Score(D; Q) = Y

P(wjD) (7)

(16)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Smoothing Techniques

Motivation

The Maximum Likelihood Estimator (MLE) for P (wjD) given by P

ml

= P c (w; D)

w2D

c (w; D) (8)

Since document length is limited, MLE P

ml

Assigns zero probability to words not observed in D Has high variance

Solution - Smoothing the MLE using collection model P (wjC) Example

Jelinik-Mercer Smoothing

(17)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Modeling of Relevance

Relation with Classical Approach

Figure: Two different factorizations of the same joint P (D; QjR) Two Approaches Equivalent

LM Approach makes additional assumptions

Justification for Assumptions

(18)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Modeling of Relevance

Where is Relevance?

Notion of relevance assumed implicitly in the model This is a problem while handling “Relevance Feedback”

Solution - Query Models or Relevance Models [3, 4]

Relevance Model or Query Model - Distribution encoding the information need

Assume query Q to be sample from Relevance Model

R

New Ranking Function - Divergence Based

Score (D) = KL(Djj R )

= X

P (wjD) log P (wjD)

(10)

(19)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Modeling of Relevance

Where is Relevance?

Notion of relevance assumed implicitly in the model This is a problem while handling “Relevance Feedback”

Solution - Query Models or Relevance Models [3, 4]

Relevance Model or Query Model - Distribution encoding the information need

Assume query Q to be sample from Relevance Model

R

New Ranking Function - Divergence Based

Score (D) = KL(Djj R )

X P (wjD)

(20)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Modeling of Relevance

Where is Relevance?

Notion of relevance assumed implicitly in the model This is a problem while handling “Relevance Feedback”

Solution - Query Models or Relevance Models [3, 4]

Relevance Model or Query Model - Distribution encoding the information need

Assume query Q to be sample from Relevance Model

R

New Ranking Function - Divergence Based

Score (D) = KL(Djj R )

= X

P (wjD) log P (wjD)

(10)

(21)

Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach

Modeling of Relevance

Implications for the LM Approach

Problem of Retrieval ) Estimating Two Distributions Relevance Model P (wj

R

)

Document Language Models P (wjD)

Offers natural way to incorporate “Relevance Feedback”

Given relevant documents, update Relevance Model R

(22)

References I

CROFT, W. B.,ANDLAFFERTY, J.

Language Modeling for Information Retrieval.

Kluwer Academic Publishers, 2003.

JOHNLAFFERTY ANDCHENGXIANGZHAI.

Probabilistic Relevance Models Based on Document and Query Generation.

In Language Modeling for Information Retrieval (2003), vol. 13, Kluwer International Series on IR, pp. 1–10.

LAFFERTY, J.,ANDZHAI, C.

Document Language Models, Query Models, and Risk Minimization for Information Retrieval.

In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA, 2001), ACM Press, pp. 111–119.

LAVRENKO, V.,ANDCROFT, W. B.

Relevance Based Language Models.

In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA, 2001), ACM Press, pp. 120–127.

PONTE, J. M.,ANDCROFT, W. B.

A Language Modeling Approach to Information Retrieval.

(23)

References II

ROBERTSON, S. E.

The Probability Ranking Principle in IR.

Readings in information retrieval (1997), 281–286.

ROBERTSON, S. E.,ANDJONES, S.

Relevance Weighting of Search Terms.

Journal of the American Society for Information Science 27 (1976), 129–146.

YATES, R. B.,ANDNETO, B. R.

Modern Information Retrieval.

Pearson Education, 2005.

(24)

Thank You

References

Related documents

Start with initial guess x = −1, keep applying cos, and hope for convergence cos(cos(....cos(−1)...)).. How do we choose

∋ (Extensible) data types with vague predicates.. Probabilistic Retrieval with XIRQL. Problem: weighting of different forms of occurrence of terms /document[.//heading ∋ "XML"

The estimates of area, production and yield rates for 2018-19 are “Fourth Advance Estimates” as on 19.08.2019 and are based on data received from State Governments, validated

• Using an initial parameter instantiation, the forward-backward algorithm iteratively re- estimates the parameters and improves the probability that given observation are. generated

Start with initial guess x = −1, keep applying cos, and hope for convergence cos(cos(....cos(−1)...)).. How do we choose

On the World Wide Web, standards for transmitting virtual reality worlds or “scenes” in VRML (Virtual Reality Modeling Language) documents (with the file name extension

considerable increase over the previous figures. Revised estimates have not been made for the other zones.. Some efforts were made in the present study to update

* Jammu & Kashmir being a larger UT, the estimates of Jammu & Kashmir excluding Ladakh are provided for the quarter July - September 2021, However, these estimates are