Subject: Statistics
Paper: Regression Analysis III Module: Components of GLM
Regression Analysis III 1 / 14
Development Team
Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray,Analytics professional, Kolkata
Content reviewer: Department of Statistics, University of Calcutta
Components of a GLM
There are 3 components of a GLM :
I Random component
I Systematic component
I Link function
Regression Analysis III 3 / 14
Components of a GLM
There are 3 components of a GLM :
I Random component
I Systematic component
I Link function
Components of a GLM
There are 3 components of a GLM :
I Random component
I Systematic component
I Link function
Regression Analysis III 3 / 14
Components of a GLM
There are 3 components of a GLM :
I Random component
I Systematic component
I Link function
The random component
I ConsiderY to be the response variable.
I Y = (Y1, Y2, . . . , Yn) : Yi’s are independently distributed.
I We are considering one parameter exponential families here so we assume the distribution of Y to belong to that family and have the form :
f(y) =e(
yθ−b(θ)
a(φ) +C(y,φ))
I θ : natural or canonical parameters
I φ: dispersion parameters
Regression Analysis III 4 / 14
The random component
I ConsiderY to be the response variable.
I Y = (Y1, Y2, . . . , Yn) : Yi’s are independently distributed.
I We are considering one parameter exponential families here so we assume the distribution of Y to belong to that family and have the form :
f(y) =e(
yθ−b(θ)
a(φ) +C(y,φ))
I θ : natural or canonical parameters
I φ: dispersion parameters
The random component
I ConsiderY to be the response variable.
I Y = (Y1, Y2, . . . , Yn) : Yi’s are independently distributed.
I We are considering one parameter exponential families here so we assume the distribution of Y to belong to that family and have the form :
f(y) =e(
yθ−b(θ)
a(φ) +C(y,φ))
I θ : natural or canonical parameters
I φ: dispersion parameters
Regression Analysis III 4 / 14
The random component
I ConsiderY to be the response variable.
I Y = (Y1, Y2, . . . , Yn) : Yi’s are independently distributed.
I We are considering one parameter exponential families here so we assume the distribution of Y to belong to that family and have the form :
f(y) =e(
yθ−b(θ)
a(φ) +C(y,φ))
I θ : natural or canonical parameters
I φ: dispersion parameters
Examples of One parameter Exponential family
I Normal: f(y) =e(
yµ−µ2/2 σ2 +[−y2
2σ2−log√ 2πσ])
I Binomial: f(y) =e
(ylog1−ππ +nlog(1−π)+log
n y
)
Regression Analysis III 5 / 14
Examples of One parameter Exponential family
I Normal: f(y) =e(
yµ−µ2/2 σ2 +[−y2
2σ2−log√ 2πσ])
I Binomial: f(y) =e
(ylog1−ππ +nlog(1−π)+log
n y
)
Members of the one parameter exponential family
Other members of the exponential family
I Poisson
I Binomial
I Negative Binomial
I Gamma
I Inverse Gaussian
Regression Analysis III 6 / 14
Members of the one parameter exponential family
Other members of the exponential family
I Poisson
I Binomial
I Negative Binomial
I Gamma
I Inverse Gaussian
Members of the one parameter exponential family
Other members of the exponential family
I Poisson
I Binomial
I Negative Binomial
I Gamma
I Inverse Gaussian
Regression Analysis III 6 / 14
Members of the one parameter exponential family
Other members of the exponential family
I Poisson
I Binomial
I Negative Binomial
I Gamma
I Inverse Gaussian
Members of the one parameter exponential family
Other members of the exponential family
I Poisson
I Binomial
I Negative Binomial
I Gamma
I Inverse Gaussian
Regression Analysis III 6 / 14
Systematic component
I Consists of the explanatory variable x and a linear function of x andβ
I x= (x1, x2, . . . , xp) : set ofp explanatory variables
I β = (β1, β2, , βp) : parameter vector
I η =x0β =Pp
j=1βjxj : linear predictor
Systematic component
I Consists of the explanatory variable x and a linear function of x andβ
I x= (x1, x2, . . . , xp) : set ofp explanatory variables
I β = (β1, β2, , βp) : parameter vector
I η =x0β =Pp
j=1βjxj : linear predictor
Regression Analysis III 7 / 14
Systematic component
I Consists of the explanatory variable x and a linear function of x andβ
I x= (x1, x2, . . . , xp) : set ofp explanatory variables
I β = (β1, β2, , βp) : parameter vector
I η =x0β =Pp
j=1βjxj : linear predictor
Systematic component
I Consists of the explanatory variable x and a linear function of x andβ
I x= (x1, x2, . . . , xp) : set ofp explanatory variables
I β = (β1, β2, , βp) : parameter vector
I η =x0β =Pp
j=1βjxj : linear predictor
Regression Analysis III 7 / 14
Link function
I It is a link between the random and systematic components.
I It is a function that specifies the relationship between the expected value of the random component and the systematic component.
I ηi =g(µi) whereµi=E(Yi)
and, g(.) is a continuous, monotone and differentiable function
I Identity link : g(µi) =µi
I Canonical link : g(µ) =θ
Link function
I It is a link between the random and systematic components.
I It is a function that specifies the relationship between the expected value of the random component and the systematic component.
I ηi =g(µi) whereµi=E(Yi)
and, g(.) is a continuous, monotone and differentiable function
I Identity link : g(µi) =µi
I Canonical link : g(µi) =θi
Regression Analysis III 8 / 14
Link function
I It is a link between the random and systematic components.
I It is a function that specifies the relationship between the expected value of the random component and the systematic component.
I ηi =g(µi) whereµi=E(Yi)
and, g(.) is a continuous, monotone and differentiable function
I Identity link : g(µi) =µi
I Canonical link : g(µ) =θ
Link function
I It is a link between the random and systematic components.
I It is a function that specifies the relationship between the expected value of the random component and the systematic component.
I ηi =g(µi) whereµi=E(Yi)
and, g(.) is a continuous, monotone and differentiable function
I Identity link : g(µi) =µi
I Canonical link : g(µi) =θi
Regression Analysis III 8 / 14
Link function
I It is a link between the random and systematic components.
I It is a function that specifies the relationship between the expected value of the random component and the systematic component.
I ηi =g(µi) whereµi=E(Yi)
and, g(.) is a continuous, monotone and differentiable function
I Identity link : g(µi) =µi
I Canonical link : g(µ) =θ
Link function
I It is a link between the random and systematic components.
I It is a function that specifies the relationship between the expected value of the random component and the systematic component.
I ηi =g(µi) whereµi=E(Yi)
and, g(.) is a continuous, monotone and differentiable function
I Identity link : g(µi) =µi
I Canonical link : g(µi) =θi
Regression Analysis III 8 / 14
Example
Example : Yi ∼Bernoulli(pi)
∴ ,µi =E(Yi) =pi
I logit link : g(µi) =log(1−µµi
i)
I probit link : g(µi) = Φ−1(µi)
I linear probability model : g(µi) =µi
Example
Example : Yi ∼Bernoulli(pi)
∴ ,µi =E(Yi) =pi
I logit link : g(µi) =log(1−µµi
i)
I probit link : g(µi) = Φ−1(µi)
I linear probability model : g(µi) =µi
Regression Analysis III 9 / 14
Example
Example : Yi ∼Bernoulli(pi)
∴ ,µi =E(Yi) =pi
I logit link : g(µi) =log(1−µµi
i)
I probit link : g(µi) = Φ−1(µi)
I linear probability model : g(µi) =µi
Example
Example : Yi ∼Bernoulli(pi)
∴ ,µi =E(Yi) =pi
I logit link : g(µi) =log(1−µµi
i)
I probit link : g(µi) = Φ−1(µi)
I linear probability model : g(µi) =µi
Regression Analysis III 9 / 14
Example
Example : Yi ∼Bernoulli(pi)
∴ ,µi =E(Yi) =pi
I logit link : g(µi) =log(1−µµi
i)
I probit link : g(µi) = Φ−1(µi)
I linear probability model : g(µi) =µi
Example
Example : Yi ∼Bernoulli(pi)
∴ ,µi =E(Yi) =pi
I logit link : g(µi) =log(1−µµi
i)
I probit link : g(µi) = Φ−1(µi)
I linear probability model : g(µi) =µi
Regression Analysis III 9 / 14
Likelihood estimation
log likelihood function based on a single observation :
I Li = yiθa(φi−b(θi)
i) +C(yi, φi)
I δLi
δθi = yi−ba(φ0(θi)
i)
I E(δLδθi
i) = 0⇒E(Yi) =µi =b0(θi)
I (δδθ2L2i i
) =−ba(φ00(θi)
i)
Likelihood estimation
log likelihood function based on a single observation :
I Li = yiθa(φi−b(θi)
i) +C(yi, φi)
I δLi
δθi = yi−ba(φ0(θi)
i)
I E(δLδθi
i) = 0⇒E(Yi) =µi =b0(θi)
I (δδθ2L2i i
) =−ba(φ00(θi)
i)
Regression Analysis III 10 / 14
Likelihood estimation
log likelihood function based on a single observation :
I Li = yiθa(φi−b(θi)
i) +C(yi, φi)
I δLi
δθi = yi−ba(φ0(θi)
i)
I E(δLδθi
i) = 0⇒E(Yi) =µi =b0(θi)
I (δδθ2L2i i
) =−ba(φ00(θi)
i)
Likelihood estimation
log likelihood function based on a single observation :
I Li = yiθa(φi−b(θi)
i) +C(yi, φi)
I δLi
δθi = yi−ba(φ0(θi)
i)
I E(δLδθi
i) = 0⇒E(Yi) =µi =b0(θi)
I (δδθ2L2i i
) =−ba(φ00(θi)
i)
Regression Analysis III 10 / 14
Likelihood estimation
log likelihood function based on a single observation :
I Li = yiθa(φi−b(θi)
i) +C(yi, φi)
I δLi
δθi = yi−ba(φ0(θi)
i)
I E(δLδθi
i) = 0⇒E(Yi) =µi =b0(θi)
I (δδθ2L2i i
) =−ba(φ00(θi)
i)
Variance function
E(δLδθi
i)2 =E[yi−ba(φ0(θi)
i) ]2
= V ar(Y[a(φ i)
i)]2
= ba(φ00(θi)
i)
⇒V ar(Yi) =b00(θi)a(φi)
V =b00(θi) : Variance function
V =V(µi) = dbd(θ0(θi)
i) = d(µd(θi)
i)
Regression Analysis III 11 / 14
Variance function
E(δLδθi
i)2 =E[yi−ba(φ0(θi)
i) ]2
= V ar(Y[a(φ i)
i)]2
= ba(φ00(θi)
i)
⇒V ar(Yi) =b00(θi)a(φi)
V =b00(θi) : Variance function
V =V(µ) = db0(θi) = d(µi)
Variance function
E(δLδθi
i)2 =E[yi−ba(φ0(θi)
i) ]2
= V ar(Y[a(φ i)
i)]2
= ba(φ00(θi)
i)
⇒V ar(Yi) =b00(θi)a(φi)
V =b00(θi) : Variance function
V =V(µi) = dbd(θ0(θi)
i) = d(µd(θi)
i)
Regression Analysis III 11 / 14
Variance function
E(δLδθi
i)2 =E[yi−ba(φ0(θi)
i) ]2
= V ar(Y[a(φ i)
i)]2
= ba(φ00(θi)
i)
⇒V ar(Yi) =b00(θi)a(φi)
V =b00(θi) : Variance function
V =V(µ) = db0(θi) = d(µi)
Salient features of GLM
I GLM does not transform Y, it only transforms the mean (E(Y)) and models it as a function of linear predictors.
I The objective is to investigate whether and how the mean varies as a function of the levels of our predictor or explanatory variables.
I Link function transforms the model to a linear model and retains the assumptions of normality with different mean for each observation Yi.
Regression Analysis III 12 / 14
Salient features of GLM
I GLM does not transform Y, it only transforms the mean (E(Y)) and models it as a function of linear predictors.
I The objective is to investigate whether and how the mean varies as a function of the levels of our predictor or explanatory variables.
I Link function transforms the model to a linear model and retains the assumptions of normality with different mean for each observation Yi.
Salient features of GLM
I GLM does not transform Y, it only transforms the mean (E(Y)) and models it as a function of linear predictors.
I The objective is to investigate whether and how the mean varies as a function of the levels of our predictor or explanatory variables.
I Link function transforms the model to a linear model and retains the assumptions of normality with different mean for each observation Yi.
Regression Analysis III 12 / 14
Salient features of GLM contd ...
I GLM relaxes normality.
I GLM allows for non-uniform variance.
I Variance of each observation Yi is a function of the meanµi.
I Distribution is completely specified in terms of its mean and variance.
Salient features of GLM contd ...
I GLM relaxes normality.
I GLM allows for non-uniform variance.
I Variance of each observation Yi is a function of the meanµi.
I Distribution is completely specified in terms of its mean and variance.
Regression Analysis III 13 / 14
Salient features of GLM contd ...
I GLM relaxes normality.
I GLM allows for non-uniform variance.
I Variance of each observation Yi is a function of the meanµi.
I Distribution is completely specified in terms of its mean and variance.
Salient features of GLM contd ...
I GLM relaxes normality.
I GLM allows for non-uniform variance.
I Variance of each observation Yi is a function of the meanµi.
I Distribution is completely specified in terms of its mean and variance.
Regression Analysis III 13 / 14
Summary
I Random component : the response variable Yi and it belongs to the one parameter exponential family.
I Systematic component : linear function of the explanatory variables (linear predictor).
I Link function : links the random component with the systematic component to make the relationship linear.
I Variance of each observation is a function of the mean of that observation.
Summary
I Random component : the response variable Yi and it belongs to the one parameter exponential family.
I Systematic component : linear function of the explanatory variables (linear predictor).
I Link function : links the random component with the systematic component to make the relationship linear.
I Variance of each observation is a function of the mean of that observation.
Regression Analysis III 14 / 14
Summary
I Random component : the response variable Yi and it belongs to the one parameter exponential family.
I Systematic component : linear function of the explanatory variables (linear predictor).
I Link function : links the random component with the systematic component to make the relationship linear.
I Variance of each observation is a function of the mean of that observation.
Summary
I Random component : the response variable Yi and it belongs to the one parameter exponential family.
I Systematic component : linear function of the explanatory variables (linear predictor).
I Link function : links the random component with the systematic component to make the relationship linear.
I Variance of each observation is a function of the mean of that observation.
Regression Analysis III 14 / 14