Subject: Statistics
Paper: Introduction to R
Module: Maximum Likelihood Estimation in R-II
Development Team
Principal investigator:
Dr. Bhaswati Ganguli,
Professor, Department of Statistics, University of Calcutta
Paper co-ordinator:
Dr. Arindam Basu,
University of Canterbury, New Zealand
Content writer: Dr. Santu Ghosh,
Lecturer, Department of Environmental Health Engineering, Sri Ramachandra University, Chennai
Content reviewer: Dr. Kalyan Das,
Professor, Department of Statistics, University of Calcutta
Intro-R-MLE 2 /23
Outline
Maximum Likelihood Estimation by mle(stats4) function
Maximum Likelihood Estimation by mle2(bbmle) function
Maximum Likelihood Estimation by formula interface by
mle2(stats4) function
MLE by mle(stats4) function
MLE of normal distribution:First we generate some data.
set.seed(1001)
N <- 100
x <- rnorm(N, mean = 3, sd = 2)
Then we formulate the log-likelihood function.
LL <- function(mu, sigma) {
R = dnorm(x, mu, sigma) -sum(log(R))
}
Intro-R-MLE 4 /23
MLE by mle(stats4) function
Let usemle command fromstats4packages to get the estimates as follows
library(stats4) library(methods)
mle(LL, start = list(mu = 1, sigma=1))
## Warning in dnorm(x, mu, sigma): NaNs produced
## Warning in dnorm(x, mu, sigma): NaNs produced
## Warning in dnorm(x, mu, sigma): NaNs produced
##
## Call:
## mle(minuslogl = LL, start = list(mu = 1, sigma = 1))
##
## Coefficients:
## mu sigma
## 2.998305 2.277506
MLE by mle(stats4) function
Those warnings are a little disconcerting! They are produced when negative values are attempted for the standard deviation.
There are two ways to sort this out. The first is to apply constraints on the parameters. The mean does not require a constraint but we insist that the standard deviation is positive.
mle(LL, start = list(mu = 1, sigma = 1), method = "L-BFGS-B", lower = c(-Inf, 0), upper = c(Inf, Inf))
##
## Call:
## mle(minuslogl = LL, start = list(mu = 1, sigma = 1), method = "L-BFGS-B",
## lower = c(-Inf, 0), upper = c(Inf, Inf))
##
## Coefficients:
## mu sigma
## 2.998304 2.277506
This works becausemle()callsoptim(), which has a number of optimisation methods. The default method isBFGS. An alternative, the L-BFGS-Bmethod, allows box constraints.
Intro-R-MLE 6 /23
MLE by mle(stats4) function
The other solution is to simply ignore the warnings. It’s neater and produces the same results.
LL <- function(mu, sigma) {
R = suppressWarnings(dnorm(x, mu, sigma)) -sum(log(R))
}
mle(LL, start = list(mu = 1, sigma=1))
##
## Call:
## mle(minuslogl = LL, start = list(mu = 1, sigma = 1))
##
## Coefficients:
## mu sigma
## 2.998305 2.277506
Linear Model estimates by MLE
Let simulate a data as follows
x <- runif(100)
y <- 2.5 * x + 3 + rnorm(100,0,0.5)
Define log likelihood function as follows
LL <- function(beta0, beta1, mu, sigma) {
R = y - x * beta1 - beta0
R = suppressWarnings(dnorm(R, mu, sigma, log = TRUE)) -sum(R)
}
Intro-R-MLE 8 /23
Linear Model estimates by MLE
Let get the estimate mymlecommand
fit <- mle(LL, start = list(beta0 = 4, beta1 = 2, mu = 0, sigma=1))
summary(fit)
## Maximum likelihood estimation
##
## Call:
## mle(minuslogl = LL, start = list(beta0 = 4, beta1 = 2, mu = 0,
## sigma = 1))
##
## Coefficients:
## Estimate Std. Error
## beta0 3.5270106 2.372657e+04
## beta1 2.4758093 1.465961e-01
## mu -0.4729894 2.372657e+04
## sigma 0.4391135 3.104928e-02
##
## -2 log L: 119.1882
MLE by bbmle package: mle2()
Thebbmlepackage, designed to simplify maximum likelihood estimation and analysis in R, extends and modifies themlefunction and class in the stats4package that comes with R by default.
The major differences betweenmleandmle2are:
I mle2is more robust, with additional warnings (e.g. if the Hessian can’t be computed by finite differences,mle2returns a fit with a missing Hessian rather than stopping with an error)
I mle2uses a data argument to allow different data to be passed to the negative log-likelihood function
Intro-R-MLE 10 /23
MLE by bbmle package: mle2()
I mle2 has a formula interface like that of (e.g.) glsin thenlme package. For relatively simple models the formula for the maximum likelihood can be written in-line, rather than defining a negative log-likelihood function. The formula interface also simplifies fitting models with categorical variables. Models fitted using the formula interface also have applicable predict and simulate methods.
I bbmledefines anova, AIC, AICc, andBICmethods formle2 objects, as well asAICtab,BICtab,AICctabfunctions for producing summary tables of information criteria for a set of models.
MLE on simulated beta-binomial data
First, generate a single beta-binomially distributed set of points as a simple test the fit.
library(emdbook) set.seed(1001)
x1 <- rbetabinom(n=1000,prob=0.1,size=50,theta=10)
Construct a simple negative log-likelihood function:
library("bbmle")
mtmp <- function(prob,size,theta){
-sum(dbetabinom(x1,prob,size,theta,log=TRUE)) }
Intro-R-MLE 12 /23
MLE on simulated beta-binomial data
Fit the model — use data to pass the size parameter (since it wasn’t hard coded in themtmpfunction):
(m0<- mle2(mtmp, start= list(prob =0.2,theta= 9), data =list(size =50)))
##
## Call:
## mle2(minuslogl = mtmp, start = list(prob = 0.2, theta = 9), data = list(size = 50))
##
## Coefficients:
## prob theta
## 0.1030974 10.0758090
##
## Log-likelihood: -2723.5
MLE on simulated data
The summary method formle2 objects shows the parameters; approximate standard errors (based on quadratic approximation to the curvature at the maximum likelihood estimate); and a test of the parameter difference from zero based on this standard error and on an assumption that the likelihood surface is quadratic (or equivalently that the sampling distribution of the estimated parameters is normal).
summary(m0)
## Maximum likelihood estimation
##
## Call:
## mle2(minuslogl = mtmp, start = list(prob = 0.2, theta = 9), data = list(size = 50))
##
## Coefficients:
## Estimate Std. Error z value Pr(z)
## prob 0.1030974 0.0031626 32.599 < 2.2e-16 ***
## theta 10.0758090 0.6213319 16.216 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## -2 log L: 5446.995
Intro-R-MLE 14 /23
MLE on simulated data
You can apply confintdirectly to m0, but if you’re going to work with the likelihood profile [e.g. plotting, or looking for confidence intervals at several differentαvalues] then it is more efficient to compute the profile likelihood once:
p0 <- profile(m0)
Compare the confidence interval estimates based on inverting a spline fit to the profile (the default); based on the quadratic approximation at the maximum likelihood estimate; and based on root-finding to find the exact point where the profile crosses the critical level.
MLE on simulated data
confint(p0)
## 2.5 % 97.5 %
## prob 0.09709228 0.1095103
## theta 8.91708205 11.3559592 confint(m0,method="quad")
## 2.5 % 97.5 %
## prob 0.09689875 0.1092961
## theta 8.85802088 11.2935972 confint(m0,method="uniroot")
## 2.5 % 97.5 %
## prob 0.09709908 0.1095033
## theta 8.91691019 11.3559746
Intro-R-MLE 16 /23
MLE on simulated data Plot of profiles
plot(p0, plot.confstr = TRUE)
0.095 0.100 0.105 0.110
0.00.51.01.52.02.5
Likelihood profile: prob
prob
z
99%
95%
90%
80%
50%
8.5 9.5 10.5 11.5
0.00.51.01.52.02.5
Likelihood profile: theta
theta
z
99%
95%
90%
80%
50%
MLE on simulated data
The plot method for likelihood profiles displays the square root of the the deviance difference
I twice the difference in negative log-likelihood from the best fit, so it will be V-shaped for cases where the quadratic approximation works well (as in this case).
For a better visual estimate of whether the profile is quadratic, use the absVal=FALSEoption to the plot method.
You can also request confidence intervals calculated using uniroot, which may be more exact when the profile is not smooth enough to be modeled accurately by a spline.
I However, this method is also more sensitive to numeric problems.
Intro-R-MLE 18 /23
MLE by formula interface
Instead of defining an explicit function for minuslogl, we can also use the formula interface.
The formula interface assumes that the density function given
(i) hasx as its first argument (if the distribution is multivariate, thenx should be a matrix of observations) and
(ii) has a log argument that will return the log-probability or log-probability density if log=TRUE.
(iii)
Some of the extended functionality (prediction etc.) depends on the existence of ans-variant function for the distribution that returns (at least) the mean and median as a function of the parameters (currently defined:
snorm, sbinom, sbeta, snbinom, spois).
MLE by formula interface
Note that you must specify the data via the data argument when using the formula interface.
m0f <- mle2(x1 ~ dbetabinom(prob, size = 50, theta),
start = list(prob = 0.2, theta = 9), data = data.frame(x1)) This may be slightly more unwieldy than just pulling the data from your workspace when you are doing simple things, but in the long run it makes tasks like predicting new responses much simpler.
Intro-R-MLE 20 /23
MLE by formula interface
It’s convenient to use the formula interface to try out likelihood estimation on the transformed parameters:
m0cf <- mle2(x1 ~ dbetabinom(prob = plogis(lprob),
size = 50, theta = exp(ltheta)), start = list(lprob = 0, ltheta = 2), data = data.frame(x1))
confint(m0cf, method = "uniroot")
## 2.5 % 97.5 %
## lprob -2.229960 -2.095759
## ltheta 2.187954 2.429738 confint(m0cf, method = "spline")
## 2.5 % 97.5 %
## lprob -2.229963 -2.095756
## ltheta 2.187948 2.429742
References
1 http://www.r-bloggers.com/fitting-a-model-by-maximum-likelihood/
2 https://stat.ethz.ch/R-manual/R-devel/library/stats4/html/mle.html
3 https://cran.r-project.org/web/packages/bbmle/vignettes/mle2.pdf
Intro-R-MLE 22 /23