Module: Maximum Likelihood Estimation in R-II

(1)

Subject: Statistics

Paper: Introduction to R

Module: Maximum Likelihood Estimation in R-II

(2)

Development Team

Principal investigator:

Dr. Bhaswati Ganguli,

Professor, Department of Statistics, University of Calcutta

Paper co-ordinator:

Dr. Arindam Basu,

University of Canterbury, New Zealand

Content writer: Dr. Santu Ghosh,

Lecturer, Department of Environmental Health Engineering, Sri Ramachandra University, Chennai

Content reviewer: Dr. Kalyan Das,

Professor, Department of Statistics, University of Calcutta

Intro-R-MLE 2 /23

(3)

Outline

Maximum Likelihood Estimation by mle(stats4) function

Maximum Likelihood Estimation by mle2(bbmle) function

Maximum Likelihood Estimation by formula interface by

mle2(stats4) function

(4)

MLE by mle(stats4) function

MLE of normal distribution:First we generate some data.

set.seed(1001)

N <- 100

x <- rnorm(N, mean = 3, sd = 2)

Then we formulate the log-likelihood function.

LL <- function(mu, sigma) {

R = dnorm(x, mu, sigma) -sum(log(R))

}

Intro-R-MLE 4 /23

(5)

MLE by mle(stats4) function

Let usemle command fromstats4packages to get the estimates as follows

library(stats4) library(methods)

mle(LL, start = list(mu = 1, sigma=1))

## Warning in dnorm(x, mu, sigma): NaNs produced

##

## Call:

## mle(minuslogl = LL, start = list(mu = 1, sigma = 1))

##

## Coefficients:

## mu sigma

## 2.998305 2.277506

(6)

MLE by mle(stats4) function

Those warnings are a little disconcerting! They are produced when negative values are attempted for the standard deviation.

There are two ways to sort this out. The first is to apply constraints on the parameters. The mean does not require a constraint but we insist that the standard deviation is positive.

mle(LL, start = list(mu = 1, sigma = 1), method = "L-BFGS-B", lower = c(-Inf, 0), upper = c(Inf, Inf))

##

## Call:

## mle(minuslogl = LL, start = list(mu = 1, sigma = 1), method = "L-BFGS-B",

## lower = c(-Inf, 0), upper = c(Inf, Inf))

##

## Coefficients:

## mu sigma

## 2.998304 2.277506

This works becausemle()callsoptim(), which has a number of optimisation methods. The default method isBFGS. An alternative, the L-BFGS-Bmethod, allows box constraints.

Intro-R-MLE 6 /23

(7)

MLE by mle(stats4) function

The other solution is to simply ignore the warnings. It’s neater and produces the same results.

LL <- function(mu, sigma) {

R = suppressWarnings(dnorm(x, mu, sigma)) -sum(log(R))

}

mle(LL, start = list(mu = 1, sigma=1))

##

## Call:

## mle(minuslogl = LL, start = list(mu = 1, sigma = 1))

##

## Coefficients:

## mu sigma

## 2.998305 2.277506

(8)

Linear Model estimates by MLE

Let simulate a data as follows

x <- runif(100)

y <- 2.5 * x + 3 + rnorm(100,0,0.5)

Define log likelihood function as follows

LL <- function(beta0, beta1, mu, sigma) {

R = y - x * beta1 - beta0

R = suppressWarnings(dnorm(R, mu, sigma, log = TRUE)) -sum(R)

}

Intro-R-MLE 8 /23

(9)

Linear Model estimates by MLE

Let get the estimate mymlecommand

fit <- mle(LL, start = list(beta0 = 4, beta1 = 2, mu = 0, sigma=1))

summary(fit)

## Maximum likelihood estimation

##

## Call:

## mle(minuslogl = LL, start = list(beta0 = 4, beta1 = 2, mu = 0,

## sigma = 1))

##

## Coefficients:

## Estimate Std. Error

## beta0 3.5270106 2.372657e+04

## beta1 2.4758093 1.465961e-01

## mu -0.4729894 2.372657e+04

## sigma 0.4391135 3.104928e-02

##

## -2 log L: 119.1882

(10)

MLE by bbmle package: mle2()

Thebbmlepackage, designed to simplify maximum likelihood estimation and analysis in R, extends and modifies themlefunction and class in the stats4package that comes with R by default.

The major differences betweenmleandmle2are:

I mle2is more robust, with additional warnings (e.g. if the Hessian can’t be computed by finite differences,mle2returns a fit with a missing Hessian rather than stopping with an error)

I mle2uses a data argument to allow different data to be passed to the negative log-likelihood function

Intro-R-MLE 10 /23

(11)

MLE by bbmle package: mle2()

I mle2 has a formula interface like that of (e.g.) glsin thenlme package. For relatively simple models the formula for the maximum likelihood can be written in-line, rather than defining a negative log-likelihood function. The formula interface also simplifies fitting models with categorical variables. Models fitted using the formula interface also have applicable predict and simulate methods.

I bbmledefines anova, AIC, AICc, andBICmethods formle2 objects, as well asAICtab,BICtab,AICctabfunctions for producing summary tables of information criteria for a set of models.

(12)

MLE on simulated beta-binomial data

First, generate a single beta-binomially distributed set of points as a simple test the fit.

library(emdbook) set.seed(1001)

x1 <- rbetabinom(n=1000,prob=0.1,size=50,theta=10)

Construct a simple negative log-likelihood function:

library("bbmle")

mtmp <- function(prob,size,theta){

-sum(dbetabinom(x1,prob,size,theta,log=TRUE)) }

Intro-R-MLE 12 /23

(13)

MLE on simulated beta-binomial data

Fit the model — use data to pass the size parameter (since it wasn’t hard coded in themtmpfunction):

(m0<- mle2(mtmp, start= list(prob =0.2,theta= 9), data =list(size =50)))

##

## Call:

## mle2(minuslogl = mtmp, start = list(prob = 0.2, theta = 9), data = list(size = 50))

##

## Coefficients:

## prob theta

## 0.1030974 10.0758090

##

## Log-likelihood: -2723.5

(14)

MLE on simulated data

The summary method formle2 objects shows the parameters; approximate standard errors (based on quadratic approximation to the curvature at the maximum likelihood estimate); and a test of the parameter difference from zero based on this standard error and on an assumption that the likelihood surface is quadratic (or equivalently that the sampling distribution of the estimated parameters is normal).

summary(m0)

## Maximum likelihood estimation

##

## Call:

## mle2(minuslogl = mtmp, start = list(prob = 0.2, theta = 9), data = list(size = 50))

##

## Coefficients:

## Estimate Std. Error z value Pr(z)

## prob 0.1030974 0.0031626 32.599 < 2.2e-16 ***

## theta 10.0758090 0.6213319 16.216 < 2.2e-16 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## -2 log L: 5446.995

Intro-R-MLE 14 /23

(15)

MLE on simulated data

You can apply confintdirectly to m0, but if you’re going to work with the likelihood profile [e.g. plotting, or looking for confidence intervals at several differentαvalues] then it is more efficient to compute the profile likelihood once:

p0 <- profile(m0)

Compare the confidence interval estimates based on inverting a spline fit to the profile (the default); based on the quadratic approximation at the maximum likelihood estimate; and based on root-finding to find the exact point where the profile crosses the critical level.

(16)

MLE on simulated data

confint(p0)

## 2.5 % 97.5 %

## prob 0.09709228 0.1095103

## theta 8.91708205 11.3559592 confint(m0,method="quad")

## 2.5 % 97.5 %

## prob 0.09689875 0.1092961

## theta 8.85802088 11.2935972 confint(m0,method="uniroot")

## 2.5 % 97.5 %

## prob 0.09709908 0.1095033

## theta 8.91691019 11.3559746

Intro-R-MLE 16 /23

(17)

MLE on simulated data Plot of profiles

plot(p0, plot.confstr = TRUE)

0.095 0.100 0.105 0.110

0.00.51.01.52.02.5

Likelihood profile: prob

prob

z

99%

95%

90%

80%

50%

8.5 9.5 10.5 11.5

0.00.51.01.52.02.5

Likelihood profile: theta

theta

z

99%

95%

90%

80%

50%

(18)

MLE on simulated data

The plot method for likelihood profiles displays the square root of the the deviance difference

I twice the difference in negative log-likelihood from the best fit, so it will be V-shaped for cases where the quadratic approximation works well (as in this case).

For a better visual estimate of whether the profile is quadratic, use the absVal=FALSEoption to the plot method.

You can also request confidence intervals calculated using uniroot, which may be more exact when the profile is not smooth enough to be modeled accurately by a spline.

I However, this method is also more sensitive to numeric problems.

Intro-R-MLE 18 /23

(19)

MLE by formula interface

Instead of defining an explicit function for minuslogl, we can also use the formula interface.

The formula interface assumes that the density function given

(i) hasx as its first argument (if the distribution is multivariate, thenx should be a matrix of observations) and

(ii) has a log argument that will return the log-probability or log-probability density if log=TRUE.

(iii)

Some of the extended functionality (prediction etc.) depends on the existence of ans-variant function for the distribution that returns (at least) the mean and median as a function of the parameters (currently defined:

snorm, sbinom, sbeta, snbinom, spois).

(20)

MLE by formula interface

Note that you must specify the data via the data argument when using the formula interface.

m0f <- mle2(x1 ~ dbetabinom(prob, size = 50, theta),

start = list(prob = 0.2, theta = 9), data = data.frame(x1)) This may be slightly more unwieldy than just pulling the data from your workspace when you are doing simple things, but in the long run it makes tasks like predicting new responses much simpler.

Intro-R-MLE 20 /23

(21)

MLE by formula interface

It’s convenient to use the formula interface to try out likelihood estimation on the transformed parameters:

m0cf <- mle2(x1 ~ dbetabinom(prob = plogis(lprob),

size = 50, theta = exp(ltheta)), start = list(lprob = 0, ltheta = 2), data = data.frame(x1))

confint(m0cf, method = "uniroot")

## 2.5 % 97.5 %

## lprob -2.229960 -2.095759

## ltheta 2.187954 2.429738 confint(m0cf, method = "spline")

## 2.5 % 97.5 %

## lprob -2.229963 -2.095756

## ltheta 2.187948 2.429742

(22)

References

1 http://www.r-bloggers.com/fitting-a-model-by-maximum-likelihood/

2 https://stat.ethz.ch/R-manual/R-devel/library/stats4/html/mle.html

3 https://cran.r-project.org/web/packages/bbmle/vignettes/mle2.pdf

Intro-R-MLE 22 /23

(23)

Module: Maximum Likelihood Estimation in R-II

Subject: Statistics

Paper: Introduction to R