A weighting function for ‘nls’ / ‘nlsLM’

July 19, 2012

Standard nonlinear regression assumes homoscedastic data, that is, all response values y_i are distributed normally.  In case of heteroscedastic data (i.e. when the variance is dependent on the magnitude of the data), weighting the fit is essential. In nls (or nlsLM of the minpack.lm package), weighting can be conducted by two different methods:

1) by supplying a vector of weighting values w_i for every y_i that is supplied to the ‘weights’ argument.

2) by modifying the objective function in a way that it has the weights included in the body of the function. An example is given for this one in the documentation to nls.

1) is easy of course, but all w_i values have to be derived from somewhere, for example the reciproce of the response values w_i = 1/y_i. 2) is a bit more cumbersome and one does not always want to define a new weighted version of the objective function.

I found it about time to design a weighting function that can be supplied to the ‘weights’ argument and that makes weighting by different regimes a breeze. wfct() returns a vector of weights that are calculated from a user-defined expression and transfers this vector within nls.

wfct <- function(expr)
{
expr <- deparse(substitute(expr))

## create new environment
newEnv <- new.env()

## get call
mc <- sys.calls()[[1]]
mcL <- as.list(mc)

## get data and write to newEnv
DATA <- mcL[["data"]]
DATA <- eval(DATA)
DATA <- as.list(DATA)
NAMES <- names(DATA)
for (i in 1:length(DATA)) assign(NAMES[i], DATA[[i]], envir = newEnv)

## get parameter, response and predictor names
formula <- as.formula(mcL[[2]])
VARS <- all.vars(formula)
RESP <- VARS[1]
RHS <- VARS[-1]
PRED <- match(RHS, names(DATA))
PRED <- names(DATA)[na.omit(PRED)]

## calculate variances for response values if "error" is in expression
## and write to newEnv
if (length(grep("error", expr)) > 0) {
y <- DATA[[RESP]]
x <- DATA[[PRED]]
## test for replication
if (!any(duplicated(x))) stop("No replicates available to calculate error from!")
## calculate error
error <- tapply(y, x, function(e) var(e, na.rm = TRUE))
error <- as.numeric(sqrt(error))
## convert to original repititions
error <- rep(error, as.numeric(table(x)))
assign("error", error, envir = newEnv)
}

## calculate fitted or residual values if "fitted"/"resid" is in expression
## and write to newEnv
if (length(grep("fitted", expr)) > 0 || length(grep("resid", expr)) > 0) {
mc2 <- mc
mc2$weights <- NULL
MODEL <- eval(mc2)
fitted <- fitted(MODEL)
resid <- residuals(MODEL)
assign("fitted", fitted, newEnv)
assign("resid", resid, newEnv)
}

## return evaluation in newEnv: vector of weights
OUT <- eval(parse(text = expr), envir = newEnv)
return(OUT)
}

The weighting function can take 5 different variable definitions and combinations thereof:
the name of the predictor (independent) variable,
the name of the response (dependent) variable,
error: if replicates y_{ij} exist, the error \sigma(y_{ij}),
fitted: the fitted values \hat{y}_i of the model,
resid: the residuals y_i - \hat{y}_i of the model,
For the last two, the model is fit unweighted, fitted values and residuals are extracted and the model is refit by the defined weights.

Let’s have some examples (taken from the nls doc):

Treated <- Puromycin[Puromycin$state == "treated", ]

Weighting by inverse of response 1/y_i:

nls(rate ~ Vm * conc/(K + conc), data = Treated, 
    start = c(Vm = 200, K = 0.05), weights = wfct(1/rate))
Nonlinear regression model
  model:  rate ~ Vm * conc/(K + conc) 
   data:  Treated 
       Vm         K 
209.59669   0.06065 
 weighted residual sum-of-squares: 12.27

Number of iterations to convergence: 5 
Achieved convergence tolerance: 5.2e-06

Weighting by square root of predictor \sqrt{x_i}:

nls(rate ~ Vm * conc/(K + conc), data = Treated, 
    start = c(Vm = 200, K = 0.05), weights = wfct(sqrt(conc)))
Nonlinear regression model
  model:  rate ~ Vm * conc/(K + conc) 
   data:  Treated 
       Vm         K 
216.74134   0.07106 
 weighted residual sum-of-squares: 332.6

Number of iterations to convergence: 5 
Achieved convergence tolerance: 2.403e-06

Weighting by inverse square of fitted values 1/\hat{y_i}^2:

nls(rate ~ Vm * conc/(K + conc), data = Treated, 
    start = c(Vm = 200, K = 0.05), weights = wfct(1/fitted^2))
Nonlinear regression model
  model:  rate ~ Vm * conc/(K + conc) 
   data:  Treated 
       Vm         K 
200.84098   0.04943 
 weighted residual sum-of-squares: 0.2272

Number of iterations to convergence: 4 
Achieved convergence tolerance: 1.47e-06 

A common weighting regime is to use the inverse variance 1/\sigma^2(y_i):


A better ‘nls’ (?)

July 5, 2012

Those that do a lot of nonlinear regression will love the nls function of R. In most of the cases it works really well, but there are some mishaps that can occur when using bad starting values for the parameters. One of the most dreaded is the “singular gradient matrix at initial parameter estimates” which brings the function to a stop because the gradient check in stats:::nlsModel will terminate if the QR decomposition is not of full column rank.

Nearly all nonlinear fitting programs out there use the Levenberg-Marquardt algorithm for nonlinear regression. This is because its switching between Gauss-Newton and gradient descent is highly robust against far-off-optimal starting values.  Unfortunately, the standard nls function has no LM implemented, instead it houses the Gauss-Newton type, the PORT routines and a partial linear fitter. The fabulous minpack.lm package from Katherine M. Mullen offers an R frontend to a Fortran LM implementation of the MINPACK package. The nls.lm function must be supplied with an objective function that returns a vector of residuals to be minimized. Together with Kate I developed a function nlsLM that has the gee wizz formulaic interface of nls, but it calls LM instead of Gauss-Newton. This has some advantages as we will see below.  The function returns the usual result of class ‘nls’, and due to some modifications, all standard generics work. The modifications were made so that the formula is transformed into a function that returns a vector of (weighted) residuals whose sum square is minimized by nls.lm. The optimized parameters are then transferred to stats:::nlsModel in order to obtain an object of class ‘nlsModel’. The internal C  functions C_nls_iter and stats:::nls_port_fit were removed to avoid subsequent “Gauss-Newton”, “port” or “plinear” optimization of nlsModel.

So, what’s similar and what’s, well, better…

library(minpack.lm)
### Examples from 'nls' doc ###
DNase1 <- subset(DNase, Run == 1)
fm1DNase1 <- nlsLM(density ~ Asym/(1 + exp((xmid - log(conc))/scal)), 
data = DNase1, start = list(Asym = 3, xmid = 0, scal = 1))
## all generics are applicable
coef(fm1DNase1)
confint(fm1DNase1)
deviance(fm1DNase1)
df.residual(fm1DNase1)
fitted(fm1DNase1)
formula(fm1DNase1)
logLik(fm1DNase1)
predict(fm1DNase1)
print(fm1DNase1)
profile(fm1DNase1)
residuals(fm1DNase1)
summary(fm1DNase1)
update(fm1DNase1)
vcov(fm1DNase1)
weights(fm1DNase1)

nlsLM can fit zero noise data, when nls fails:

x <- 1:10
y <- 2*x + 3                           
nls(y ~ a + b * x, start = list(a = 0.12345, b = 0.54321))
nlsLM(y ~ a + b * x, start = list(a = 0.12345, b = 0.54321))

Nonlinear regression model
model: y ~ a + b * x
data: parent.frame()
a b
3 2
residual sum-of-squares: 0

Number of iterations to convergence: 3
Achieved convergence tolerance: 1.49e-08

nlsLM often converges when nls gives the dreaded "singular gradient" error.
Example taken from here:

x <- 0:140
y <- 200 / (1 + exp(17 - x)/2) * exp(-0.02*x)
yeps <- y + rnorm(length(y), sd = 2)
nls(yeps ~ p1 / (1 + exp(p2 - x)) * exp(p4 * x), start=list(p1=410,p2=18,p4=-.03))

Optimal starting parameters work:
Nonlinear regression model
model: yeps ~ p1/(1 + exp(p2 - x)) * exp(p4 * x)
data: parent.frame()
p1 p2 p4
200.67285 16.32430 -0.02005
residual sum-of-squares: 569

Number of iterations to convergence: 6
Achieved convergence tolerance: 2.611e-07

But off-optimal parameters give error:

nls(yeps ~ p1 / (1 + exp(p2 - x)) * exp(p4 * x), start=list(p1=10,p2=18,p4=-.03))

Fehler in nls(yeps ~ p1/(1 + exp(p2 - x)) * exp(p4 * x), start = list(p1 = 10, :
singulärer Gradient

nlsLM is more robust with these starting parameters:

nlsLM(yeps ~ p1 / (1 + exp(p2 - x)) * exp(p4 * x), start=list(p1=10,p2=18,p4=-.03))

Nonlinear regression model
model: yeps ~ p1/(1 + exp(p2 - x)) * exp(p4 * x)
data: parent.frame()
p1 p2 p4
200.67285 16.32430 -0.02005
residual sum-of-squares: 569

Number of iterations to convergence: 10
Achieved convergence tolerance: 1.49e-08

Have fun!
Cheers, Andrej