## Monte Carlo-based prediction intervals for nonlinear regression

May 11, 2018

Calculation of the propagated uncertainty $\sigma_y$ using $\nabla \Sigma \nabla^T$ (1), where $\nabla$ is the gradient and $\Sigma$ the covariance matrix of the coefficients $\beta_i$, is called the “Delta Method” and is widely applied in nonlinear least-squares (NLS) fitting. However, this method is based on first-order Taylor expansion and thus assummes linearity around $\hat{y} = f(x, \hat{\beta})$. The second-order approach can partially correct for this restriction by using a second-order polynomial around $\hat{y}$, which is $\nabla \Sigma \nabla^T + \frac{1}{2} \rm{tr}(H \Sigma H \Sigma)$ (2), where $\rm{tr}(\cdot)$ is the matrix trace and $\rm{H}$ is the Hessian.
Confidence and prediction intervals for NLS models are calculated using $t(1 - \frac{\alpha}{2}, \nu) \cdot \sigma_y$ (3) or $t(1 - \frac{\alpha}{2}, \nu) \cdot \sqrt{\sigma_y^2 + \sigma_r^2}$ (4), respectively, where the residual variance $\sigma_r^2 = \frac{\sum_{i=1}^n (y_i - \hat{y}_i)^2}{\nu}$ (5).
Now, how can we employ the matrix notation of error propagation for creating Taylor expansion- and Monte Carlo-based prediction intervals?
The inclusion of $\sigma_r^2$ in the prediction interval can be implemented as an extended gradient and “augmented” covariance matrix. So instead of using $\hat{y} = f(x, \hat{\beta})$ (6), we take $\hat{y} = f(x, \hat{\beta}) + \sigma_r^2$ (7) as the expression and augment the $i \times i$ covariance matrix $\Sigma$ to an $(i+1) \times (i+1)$ covariance matrix, where $\Sigma_{i+1, i+1} = \sigma_r^2$. Partial differentiation and matrix multiplication will then yield, for example with two coefficients $\beta_1$ and $\beta_2$ and their corresponding covariance matrix $\Sigma$:
$\left[\frac{\partial f}{\partial \beta_1}\; \frac{\partial f}{\partial \beta_2}\; 1\right] \left[ \begin{array}{ccc} \;\sigma_1^2\;\;\; \sigma_1\sigma_2\;\; 0 \\ \sigma_2\sigma_1\;\; \sigma_2^2\;\;\;\; 0 \\ \;\;\;0\;\;\;\;\;\; 0\;\;\;\;\; \sigma_r^2 \end{array} \right] \left[ \begin{array}{c} \frac{\partial f}{\partial \beta_1} \\ \frac{\partial f}{\partial \beta_2} \\ 1 \end{array} \right]$ (8)
$= \left(\frac{\partial f}{\partial \beta_1}\right)^2\sigma_1^2 + 2 \frac{\partial f}{\partial \beta_1} \frac{\partial f}{\partial \beta_2} \sigma_1 \sigma_2 + \left(\frac{\partial f}{\partial \beta_2}\right)^2\sigma_2^2 + \sigma_r^2$ (9)
$\equiv \sigma_y^2 + \sigma_r^2$, which then goes into (4).
The advantage of the augmented covariance matrix is that it can be exploited for creating Monte Carlo-based prediction intervals. This is new from propagate version 1.0-6 and is based on the paradigm that we add another dimension by employing the augmented covariance matrix of (8) in the multivariate t-distribution random number generator (in our case rtmvt), with $\mu = 0$. All $n$ simulations are then evaluated with (7) and the usual $[1 - \frac{\alpha}{2}, \frac{\alpha}{2}]$ quantiles calculated for the prediction interval. Using the original covariance matrix with (6) will deliver the MC-based confidence interval.
Application of second-order Taylor expansion or the MC-based approach demonstrates nicely that for the majority of nonlinear models, the confidence/prediction intervals around $\hat{y}$ are quite asymmetric, which the classical Delta Method does not capture:
 library(propagate)

 

DNase1 <- subset(DNase, Run == 1) fm3DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)), data = DNase1, start = list(Asym = 3, xmid = 0, scal = 1)) ## first-order prediction interval set.seed(123) PROP1 <- predictNLS(fm3DNase1, newdata = data.frame(conc = 2), nsim = 1000000, second.order = FALSE, interval = "prediction") t(PROP1$summary)  $\begin{array}{cc} Prop.Mean.1 & 0.74804722 \\ Prop.sd.1 & 0.02081131 \\ Prop.2.5 & 0.70308712 \\ Prop.97.5 & 0.79300731 \\ \end{array}$  ## second-order prediction interval and MC set.seed(123) PROP2 <- predictNLS(fm3DNase1, newdata = data.frame(conc = 2), nsim = 1000000, second.order = TRUE, interval = "prediction") t(PROP2$summary) 

$\begin{array}{cc} Prop.Mean.1 & 0.74804722 \\ Prop.Mean.2 & 0.74815136 \\ Prop.sd.1 & 0.02081131 \\ Prop.sd.2 & 0.02081520 \\ Prop.2.5 & 0.70318286 \\ Prop.97.5 & 0.79311987 \\ Sim.Mean & 0.74815598 \\ Sim.sd & 0.02261884 \\ Sim.2.5 & 0.70317629 \\ Sim.97.5 & 0.79309874 \\ \end{array}$

What we see here is that
i) the first-order prediction interval [0.70308712; 0.79300731] is symmetric and slightly down-biased compared to the second-order one [0.70317629; 0.79309874],
and
ii) the second-order prediction interval tallies nicely up to the 4th decimal with the new MC-based interval (0.70318286 and 0.70317629; 0.79311987 and 0.79309874).
I believe this clearly demonstrates the usefulness of the MC-based approach for NLS prediction interval estimation…