When simulating linear models based on some defined slope/intercept and added gaussian noise, the parameter estimates vary after least-squares fitting. Here is some code I developed that does a double transform of these models as to obtain a fitted model with EXACT defined parameter estimates `a`

(intercept) and `b`

(slope).

It does so by:

1) Fitting a linear model #1 to the `x,y`

data.

2) Correcting `y`

by : .

3) Refitting linear model #2: .

4) Correcting `y`

by : .

5) Refitting linear model #3: , which is the final model with parameter estimates `a`

and `b`

.

Below is the code:

exactLM <- function(

x = 1:100, ## predictor values

b = 0.01, ## slope

a = 3, ## intercept

error = NULL, ## homoscedastic error

n = 1, ## number of replicates

weights = NULL, ## possible weights, i.e. when heteroscedastic

plot = TRUE, ## plot data and regression

... ## other parameters to 'lm'

)

{

if (is.null(error)) stop("Please define some error!")

```
``` ## create x and y-values

x <- rep(x, n)

y <- a + b * x

if (!is.null(error) & length(error) != length(x)) stop("'x' and 'error' must be of same length!")

if (!is.null(weights) & length(weights) != length(x)) stop("'x' and 'weights' must be of same length!")

## add error

y <- y + error

## create linear model #1

LM1 <- lm(y ~ x, weights = weights, ...)

COEF1 <- coef(LM1)

## correct slope and create linear model #2

y <- y * (b/COEF1[2])

LM2 <- lm(y ~ x, weights = weights, ...)

COEF2 <- coef(LM2)

## correct intercept and create linear model #3

y <- y + (a - COEF2[1])

LM3 <- lm(y ~ x, weights = weights, ...)

## plot data and regression

plot(x, y, pch = 16)

abline(LM3, lwd = 2, col = "darkred")

` return(list(model = LM3, x = x, y = y))`

}

Here are some applications using replicates and weighted fitting:

############ Examples #################

## n = 1

exactLM(x = 1:100, a = 0.5, b = 0.2, error = rnorm(100, 0, 2))

```
```## n = 3

exactLM(x = 1:100, a = 0.5, b = 0.2, error = rnorm(300, 0, 2), n = 3)

## weighted by exact 1/var

x <- 1:100

error <- rnorm(100, 0, 0.1 * x)

weights <- 1/(0.1 * x)^2

exactLM(x = x, a = 0.5, b = 0.2, error = error, weights = weights)

`## weighted by empirical 1/var`

x <- rep(1:100, 3)

error <- rnorm(300, 0, 0.1 * x)

weights <- rep(1/(tapply(error, x, mean)^2), 3)

exactLM(x = x, a = 0.5, b = 0.2, error = error, weights = weights)

I am curious on comments concerning simplification and more importantly, application (other than cheating data…)!

Cheers,

Andrej