September | 2014

Error propagation based on interval arithmetics

September 27, 2014

I added an interval function to my ‘propagate’ package (now on CRAN) that conducts error propagation based on interval arithmetics. It calculates the uncertainty of a model by using interval arithmetics based on (what I call) a “combinatorial sequence grid evaluation” approach, thereby avoiding the classical dependency problem that often inflates the result interval.
This is how it works:
For two variables $x, y$ with intervals $[x_1, x_2]$ and $[y_1, y_2]$ , the four basic arithmetic operations $\langle \mathrm{op} \rangle \in \{+, -, \cdot, /\}$ are
$[x_1, x_2] \,\langle\!\mathrm{op}\!\rangle\, [y_1, y_2] =$
$\left[ \min(x_1 \langle\!\mathrm{op}\!\rangle y_1, x_1 \langle\!\mathrm{op}\!\rangle y_2, x_2 \langle\!\mathrm{op}\!\rangle y_1, x_2 \langle\!\mathrm{op}\!\rangle y_2) \right.,$
$\left. \max(x_1 \langle\!\mathrm{op}\!\rangle y_1, x_1 \langle\!\mathrm{op}\!\rangle y_2, x_2 \langle\!\mathrm{op}\!\rangle y_1, x_2 \langle\!\mathrm{op}\!\rangle y_2)\right]$

So for a function $f([x_1, x_2], [y_1, y_2], [z_1, z_2], ...)$ with $k$ variables, we have to create all combinations $C_i = {{\{\{x_1, x_2\}, \{y_1, y2\}, \{z_1, z_2\}, ...\}} \choose k}$ , evaluate their function values $R_i = f(C_i)$ and select $R = [\min R_i, \max R_i]$ .
The so-called dependency problem is a major obstacle to the application of interval arithmetic and arises when the same variable exists in several terms of a complicated and often nonlinear function. In these cases, over-estimation can cover a range that is significantly larger, i.e. $\min R_i \ll \min f(x, y, z, ...) , \max R_i \gg \max f(x, y, z, ...)$ . For an example, see here under “Dependency problem”. A partial solution to this problem is to refine $R_i$ by dividing $[x_1, x_2]$ into $i$ smaller subranges to obtain sequence $(x_1, x_{1.1}, x_{1.2}, x_{1.i}, ..., x_2)$ . Again, all combinations are evaluated as described above, resulting in a larger number of $R_i$ in which $\min R_i$ and $\max R_i$ may be closer to $\min f(x, y, z, ...)$ and $\max f(x, y, z, ...)$ , respectively. This is the “combinatorial sequence grid evaluation” approach which works quite well in scenarios where monotonicity changes direction, obviating the need to create multivariate derivatives (Hessians) or use some multivariate minimization algorithm.
If the interval is of type $[x_1 < 0, x_2 > 0]$ , a zero is included into the middle of the sequence to avoid wrong results in case of even powers, i.e. $[-1, 1]^2 = [-1, 1][-1, 1] = [-1, 1]$ when actually the correct interval is $[0, 1]$ , as exemplified by curve(x^2, -1, 1). Some examples to illustrate:

## Example 2: A complicated nonlinear model. ## Reduce sequence length to 2 => original interval ## for quicker evaluation. EXPR2 <- expression(C * sqrt((520 * H * P)/(M *(t + 460)))) H <- c(64, 65) M <- c(16, 16.2) P <- c(361, 365) t <- c(165, 170) C <- c(38.4, 38.5) DAT2 <- makeDat(EXPR2) interval(DAT2, EXPR2, seq = 2) [1317.494, 1352.277]
## Example 5: Overestimation from dependency problem. # Original interval with seq = 2 => [1, 7] EXPR5 <- expression(x^2 - x + 1) x <- c(-2, 1) DAT5 <- makeDat(EXPR5) interval(DAT5, EXPR5, seq = 2) [1, 7]
# Refine with large sequence => [0.75, 7] interval(DAT5, EXPR5, seq = 100) [0.7502296, 7] # Tallies with curve function. curve(x^2 - x + 1, -2, 1)