Adam Fleischhacker’s question was impressive for me, because I’m using the R numerical integrate function. But I have not thought deeply about that. Now the time to dig it.

I planned an experiment. Reading the source code will reveal all questions, but it requires more time and patience. I love experiments than reading source codes.

- make a step function that have a square peak of unit width and height.

position of the peak is (offset, offset + 1). - integrate the step function by the range of (0, offset + 1).
- check the results by variable offset.

The value of integrate must be exactly 1. But the value of numerical integrate differs.

sf0 <- function(x, offset) ifelse(x < offset, 0, ifelse(x < offset + 1, 1, 0)) sf1 <- function(n, ...) unclass(integrate(function(x) sf0(x, n), 0, n + 1, ...)) sf2 <- function(n, ...) sapply(as.list(n), function(a) sf1(a, ...)$value) sf3 <- function(n, ...) sapply(as.list(n), function(a) sf1(a, ...)$subdivisions) sf4 <- function(n, ...) t(sapply(as.list(n), function(a) { b <- sf1(a, ...) c(value=b$value, subdivisions=b$subdivisions) }) ) Fig.1 <- function() { StepFunction <- function(x) sf0(x,3) curve(StepFunction, 0, 5, col='purple') } Fig.2 <- function() { par(mfrow=c(2,1), mar=c(4,4,1,1)) curve(sf2, 0, 700, xlab='offset', ylab='value', col='blue') curve(sf3, 0, 700, xlab='offset', ylab='subdivisions', col='red') }

At offset of 460, the value quickly turns to zero, and the subdivisions also drops to 1. This is the point of unreliability begins. The function cannot integrate the step function when the offset is greater than 459. At this offset, the integrand is 99.8% flat and the width of step peak is 0.2%. We must doubt the result, when the integrand has a very narrow peak that has a major contribution to the integrate while the other range is almost flat or periodical. In this case, the function is fooled, and thinks the entire range is flat.

> sf4(458:462) value subdivisions [1,] 0.9999569 21 [2,] 0.9999932 21 offset 459 [3,] 0.0000000 1 offset 460 [4,] 0.0000000 1 [5,] 0.0000000 1

The subdivisions can likely be used as a signal of reliability. That is a number of division to calculate the whole integration. So if the number is 1 or a few, the integrate function thought the integrand function is flat and calculate the integration as a single rectangle. If the integrand is really flat, the value may take 1 and the value is reliable. At least, checking the lower value of subdivisions is a good way to ensure the integrate value.

I colored the result of integrate by this value. The red color is warnings. At the lowest point (offset=0), a red point is seen, but it is correct because the integrand is really a flat line. And the bad values are all colored by red.

Fig.3 <- function() { x<-sf4(0:700) y <- ifelse(x[,'subdivisions'] <= 1, 2, 1) plot(x[,'value'], col=c('#00ff0022','#ff000066')[y], cex=c(6,4)[y], pch=c(4,21)[y], xlab='offset', ylab='value') }

Advertisements

The stupidest thing...

Statistics, genetics, programming, academics

ЯтомизоnoR

R, Statistics

%d bloggers like this:

## Recent Comments