
Originally Posted by
markandersen
The number of significant digits far exceeds the number of decimal places being shown. For example, a value of 0.000000045821 is a valid value (a probability density function, where the probability for all cells across the study area sum to 1), but would be rounded to 0.000000 as it's being displayed when I use the Get Raster Properties. When I sum the grid values, if there is this much rounding in the calculations, they will sum to much less than 1. That's what I mean by erroneous.
I think we might be miscommunicating about sig figs. 0.000000045821 has only five sig figs; the leading zeros don't contribute to the significant figures. There will be no internal rounding of this value.
The "Get Raster Properties" tool might be rounding to a fixed decimal precision for numeric display but it is not the same code that sums the values, so its behavior is not relevant.
The main difficulty with summing a pdf on a grid occurs when the values can vary by orders of magnitude. This is easy to see by doing the calculations on a hypothetical base-10 computer having only two significant digits (instead of on an actual base-2 computer having 21 significant binary digits). On such a computer a grid might have 900 values of 0.001 (stored as 0.10 * 10^-3), one value of 0.1 (stored as 0.10 * 10^-1), and 99 values of 0. These 1000 values sum to 1. The computer is capable of accurately performing computations like0.001 + 0.001 = 0.002
and0.01 + 0.001 = 0.011,
but using only two significant figures it will determine, for example, that0.1 + 0.001 = 0.1
because the correct sum of 0.101 must be rounded to two significant figures, yielding 0.10. Thus, in the worst case the sum would be performed as(...(((0.1 + 0.001)+0.001)+0.001)+...+0.001) +0+...+0 = 0.1
because all the sums will result in 0.1. In reality the 0.1 is unlikely to appear first so the error won't be this big, but it will still be substantial. It is attributable to the fact that the component numbers in the sum vary too much in magnitude compared to the inherent precision of the computer's addition mechanism.
With ESRI floating grids, this will become an issue when numeric magnitudes vary by four or more orders of magnitude within a grid, approximately. (The threshold depends on grid size; the problem becomes more acute with larger grids.) One way to cope is to split the original grid into pieces by size of the values: create a grid of all values between 0.0001 and 1, for example (with zeros elsewhere); create another of all the values between 0.0001 and 0.00000001; etc. Sum the values in these grids separately--the sums will individually be fairly accurate--then sum the sums in VBA (or Python or whatever). In this fashion you won't have to pick the grids apart into individual cells (which I suspect might be a relatively slow operation).
Bookmarks