In numerical analysis, catastrophic cancellation[1][2] is the phenomenon that subtracting good approximations to two nearby numbers may yield a very bad approximation to the difference of the original numbers.
For example, if there are two studs, one long and the other long, and they are measured with a ruler that is good only to the centimeter, then the approximations could come out to be and .
These may be good approximations, in relative error, to the true lengths: the approximations are in error by less than 2% of the true lengths, .
However, if the approximate lengths are subtracted, the difference will be , even though the true difference between the lengths is .
The difference of the approximations, , is in error by almost 100% of the magnitude of the difference of the
true values, .
Catastrophic cancellation is not affected by how large the inputs are—it applies just as much to large and small inputs.
It depends only on how large the difference is, and on the error of the inputs.
Exactly the same error would arise by subtracting from as approximations to and , or by subtracting from as approximations to and .
Catastrophic cancellation may happen even if the difference is computed exactly, as in the example above—it is not a property of any particular kind of arithmetic like floating-point arithmetic; rather, it is inherent to subtraction, when the inputs are approximations themselves. Indeed, in floating-point arithmetic, when the inputs are close enough, the floating-point difference is computed exactly, by the Sterbenz lemma—there is no rounding error introduced by the floating-point subtraction operation.
Subtracting nearby numbers in floating-point arithmetic does not always cause catastrophic cancellation, or even any error—by the Sterbenz lemma, if the numbers are close enough the floating-point difference is exact.
But cancellation may amplify errors in the inputs that arose from rounding in other floating-point arithmetic.
Example: Difference of squares
Given numbers and , the naive attempt to compute the mathematical function
by the floating-point arithmetic
is subject to catastrophic cancellation when and are close in magnitude, because the subtraction can expose the rounding errors in the squaring.
The alternative factoring
,
evaluated by the floating-point arithmetic
,
avoids catastrophic cancellation because it avoids introducing rounding error leading into the subtraction.[2]
For example, if
and
,
then the true value of the difference
is
.
In IEEE 754 binary64 arithmetic, evaluating the alternative factoring
gives the correct result exactly (with no rounding), but evaluating the naive expression
gives the floating-point number
,
of which less than half the digits are correct and the other (underlined) digits reflect the missing terms , lost due to rounding when calculating the intermediate squared values.
Example: Complex arcsine
When computing the complex arcsine function, one may be tempted to use the logarithmic formula directly:
However, suppose for .
Then and ; call the difference between them —a very small difference, nearly zero.
If is evaluated in floating-point arithmetic giving
with any error , where denotes floating-point rounding, then computing the difference
of two nearby numbers, both very close to , may amplify the error in one input by a factor of —a very large factor because was nearly zero.
For instance, if , the true value of is approximately , but using the naive logarithmic formula in IEEE 754 binary64 arithmetic may give , with only five out of sixteen digits correct and the remainder (underlined) all incorrect.
In the case of for , using the identity avoids cancellation because but , so the subtraction is effectively addition with the same sign which does not cancel.