Skip to content

slarfgp/dlarfgp: avoid overflow when computing 1/ALPHA#1290

Open
jschueller wants to merge 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue938
Open

slarfgp/dlarfgp: avoid overflow when computing 1/ALPHA#1290
jschueller wants to merge 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue938

Conversation

@jschueller

Copy link
Copy Markdown
Contributor

In the general case (xnorm > eps*|alpha|, beta >= 0), the Householder reflector formula computes

ALPHA = -XNORM^2 / (alpha + beta)

which can be subnormal even when beta itself is safely above SMLNUM. When this subnormal ALPHA was used in the subsequent

CALL SSCAL( N-1, ONE / ALPHA, X, INCX )

the reciprocal overflowed (e.g. ALPHA = 2^-137 → ONE/ALPHA = 2^137 which exceeds SP max ~3.4e38).

Fix: guard the SSCAL with ABS(ALPHA) < SMLNUM. When ALPHA is very small, scale X by ONE/SMLNUM then by SMLNUM/ALPHA — both safe since ONE/SMLNUM = BIGNUM is below the overflow threshold and SMLNUM/ALPHA ≤ MAX_EXPONENT.

The complex variants (clarfgp/zlarfgp) are already safe because they compute 1/ALPHA via CLADIV/ZLADIV, which internally handles overflow/underflow without intermediate overflow.

Fixes #938

In the general case (xnorm > eps*|alpha|, beta >= 0), the
Householder reflector formula computes

  ALPHA = -XNORM^2 / (alpha + beta)

which can be subnormal even when beta itself is safely above
SMLNUM.  When this subnormal ALPHA was used in the subsequent

  CALL SSCAL( N-1, ONE / ALPHA, X, INCX )

the reciprocal overflowed (e.g. ALPHA = 2^-137 → ONE/ALPHA = 2^137
which exceeds SP max ~3.4e38).

Fix: guard the SSCAL with ABS(ALPHA) < SMLNUM.  When ALPHA is very
small, scale X by ONE/SMLNUM then by SMLNUM/ALPHA — both safe
since ONE/SMLNUM = BIGNUM is below the overflow threshold and
SMLNUM/ALPHA ≤ MAX_EXPONENT.

The complex variants (clarfgp/zlarfgp) are already safe because
they compute 1/ALPHA via CLADIV/ZLADIV, which internally handles
overflow/underflow without intermediate overflow.

Fixes Reference-LAPACK#938
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

xLARFGP unnecessarily overflows

2 participants