Skip to content

dbbcsd/sbbcsd/cbbcsd/zbbcsd: fix insufficiently accurate U2 in CS dec…#1293

Open
jschueller wants to merge 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue965
Open

dbbcsd/sbbcsd/cbbcsd/zbbcsd: fix insufficiently accurate U2 in CS dec…#1293
jschueller wants to merge 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue965

Conversation

@jschueller

Copy link
Copy Markdown
Contributor

…omposition

The Golub-Reinsch-SVD-style iteration in xBBCSD used an absolute convergence threshold THRESH ~ 90EPS to decide when off-diagonal bulges are negligible and can be skipped. For rows where the diagonal entries are small (tiny singular values in the ratio), this absolute threshold is too large relative to the diagonal, causing the algorithm to stop chasing bulges prematurely and producing inaccurate singular vectors (||X21 - U2 D2 V^|| up to 50*EPS ||X21||).

Fix: change all bulge-convergence checks (RESTART flags in the inner loop, initial bulge-chase decisions, and the IMAX-1 cleanup) from absolute to relative by scaling THRESH by the adjacent diagonal entries:

RESTART11 = |B11E|^2+|BULGE|^2 <= (THRESH * MAX(|B11D(I-1)|,|B11D(I)|,UNFL))^2

This mirrors DBDSQR's relative convergence check |E| <= TOL*|D|. Applied to all 4 bidiagonal blocks (B11/B21/B12/B22) at all 3 check-points in the iteration.

Fixes #965

…omposition

The Golub-Reinsch-SVD-style iteration in xBBCSD used an absolute
convergence threshold THRESH ~ 90*EPS to decide when off-diagonal
bulges are negligible and can be skipped.  For rows where the
diagonal entries are small (tiny singular values in the ratio),
this absolute threshold is too large relative to the diagonal,
causing the algorithm to stop chasing bulges prematurely and
producing inaccurate singular vectors (||X21 - U2 D2 V^*|| up to
50*EPS ||X21||).

Fix: change all bulge-convergence checks (RESTART flags in the
inner loop, initial bulge-chase decisions, and the IMAX-1 cleanup)
from absolute to relative by scaling THRESH by the adjacent diagonal
entries:

  RESTART11 = |B11E|^2+|BULGE|^2 <= (THRESH * MAX(|B11D(I-1)|,|B11D(I)|,UNFL))^2

This mirrors DBDSQR's relative convergence check |E| <= TOL*|D|.
Applied to all 4 bidiagonal blocks (B11/B21/B12/B22) at all 3
check-points in the iteration.

Fixes Reference-LAPACK#965
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DBBCSD may compute insufficiently accurate singular vectors

1 participant