EFT-based compensated arithmetic for spherical geometry (GCA intersection, point-in-face, face bounds)#1513
EFT-based compensated arithmetic for spherical geometry (GCA intersection, point-in-face, face bounds)#1513rajeeja wants to merge 9 commits into
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
We have an existed implementation for these functions in
uxarray/uxarray/utils/computing.py
Line 189 in 735a1dd
And I would suggest use other name instead of eft here, the term "EFT" specifically refers to "Error-Free" floating point operations, but the AccuCross and diff_of_productthemself is not completely Error-Free, is just more accurate than normal floating point cross (doubling the precision)
There was a problem hiding this comment.
Done: merged this into uxarray/utils/computing.py and renamed the docs/API section to compensated arithmetic. I only call two_sum/two_prod EFT now; the cross-product helpers are described as compensated algorithms.
|
|
||
|
|
||
| @njit(cache=True) | ||
| def _generate_lat_lon_bounds_local(face_vertices, z_min, z_max, snap_tol_deg): |
There was a problem hiding this comment.
For a better vectorization, consider keeping the current design in https://github.com/hongyuchen1030/AccuSphGeom/blob/56cbd6e30270ec5845a853ec72ba5b19c9128017/include/accusphgeom/algorithms/lat_lon_bounds.hpp#L107:
It uses lots of masking and avoid branching for computation steps.
There was a problem hiding this comment.
Done: rewrote this path closer to the AccuSphGeom structure. The local bounds computation now uses mask-selection for extrema/snap decisions instead of branching through separate cases.
There was a problem hiding this comment.
The implementation for this entire function involves lots of new branching and operations that are not existed in the AccuSphGeom
There was a problem hiding this comment.
The remaining branching in _generate_lat_lon_bounds_local and _generate_lat_lon_bounds_pole is UXarray-specific post-kernel formatting, not part of the EFT computation path. These functions convert the z-extrema from _face_location_info into degree lat/lon bounds using UXarray's encoding conventions (antimeridian signaled by lon_min > lon_max, pole-enclosing faces returned as 0/360). AccuSphGeom does not have an equivalent layer since it uses different output conventions. Both functions are now documented as UXarray-specific formatting steps, clearly separated from the EFT kernel.
|
|
||
|
|
||
| @njit(cache=True) | ||
| def _generate_lat_lon_bounds_pole(face_vertices, label, z_min, z_max, snap_tol_deg): |
There was a problem hiding this comment.
Same here, consider keeping the structure here: https://github.com/hongyuchen1030/AccuSphGeom/blob/56cbd6e30270ec5845a853ec72ba5b19c9128017/include/accusphgeom/algorithms/lat_lon_bounds.hpp#L159
use masks instead of branching as much as possible for the vectorization purpose
There was a problem hiding this comment.
Done: same cleanup here. The polar bounds path now keeps the AccuSphGeom-style local/polar structure and uses mask-selection for the snap logic.
|
|
||
|
|
||
| @njit(cache=True) | ||
| def _face_location_info(face_vertices, polar_cap_z): |
There was a problem hiding this comment.
The branching here will probably slow down the vectorization, consider keeping the structure here:
https://github.com/hongyuchen1030/AccuSphGeom/blob/56cbd6e30270ec5845a853ec72ba5b19c9128017/include/accusphgeom/algorithms/lat_lon_bounds.hpp#L49
There was a problem hiding this comment.
Done: _face_location_info now follows the AccuSphGeom face-location flow. The per-edge extrema use mask-selection; only the final face classification branches remain.
| @njit(cache=True) | ||
| def _normalize_pair(x_hi, y_hi, z_hi, x_lo, y_lo, z_lo): | ||
| """Normalize an (hi, lo) compensated vector, returning the unit vector and magnitude.""" | ||
| x = x_hi + x_lo |
There was a problem hiding this comment.
This _normalize_pair is not utilizing the EFT and it will have the same precision as the direct floating point precision. And the normalization is one of the big killer for the precision.
There was a problem hiding this comment.
Also if the input are normalized, then both the gca_gca_intersection and the gca_constLat_intersection will return the normalized results already
There was a problem hiding this comment.
And just in case you want an "accurately calculated norm", it is const auto sum = numeric::sum_of_squares_c<T, 3>(v.hi, v.lo); const auto norm = numeric::acc_sqrt_re(sum.hi, sum.lo);
There was a problem hiding this comment.
Done: removed _normalize_pair. The intersection code now keeps compensated normals/intersection vectors, and the baseline near-tangent cases pass.
There was a problem hiding this comment.
Done. _normalize_pair is removed. For _accux_gca, the direction vector is normalized using plain math.sqrt on the collapsed scalars rather than sum_of_squares_c + acc_sqrt_re — we found that passing collapsed scalars as hi/lo pairs to _sum_sq_c3 misrepresents their error structure and actually degrades accuracy on the baseline suite (pair_id=8 goes from <1e-15 to ~7e-10 error). The collapsed norm is sufficient because the unit-sphere error is dominated by the accucross_pair step, not the normalization. This is documented in a comment in _accux_gca.
| @@ -301,139 +314,198 @@ def _gca_gca_intersection_cartesian(gca_a_xyz, gca_b_xyz): | |||
|
|
|||
| @njit(cache=True) | |||
There was a problem hiding this comment.
consider keeping the entire structure from here : https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_gca_intersection.hpp#L31
These EFT-fused operations are extremely sensitive for each operations and order. The accuracy can only be guaranteed iff following the exact algorithm
There was a problem hiding this comment.
Done: rewrote GCA-GCA around the AccuSphGeom structure: accucross normals, accucross_pair for the intersection direction, then candidate filtering with on_minor_arc.
|
|
||
|
|
||
| @njit(cache=True) | ||
| def gca_const_lat_intersection(gca_cart, const_z): |
There was a problem hiding this comment.
The algorithm is almost very stable around floating point precision. Within the current Uxarray Error tolerance, you almost don't have to use any other safety pre-check other than if the input are valid (The relative error bound is 3*\sqrt{1-constz^2}*machine_epsilon as long as it's constZ is at least 10^15 from the equator) . And probably the only check needed Then only check you can make is if the const latitude is at the equator . Again, make sure to follow the entire algorithm structure from below. Such precision is only guaranteed
iff it follows the exact algorithm here
https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_constlat_intersection.hpp#L33
There was a problem hiding this comment.
But you can use the same check as in https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_gca_intersection.hpp#L51
to remove any invalid points
There was a problem hiding this comment.
Done: rewrote const-lat around the AccuSphGeom accux_constlat flow and filter invalid candidates after computing them.
| nx = nx_hi + nx_lo | ||
| ny = ny_hi + ny_lo | ||
| nz = nz_hi + nz_lo | ||
| denom = nx * nx + ny * ny |
There was a problem hiding this comment.
Denome should be calculated from
const T denom = s2.hi + s2.lo;
where
const auto s2 = numeric::sum_of_squares_c<T, 2>( {normal.hi[0], normal.hi[1]}, {normal.lo[0], normal.lo[1]});
There was a problem hiding this comment.
Done: denom now comes from _sum_sq_c2(...) as s2_hi + s2_lo.
| @@ -301,139 +314,198 @@ def _gca_gca_intersection_cartesian(gca_a_xyz, gca_b_xyz): | |||
|
|
|||
| @njit(cache=True) | |||
| def gca_gca_intersection(gca_a_xyz, gca_b_xyz): | |||
There was a problem hiding this comment.
Consider keeping the entire structure from here https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_gca_intersection.hpp#L31.
These algorithms are extremely sensitive to the operations and order, the accuracy is only guaranteed iff we follow the exact same algorithm described
There was a problem hiding this comment.
Done: same GCA-GCA rewrite as above, following the AccuSphGeom operation order much more closely. Added baseline tests for the near-tangent cases too.
| x2_at_const_z = np.isclose( | ||
| x2[2], const_z, rtol=ERROR_TOLERANCE, atol=ERROR_TOLERANCE | ||
| ) | ||
| # 1. Endpoint coincidence with the latitude line. |
There was a problem hiding this comment.
This case is still valid and calculable with the new algorithm. And we can improve the vectorization by only use mask-selection after the intersection is calculated to snap the intersection point with the endpoints
There was a problem hiding this comment.
Done: removed the endpoint early return. The code computes candidates first, then snaps near-endpoint results afterward.
| z_max = extreme_gca_z(gca_cart, extreme_type="max") | ||
|
|
||
| # Check if the constant latitude is within the GCA range | ||
| if not in_between(z_min, const_z, z_max): |
There was a problem hiding this comment.
I am not sure if this early exist will counter-effect the branching it brings to the vectorization
There was a problem hiding this comment.
Done: removed the endpoint pre-exits. I kept only invalid denom / negative-discriminant exits.
| elif p2_intersects_gca: | ||
| res[0] = p2 | ||
| # 4. Solve for the two candidate points on the latitude circle. | ||
| r2 = 1.0 - const_z * const_z |
There was a problem hiding this comment.
This is not the new EFT-fused algorithmn
There was a problem hiding this comment.
Done: replaced this with the compensated s2/s3, two_prod, cdp4, and acc_sqrt_re flow from AccuSphGeom.
|
@hongyuchen1030 Thanks for the review. The point-in-polygon predicate and
|
| tiers — an EFT filter (what this module implements), Shewchuk adaptive | ||
| predicates for results that fall inside the filter threshold, and a geogram | ||
| exact-arithmetic fallback. This port implements only the EFT tier. For | ||
| non-degenerate inputs in double precision this is sufficient; callers that |
There was a problem hiding this comment.
We're technically still unable to claim "For non-degenerate inputs in double precision this is sufficient". We can claim "This is twice more accurate than the direct floating point implementation without computation overhead with vectorization and parralization"
There was a problem hiding this comment.
If we don't use any adaptive arithmetic, we cannot claim for robustness here. We can only claim our point-in-face use the new algorithm that are more accurate (if we use any EFT operations)
There was a problem hiding this comment.
Fixed in commit 002b9fe. Removed all claims of 'sufficient', 'non-degenerate inputs', and 'robustness'. The module docstring now says the compensated routines are roughly twice as accurate as direct floating-point equivalents; robustness against all degenerate inputs would require adding an adaptive predicate or exact-arithmetic fallback tier. Same language used for the point-in-face docstring.
| predicates for results that fall inside the filter threshold, and a geogram | ||
| exact-arithmetic fallback. This port implements only the EFT tier. For | ||
| non-degenerate inputs in double precision this is sufficient; callers that | ||
| need to handle geometrically degenerate inputs (coincident arcs, a query |
There was a problem hiding this comment.
These edge cases should be included in the test cases already. These scenarios don't have more risks than the "normal input" here. (They probably have the same risk since the intersection point is a newly constructed point)
There was a problem hiding this comment.
Agreed. The denom==0 and planar_sq<0 edge cases propagate as inf/NaN through the kernel and are correctly handled by the isfinite mask in the status layer — no special-casing needed. The guards were removed from the kernel in commit 56c6ce1; only the status layer classifies them.
|
|
||
| # Parameter along the arc at which z is extremal (matches C++ get_face_location_info). | ||
| denom = (z1 + z2) * (d - 1.0) | ||
| a_raw = (z1 * d - z2) / denom if denom != 0.0 else -1.0 |
There was a problem hiding this comment.
The a = min(max(a_raw, 0.0), 1.0) should be able to prevent the divide by zero here. But we can keep it just in case
There was a problem hiding this comment.
Kept as-is. The denom==0 guard sets a_raw=-1 so the clamp produces a=0 (use endpoint), which is safe. Matches your comment — kept just in case, and the guard costs nothing.
| z_max = z_max_candidate | ||
| if z_min_candidate < z_min: | ||
| z_min = z_min_candidate | ||
|
|
There was a problem hiding this comment.
Consider utilizing the boolean operation from AccuSphHGeom to reduce branching
const MaskType<T> north_pole_candidate_mask = (face_z_max >= polar_cap_z);
const MaskType<T> south_pole_candidate_mask = (face_z_min <= -polar_cap_z);
const MaskType<T> local_mask = !(north_pole_candidate_mask | south_pole_candidate_mask);
There was a problem hiding this comment.
Done in commit 002b9fe. The label computation now uses pure integer multiplication instead of boolean branching: label = north_pole_candidate * _FACE_LOC_NORTH_POLAR + (1 - north_pole_candidate) * south_pole_candidate * _FACE_LOC_SOUTH_POLAR. The unused local variable was also removed.
|
|
||
|
|
||
| @njit(cache=True) | ||
| def _lon_bounds_from_vertices(face_vertices): |
There was a problem hiding this comment.
We probably don't need this work around anymore with the current latlon bounds implementation
There was a problem hiding this comment.
This function is still needed and cannot be removed. UXarray uses a lon_min > lon_max encoding to signal antimeridian-crossing faces throughout the bounds and cross-section APIs — AccuSphGeom uses a union-of-intervals convention instead. The largest-gap algorithm translates between them. Removed the snap logic (which you flagged separately) but the antimeridian detection itself must stay. Added a docstring explaining this.
|
|
||
|
|
||
| @njit(parallel=True, nogil=True, cache=True) | ||
| def constant_lat_intersections_no_extreme(lat, edge_node_z, n_edge): |
There was a problem hiding this comment.
Why we need to seperate the extreme case here, and what does the "extreme" mean here
There was a problem hiding this comment.
constant_lat_intersections_no_extreme and constant_lon_intersections_no_extreme are pre-existing edge screeners — fast O(n) passes over all edges using only endpoint z/lon values, used by Grid.get_edges_at_constant_latitude/longitude before the expensive GCA kernel runs. 'No extreme' means arc interior extrema along the great circle are not considered. These are completely separate from the AccuSphGeom EFT stack and were not changed in this PR. Added a block comment before them to make this clear.
| n2x = n2x_hi + n2x_lo | ||
| n2y = n2y_hi + n2y_lo | ||
| n2z = n2z_hi + n2z_lo | ||
| if ( |
There was a problem hiding this comment.
The degeneracy check like this will greatly impact the vectorization performance, any check like this should be isolated from the intersection computation kernel. And the AccuXGCA itself is able to handle extremely short arcs
There was a problem hiding this comment.
Done. Degeneracy checks (denom==0, planar_sq<0) are handled by letting NaN/inf propagate through the kernel — the status layer uses isfinite masks to classify them without any branching in the hot path. The _accux_gca and _accux_constlat kernels contain no if/else guards; only the 1.0/vn if vn!=0.0 else np.inf guard remains, which produces inf so the isfinite mask rejects the candidate without branching.
| and math.isfinite(vn) | ||
| ): | ||
| # Parallel (coplanar) arcs: check whether endpoints of one lie on the other. | ||
| if on_minor_arc(v0, w0, w1): |
There was a problem hiding this comment.
Again, all these if-else branch will impact the vectorization behavior, that's why the original try_gca_gca_intersection only use mask/boolean operations. The point is, we do not try to prevent the NaN or Divide by zero in the intersection kernel, these errors will be recorded in the "status" so an outside dispatch function will know how to handle each case.
There was a problem hiding this comment.
Done in commits 51c00c2 and 002b9fe. Both _try_gca_gca_intersection and _try_gca_const_lat_intersection now use integer mask arithmetic throughout — no if/else in the hot path. pos_fin = int(isfinite(...)), validity via pos_valid = pos_fin * pos_on_a * pos_on_b, point selection via pos_mask * pos + neg_mask * neg, status via both + none * 2. The only remaining guards wrap on_minor_arc calls to prevent calling it with inf inputs, which is unavoidable.
| p1_intersects_gca = point_within_gca(p1, gca_cart[0], gca_cart[1]) | ||
| p2_intersects_gca = point_within_gca(p2, gca_cart[0], gca_cart[1]) | ||
| if planar_sq < 0.0: | ||
| return res |
There was a problem hiding this comment.
Isolate these if-else branch outside of the computation kernel
There was a problem hiding this comment.
Done. All branching is now in L3 (gca_gca_intersection / gca_const_lat_intersection) — the dispatcher layer. L1 kernels (_accux_gca, _accux_constlat) are branch-free. L2 (_try_gca_gca_intersection, _try_gca_const_lat_intersection) uses integer mask arithmetic with no if/else in the computational path.
| # deduplication in the caller works correctly. Matches Hongyu's suggestion | ||
| # of mask-selection to snap after computing rather than branching out early. | ||
| _snap_sq = 1e-14 # distance² ≈ (1e-7)² — well above algorithm error (~1e-15) | ||
| for xe in (x1, x2): |
There was a problem hiding this comment.
Snapping involving branching should be isolated outside of the computation kernels. The intersection computation kernels should not include any branching and always stay in the SIMD-packed vectorization friendly form
There was a problem hiding this comment.
Done. _snap_const_lat_endpoint lives entirely in L3 (gca_const_lat_intersection) — it is called after the kernel and status layer have completed, never inside _accux_constlat or _try_gca_const_lat_intersection. The L1 and L2 layers remain branch-free and kernel-pure.
| res[0, 1] = p2[1] | ||
| res[0, 2] = p2[2] | ||
|
|
||
| return res |
There was a problem hiding this comment.
The implementation in
include/accusphgeom/constructions/gca_constlat_intersection.hpp
was intentionally designed with three separate layers, and I think it is important to keep these layers conceptually and structurally separated.
-
Core computation kernel:
accux_constlatThis is the numerical kernel. It contains the carefully designed AccuSphere algorithm for computing the great-circle-arc and constant-latitude intersection.
This layer should stay isolated and intact. It is designed to be SIMD-packing friendly, branch-free in the hot path, and easy to reason about for accuracy, performance, and future optimization. This function should not be mixed with high-level filtering, status interpretation, or UXarray-specific logic.
-
SIMD-friendly batch API:
try_gca_constlat_intersectionThis is the API intended for heavy computation.
It is still SIMD-packing friendly and is designed to compute the full matrix of possible intersection results from the input faces or edge lists. It does not immediately discard invalid intersections. Instead, it returns the computed candidate points together with a status flag indicating whether each result is valid.
This is the layer we want to use for large-scale vectorized computation, because it keeps the computation uniform. Even invalid candidates are part of the output matrix, and validity is represented by status instead of control flow.
-
Dispatcher / convenience API:
gca_constlat_intersectionThis is the higher-level user-facing dispatcher.
This layer is allowed to branch. It reads the status returned by
try_gca_constlat_intersection, filters out invalid results, and returns only the valid intersection points. This is useful as a lightweight convenience API, especially when the caller wants clean geometry results instead of the full vectorized computation matrix.
The important point is that these three layers serve different purposes.
The core kernel should focus only on accurate numerical computation. The batch API should focus on uniform, SIMD-friendly execution. The dispatcher should handle branching, filtering, and convenience behavior.
Mixing these layers together is not ideal because it makes the heavy computation path harder to vectorize, harder to optimize, and harder to verify numerically. Once filtering, branching, and application-specific logic are pushed into the core kernel or SIMD batch layer, the implementation becomes less predictable and less suitable for large-scale computation.
For this kind of heavy numerical geometry kernel, the best practice is usually:
- keep the low-level numerical kernel pure, isolated, and branch-minimized;
- keep the batch/vectorized API uniform and status-based;
- move branching, filtering, and user-facing convenience behavior to a separate dispatcher layer;
- avoid mixing UXarray-specific data handling with the core computational algorithm.
So for UXarray integration, I think the preferred workflow should be:
- Use
try_gca_constlat_intersectionfor the main large-scale computation. - Preserve the full output matrix and status information during the vectorized computation stage.
- Apply filtering only afterward, either through
gca_constlat_intersectionor through UXarray-side post-processing logic.
That way, we preserve the original AccuSphere design: accurate computation first, SIMD-friendly batch execution second, and lightweight branching/filtering only at the outer layer.
There was a problem hiding this comment.
Understood — the key point is the design separation, not the C++ specifics. We've now split both intersection functions into the three layers you described:
_accux_constlat/_accux_gca— pure numerical kernels, no branching, no validity filtering, compensated operations in the exact AccuSphGeom order_try_gca_const_lat_intersection/_try_gca_gca_intersection— batch/status layer using integer mask arithmetic (pos_valid * (1 - neg_valid)) to select candidates without branching in the hot path; returns(point, status, pos, neg)gca_const_lat_intersection/gca_gca_intersection— outer dispatcher that reads status, applies endpoint snapping, and packages UXarray's NaN-filled result format; all UXarray-specific branching lives here
|
|
||
|
|
||
| @njit(cache=True) | ||
| def _point_in_polygon_sphere(q, polygon): |
There was a problem hiding this comment.
We should be able to implement the tier 3 SoS robustness here: Location detail::point_in_polygon_sphere_impl,https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/src/algorithms/point_in_polygon_sphere.cpp#L359, and https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/src/algorithms/point_in_polygon_sphere.cpp#L345-L357
It's better for us to use the SoS (Simulation of Simplicity) to handle the near-degenerate cases. The vertex id is the vertex index in the data-array, And we will assign the query point q and the end-point r a unique id.
There was a problem hiding this comment.
Agreed that SoS would give stronger robustness guarantees. The blocker is that AccuSphGeom's SoS implementation (point_in_polygon_sphere_impl) requires a globally unique integer ID for each vertex and for the query point — these IDs are used as a symbolic tiebreaker when orient3d returns exactly zero. UXarray's polygon representation passes vertex coordinates as arrays without stable global IDs, so we cannot implement SoS without a larger API change. We handle near-degenerate rays by perturbing R and restarting the loop (up to 4 retries), which is less elegant but works in practice. This is documented as explicit future work in the _point_in_polygon_sphere docstring.
|
This does not fully ports AccuSphGeom because we do not have:
|
I understand that UXarray is not directly porting the C++ SIMD mask behavior as-is. My point is that the mask-based workflow is not exclusive to C++. The key idea is the vectorized computation design: we keep the heavy computation path uniform, compute all candidate results in a batch, and use a status or mask array to indicate which results are valid. In C++, that may correspond to SIMD masks. In Python/NumPy/Numba, the same design can be represented by boolean masks, status arrays, or validity flags. So the important part is not the specific C++ SIMD implementation detail. The important part is preserving the separation between:
You can refer to my earlier reply here for the intended vectorized implementation workflow: This design pattern is still relevant for Python. Even if UXarray does not use the exact C++ SIMD mask behavior, it should still preserve the same vectorized workflow using Python-side masks or status arrays, rather than mixing branching and filtering into the core computation path. |
002b9fe to
3fe8f31
Compare
…rite intersections, add 241 baseline testsgit status! - most came from accusphere
…us/mask, dispatcher
3fe8f31 to
94c06fe
Compare
Closes #1509
Ports the compensated-arithmetic tier from AccuSphGeom into UXarray's Numba spherical geometry stack, replacing cancellation-prone naive cross-product paths with compensated primitives throughout arc intersection, point-in-face, and face bounds.
What changed
utils/computing.py— EFT and compensated primitives:two_sum,two_prod,diff_of_products,accucross,accucross_pair,acc_sqrt_re, and fixed-size compensated dot/sum-of-squares helpers. Breaking change: the oldcross_fmaanddot_fmafunctions (which depended on the optionalpyfmapackage) are removed. Any code importing those names directly will get anImportError; useaccucrossand the_cdp*helpers instead.grid/arcs.py— new_orient3d_on_sphere_value,orient3d_on_sphere,on_minor_arcpredicates using compensated arithmetic.grid/intersections.py—gca_gca_intersectionandgca_const_lat_intersectionrewritten as three-layer stacks: pure numerical kernel (L1), integer-mask status layer (L2), UXarray dispatcher (L3). Removed dead_gca_gca_intersection_cartesianshim.grid/point_in_face.py—_face_contains_pointdelegates to a new ray-casting SPIP implementation (_point_in_polygon_sphere) usingorient3d_on_sphereinstead of the old winding-number viaarctan2.grid/bounds.py— new_construct_face_bounds_array_gcapath for pure-GCA grids, computing interior arc z-extrema correctly via the compensated kernel. Old path retained for latlon/mixed-edge grids.What this is not
Not a full port of AccuSphGeom's robustness stack. Excluded: Shewchuk adaptive predicates, Simulation of Simplicity (requires per-vertex global IDs not available in UXarray's polygon representation), geogram exact fallback. This implements only the compensated-arithmetic tier — roughly twice as accurate as naive floating-point on near-tangent cases, with no measurable runtime overhead on intersection kernels.
Performance
gca_gca_intersectiongca_const_lat_intersectionGrid.bounds(face bounds, cached once)Accuracy
Measured on the AccuSphGeom baseline suite (31 near-tangent GCA-GCA pairs, angles down to 10⁻⁵°):
Tests
241 new baseline regression tests ported from the AccuSphGeom C++ suite covering near-tangent GCA-GCA pairs, GCA/constant-latitude cases, and spherical point-in-polygon.