Skip to content

intrinsic-test: sve support #2160

Draft
davidtwco wants to merge 27 commits into
rust-lang:mainfrom
davidtwco:intrinsic-test-sve
Draft

intrinsic-test: sve support #2160
davidtwco wants to merge 27 commits into
rust-lang:mainfrom
davidtwco:intrinsic-test-sve

Conversation

@davidtwco

@davidtwco davidtwco commented Jun 15, 2026

Copy link
Copy Markdown
Member

This needs rust-lang/rust#158088 to be in nightly for CI to pass - it is already approved and in the bors queue. Parts of this have been split out in #2169, #2170, #2171, #2172, #2172 and #2173.

This patch contains all the changes necessary for intrinsic-test to generate tests for the SVE intrinsics:

  • Some of the commits are just general refactors to the tool, motivated by the commits that follow and that I felt were hard to justify on their own
  • Some of the commits are small changes I made along the way (e.g. forwarding args from intrinsic-test.sh to cargo test)
  • Some of the commits modify the definitions of intrinsics so that they pass tests - this only happened for a handful of intrinsics and the changes required to their definitions were very minor
  • Some of the commits update arm_intrinsics.json to add some additional constraints necessary to generate the right tests (we're making sure that these addl. constraints are reflected in the source of truth we use to generate that file internally)
  • Any remaining commits are changes to make the tool generate the correct tests, primarily:
    • Adding the arm_sve.h headers and appropriate target feature flags for C and Rust
    • Using the svld1 intrinsic when loading an argument of scalable vector type
      • There is a special case for svbool_t arguments which do not have an svld1 intrinsic, we load a svint8_t and use svcmpne_n_s8 against zero to produce a svbool_t from it
    • Generating test arrays for intrinsics that take boolean values and svbool_t values (though the earlier commits use the same svptrue_$ty predicate for these arguments initially)
    • Supporting calls to enum-typed const generic arguments (such as svpattern), using simple const functions that map integers from intrinsic-test constraints to the appropriate variant
    • Generating a local containing a predicate that is used by all intrinsics which require a predicate (just svptrue_$ty() to enable every lane)
    • Using svcmpeq_$ty and svptest_any to compare the Rust and C results instead of assert_eq!
      • There is a special-case for tuples of vectors, where multiple svcmpeq_$ty and svptest_any blocks are generated for each vector in the tuple, with svgetN calls to extract the appropriate vector from the result tuples
      • There is also a special-case for intrinsics that return svbool_t, where there is no equivalent svcmpeq_$ty call, so sveor_b_z (exclusive OR) and !svptest_any is used
      • There is yet another special case for scalable vectors of floats, where the pre-existing NanEqF* type cannot be used, so a (rust == c) || (isnan(rust) && isnan(c)) check is performed instead, via a handful of SVE intrinsics
  • I haven't been been able to test this on x86 to check that I haven't broken the test generation, but I don't think I've made any changes to the generated output that weren't in the Arm-specific parts

@rustbot

This comment was marked as resolved.

@davidtwco davidtwco force-pushed the intrinsic-test-sve branch 2 times, most recently from 95f286a to ed132a7 Compare June 17, 2026 10:15
Comment thread crates/intrinsic-test/src/common/intrinsic.rs
} = **self
{
let quad = if self.num_lanes() * bl > 64 { "q" } else { "" };
fn comparison_function(&self) -> String {

@sayantn sayantn Jun 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way to print the diff for SVE? I would imagine it is difficult due to SVE being non-const sized, but is there some way? That would massively improve debuggability. One approach I can suggest is have some small functions that convert from SVE vectors to &[T], e.g.

#[target_feature(enable = "sve")]
pub fn svfloat32_to_slice(a: &svfloat32_t) -> &[NanEqF32] {
    unsafe {
        core::slice::from_raw_parts(core::ptr::from_ref(a).cast(), svcntw() as usize)
    }
}

this might work, with significantly less complexity

View changes since the review

@davidtwco davidtwco Jun 18, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I've needed to debug them, I've disabled the invocation of intrinsic-test in intrinsic-test.sh and added these snippets to the generated tests that I wanted to debug.

For non-bool vectors, just replace the x0_val and x1_val with the vectors that you want to inspect, e.g. __rust_return_value or __c_return_value:

{
    let num_elems = svcnth() as usize;
    let mut x0_buf = Vec::with_capacity(num_elems);
    let mut x1_buf = Vec::with_capacity(num_elems);
    svst1_u16(__pred, x0_buf.as_mut_ptr(), x0_val);
    x0_buf.set_len(num_elems);
    svst1_u16(__pred, x1_buf.as_mut_ptr(), x1_val);
    x1_buf.set_len(num_elems);
    for i in 0..num_elems {
        let x0_val = x0_buf[i];
        dbg!(i, x0_val);
    }
    for i in 0..num_elems {
        let x1_val = x1_buf[i];
        dbg!(i, x1_val);
    }
}

Similarly for bool vectors:

{
    let num_elems = svcntb() as usize;
    let mut _op_val = Vec::with_capacity(num_elems);
    svst1_u8(op_val, _op_val.as_mut_ptr(), svdup_n_u8(1));
    _op_val.set_len(num_elems);
    for i in 0..num_elems {
        let _op_val_el = if _op_val[i] == 1 { true } else { false };
        dbg!(i, _op_val_el);
    }
}

I'm not sure if what you've suggested will work, happy to try. I'm more than happy to make some improvements here for debuggability of the tests when they do fail, but could that be left to a follow-up?

"
{intrinsics}
"#,
intrinsics = intrinsics.iter().format_with("", |intrinsic, fmt| {

@sayantn sayantn Jun 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor nit, but you can use something like intrinsic.specializations().for_each(|imm_values| fmt(...))

View changes since the review

@sayantn

sayantn commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

also, it would be nice if this can be split into a few focused PRs (I can help if you want)

@davidtwco

Copy link
Copy Markdown
Member Author

Submitted rust-lang/rust#158088 to fix the failure in debug mode - should pass CI after that

@davidtwco

Copy link
Copy Markdown
Member Author

also, it would be nice if this can be split into a few focused PRs (I can help if you want)

I'll try and split some out tomorrow

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jun 18, 2026
…o-no-opt, r=lqd

codegen_ssa: no dbginfo for scalable vec local w/ `-O0`

LLVM uses GlobalISel with `-O0` that doesn't support scalable vectors. It normally falls back to SDAG which does support scalable vectors, but there's a bug that means that isn't happening for debuginfo - so temporarily don't emit debuginfo for scalable vector locals when there are no optimisations until that bug is fixed.

cc llvm/llvm-project#204585
cc rust-lang/stdarch#2160

r? @lqd
rust-timer added a commit to rust-lang/rust that referenced this pull request Jun 18, 2026
Rollup merge of #158088 - davidtwco:scalable-vector-debuginfo-no-opt, r=lqd

codegen_ssa: no dbginfo for scalable vec local w/ `-O0`

LLVM uses GlobalISel with `-O0` that doesn't support scalable vectors. It normally falls back to SDAG which does support scalable vectors, but there's a bug that means that isn't happening for debuginfo - so temporarily don't emit debuginfo for scalable vector locals when there are no optimisations until that bug is fixed.

cc llvm/llvm-project#204585
cc rust-lang/stdarch#2160

r? @lqd
github-actions Bot pushed a commit that referenced this pull request Jun 19, 2026
… r=lqd

codegen_ssa: no dbginfo for scalable vec local w/ `-O0`

LLVM uses GlobalISel with `-O0` that doesn't support scalable vectors. It normally falls back to SDAG which does support scalable vectors, but there's a bug that means that isn't happening for debuginfo - so temporarily don't emit debuginfo for scalable vector locals when there are no optimisations until that bug is fixed.

cc llvm/llvm-project#204585
cc #2160

r? @lqd
This macro isn't necessary and just makes the generated code being
written harder to read compared to multi-line strings.
Replacing `iter_specializations` (which repeatedly invokes a callback)
with an iterator implementation allows `Itertools::format_with` to be
used more broadly, which in turn allows disparate string interpolation
to be combined and hopefully provide greater context to the reader.
This isn't strictly necessary but these type names were longer than they
needed to be.
Updates `get_load_function` to return `svld{n}_{ty}` when loading a
scalable vector type. Caller of `get_load_function` will still need
updated to handle passing the predicate arguments to these load
functions.
Various SVE intrinsics are not yet implemented in stdarch, but are
present in the `arm_intrinsics.json` and so should be skipped.
Updates the headers used by generated C code and the target feature flags
passed to the C compiler to enable SVE.
davidtwco added 19 commits June 19, 2026 10:45
A small refactoring to make the type printing logic slightly cleaner and
with greater code re-use.
Predicate arguments of type `svbool_t` do not need test value arrays to
be generated as the same enable-all-lanes predicate will be passed to all
invocations of the intrinsic under test. There is no `svld1` equivalent
for `svbool_t` that could be used even if there were test values to
use.
Introduces a per-architecture abstraction over how intrinsic results are
compared, so that later commits can implement Arm-specific comparison
logic for SVE.
Refactoring enabling accessing architecture-specific behaviour that isn't
associated with either of the return or argument types.
Support defining a local variable containing the predicate that will be
used with all subsequent scalable vector intrinsics.
Implementation of `get_comparison_function` and `get_predicate_function`
for SVE which uses the relevant SVE intrinsics.
Refactoring enabling accessing architecture-specific behaviour that isn't
associated with the specific argument type.
Instead of assuming that any scalable boolean argument is a predicate,
handle predicates specifically and generate test values for `svbool_t`
values.
All of the generated output is run through rustfmt so these aren't
necessary.
Enables generation of tests for SVE intrinsics leveraging the changes
from the previous commits.
There doesn't need to be so many or other modules with the values.
SVE isn't a baseline target feature for `aarch64-unknown-linux-gnu` but
should be enabled when running tests.
SVE intrinsics aren't available on big endian
Like with non-SVE test generation, comparison of float results in
scalable vectors need special-handling of comparisons.
The output of these cannot be compared.
`sveorv` intrinsics trigger a miscompile in LLVM where the call to the
Rust intrinsic is optimised out and replaced with a zero, which is
incorrect.
These tests require that we generate test arrays with values that are
valid when cast to a pointer, which we don't currently support.
GCC quickly ICEs when asked to compile intrinsic-test's wrapper sources.
@davidtwco

Copy link
Copy Markdown
Member Author

Split out the first batch of changes in #2169

@davidtwco

Copy link
Copy Markdown
Member Author

Split out the second batch of changes in #2170

@davidtwco

Copy link
Copy Markdown
Member Author

Split out the third batch of changes in #2171

@davidtwco

Copy link
Copy Markdown
Member Author

Split out fourth batch of changes in #2172

@davidtwco

Copy link
Copy Markdown
Member Author

Split out final batch of changes in #2173. Shouldn't be much less other than the actual SVE testing implementation here after those are merged, but there will be a few merge conflicts between them as they land.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants