PowerPC: Add vec_add/vec_sub/vec_mul intrinsics support for vector_double#2161
PowerPC: Add vec_add/vec_sub/vec_mul intrinsics support for vector_double#2161lei137 wants to merge 4 commits into
Conversation
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sayantn (or someone else) some time within the next two weeks. Why was this reviewer chosen?The reviewer was selected based on:
|
|
This is my first rust patch and is the first of a series of vsx intrinsics support patches. |
| // Implement AltiVec's VectorAdd trait for vector_double to enable vec_add support | ||
| #[unstable(feature = "stdarch_powerpc", issue = "111145")] | ||
| impl crate::core_arch::powerpc::altivec::sealed::VectorAdd<vector_double> for vector_double { | ||
| type Result = vector_double; | ||
| #[inline] | ||
| #[target_feature(enable = "vsx")] | ||
| unsafe fn vec_add(self, other: vector_double) -> Self::Result { | ||
| sealed::vec_add_double_double(self, other) | ||
| } | ||
| } |
There was a problem hiding this comment.
So is the argument here that if you got your hands on a vector_double then vsx must be enabled?
Because normally adding stricter target features on a trait method is error-prone: there is now an additional safety requirement on that trait method, which is that if that particular instance is called, the target feature must be enabled.
There was a problem hiding this comment.
Yes that was the idea. On PPC, vec_add(vector_double, vector_double) should result in code gen xvadddp, which is only available if vsx feature is enabled.
There was a problem hiding this comment.
Hmm yeah the s390x implementation has the same sort of problem with float in their case. Many instructions for f32 were only added in a later target feature. Actually this is fine for vec_add because the implementation is just simd_add, which will generate xvadddp when the right target features are available, or some fallback thing (using scalars, probably) when that target feature is not available.
There was a problem hiding this comment.
We discussed this more at #t-libs/stdarch > target featurs on trait impls, and the conclusion is that this approach is unsound.
Which is to say, you cannot (soundly) require more target features on a specific impl than in the trait definition.
Perhaps we should back up a bit here: why are you looking to add this part of the API now?
There was a problem hiding this comment.
Thank-you for looking into this!
What is a better approach? I thought this was the way to do it but happy to go a different direction. I basically just want to add vector_double support for various intrinsics that currently are supported for other types. The restriction is that the ppc instruction they are bound to will be supported in different CPUs.
Eg. altivec features are pre pwr7, vsx support started in pwr7, however alot of our vsx instructions are only available starting in pwr8 so will require target_feature = "power8-vector" etc...
There was a problem hiding this comment.
Unfortunately, in current rust there is no way to work around this problem.
We have some (rather speculative) ideas, e.g. struct target features and effectful target feature, to be able to do this in the future.
Given that such a solution is probably years out, I wonder why you are looking to add this part of the API now?
| type Result = vector_double; | ||
| #[inline] | ||
| #[target_feature(enable = "vsx")] | ||
| unsafe fn vec_add(self, other: vector_double) -> Self::Result { |
There was a problem hiding this comment.
is this the only function you'd want to do this for? I'd guess not?
There was a problem hiding this comment.
Sorry, I do not understand what you mean. Can you please elaborate?
There was a problem hiding this comment.
well what about vec_sub and other vec_* methods? Should they not also follow this pattern, or is vec_add special somehow?
There was a problem hiding this comment.
Ah yes! I do have a few more other ones to add. I tried to limit this PR to just vec_add as it is my first patch so I was just trying to get a feel on how these should be done.
I have a list of intrinsics that I need to add and the plan was to follow up with sets of 3/4 intrinsics based on type/complexity afterwards. Is there a preferred PR size/breakdown?
There was a problem hiding this comment.
updated to add support also for vec_sub and vec_mul.
|
r? @folkertdev as you know more about powerpc |
e67122b to
78107a2
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
Add intrinsics support on Linux on Power for:
AI Assissted.