Add new MIR constant propagation based on dataflow analysis#101168
Conversation
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @compiler-errors (or someone else) soon. Please see the contribution instructions for more information. |
|
@rustbot label +A-mir-opt |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
648fc1a to
77d6fad
Compare
This comment has been minimized.
This comment has been minimized.
|
r? @oli-obk since you know about const prop |
JakobDegen
left a comment
There was a problem hiding this comment.
Thank you for doing this! This is impressive work, and looks to be exactly the direction that I think we should be going. For now, I've only focused on correctness in the review. Once we're confident with that, we can work on everything else.
Besides the one bug I pointed out below, there is a part of the logic here that I am very suspicious of: The place expressions that are tracked might in general be overlapping each other. This can happen either in two cases:
-
Enums. Assuming that the layouts work out, the below MIR is DB and would be miscompiled by this pass, if I am reading the code correctly:
(_2 as Foo).0 = 5; (_2 as Bar).0 = 5; _0 = (_2 as Foo).0; return
It is not possible to generate this kind of MIR from Rust (I am working on something to be able to write it in tests at least), but it is nonetheless a part of the semantics of MIR.
-
Pointer aliasing. This is the complicated one. The place values that we care about might in general be aliased by pointers, and we also const prop through pointers in the logic here. There is a lot of subtlety to this. For example, this code is "miscompiled:"
#[inline(never)] fn foo() -> usize { let mut x = 0; let p = core::ptr::addr_of_mut!(x); x = 1; unsafe { *p = 2 }; x } fn main() { dbg!(foo()); }
Miri does indeed report UB for this. Putting aside for a minute the issue of whether or not we're ok miscompiling this, there needs to be a comment in the code somewhere that states what assumptions we are making about the aliasing model that are broken by the above code.
Speaking somewhat more generally, the code that is currently responsible for handling writes needs to explain why its behavior is correct not just for the "obvious" target but for all other possibly overlapping places.
This is unexpected, but I guess it can make sense to allow such writes to inactive variants. If we don't want to deal with layouts, we could flood all other variants with
The analysis relies on some properties of Stacked Borrows. I agree that the assumptions should be clearly documented. I'm also not quite convinced yet that the way that taking a mutable reference/pointer is handled is sound (it just floods the place with |
oli-obk
left a comment
There was a problem hiding this comment.
Is if x { y = 42; } handled to not expose the 42 after the if block is left, by relying on the regular dataflow lattice merger erasing such information?
|
This is absolutely awesome. I would propose leaving anything that current const prop does not support (mut refs, dead branch handling, ...) to future PRs. Basically only add "new features" where it falls out of the current work anyway. Do you know of anything that old const prop supports that the dataflow const propagator does not yet support? |
Yes, exactly! Let's use the visualization provided by the dataflow analysis framework and the fn foo(x: bool, mut y: i32) -> i32 {
if x {
y = 42;
}
y
} |
Thank you for the kind words! :)
I agree. Unless there are low-hanging fruits (proper dead branch handling might be easy, I might want to take a look at that).
I did not really investigate that yet. However, there is algebraic simplification, also mentioned here, which is not currently done by this analysis. Although this should be an easy addition if we want it. |
|
There are currently many open questions. However, I feel like if this approach turns out to be unsound, it most likely is due to the handling of references and aliasing (this was also brought up by @JakobDegen here). I am not aware of any formal proofs for other MIR optimizations (if you are, please let me know, it might help!), but then again, their correctness is perhaps a little bit more intuitive. Also, an actual formal proof might be infeasible. However, I think with some theory from Abstract Interpretation, we might at least put this analysis on solid ground. With this commit, I began a semi-formal argument for why this analysis is sound. There is still a lot to cover there, but the current summary is basically: Whenever mutable access to a tracked place escapes the analysis, the tracked place must be assigned to top and stay that way for the duration of possible access. Since something like |
|
I resolved the issue with the overlapping storage of enum variants by rejecting the registration of places that contain downcast projections. This is of course not an optimal solution and could be improved in the future. |
This comment has been minimized.
This comment has been minimized.
|
I'll review this again hopefully tonight, otherwise over the weekend.
I just want to clarify quickly though that I did not mean to imply that I wanted to see a full proof of soundness for the optimization. You are right that those don't exist right now (as far as I know anyway). I mostly just want to move us to a point where we are reasonable comfortable saying that if someone were to try and write such a proof, they would not hit any huge surprises of the form "it turns out this optimization depends on a whole bunch of assumptions that are not mentioned anywhere." |
|
Adding the ability to track unreachability was not too hard (resolves #81605). fn f() -> usize {
1 + if true { 1 } else { 2 }
} |
|
☔ The latest upstream changes (presumably #101318) made this pull request unmergeable. Please resolve the merge conflicts. |
ce0c101 to
b4b1347
Compare
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
💔 Test failed - checks-actions |
|
Same failure as here: #104399 (comment) |
|
☀️ Test successful - checks-actions |
|
Finished benchmarking commit (357f660): comparison URL. Overall result: ❌ regressions - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |


The current constant propagation in
rustc_mir_transform/src/const_prop.rsfails to handle many cases that would be expected from a constant propagation optimization. For example:This pull request adds a new constant propagation MIR optimization pass based on the existing dataflow analysis framework. Since most of the analysis is not unique to constant propagation, a generic framework has been extracted. It works on top of the existing framework and could be reused for other optimzations.
Closes #80038. Closes #81605.
Todo
Essential
StatementKind::CopyNonOverlapping. Resolved by flooding the destination.UnsafeCell/!Freezecorrectly.CheckedBinaryOp: Decided to not propagate if overflow flag istrue(falsewill still be propagated)-Zmir-opt-level=3for now.Extra
const_prop.rs, but not by this one.