|  | //! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments. | 
|  | //! | 
|  | //! # Motivation | 
|  | //! | 
|  | //! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move | 
|  | //! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR. | 
|  | //! MIR building for constants in particular tends to create additional locals that are only used | 
|  | //! inside a single block to shuffle a value around unnecessarily. | 
|  | //! | 
|  | //! LLVM by itself is not good enough at eliminating these redundant copies (eg. see | 
|  | //! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table | 
|  | //! that we can regain by implementing an optimization for removing these assign statements in rustc | 
|  | //! itself. When this optimization runs fast enough, it can also speed up the constant evaluation | 
|  | //! and code generation phases of rustc due to the reduced number of statements and locals. | 
|  | //! | 
|  | //! # The Optimization | 
|  | //! | 
|  | //! Conceptually, this optimization is "destination propagation". It is similar to the Named Return | 
|  | //! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return | 
|  | //! values or the return place `_0`. On a very high level, independent of the actual implementation | 
|  | //! details, it does the following: | 
|  | //! | 
|  | //! 1) Identify `dest = src;` statements with values for `dest` and `src` whose storage can soundly | 
|  | //!    be merged. | 
|  | //! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination | 
|  | //!    backwards). | 
|  | //! 3) Delete the `dest = src;` statement (by making it a `nop`). | 
|  | //! | 
|  | //! Step 1) is by far the hardest, so it is explained in more detail below. | 
|  | //! | 
|  | //! ## Soundness | 
|  | //! | 
|  | //! We have a pair of places `p` and `q`, whose memory we would like to merge. In order for this to | 
|  | //! be sound, we need to check a number of conditions: | 
|  | //! | 
|  | //! * `p` and `q` must both be *constant* - it does not make much sense to talk about merging them | 
|  | //!   if they do not consistently refer to the same place in memory. This is satisfied if they do | 
|  | //!   not contain any indirection through a pointer or any indexing projections. | 
|  | //! | 
|  | //! * `p` and `q` must have the **same type**. If we replace a local with a subtype or supertype, | 
|  | //!   we may end up with a different vtable for that local. See the `subtyping-impacts-selection` | 
|  | //!   tests for an example where that causes issues. | 
|  | //! | 
|  | //! * We need to make sure that the goal of "merging the memory" is actually structurally possible | 
|  | //!   in MIR. For example, even if all the other conditions are satisfied, there is no way to | 
|  | //!   "merge" `_5.foo` and `_6.bar`. For now, we ensure this by requiring that both `p` and `q` are | 
|  | //!   locals with no further projections. Future iterations of this pass should improve on this. | 
|  | //! | 
|  | //! * Finally, we want `p` and `q` to use the same memory - however, we still need to make sure that | 
|  | //!   each of them has enough "ownership" of that memory to continue "doing its job." More | 
|  | //!   precisely, what we will check is that whenever the program performs a write to `p`, then it | 
|  | //!   does not currently care about what the value in `q` is (and vice versa). We formalize the | 
|  | //!   notion of "does not care what the value in `q` is" by checking the *liveness* of `q`. | 
|  | //! | 
|  | //!   Because of the difficulty of computing liveness of places that have their address taken, we do | 
|  | //!   not even attempt to do it. Any places that are in a local that has its address taken is | 
|  | //!   excluded from the optimization. | 
|  | //! | 
|  | //! The first two conditions are simple structural requirements on the `Assign` statements that can | 
|  | //! be trivially checked. The third requirement however is more difficult and costly to check. | 
|  | //! | 
|  | //! ## Future Improvements | 
|  | //! | 
|  | //! There are a number of ways in which this pass could be improved in the future: | 
|  | //! | 
|  | //! * Merging storage liveness ranges instead of removing storage statements completely. This may | 
|  | //!   improve stack usage. | 
|  | //! | 
|  | //! * Allow merging locals into places with projections, eg `_5` into `_6.foo`. | 
|  | //! | 
|  | //! * Liveness analysis with more precision than whole locals at a time. The smaller benefit of this | 
|  | //!   is that it would allow us to dest prop at "sub-local" levels in some cases. The bigger benefit | 
|  | //!   of this is that such liveness analysis can report more accurate results about whole locals at | 
|  | //!   a time. For example, consider: | 
|  | //! | 
|  | //!   ```ignore (syntax-highlighting-only) | 
|  | //!   _1 = u; | 
|  | //!   // unrelated code | 
|  | //!   _1.f1 = v; | 
|  | //!   _2 = _1.f1; | 
|  | //!   ``` | 
|  | //! | 
|  | //!   Because the current analysis only thinks in terms of locals, it does not have enough | 
|  | //!   information to report that `_1` is dead in the "unrelated code" section. | 
|  | //! | 
|  | //! * Liveness analysis enabled by alias analysis. This would allow us to not just bail on locals | 
|  | //!   that ever have their address taken. Of course that requires actually having alias analysis | 
|  | //!   (and a model to build it on), so this might be a bit of a ways off. | 
|  | //! | 
|  | //! * Various perf improvements. There are a bunch of comments in here marked `PERF` with ideas for | 
|  | //!   how to do things more efficiently. However, the complexity of the pass as a whole should be | 
|  | //!   kept in mind. | 
|  | //! | 
|  | //! ## Previous Work | 
|  | //! | 
|  | //! A [previous attempt][attempt 1] at implementing an optimization like this turned out to be a | 
|  | //! significant regression in compiler performance. Fixing the regressions introduced a lot of | 
|  | //! undesirable complexity to the implementation. | 
|  | //! | 
|  | //! A [subsequent approach][attempt 2] tried to avoid the costly computation by limiting itself to | 
|  | //! acyclic CFGs, but still turned out to be far too costly to run due to suboptimal performance | 
|  | //! within individual basic blocks, requiring a walk across the entire block for every assignment | 
|  | //! found within the block. For the `tuple-stress` benchmark, which has 458745 statements in a | 
|  | //! single block, this proved to be far too costly. | 
|  | //! | 
|  | //! [Another approach after that][attempt 3] was much closer to correct, but had some soundness | 
|  | //! issues - it was failing to consider stores outside live ranges, and failed to uphold some of the | 
|  | //! requirements that MIR has for non-overlapping places within statements. However, it also had | 
|  | //! performance issues caused by `O(l² * s)` runtime, where `l` is the number of locals and `s` is | 
|  | //! the number of statements and terminators. | 
|  | //! | 
|  | //! Since the first attempt at this, the compiler has improved dramatically, and new analysis | 
|  | //! frameworks have been added that should make this approach viable without requiring a limited | 
|  | //! approach that only works for some classes of CFGs: | 
|  | //! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards | 
|  | //!   analyses efficiently. | 
|  | //! - Layout optimizations for coroutines have been added to improve code generation for | 
|  | //!   async/await, which are very similar in spirit to what this optimization does. | 
|  | //! | 
|  | //! Also, rustc now has a simple NRVO pass (see `nrvo.rs`), which handles a subset of the cases that | 
|  | //! this destination propagation pass handles, proving that similar optimizations can be performed | 
|  | //! on MIR. | 
|  | //! | 
|  | //! ## Pre/Post Optimization | 
|  | //! | 
|  | //! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as | 
|  | //! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind. | 
|  | //! | 
|  | //! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis | 
|  | //! [attempt 1]: https://github.com/rust-lang/rust/pull/47954 | 
|  | //! [attempt 2]: https://github.com/rust-lang/rust/pull/71003 | 
|  | //! [attempt 3]: https://github.com/rust-lang/rust/pull/72632 | 
|  |  | 
|  | use rustc_data_structures::fx::{FxIndexMap, IndexEntry, IndexOccupiedEntry}; | 
|  | use rustc_index::bit_set::DenseBitSet; | 
|  | use rustc_index::interval::SparseIntervalMatrix; | 
|  | use rustc_middle::bug; | 
|  | use rustc_middle::mir::visit::{MutVisitor, PlaceContext, Visitor}; | 
|  | use rustc_middle::mir::{ | 
|  | Body, HasLocalDecls, InlineAsmOperand, Local, LocalKind, Location, Operand, PassWhere, Place, | 
|  | Rvalue, Statement, StatementKind, TerminatorKind, dump_mir, traversal, | 
|  | }; | 
|  | use rustc_middle::ty::TyCtxt; | 
|  | use rustc_mir_dataflow::Analysis; | 
|  | use rustc_mir_dataflow::impls::MaybeLiveLocals; | 
|  | use rustc_mir_dataflow::points::{DenseLocationMap, PointIndex, save_as_intervals}; | 
|  | use tracing::{debug, trace}; | 
|  |  | 
|  | pub(super) struct DestinationPropagation; | 
|  |  | 
|  | impl<'tcx> crate::MirPass<'tcx> for DestinationPropagation { | 
|  | fn is_enabled(&self, sess: &rustc_session::Session) -> bool { | 
|  | // For now, only run at MIR opt level 3. Two things need to be changed before this can be | 
|  | // turned on by default: | 
|  | //  1. Because of the overeager removal of storage statements, this can cause stack space | 
|  | //     regressions. This opt is not the place to fix this though, it's a more general | 
|  | //     problem in MIR. | 
|  | //  2. Despite being an overall perf improvement, this still causes a 30% regression in | 
|  | //     keccak. We can temporarily fix this by bounding function size, but in the long term | 
|  | //     we should fix this by being smarter about invalidating analysis results. | 
|  | sess.mir_opt_level() >= 3 | 
|  | } | 
|  |  | 
|  | fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) { | 
|  | let def_id = body.source.def_id(); | 
|  | let mut candidates = Candidates::default(); | 
|  | let mut write_info = WriteInfo::default(); | 
|  | trace!(func = ?tcx.def_path_str(def_id)); | 
|  |  | 
|  | let borrowed = rustc_mir_dataflow::impls::borrowed_locals(body); | 
|  |  | 
|  | let live = MaybeLiveLocals.iterate_to_fixpoint(tcx, body, Some("MaybeLiveLocals-DestProp")); | 
|  | let points = DenseLocationMap::new(body); | 
|  | let mut live = save_as_intervals(&points, body, live.analysis, live.results); | 
|  |  | 
|  | // In order to avoid having to collect data for every single pair of locals in the body, we | 
|  | // do not allow doing more than one merge for places that are derived from the same local at | 
|  | // once. To avoid missed opportunities, we instead iterate to a fixed point - we'll refer to | 
|  | // each of these iterations as a "round." | 
|  | // | 
|  | // Reaching a fixed point could in theory take up to `min(l, s)` rounds - however, we do not | 
|  | // expect to see MIR like that. To verify this, a test was run against `[rust-lang/regex]` - | 
|  | // the average MIR body saw 1.32 full iterations of this loop. The most that was hit were 30 | 
|  | // for a single function. Only 80/2801 (2.9%) of functions saw at least 5. | 
|  | // | 
|  | // [rust-lang/regex]: | 
|  | //     https://github.com/rust-lang/regex/tree/b5372864e2df6a2f5e543a556a62197f50ca3650 | 
|  | let mut round_count = 0; | 
|  | loop { | 
|  | // PERF: Can we do something smarter than recalculating the candidates and liveness | 
|  | // results? | 
|  | candidates.reset_and_find(body, &borrowed); | 
|  | trace!(?candidates); | 
|  | dest_prop_mir_dump(tcx, body, &points, &live, round_count); | 
|  |  | 
|  | FilterInformation::filter_liveness( | 
|  | &mut candidates, | 
|  | &points, | 
|  | &live, | 
|  | &mut write_info, | 
|  | body, | 
|  | ); | 
|  |  | 
|  | // Because we only filter once per round, it is unsound to use a local for more than | 
|  | // one merge operation within a single round of optimizations. We store here which ones | 
|  | // we have already used. | 
|  | let mut merged_locals: DenseBitSet<Local> = | 
|  | DenseBitSet::new_empty(body.local_decls.len()); | 
|  |  | 
|  | // This is the set of merges we will apply this round. It is a subset of the candidates. | 
|  | let mut merges = FxIndexMap::default(); | 
|  |  | 
|  | for (src, candidates) in candidates.c.iter() { | 
|  | if merged_locals.contains(*src) { | 
|  | continue; | 
|  | } | 
|  | let Some(dest) = candidates.iter().find(|dest| !merged_locals.contains(**dest)) | 
|  | else { | 
|  | continue; | 
|  | }; | 
|  |  | 
|  | // Replace `src` by `dest` everywhere. | 
|  | merges.insert(*src, *dest); | 
|  | merged_locals.insert(*src); | 
|  | merged_locals.insert(*dest); | 
|  |  | 
|  | // Update liveness information based on the merge we just performed. | 
|  | // Every location where `src` was live, `dest` will be live. | 
|  | live.union_rows(*src, *dest); | 
|  | } | 
|  | trace!(merging = ?merges); | 
|  |  | 
|  | if merges.is_empty() { | 
|  | break; | 
|  | } | 
|  | round_count += 1; | 
|  |  | 
|  | apply_merges(body, tcx, merges, merged_locals); | 
|  | } | 
|  |  | 
|  | trace!(round_count); | 
|  | } | 
|  |  | 
|  | fn is_required(&self) -> bool { | 
|  | false | 
|  | } | 
|  | } | 
|  |  | 
|  | #[derive(Debug, Default)] | 
|  | struct Candidates { | 
|  | /// The set of candidates we are considering in this optimization. | 
|  | /// | 
|  | /// We will always merge the key into at most one of its values. | 
|  | /// | 
|  | /// Whether a place ends up in the key or the value does not correspond to whether it appears as | 
|  | /// the lhs or rhs of any assignment. As a matter of fact, the places in here might never appear | 
|  | /// in an assignment at all. This happens because if we see an assignment like this: | 
|  | /// | 
|  | /// ```ignore (syntax-highlighting-only) | 
|  | /// _1.0 = _2.0 | 
|  | /// ``` | 
|  | /// | 
|  | /// We will still report that we would like to merge `_1` and `_2` in an attempt to allow us to | 
|  | /// remove that assignment. | 
|  | c: FxIndexMap<Local, Vec<Local>>, | 
|  |  | 
|  | /// A reverse index of the `c` set; if the `c` set contains `a => Place { local: b, proj }`, | 
|  | /// then this contains `b => a`. | 
|  | // PERF: Possibly these should be `SmallVec`s? | 
|  | reverse: FxIndexMap<Local, Vec<Local>>, | 
|  | } | 
|  |  | 
|  | ////////////////////////////////////////////////////////// | 
|  | // Merging | 
|  | // | 
|  | // Applies the actual optimization | 
|  |  | 
|  | fn apply_merges<'tcx>( | 
|  | body: &mut Body<'tcx>, | 
|  | tcx: TyCtxt<'tcx>, | 
|  | merges: FxIndexMap<Local, Local>, | 
|  | merged_locals: DenseBitSet<Local>, | 
|  | ) { | 
|  | let mut merger = Merger { tcx, merges, merged_locals }; | 
|  | merger.visit_body_preserves_cfg(body); | 
|  | } | 
|  |  | 
|  | struct Merger<'tcx> { | 
|  | tcx: TyCtxt<'tcx>, | 
|  | merges: FxIndexMap<Local, Local>, | 
|  | merged_locals: DenseBitSet<Local>, | 
|  | } | 
|  |  | 
|  | impl<'tcx> MutVisitor<'tcx> for Merger<'tcx> { | 
|  | fn tcx(&self) -> TyCtxt<'tcx> { | 
|  | self.tcx | 
|  | } | 
|  |  | 
|  | fn visit_local(&mut self, local: &mut Local, _: PlaceContext, _location: Location) { | 
|  | if let Some(dest) = self.merges.get(local) { | 
|  | *local = *dest; | 
|  | } | 
|  | } | 
|  |  | 
|  | fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) { | 
|  | match &statement.kind { | 
|  | // FIXME: Don't delete storage statements, but "merge" the storage ranges instead. | 
|  | StatementKind::StorageDead(local) | StatementKind::StorageLive(local) | 
|  | if self.merged_locals.contains(*local) => | 
|  | { | 
|  | statement.make_nop(); | 
|  | return; | 
|  | } | 
|  | _ => (), | 
|  | }; | 
|  | self.super_statement(statement, location); | 
|  | match &statement.kind { | 
|  | StatementKind::Assign(box (dest, rvalue)) => { | 
|  | match rvalue { | 
|  | Rvalue::CopyForDeref(place) | 
|  | | Rvalue::Use(Operand::Copy(place) | Operand::Move(place)) => { | 
|  | // These might've been turned into self-assignments by the replacement | 
|  | // (this includes the original statement we wanted to eliminate). | 
|  | if dest == place { | 
|  | debug!("{:?} turned into self-assignment, deleting", location); | 
|  | statement.make_nop(); | 
|  | } | 
|  | } | 
|  | _ => {} | 
|  | } | 
|  | } | 
|  |  | 
|  | _ => {} | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | ////////////////////////////////////////////////////////// | 
|  | // Liveness filtering | 
|  | // | 
|  | // This section enforces bullet point 2 | 
|  |  | 
|  | struct FilterInformation<'a, 'tcx> { | 
|  | body: &'a Body<'tcx>, | 
|  | points: &'a DenseLocationMap, | 
|  | live: &'a SparseIntervalMatrix<Local, PointIndex>, | 
|  | candidates: &'a mut Candidates, | 
|  | write_info: &'a mut WriteInfo, | 
|  | at: Location, | 
|  | } | 
|  |  | 
|  | // We first implement some utility functions which we will expose removing candidates according to | 
|  | // different needs. Throughout the liveness filtering, the `candidates` are only ever accessed | 
|  | // through these methods, and not directly. | 
|  | impl Candidates { | 
|  | /// Collects the candidates for merging. | 
|  | /// | 
|  | /// This is responsible for enforcing the first and third bullet point. | 
|  | fn reset_and_find<'tcx>(&mut self, body: &Body<'tcx>, borrowed: &DenseBitSet<Local>) { | 
|  | self.c.clear(); | 
|  | self.reverse.clear(); | 
|  | let mut visitor = FindAssignments { body, candidates: &mut self.c, borrowed }; | 
|  | visitor.visit_body(body); | 
|  | // Deduplicate candidates. | 
|  | for (_, cands) in self.c.iter_mut() { | 
|  | cands.sort(); | 
|  | cands.dedup(); | 
|  | } | 
|  | // Generate the reverse map. | 
|  | for (src, cands) in self.c.iter() { | 
|  | for dest in cands.iter().copied() { | 
|  | self.reverse.entry(dest).or_default().push(*src); | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | /// Just `Vec::retain`, but the condition is inverted and we add debugging output | 
|  | fn vec_filter_candidates( | 
|  | src: Local, | 
|  | v: &mut Vec<Local>, | 
|  | mut f: impl FnMut(Local) -> CandidateFilter, | 
|  | at: Location, | 
|  | ) { | 
|  | v.retain(|dest| { | 
|  | let remove = f(*dest); | 
|  | if remove == CandidateFilter::Remove { | 
|  | trace!("eliminating {:?} => {:?} due to conflict at {:?}", src, dest, at); | 
|  | } | 
|  | remove == CandidateFilter::Keep | 
|  | }); | 
|  | } | 
|  |  | 
|  | /// `vec_filter_candidates` but for an `Entry` | 
|  | fn entry_filter_candidates( | 
|  | mut entry: IndexOccupiedEntry<'_, Local, Vec<Local>>, | 
|  | p: Local, | 
|  | f: impl FnMut(Local) -> CandidateFilter, | 
|  | at: Location, | 
|  | ) { | 
|  | let candidates = entry.get_mut(); | 
|  | Self::vec_filter_candidates(p, candidates, f, at); | 
|  | if candidates.len() == 0 { | 
|  | // FIXME(#120456) - is `swap_remove` correct? | 
|  | entry.swap_remove(); | 
|  | } | 
|  | } | 
|  |  | 
|  | /// For all candidates `(p, q)` or `(q, p)` removes the candidate if `f(q)` says to do so | 
|  | fn filter_candidates_by( | 
|  | &mut self, | 
|  | p: Local, | 
|  | mut f: impl FnMut(Local) -> CandidateFilter, | 
|  | at: Location, | 
|  | ) { | 
|  | // Cover the cases where `p` appears as a `src` | 
|  | if let IndexEntry::Occupied(entry) = self.c.entry(p) { | 
|  | Self::entry_filter_candidates(entry, p, &mut f, at); | 
|  | } | 
|  | // And the cases where `p` appears as a `dest` | 
|  | let Some(srcs) = self.reverse.get_mut(&p) else { | 
|  | return; | 
|  | }; | 
|  | // We use `retain` here to remove the elements from the reverse set if we've removed the | 
|  | // matching candidate in the forward set. | 
|  | srcs.retain(|src| { | 
|  | if f(*src) == CandidateFilter::Keep { | 
|  | return true; | 
|  | } | 
|  | let IndexEntry::Occupied(entry) = self.c.entry(*src) else { | 
|  | return false; | 
|  | }; | 
|  | Self::entry_filter_candidates( | 
|  | entry, | 
|  | *src, | 
|  | |dest| { | 
|  | if dest == p { CandidateFilter::Remove } else { CandidateFilter::Keep } | 
|  | }, | 
|  | at, | 
|  | ); | 
|  | false | 
|  | }); | 
|  | } | 
|  | } | 
|  |  | 
|  | #[derive(Copy, Clone, PartialEq, Eq)] | 
|  | enum CandidateFilter { | 
|  | Keep, | 
|  | Remove, | 
|  | } | 
|  |  | 
|  | impl<'a, 'tcx> FilterInformation<'a, 'tcx> { | 
|  | /// Filters the set of candidates to remove those that conflict. | 
|  | /// | 
|  | /// The steps we take are exactly those that are outlined at the top of the file. For each | 
|  | /// statement/terminator, we collect the set of locals that are written to in that | 
|  | /// statement/terminator, and then we remove all pairs of candidates that contain one such local | 
|  | /// and another one that is live. | 
|  | /// | 
|  | /// We need to be careful about the ordering of operations within each statement/terminator | 
|  | /// here. Many statements might write and read from more than one place, and we need to consider | 
|  | /// them all. The strategy for doing this is as follows: We first gather all the places that are | 
|  | /// written to within the statement/terminator via `WriteInfo`. Then, we use the liveness | 
|  | /// analysis from *before* the statement/terminator (in the control flow sense) to eliminate | 
|  | /// candidates - this is because we want to conservatively treat a pair of locals that is both | 
|  | /// read and written in the statement/terminator to be conflicting, and the liveness analysis | 
|  | /// before the statement/terminator will correctly report locals that are read in the | 
|  | /// statement/terminator to be live. We are additionally conservative by treating all written to | 
|  | /// locals as also being read from. | 
|  | fn filter_liveness( | 
|  | candidates: &mut Candidates, | 
|  | points: &DenseLocationMap, | 
|  | live: &SparseIntervalMatrix<Local, PointIndex>, | 
|  | write_info: &mut WriteInfo, | 
|  | body: &Body<'tcx>, | 
|  | ) { | 
|  | let mut this = FilterInformation { | 
|  | body, | 
|  | points, | 
|  | live, | 
|  | candidates, | 
|  | // We don't actually store anything at this scope, we just keep things here to be able | 
|  | // to reuse the allocation. | 
|  | write_info, | 
|  | // Doesn't matter what we put here, will be overwritten before being used | 
|  | at: Location::START, | 
|  | }; | 
|  | this.internal_filter_liveness(); | 
|  | } | 
|  |  | 
|  | fn internal_filter_liveness(&mut self) { | 
|  | for (block, data) in traversal::preorder(self.body) { | 
|  | self.at = Location { block, statement_index: data.statements.len() }; | 
|  | self.write_info.for_terminator(&data.terminator().kind); | 
|  | self.apply_conflicts(); | 
|  |  | 
|  | for (i, statement) in data.statements.iter().enumerate().rev() { | 
|  | self.at = Location { block, statement_index: i }; | 
|  | self.write_info.for_statement(&statement.kind, self.body); | 
|  | self.apply_conflicts(); | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | fn apply_conflicts(&mut self) { | 
|  | let writes = &self.write_info.writes; | 
|  | for p in writes { | 
|  | let other_skip = self.write_info.skip_pair.and_then(|(a, b)| { | 
|  | if a == *p { | 
|  | Some(b) | 
|  | } else if b == *p { | 
|  | Some(a) | 
|  | } else { | 
|  | None | 
|  | } | 
|  | }); | 
|  | let at = self.points.point_from_location(self.at); | 
|  | self.candidates.filter_candidates_by( | 
|  | *p, | 
|  | |q| { | 
|  | if Some(q) == other_skip { | 
|  | return CandidateFilter::Keep; | 
|  | } | 
|  | // It is possible that a local may be live for less than the | 
|  | // duration of a statement This happens in the case of function | 
|  | // calls or inline asm. Because of this, we also mark locals as | 
|  | // conflicting when both of them are written to in the same | 
|  | // statement. | 
|  | if self.live.contains(q, at) || writes.contains(&q) { | 
|  | CandidateFilter::Remove | 
|  | } else { | 
|  | CandidateFilter::Keep | 
|  | } | 
|  | }, | 
|  | self.at, | 
|  | ); | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | /// Describes where a statement/terminator writes to | 
|  | #[derive(Default, Debug)] | 
|  | struct WriteInfo { | 
|  | writes: Vec<Local>, | 
|  | /// If this pair of locals is a candidate pair, completely skip processing it during this | 
|  | /// statement. All other candidates are unaffected. | 
|  | skip_pair: Option<(Local, Local)>, | 
|  | } | 
|  |  | 
|  | impl WriteInfo { | 
|  | fn for_statement<'tcx>(&mut self, statement: &StatementKind<'tcx>, body: &Body<'tcx>) { | 
|  | self.reset(); | 
|  | match statement { | 
|  | StatementKind::Assign(box (lhs, rhs)) => { | 
|  | self.add_place(*lhs); | 
|  | match rhs { | 
|  | Rvalue::Use(op) => { | 
|  | self.add_operand(op); | 
|  | self.consider_skipping_for_assign_use(*lhs, op, body); | 
|  | } | 
|  | Rvalue::Repeat(op, _) => { | 
|  | self.add_operand(op); | 
|  | } | 
|  | Rvalue::Cast(_, op, _) | 
|  | | Rvalue::UnaryOp(_, op) | 
|  | | Rvalue::ShallowInitBox(op, _) => { | 
|  | self.add_operand(op); | 
|  | } | 
|  | Rvalue::BinaryOp(_, ops) => { | 
|  | for op in [&ops.0, &ops.1] { | 
|  | self.add_operand(op); | 
|  | } | 
|  | } | 
|  | Rvalue::Aggregate(_, ops) => { | 
|  | for op in ops { | 
|  | self.add_operand(op); | 
|  | } | 
|  | } | 
|  | Rvalue::WrapUnsafeBinder(op, _) => { | 
|  | self.add_operand(op); | 
|  | } | 
|  | Rvalue::ThreadLocalRef(_) | 
|  | | Rvalue::NullaryOp(_, _) | 
|  | | Rvalue::Ref(_, _, _) | 
|  | | Rvalue::RawPtr(_, _) | 
|  | | Rvalue::Len(_) | 
|  | | Rvalue::Discriminant(_) | 
|  | | Rvalue::CopyForDeref(_) => {} | 
|  | } | 
|  | } | 
|  | // Retags are technically also reads, but reporting them as a write suffices | 
|  | StatementKind::SetDiscriminant { place, .. } | 
|  | | StatementKind::Deinit(place) | 
|  | | StatementKind::Retag(_, place) => { | 
|  | self.add_place(**place); | 
|  | } | 
|  | StatementKind::Intrinsic(_) | 
|  | | StatementKind::ConstEvalCounter | 
|  | | StatementKind::Nop | 
|  | | StatementKind::Coverage(_) | 
|  | | StatementKind::StorageLive(_) | 
|  | | StatementKind::StorageDead(_) | 
|  | | StatementKind::BackwardIncompatibleDropHint { .. } | 
|  | | StatementKind::PlaceMention(_) => {} | 
|  | StatementKind::FakeRead(_) | StatementKind::AscribeUserType(_, _) => { | 
|  | bug!("{:?} not found in this MIR phase", statement) | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | fn consider_skipping_for_assign_use<'tcx>( | 
|  | &mut self, | 
|  | lhs: Place<'tcx>, | 
|  | rhs: &Operand<'tcx>, | 
|  | body: &Body<'tcx>, | 
|  | ) { | 
|  | let Some(rhs) = rhs.place() else { return }; | 
|  | if let Some(pair) = places_to_candidate_pair(lhs, rhs, body) { | 
|  | self.skip_pair = Some(pair); | 
|  | } | 
|  | } | 
|  |  | 
|  | fn for_terminator<'tcx>(&mut self, terminator: &TerminatorKind<'tcx>) { | 
|  | self.reset(); | 
|  | match terminator { | 
|  | TerminatorKind::SwitchInt { discr: op, .. } | 
|  | | TerminatorKind::Assert { cond: op, .. } => { | 
|  | self.add_operand(op); | 
|  | } | 
|  | TerminatorKind::Call { destination, func, args, .. } => { | 
|  | self.add_place(*destination); | 
|  | self.add_operand(func); | 
|  | for arg in args { | 
|  | self.add_operand(&arg.node); | 
|  | } | 
|  | } | 
|  | TerminatorKind::TailCall { func, args, .. } => { | 
|  | self.add_operand(func); | 
|  | for arg in args { | 
|  | self.add_operand(&arg.node); | 
|  | } | 
|  | } | 
|  | TerminatorKind::InlineAsm { operands, .. } => { | 
|  | for asm_operand in operands { | 
|  | match asm_operand { | 
|  | InlineAsmOperand::In { value, .. } => { | 
|  | self.add_operand(value); | 
|  | } | 
|  | InlineAsmOperand::Out { place, .. } => { | 
|  | if let Some(place) = place { | 
|  | self.add_place(*place); | 
|  | } | 
|  | } | 
|  | // Note that the `late` field in `InOut` is about whether the registers used | 
|  | // for these things overlap, and is of absolutely no interest to us. | 
|  | InlineAsmOperand::InOut { in_value, out_place, .. } => { | 
|  | if let Some(place) = out_place { | 
|  | self.add_place(*place); | 
|  | } | 
|  | self.add_operand(in_value); | 
|  | } | 
|  | InlineAsmOperand::Const { .. } | 
|  | | InlineAsmOperand::SymFn { .. } | 
|  | | InlineAsmOperand::SymStatic { .. } | 
|  | | InlineAsmOperand::Label { .. } => {} | 
|  | } | 
|  | } | 
|  | } | 
|  | TerminatorKind::Goto { .. } | 
|  | | TerminatorKind::UnwindResume | 
|  | | TerminatorKind::UnwindTerminate(_) | 
|  | | TerminatorKind::Return | 
|  | | TerminatorKind::Unreachable { .. } => (), | 
|  | TerminatorKind::Drop { .. } => { | 
|  | // `Drop`s create a `&mut` and so are not considered | 
|  | } | 
|  | TerminatorKind::Yield { .. } | 
|  | | TerminatorKind::CoroutineDrop | 
|  | | TerminatorKind::FalseEdge { .. } | 
|  | | TerminatorKind::FalseUnwind { .. } => { | 
|  | bug!("{:?} not found in this MIR phase", terminator) | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | fn add_place(&mut self, place: Place<'_>) { | 
|  | self.writes.push(place.local); | 
|  | } | 
|  |  | 
|  | fn add_operand<'tcx>(&mut self, op: &Operand<'tcx>) { | 
|  | match op { | 
|  | // FIXME(JakobDegen): In a previous version, the `Move` case was incorrectly treated as | 
|  | // being a read only. This was unsound, however we cannot add a regression test because | 
|  | // it is not possible to set this off with current MIR. Once we have that ability, a | 
|  | // regression test should be added. | 
|  | Operand::Move(p) => self.add_place(*p), | 
|  | Operand::Copy(_) | Operand::Constant(_) => (), | 
|  | } | 
|  | } | 
|  |  | 
|  | fn reset(&mut self) { | 
|  | self.writes.clear(); | 
|  | self.skip_pair = None; | 
|  | } | 
|  | } | 
|  |  | 
|  | ///////////////////////////////////////////////////// | 
|  | // Candidate accumulation | 
|  |  | 
|  | /// If the pair of places is being considered for merging, returns the candidate which would be | 
|  | /// merged in order to accomplish this. | 
|  | /// | 
|  | /// The contract here is in one direction - there is a guarantee that merging the locals that are | 
|  | /// outputted by this function would result in an assignment between the inputs becoming a | 
|  | /// self-assignment. However, there is no guarantee that the returned pair is actually suitable for | 
|  | /// merging - candidate collection must still check this independently. | 
|  | /// | 
|  | /// This output is unique for each unordered pair of input places. | 
|  | fn places_to_candidate_pair<'tcx>( | 
|  | a: Place<'tcx>, | 
|  | b: Place<'tcx>, | 
|  | body: &Body<'tcx>, | 
|  | ) -> Option<(Local, Local)> { | 
|  | let (mut a, mut b) = if a.projection.len() == 0 && b.projection.len() == 0 { | 
|  | (a.local, b.local) | 
|  | } else { | 
|  | return None; | 
|  | }; | 
|  |  | 
|  | // By sorting, we make sure we're input order independent | 
|  | if a > b { | 
|  | std::mem::swap(&mut a, &mut b); | 
|  | } | 
|  |  | 
|  | // We could now return `(a, b)`, but then we miss some candidates in the case where `a` can't be | 
|  | // used as a `src`. | 
|  | if is_local_required(a, body) { | 
|  | std::mem::swap(&mut a, &mut b); | 
|  | } | 
|  | // We could check `is_local_required` again here, but there's no need - after all, we make no | 
|  | // promise that the candidate pair is actually valid | 
|  | Some((a, b)) | 
|  | } | 
|  |  | 
|  | struct FindAssignments<'a, 'tcx> { | 
|  | body: &'a Body<'tcx>, | 
|  | candidates: &'a mut FxIndexMap<Local, Vec<Local>>, | 
|  | borrowed: &'a DenseBitSet<Local>, | 
|  | } | 
|  |  | 
|  | impl<'tcx> Visitor<'tcx> for FindAssignments<'_, 'tcx> { | 
|  | fn visit_statement(&mut self, statement: &Statement<'tcx>, _: Location) { | 
|  | if let StatementKind::Assign(box ( | 
|  | lhs, | 
|  | Rvalue::CopyForDeref(rhs) | Rvalue::Use(Operand::Copy(rhs) | Operand::Move(rhs)), | 
|  | )) = &statement.kind | 
|  | { | 
|  | let Some((src, dest)) = places_to_candidate_pair(*lhs, *rhs, self.body) else { | 
|  | return; | 
|  | }; | 
|  |  | 
|  | // As described at the top of the file, we do not go near things that have | 
|  | // their address taken. | 
|  | if self.borrowed.contains(src) || self.borrowed.contains(dest) { | 
|  | return; | 
|  | } | 
|  |  | 
|  | // As described at the top of this file, we do not touch locals which have | 
|  | // different types. | 
|  | let src_ty = self.body.local_decls()[src].ty; | 
|  | let dest_ty = self.body.local_decls()[dest].ty; | 
|  | if src_ty != dest_ty { | 
|  | // FIXME(#112651): This can be removed afterwards. Also update the module description. | 
|  | trace!("skipped `{src:?} = {dest:?}` due to subtyping: {src_ty} != {dest_ty}"); | 
|  | return; | 
|  | } | 
|  |  | 
|  | // Also, we need to make sure that MIR actually allows the `src` to be removed | 
|  | if is_local_required(src, self.body) { | 
|  | return; | 
|  | } | 
|  |  | 
|  | // We may insert duplicates here, but that's fine | 
|  | self.candidates.entry(src).or_default().push(dest); | 
|  | } | 
|  | } | 
|  | } | 
|  |  | 
|  | /// Some locals are part of the function's interface and can not be removed. | 
|  | /// | 
|  | /// Note that these locals *can* still be merged with non-required locals by removing that other | 
|  | /// local. | 
|  | fn is_local_required(local: Local, body: &Body<'_>) -> bool { | 
|  | match body.local_kind(local) { | 
|  | LocalKind::Arg | LocalKind::ReturnPointer => true, | 
|  | LocalKind::Temp => false, | 
|  | } | 
|  | } | 
|  |  | 
|  | ///////////////////////////////////////////////////////// | 
|  | // MIR Dump | 
|  |  | 
|  | fn dest_prop_mir_dump<'tcx>( | 
|  | tcx: TyCtxt<'tcx>, | 
|  | body: &Body<'tcx>, | 
|  | points: &DenseLocationMap, | 
|  | live: &SparseIntervalMatrix<Local, PointIndex>, | 
|  | round: usize, | 
|  | ) { | 
|  | let locals_live_at = |location| { | 
|  | let location = points.point_from_location(location); | 
|  | live.rows().filter(|&r| live.contains(r, location)).collect::<Vec<_>>() | 
|  | }; | 
|  | dump_mir(tcx, false, "DestinationPropagation-dataflow", &round, body, |pass_where, w| { | 
|  | if let PassWhere::BeforeLocation(loc) = pass_where { | 
|  | writeln!(w, "        // live: {:?}", locals_live_at(loc))?; | 
|  | } | 
|  |  | 
|  | Ok(()) | 
|  | }); | 
|  | } |