1//===- CGSCCPassManager.h - Call graph pass management ----------*- C++ -*-===// 2// 3// The LLVM Compiler Infrastructure 4// 5// This file is distributed under the University of Illinois Open Source 6// License. See LICENSE.TXT for details. 7// 8//===----------------------------------------------------------------------===// 9/// \file 10/// 11/// This header provides classes for managing passes over SCCs of the call 12/// graph. These passes form an important component of LLVM's interprocedural 13/// optimizations. Because they operate on the SCCs of the call graph, and they 14/// traverse the graph in post-order, they can effectively do pair-wise 15/// interprocedural optimizations for all call edges in the program while 16/// incrementally refining it and improving the context of these pair-wise 17/// optimizations. At each call site edge, the callee has already been 18/// optimized as much as is possible. This in turn allows very accurate 19/// analysis of it for IPO. 20/// 21/// A secondary more general goal is to be able to isolate optimization on 22/// unrelated parts of the IR module. This is useful to ensure our 23/// optimizations are principled and don't miss oportunities where refinement 24/// of one part of the module influence transformations in another part of the 25/// module. But this is also useful if we want to parallelize the optimizations 26/// across common large module graph shapes which tend to be very wide and have 27/// large regions of unrelated cliques. 28/// 29/// To satisfy these goals, we use the LazyCallGraph which provides two graphs 30/// nested inside each other (and built lazily from the bottom-up): the call 31/// graph proper, and a reference graph. The reference graph is super set of 32/// the call graph and is a conservative approximation of what could through 33/// scalar or CGSCC transforms *become* the call graph. Using this allows us to 34/// ensure we optimize functions prior to them being introduced into the call 35/// graph by devirtualization or other technique, and thus ensures that 36/// subsequent pair-wise interprocedural optimizations observe the optimized 37/// form of these functions. The (potentially transitive) reference 38/// reachability used by the reference graph is a conservative approximation 39/// that still allows us to have independent regions of the graph. 40/// 41/// FIXME: There is one major drawback of the reference graph: in its naive 42/// form it is quadratic because it contains a distinct edge for each 43/// (potentially indirect) reference, even if are all through some common 44/// global table of function pointers. This can be fixed in a number of ways 45/// that essentially preserve enough of the normalization. While it isn't 46/// expected to completely preclude the usability of this, it will need to be 47/// addressed. 48/// 49/// 50/// All of these issues are made substantially more complex in the face of 51/// mutations to the call graph while optimization passes are being run. When 52/// mutations to the call graph occur we want to achieve two different things: 53/// 54/// - We need to update the call graph in-flight and invalidate analyses 55/// cached on entities in the graph. Because of the cache-based analysis 56/// design of the pass manager, it is essential to have stable identities for 57/// the elements of the IR that passes traverse, and to invalidate any 58/// analyses cached on these elements as the mutations take place. 59/// 60/// - We want to preserve the incremental and post-order traversal of the 61/// graph even as it is refined and mutated. This means we want optimization 62/// to observe the most refined form of the call graph and to do so in 63/// post-order. 64/// 65/// To address this, the CGSCC manager uses both worklists that can be expanded 66/// by passes which transform the IR, and provides invalidation tests to skip 67/// entries that become dead. This extra data is provided to every SCC pass so 68/// that it can carefully update the manager's traversal as the call graph 69/// mutates. 70/// 71/// We also provide support for running function passes within the CGSCC walk, 72/// and there we provide automatic update of the call graph including of the 73/// pass manager to reflect call graph changes that fall out naturally as part 74/// of scalar transformations. 75/// 76/// The patterns used to ensure the goals of post-order visitation of the fully 77/// refined graph: 78/// 79/// 1) Sink toward the "bottom" as the graph is refined. This means that any 80/// iteration continues in some valid post-order sequence after the mutation 81/// has altered the structure. 82/// 83/// 2) Enqueue in post-order, including the current entity. If the current 84/// entity's shape changes, it and everything after it in post-order needs 85/// to be visited to observe that shape. 86/// 87//===----------------------------------------------------------------------===// 88 89#ifndef LLVM_ANALYSIS_CGSCCPASSMANAGER_H 90#define LLVM_ANALYSIS_CGSCCPASSMANAGER_H 91 92#include "llvm/ADT/PriorityWorklist.h" 93#include "llvm/Analysis/LazyCallGraph.h" 94#include "llvm/IR/CallSite.h" 95#include "llvm/IR/InstIterator.h" 96#include "llvm/IR/PassManager.h" 97#include "llvm/IR/ValueHandle.h" 98 99namespace llvm { 100 101struct CGSCCUpdateResult; 102 103/// Extern template declaration for the analysis set for this IR unit. 104extern template class AllAnalysesOn<LazyCallGraph::SCC>; 105 106extern template class AnalysisManager<LazyCallGraph::SCC, LazyCallGraph &>; 107/// \brief The CGSCC analysis manager. 108/// 109/// See the documentation for the AnalysisManager template for detail 110/// documentation. This typedef serves as a convenient way to refer to this 111/// construct in the adaptors and proxies used to integrate this into the larger 112/// pass manager infrastructure. 113typedef AnalysisManager<LazyCallGraph::SCC, LazyCallGraph &> 114 CGSCCAnalysisManager; 115 116// Explicit specialization and instantiation declarations for the pass manager. 117// See the comments on the definition of the specialization for details on how 118// it differs from the primary template. 119template <> 120PreservedAnalyses 121PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager, LazyCallGraph &, 122 CGSCCUpdateResult &>::run(LazyCallGraph::SCC &InitialC, 123 CGSCCAnalysisManager &AM, 124 LazyCallGraph &G, CGSCCUpdateResult &UR); 125extern template class PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager, 126 LazyCallGraph &, CGSCCUpdateResult &>; 127 128/// \brief The CGSCC pass manager. 129/// 130/// See the documentation for the PassManager template for details. It runs 131/// a sequence of SCC passes over each SCC that the manager is run over. This 132/// typedef serves as a convenient way to refer to this construct. 133typedef PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager, LazyCallGraph &, 134 CGSCCUpdateResult &> 135 CGSCCPassManager; 136 137/// An explicit specialization of the require analysis template pass. 138template <typename AnalysisT> 139struct RequireAnalysisPass<AnalysisT, LazyCallGraph::SCC, CGSCCAnalysisManager, 140 LazyCallGraph &, CGSCCUpdateResult &> 141 : PassInfoMixin<RequireAnalysisPass<AnalysisT, LazyCallGraph::SCC, 142 CGSCCAnalysisManager, LazyCallGraph &, 143 CGSCCUpdateResult &>> { 144 PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, 145 LazyCallGraph &CG, CGSCCUpdateResult &) { 146 (void)AM.template getResult<AnalysisT>(C, CG); 147 return PreservedAnalyses::all(); 148 } 149}; 150 151/// A proxy from a \c CGSCCAnalysisManager to a \c Module. 152typedef InnerAnalysisManagerProxy<CGSCCAnalysisManager, Module> 153 CGSCCAnalysisManagerModuleProxy; 154 155/// We need a specialized result for the \c CGSCCAnalysisManagerModuleProxy so 156/// it can have access to the call graph in order to walk all the SCCs when 157/// invalidating things. 158template <> class CGSCCAnalysisManagerModuleProxy::Result { 159public: 160 explicit Result(CGSCCAnalysisManager &InnerAM, LazyCallGraph &G) 161 : InnerAM(&InnerAM), G(&G) {} 162 163 /// \brief Accessor for the analysis manager. 164 CGSCCAnalysisManager &getManager() { return *InnerAM; } 165 166 /// \brief Handler for invalidation of the Module. 167 /// 168 /// If the proxy analysis itself is preserved, then we assume that the set of 169 /// SCCs in the Module hasn't changed. Thus any pointers to SCCs in the 170 /// CGSCCAnalysisManager are still valid, and we don't need to call \c clear 171 /// on the CGSCCAnalysisManager. 172 /// 173 /// Regardless of whether this analysis is marked as preserved, all of the 174 /// analyses in the \c CGSCCAnalysisManager are potentially invalidated based 175 /// on the set of preserved analyses. 176 bool invalidate(Module &M, const PreservedAnalyses &PA, 177 ModuleAnalysisManager::Invalidator &Inv); 178 179private: 180 CGSCCAnalysisManager *InnerAM; 181 LazyCallGraph *G; 182}; 183 184/// Provide a specialized run method for the \c CGSCCAnalysisManagerModuleProxy 185/// so it can pass the lazy call graph to the result. 186template <> 187CGSCCAnalysisManagerModuleProxy::Result 188CGSCCAnalysisManagerModuleProxy::run(Module &M, ModuleAnalysisManager &AM); 189 190// Ensure the \c CGSCCAnalysisManagerModuleProxy is provided as an extern 191// template. 192extern template class InnerAnalysisManagerProxy<CGSCCAnalysisManager, Module>; 193 194extern template class OuterAnalysisManagerProxy< 195 ModuleAnalysisManager, LazyCallGraph::SCC, LazyCallGraph &>; 196/// A proxy from a \c ModuleAnalysisManager to an \c SCC. 197typedef OuterAnalysisManagerProxy<ModuleAnalysisManager, LazyCallGraph::SCC, 198 LazyCallGraph &> 199 ModuleAnalysisManagerCGSCCProxy; 200 201/// Support structure for SCC passes to communicate updates the call graph back 202/// to the CGSCC pass manager infrsatructure. 203/// 204/// The CGSCC pass manager runs SCC passes which are allowed to update the call 205/// graph and SCC structures. This means the structure the pass manager works 206/// on is mutating underneath it. In order to support that, there needs to be 207/// careful communication about the precise nature and ramifications of these 208/// updates to the pass management infrastructure. 209/// 210/// All SCC passes will have to accept a reference to the management layer's 211/// update result struct and use it to reflect the results of any CG updates 212/// performed. 213/// 214/// Passes which do not change the call graph structure in any way can just 215/// ignore this argument to their run method. 216struct CGSCCUpdateResult { 217 /// Worklist of the RefSCCs queued for processing. 218 /// 219 /// When a pass refines the graph and creates new RefSCCs or causes them to 220 /// have a different shape or set of component SCCs it should add the RefSCCs 221 /// to this worklist so that we visit them in the refined form. 222 /// 223 /// This worklist is in reverse post-order, as we pop off the back in order 224 /// to observe RefSCCs in post-order. When adding RefSCCs, clients should add 225 /// them in reverse post-order. 226 SmallPriorityWorklist<LazyCallGraph::RefSCC *, 1> &RCWorklist; 227 228 /// Worklist of the SCCs queued for processing. 229 /// 230 /// When a pass refines the graph and creates new SCCs or causes them to have 231 /// a different shape or set of component functions it should add the SCCs to 232 /// this worklist so that we visit them in the refined form. 233 /// 234 /// Note that if the SCCs are part of a RefSCC that is added to the \c 235 /// RCWorklist, they don't need to be added here as visiting the RefSCC will 236 /// be sufficient to re-visit the SCCs within it. 237 /// 238 /// This worklist is in reverse post-order, as we pop off the back in order 239 /// to observe SCCs in post-order. When adding SCCs, clients should add them 240 /// in reverse post-order. 241 SmallPriorityWorklist<LazyCallGraph::SCC *, 1> &CWorklist; 242 243 /// The set of invalidated RefSCCs which should be skipped if they are found 244 /// in \c RCWorklist. 245 /// 246 /// This is used to quickly prune out RefSCCs when they get deleted and 247 /// happen to already be on the worklist. We use this primarily to avoid 248 /// scanning the list and removing entries from it. 249 SmallPtrSetImpl<LazyCallGraph::RefSCC *> &InvalidatedRefSCCs; 250 251 /// The set of invalidated SCCs which should be skipped if they are found 252 /// in \c CWorklist. 253 /// 254 /// This is used to quickly prune out SCCs when they get deleted and happen 255 /// to already be on the worklist. We use this primarily to avoid scanning 256 /// the list and removing entries from it. 257 SmallPtrSetImpl<LazyCallGraph::SCC *> &InvalidatedSCCs; 258 259 /// If non-null, the updated current \c RefSCC being processed. 260 /// 261 /// This is set when a graph refinement takes place an the "current" point in 262 /// the graph moves "down" or earlier in the post-order walk. This will often 263 /// cause the "current" RefSCC to be a newly created RefSCC object and the 264 /// old one to be added to the above worklist. When that happens, this 265 /// pointer is non-null and can be used to continue processing the "top" of 266 /// the post-order walk. 267 LazyCallGraph::RefSCC *UpdatedRC; 268 269 /// If non-null, the updated current \c SCC being processed. 270 /// 271 /// This is set when a graph refinement takes place an the "current" point in 272 /// the graph moves "down" or earlier in the post-order walk. This will often 273 /// cause the "current" SCC to be a newly created SCC object and the old one 274 /// to be added to the above worklist. When that happens, this pointer is 275 /// non-null and can be used to continue processing the "top" of the 276 /// post-order walk. 277 LazyCallGraph::SCC *UpdatedC; 278}; 279 280/// \brief The core module pass which does a post-order walk of the SCCs and 281/// runs a CGSCC pass over each one. 282/// 283/// Designed to allow composition of a CGSCCPass(Manager) and 284/// a ModulePassManager. Note that this pass must be run with a module analysis 285/// manager as it uses the LazyCallGraph analysis. It will also run the 286/// \c CGSCCAnalysisManagerModuleProxy analysis prior to running the CGSCC 287/// pass over the module to enable a \c FunctionAnalysisManager to be used 288/// within this run safely. 289template <typename CGSCCPassT> 290class ModuleToPostOrderCGSCCPassAdaptor 291 : public PassInfoMixin<ModuleToPostOrderCGSCCPassAdaptor<CGSCCPassT>> { 292public: 293 explicit ModuleToPostOrderCGSCCPassAdaptor(CGSCCPassT Pass, bool DebugLogging = false) 294 : Pass(std::move(Pass)), DebugLogging(DebugLogging) {} 295 // We have to explicitly define all the special member functions because MSVC 296 // refuses to generate them. 297 ModuleToPostOrderCGSCCPassAdaptor( 298 const ModuleToPostOrderCGSCCPassAdaptor &Arg) 299 : Pass(Arg.Pass), DebugLogging(Arg.DebugLogging) {} 300 ModuleToPostOrderCGSCCPassAdaptor(ModuleToPostOrderCGSCCPassAdaptor &&Arg) 301 : Pass(std::move(Arg.Pass)), DebugLogging(Arg.DebugLogging) {} 302 friend void swap(ModuleToPostOrderCGSCCPassAdaptor &LHS, 303 ModuleToPostOrderCGSCCPassAdaptor &RHS) { 304 using std::swap; 305 swap(LHS.Pass, RHS.Pass); 306 swap(LHS.DebugLogging, RHS.DebugLogging); 307 } 308 ModuleToPostOrderCGSCCPassAdaptor & 309 operator=(ModuleToPostOrderCGSCCPassAdaptor RHS) { 310 swap(*this, RHS); 311 return *this; 312 } 313 314 /// \brief Runs the CGSCC pass across every SCC in the module. 315 PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM) { 316 // Setup the CGSCC analysis manager from its proxy. 317 CGSCCAnalysisManager &CGAM = 318 AM.getResult<CGSCCAnalysisManagerModuleProxy>(M).getManager(); 319 320 // Get the call graph for this module. 321 LazyCallGraph &CG = AM.getResult<LazyCallGraphAnalysis>(M); 322 323 // We keep worklists to allow us to push more work onto the pass manager as 324 // the passes are run. 325 SmallPriorityWorklist<LazyCallGraph::RefSCC *, 1> RCWorklist; 326 SmallPriorityWorklist<LazyCallGraph::SCC *, 1> CWorklist; 327 328 // Keep sets for invalidated SCCs and RefSCCs that should be skipped when 329 // iterating off the worklists. 330 SmallPtrSet<LazyCallGraph::RefSCC *, 4> InvalidRefSCCSet; 331 SmallPtrSet<LazyCallGraph::SCC *, 4> InvalidSCCSet; 332 333 CGSCCUpdateResult UR = {RCWorklist, CWorklist, InvalidRefSCCSet, 334 InvalidSCCSet, nullptr, nullptr}; 335 336 PreservedAnalyses PA = PreservedAnalyses::all(); 337 CG.buildRefSCCs(); 338 for (auto RCI = CG.postorder_ref_scc_begin(), 339 RCE = CG.postorder_ref_scc_end(); 340 RCI != RCE;) { 341 assert(RCWorklist.empty() && 342 "Should always start with an empty RefSCC worklist"); 343 // The postorder_ref_sccs range we are walking is lazily constructed, so 344 // we only push the first one onto the worklist. The worklist allows us 345 // to capture *new* RefSCCs created during transformations. 346 // 347 // We really want to form RefSCCs lazily because that makes them cheaper 348 // to update as the program is simplified and allows us to have greater 349 // cache locality as forming a RefSCC touches all the parts of all the 350 // functions within that RefSCC. 351 // 352 // We also eagerly increment the iterator to the next position because 353 // the CGSCC passes below may delete the current RefSCC. 354 RCWorklist.insert(&*RCI++); 355 356 do { 357 LazyCallGraph::RefSCC *RC = RCWorklist.pop_back_val(); 358 if (InvalidRefSCCSet.count(RC)) { 359 if (DebugLogging) 360 dbgs() << "Skipping an invalid RefSCC...\n"; 361 continue; 362 } 363 364 assert(CWorklist.empty() && 365 "Should always start with an empty SCC worklist"); 366 367 if (DebugLogging) 368 dbgs() << "Running an SCC pass across the RefSCC: " << *RC << "\n"; 369 370 // Push the initial SCCs in reverse post-order as we'll pop off the the 371 // back and so see this in post-order. 372 for (LazyCallGraph::SCC &C : reverse(*RC)) 373 CWorklist.insert(&C); 374 375 do { 376 LazyCallGraph::SCC *C = CWorklist.pop_back_val(); 377 // Due to call graph mutations, we may have invalid SCCs or SCCs from 378 // other RefSCCs in the worklist. The invalid ones are dead and the 379 // other RefSCCs should be queued above, so we just need to skip both 380 // scenarios here. 381 if (InvalidSCCSet.count(C)) { 382 if (DebugLogging) 383 dbgs() << "Skipping an invalid SCC...\n"; 384 continue; 385 } 386 if (&C->getOuterRefSCC() != RC) { 387 if (DebugLogging) 388 dbgs() << "Skipping an SCC that is now part of some other " 389 "RefSCC...\n"; 390 continue; 391 } 392 393 do { 394 // Check that we didn't miss any update scenario. 395 assert(!InvalidSCCSet.count(C) && "Processing an invalid SCC!"); 396 assert(C->begin() != C->end() && "Cannot have an empty SCC!"); 397 assert(&C->getOuterRefSCC() == RC && 398 "Processing an SCC in a different RefSCC!"); 399 400 UR.UpdatedRC = nullptr; 401 UR.UpdatedC = nullptr; 402 PreservedAnalyses PassPA = Pass.run(*C, CGAM, CG, UR); 403 404 // We handle invalidating the CGSCC analysis manager's information 405 // for the (potentially updated) SCC here. Note that any other SCCs 406 // whose structure has changed should have been invalidated by 407 // whatever was updating the call graph. This SCC gets invalidated 408 // late as it contains the nodes that were actively being 409 // processed. 410 CGAM.invalidate(*(UR.UpdatedC ? UR.UpdatedC : C), PassPA); 411 412 // Then intersect the preserved set so that invalidation of module 413 // analyses will eventually occur when the module pass completes. 414 PA.intersect(std::move(PassPA)); 415 416 // The pass may have restructured the call graph and refined the 417 // current SCC and/or RefSCC. We need to update our current SCC and 418 // RefSCC pointers to follow these. Also, when the current SCC is 419 // refined, re-run the SCC pass over the newly refined SCC in order 420 // to observe the most precise SCC model available. This inherently 421 // cannot cycle excessively as it only happens when we split SCCs 422 // apart, at most converging on a DAG of single nodes. 423 // FIXME: If we ever start having RefSCC passes, we'll want to 424 // iterate there too. 425 RC = UR.UpdatedRC ? UR.UpdatedRC : RC; 426 C = UR.UpdatedC ? UR.UpdatedC : C; 427 if (DebugLogging && UR.UpdatedC) 428 dbgs() << "Re-running SCC passes after a refinement of the " 429 "current SCC: " 430 << *UR.UpdatedC << "\n"; 431 432 // Note that both `C` and `RC` may at this point refer to deleted, 433 // invalid SCC and RefSCCs respectively. But we will short circuit 434 // the processing when we check them in the loop above. 435 } while (UR.UpdatedC); 436 437 } while (!CWorklist.empty()); 438 } while (!RCWorklist.empty()); 439 } 440 441 // By definition we preserve the call garph, all SCC analyses, and the 442 // analysis proxies by handling them above and in any nested pass managers. 443 PA.preserveSet<AllAnalysesOn<LazyCallGraph::SCC>>(); 444 PA.preserve<LazyCallGraphAnalysis>(); 445 PA.preserve<CGSCCAnalysisManagerModuleProxy>(); 446 PA.preserve<FunctionAnalysisManagerModuleProxy>(); 447 return PA; 448 } 449 450private: 451 CGSCCPassT Pass; 452 bool DebugLogging; 453}; 454 455/// \brief A function to deduce a function pass type and wrap it in the 456/// templated adaptor. 457template <typename CGSCCPassT> 458ModuleToPostOrderCGSCCPassAdaptor<CGSCCPassT> 459createModuleToPostOrderCGSCCPassAdaptor(CGSCCPassT Pass, bool DebugLogging = false) { 460 return ModuleToPostOrderCGSCCPassAdaptor<CGSCCPassT>(std::move(Pass), DebugLogging); 461} 462 463/// A proxy from a \c FunctionAnalysisManager to an \c SCC. 464/// 465/// When a module pass runs and triggers invalidation, both the CGSCC and 466/// Function analysis manager proxies on the module get an invalidation event. 467/// We don't want to fully duplicate responsibility for most of the 468/// invalidation logic. Instead, this layer is only responsible for SCC-local 469/// invalidation events. We work with the module's FunctionAnalysisManager to 470/// invalidate function analyses. 471class FunctionAnalysisManagerCGSCCProxy 472 : public AnalysisInfoMixin<FunctionAnalysisManagerCGSCCProxy> { 473public: 474 class Result { 475 public: 476 explicit Result(FunctionAnalysisManager &FAM) : FAM(&FAM) {} 477 478 /// \brief Accessor for the analysis manager. 479 FunctionAnalysisManager &getManager() { return *FAM; } 480 481 bool invalidate(LazyCallGraph::SCC &C, const PreservedAnalyses &PA, 482 CGSCCAnalysisManager::Invalidator &Inv); 483 484 private: 485 FunctionAnalysisManager *FAM; 486 }; 487 488 /// Computes the \c FunctionAnalysisManager and stores it in the result proxy. 489 Result run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &); 490 491private: 492 friend AnalysisInfoMixin<FunctionAnalysisManagerCGSCCProxy>; 493 static AnalysisKey Key; 494}; 495 496extern template class OuterAnalysisManagerProxy<CGSCCAnalysisManager, Function>; 497/// A proxy from a \c CGSCCAnalysisManager to a \c Function. 498typedef OuterAnalysisManagerProxy<CGSCCAnalysisManager, Function> 499 CGSCCAnalysisManagerFunctionProxy; 500 501/// Helper to update the call graph after running a function pass. 502/// 503/// Function passes can only mutate the call graph in specific ways. This 504/// routine provides a helper that updates the call graph in those ways 505/// including returning whether any changes were made and populating a CG 506/// update result struct for the overall CGSCC walk. 507LazyCallGraph::SCC &updateCGAndAnalysisManagerForFunctionPass( 508 LazyCallGraph &G, LazyCallGraph::SCC &C, LazyCallGraph::Node &N, 509 CGSCCAnalysisManager &AM, CGSCCUpdateResult &UR, bool DebugLogging = false); 510 511/// \brief Adaptor that maps from a SCC to its functions. 512/// 513/// Designed to allow composition of a FunctionPass(Manager) and 514/// a CGSCCPassManager. Note that if this pass is constructed with a pointer 515/// to a \c CGSCCAnalysisManager it will run the 516/// \c FunctionAnalysisManagerCGSCCProxy analysis prior to running the function 517/// pass over the SCC to enable a \c FunctionAnalysisManager to be used 518/// within this run safely. 519template <typename FunctionPassT> 520class CGSCCToFunctionPassAdaptor 521 : public PassInfoMixin<CGSCCToFunctionPassAdaptor<FunctionPassT>> { 522public: 523 explicit CGSCCToFunctionPassAdaptor(FunctionPassT Pass, bool DebugLogging = false) 524 : Pass(std::move(Pass)), DebugLogging(DebugLogging) {} 525 // We have to explicitly define all the special member functions because MSVC 526 // refuses to generate them. 527 CGSCCToFunctionPassAdaptor(const CGSCCToFunctionPassAdaptor &Arg) 528 : Pass(Arg.Pass), DebugLogging(Arg.DebugLogging) {} 529 CGSCCToFunctionPassAdaptor(CGSCCToFunctionPassAdaptor &&Arg) 530 : Pass(std::move(Arg.Pass)), DebugLogging(Arg.DebugLogging) {} 531 friend void swap(CGSCCToFunctionPassAdaptor &LHS, 532 CGSCCToFunctionPassAdaptor &RHS) { 533 using std::swap; 534 swap(LHS.Pass, RHS.Pass); 535 swap(LHS.DebugLogging, RHS.DebugLogging); 536 } 537 CGSCCToFunctionPassAdaptor &operator=(CGSCCToFunctionPassAdaptor RHS) { 538 swap(*this, RHS); 539 return *this; 540 } 541 542 /// \brief Runs the function pass across every function in the module. 543 PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, 544 LazyCallGraph &CG, CGSCCUpdateResult &UR) { 545 // Setup the function analysis manager from its proxy. 546 FunctionAnalysisManager &FAM = 547 AM.getResult<FunctionAnalysisManagerCGSCCProxy>(C, CG).getManager(); 548 549 SmallVector<LazyCallGraph::Node *, 4> Nodes; 550 for (LazyCallGraph::Node &N : C) 551 Nodes.push_back(&N); 552 553 // The SCC may get split while we are optimizing functions due to deleting 554 // edges. If this happens, the current SCC can shift, so keep track of 555 // a pointer we can overwrite. 556 LazyCallGraph::SCC *CurrentC = &C; 557 558 if (DebugLogging) 559 dbgs() << "Running function passes across an SCC: " << C << "\n"; 560 561 PreservedAnalyses PA = PreservedAnalyses::all(); 562 for (LazyCallGraph::Node *N : Nodes) { 563 // Skip nodes from other SCCs. These may have been split out during 564 // processing. We'll eventually visit those SCCs and pick up the nodes 565 // there. 566 if (CG.lookupSCC(*N) != CurrentC) 567 continue; 568 569 PreservedAnalyses PassPA = Pass.run(N->getFunction(), FAM); 570 571 // We know that the function pass couldn't have invalidated any other 572 // function's analyses (that's the contract of a function pass), so 573 // directly handle the function analysis manager's invalidation here. 574 FAM.invalidate(N->getFunction(), PassPA); 575 576 // Then intersect the preserved set so that invalidation of module 577 // analyses will eventually occur when the module pass completes. 578 PA.intersect(std::move(PassPA)); 579 580 // Update the call graph based on this function pass. This may also 581 // update the current SCC to point to a smaller, more refined SCC. 582 CurrentC = &updateCGAndAnalysisManagerForFunctionPass( 583 CG, *CurrentC, *N, AM, UR, DebugLogging); 584 assert(CG.lookupSCC(*N) == CurrentC && 585 "Current SCC not updated to the SCC containing the current node!"); 586 } 587 588 // By definition we preserve the proxy. And we preserve all analyses on 589 // Functions. This precludes *any* invalidation of function analyses by the 590 // proxy, but that's OK because we've taken care to invalidate analyses in 591 // the function analysis manager incrementally above. 592 PA.preserveSet<AllAnalysesOn<Function>>(); 593 PA.preserve<FunctionAnalysisManagerCGSCCProxy>(); 594 595 // We've also ensured that we updated the call graph along the way. 596 PA.preserve<LazyCallGraphAnalysis>(); 597 598 return PA; 599 } 600 601private: 602 FunctionPassT Pass; 603 bool DebugLogging; 604}; 605 606/// \brief A function to deduce a function pass type and wrap it in the 607/// templated adaptor. 608template <typename FunctionPassT> 609CGSCCToFunctionPassAdaptor<FunctionPassT> 610createCGSCCToFunctionPassAdaptor(FunctionPassT Pass, bool DebugLogging = false) { 611 return CGSCCToFunctionPassAdaptor<FunctionPassT>(std::move(Pass), 612 DebugLogging); 613} 614 615/// A helper that repeats an SCC pass each time an indirect call is refined to 616/// a direct call by that pass. 617/// 618/// While the CGSCC pass manager works to re-visit SCCs and RefSCCs as they 619/// change shape, we may also want to repeat an SCC pass if it simply refines 620/// an indirect call to a direct call, even if doing so does not alter the 621/// shape of the graph. Note that this only pertains to direct calls to 622/// functions where IPO across the SCC may be able to compute more precise 623/// results. For intrinsics, we assume scalar optimizations already can fully 624/// reason about them. 625/// 626/// This repetition has the potential to be very large however, as each one 627/// might refine a single call site. As a consequence, in practice we use an 628/// upper bound on the number of repetitions to limit things. 629template <typename PassT> 630class DevirtSCCRepeatedPass 631 : public PassInfoMixin<DevirtSCCRepeatedPass<PassT>> { 632public: 633 explicit DevirtSCCRepeatedPass(PassT Pass, int MaxIterations, 634 bool DebugLogging = false) 635 : Pass(std::move(Pass)), MaxIterations(MaxIterations), 636 DebugLogging(DebugLogging) {} 637 638 /// Runs the wrapped pass up to \c MaxIterations on the SCC, iterating 639 /// whenever an indirect call is refined. 640 PreservedAnalyses run(LazyCallGraph::SCC &InitialC, CGSCCAnalysisManager &AM, 641 LazyCallGraph &CG, CGSCCUpdateResult &UR) { 642 PreservedAnalyses PA = PreservedAnalyses::all(); 643 644 // The SCC may be refined while we are running passes over it, so set up 645 // a pointer that we can update. 646 LazyCallGraph::SCC *C = &InitialC; 647 648 // Collect value handles for all of the indirect call sites. 649 SmallVector<WeakTrackingVH, 8> CallHandles; 650 651 // Struct to track the counts of direct and indirect calls in each function 652 // of the SCC. 653 struct CallCount { 654 int Direct; 655 int Indirect; 656 }; 657 658 // Put value handles on all of the indirect calls and return the number of 659 // direct calls for each function in the SCC. 660 auto ScanSCC = [](LazyCallGraph::SCC &C, 661 SmallVectorImpl<WeakTrackingVH> &CallHandles) { 662 assert(CallHandles.empty() && "Must start with a clear set of handles."); 663 664 SmallVector<CallCount, 4> CallCounts; 665 for (LazyCallGraph::Node &N : C) { 666 CallCounts.push_back({0, 0}); 667 CallCount &Count = CallCounts.back(); 668 for (Instruction &I : instructions(N.getFunction())) 669 if (auto CS = CallSite(&I)) { 670 if (CS.getCalledFunction()) { 671 ++Count.Direct; 672 } else { 673 ++Count.Indirect; 674 CallHandles.push_back(WeakTrackingVH(&I)); 675 } 676 } 677 } 678 679 return CallCounts; 680 }; 681 682 // Populate the initial call handles and get the initial call counts. 683 auto CallCounts = ScanSCC(*C, CallHandles); 684 685 for (int Iteration = 0;; ++Iteration) { 686 PreservedAnalyses PassPA = Pass.run(*C, AM, CG, UR); 687 688 // If the SCC structure has changed, bail immediately and let the outer 689 // CGSCC layer handle any iteration to reflect the refined structure. 690 if (UR.UpdatedC && UR.UpdatedC != C) { 691 PA.intersect(std::move(PassPA)); 692 break; 693 } 694 695 // Check that we didn't miss any update scenario. 696 assert(!UR.InvalidatedSCCs.count(C) && "Processing an invalid SCC!"); 697 assert(C->begin() != C->end() && "Cannot have an empty SCC!"); 698 assert((int)CallCounts.size() == C->size() && 699 "Cannot have changed the size of the SCC!"); 700 701 // Check whether any of the handles were devirtualized. 702 auto IsDevirtualizedHandle = [&](WeakTrackingVH &CallH) { 703 if (!CallH) 704 return false; 705 auto CS = CallSite(CallH); 706 if (!CS) 707 return false; 708 709 // If the call is still indirect, leave it alone. 710 Function *F = CS.getCalledFunction(); 711 if (!F) 712 return false; 713 714 if (DebugLogging) 715 dbgs() << "Found devirutalized call from " 716 << CS.getParent()->getParent()->getName() << " to " 717 << F->getName() << "\n"; 718 719 // We now have a direct call where previously we had an indirect call, 720 // so iterate to process this devirtualization site. 721 return true; 722 }; 723 bool Devirt = any_of(CallHandles, IsDevirtualizedHandle); 724 725 // Rescan to build up a new set of handles and count how many direct 726 // calls remain. If we decide to iterate, this also sets up the input to 727 // the next iteration. 728 CallHandles.clear(); 729 auto NewCallCounts = ScanSCC(*C, CallHandles); 730 731 // If we haven't found an explicit devirtualization already see if we 732 // have decreased the number of indirect calls and increased the number 733 // of direct calls for any function in the SCC. This can be fooled by all 734 // manner of transformations such as DCE and other things, but seems to 735 // work well in practice. 736 if (!Devirt) 737 for (int i = 0, Size = C->size(); i < Size; ++i) 738 if (CallCounts[i].Indirect > NewCallCounts[i].Indirect && 739 CallCounts[i].Direct < NewCallCounts[i].Direct) { 740 Devirt = true; 741 break; 742 } 743 744 if (!Devirt) { 745 PA.intersect(std::move(PassPA)); 746 break; 747 } 748 749 // Otherwise, if we've already hit our max, we're done. 750 if (Iteration >= MaxIterations) { 751 if (DebugLogging) 752 dbgs() << "Found another devirtualization after hitting the max " 753 "number of repetitions (" 754 << MaxIterations << ") on SCC: " << *C << "\n"; 755 PA.intersect(std::move(PassPA)); 756 break; 757 } 758 759 if (DebugLogging) 760 dbgs() << "Repeating an SCC pass after finding a devirtualization in: " 761 << *C << "\n"; 762 763 // Move over the new call counts in preparation for iterating. 764 CallCounts = std::move(NewCallCounts); 765 766 // Update the analysis manager with each run and intersect the total set 767 // of preserved analyses so we're ready to iterate. 768 AM.invalidate(*C, PassPA); 769 PA.intersect(std::move(PassPA)); 770 } 771 772 // Note that we don't add any preserved entries here unlike a more normal 773 // "pass manager" because we only handle invalidation *between* iterations, 774 // not after the last iteration. 775 return PA; 776 } 777 778private: 779 PassT Pass; 780 int MaxIterations; 781 bool DebugLogging; 782}; 783 784/// \brief A function to deduce a function pass type and wrap it in the 785/// templated adaptor. 786template <typename PassT> 787DevirtSCCRepeatedPass<PassT> 788createDevirtSCCRepeatedPass(PassT Pass, int MaxIterations, 789 bool DebugLogging = false) { 790 return DevirtSCCRepeatedPass<PassT>(std::move(Pass), MaxIterations, 791 DebugLogging); 792} 793} 794 795#endif 796