397fc4874efe9c17e737d4c5c50bd19dc3bf27f5 |
|
08-May-2012 |
Jakob Stoklund Olesen <stoklund@2pi.dk> |
Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass(). The getPointerRegClass() hook can return register classes that depend on the calling convention of the current function (ptr_rc_tailcall). So far, we have been able to infer the calling convention from the subtarget alone, but as we add support for multiple calling conventions per target, that no longer works. Patch by Yiannis Tsiouris! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156328 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
31d157ae1ac2cd9c787dc3c1d28e64c682803844 |
|
18-Feb-2012 |
Jia Liu <proljc@gmail.com> |
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150878 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
ddfd1377d2e4154d44dc3ad217735adc15af2e3f |
|
14-Dec-2011 |
Evan Cheng <evan.cheng@apple.com> |
- Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function to finalize MI bundles (i.e. add BUNDLE instruction and computing register def and use lists of the BUNDLE instruction) and a pass to unpack bundles. - Teach more of MachineBasic and MachineInstr methods to be bundle aware. - Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to prevent IT blocks from being broken apart. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146542 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
5a96b3dad2f634c9081c8b2b6c2575441dc5a2bd |
|
07-Dec-2011 |
Evan Cheng <evan.cheng@apple.com> |
Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if *all* of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146026 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
e837dead3c8dc3445ef6a0e2322179c57e264a13 |
|
28-Jun-2011 |
Evan Cheng <evan.cheng@apple.com> |
- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134021 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
15993f83a419950f06d2879d6701530ae6449317 |
|
27-Jun-2011 |
Evan Cheng <evan.cheng@apple.com> |
More refactoring. Move getRegClass from TargetOperandInfo to TargetInstrInfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133944 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
84c5eed15baa3710d7fb8522c7a28c8e0b732c2b |
|
19-Apr-2011 |
Bob Wilson <bob.wilson@apple.com> |
This patch combines several changes from Evan Cheng for rdar://8659675. Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129775 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
6557bce3ec8d5a82b2ea299a18cb51677b299633 |
|
22-Feb-2011 |
Evan Cheng <evan.cheng@apple.com> |
VFP single precision arith instructions can go down to NEON pipeline, but on Cortex-A8 only. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126238 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
04e2b639c13a73ac91686d484628325ee536e9cc |
|
06-Dec-2010 |
Evan Cheng <evan.cheng@apple.com> |
Eliminate unneeded #include's. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120971 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
f79ed109ec4ead50216bbed3d80d1ccd5ad94061 |
|
06-Dec-2010 |
Evan Cheng <evan.cheng@apple.com> |
Remove an unused variable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120964 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|
48575f6ea7d5cd21ab29ca370f58fcf9ca31400b |
|
05-Dec-2010 |
Evan Cheng <evan.cheng@apple.com> |
Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
/external/llvm/lib/Target/ARM/MLxExpansionPass.cpp
|