0ebe7563b48fab5ee2e04a9275e623506559fab2 |
|
09-Nov-2017 |
Sanjoy Das <sanjoy@google.com> |
Explicitly disable vectorization in the LLVM IR generated for Dot. In practice this does not seem to make a difference, but I did it anyway for completeness. PiperOrigin-RevId: 175167706
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|
9249768389a22b45ee6a10930adffcc10c7f93ce |
|
15-Sep-2017 |
Justin Lebar <jlebar@google.com> |
Annotate loops in cpu/ir_emitter with the HLO name. This makes the IR significantly easier to parse. PiperOrigin-RevId: 168772460
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|
c82a933f449e637ee83244d2c40162e24cdde0e1 |
|
15-Sep-2017 |
Sanjoy Das <sanjoy@google.com> |
Lower vector-matrix dot to LLVM IR if the RHS of the dot can be made column major. The naive dot lowering to LLVM IR (already present in XLA today) is cache efficient if the dot has LHS of shape [1,K]{1,0} and RHS of shape [K x N]{0,1}. This change teaches the layout assignment pass to exploit this property by converting a constant RHS matrix to a column major layout when possible. Couple of related things I had to touch in this change: - In LayoutAssignmentTest.TupleLayout we used to generate a kCopy to satisfy the conflicting constraints between the result and the constant shapes, but with this change we change the layout of the constants themselves. So the EXPECT_FALSE is now an EXPECT_TRUE. - The extra instruction layout constraints added at the end of CpuLayoutAssignment::AddBackendConstraints seemed redundant. The layout assignment pass already tries to make all unconstrained buffers have the default row-major layout. Moreover, they were blocking this optimization in some cases by introducing conflicting constraints. - The changes to literal_util.h have to be made to deal with the Literal::Relayout calls we now get on literals of various types. PiperOrigin-RevId: 168761204
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|
34cbf161d7b1191ad5c1b3bc02fc52d338e8b175 |
|
27-Jul-2017 |
Jiri Simsa <jsimsa@google.com> |
Update Dataset API documentation. PiperOrigin-RevId: 163349457
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|
28d9223a55d3c02e91781e02ff8b3f6a31bdd66a |
|
18-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA:CPU] Vectorize reduction operations This change teaches XLA to generate vectorized code sequences for reduction operations. This is still a work in progress, but I wanted to get it out for discussion for early feedback. PiperOrigin-RevId: 162305323
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|
abbb19bb9445ffee96ff2946083a3b5c8dadc0d0 |
|
20-May-2017 |
Eli Bendersky <eliben@google.com> |
Clean up usage of HloModuleConfig in more places. While at it, kill some nonsensical dependencies; llvm_util shouldn't know about HloModuleConfig just for the sake of extracting a single flag, for example. Also clean up related BUILD dependencies a bit. PiperOrigin-RevId: 156608760
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|
1e67c90e2caceeff82d09793d1ef5fa0300d219b |
|
09-Jan-2017 |
Peter Hawkins <phawkins@google.com> |
Initial open-source release of XLA: Accelerated Linear Algebra. XLA is a compiler-based linear algebra execution engine that targets CPUs, GPUs and custom accelerators. XLA is still experimental; we are releasing it early to get the community involved. Change: 143990941
/external/tensorflow/tensorflow/compiler/xla/service/llvm_ir/llvm_loop.cc
|