473bc3580510e2da299662da200791cf4a9fb086 |
|
05-Feb-2018 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Implement GatherNd. PiperOrigin-RevId: 184584104
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
22116459b258d5753aa76410ab6f4d3cbc928a5a |
|
02-Feb-2018 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Improve/refactor the handling of resource types/shapes. Previously we used an xla::Shape to track the shape of a resource (Variable, TensorArray, Stack) shape. The xla::Shape described how the resource was represented to XLA, e.g., as a (buffer, size) pair for a Stack resource. Instead, separate the TensorFlow abstract shape representation from the XLA shape representation and track it separately. This leads to simpler and more readable code. PiperOrigin-RevId: 184310694
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
c60e32e0ca452aec465a33529a0ea22ef88b443f |
|
06-Dec-2017 |
Asim Shankar <ashankar@google.com> |
[TF:XLA] Support for DT_INT64 in the VariableShape operation. PiperOrigin-RevId: 178084701
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
c27a90d2195545c9147ec79094d7bca3176deb44 |
|
29-Nov-2017 |
Asim Shankar <ashankar@google.com> |
[TF:XLA] VariableShape op support. PiperOrigin-RevId: 177323587
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
ccfa8f4f1492c5cf1a7db35b2dba1f7b5424f0e2 |
|
12-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA:CPU] Switch TF gather's HLO implementation to use dynamic-update-slice in a "while" loop. Benchmarks results (times in ms): nontrivial_gather.axis0_cpu: 0.110 nontrivial_gather.axis0_xla_cpu: 0.139 nontrivial_gather.axis1_cpu: 0.093 nontrivial_gather.axis1_xla_cpu: 0.142 nontrivial_gather.axis4_cpu: 1.183 nontrivial_gather.axis4_xla_cpu: 2.658 slice_gather.axis0_cpu: 0.00388 slice_gather.axis0_xla_cpu: 0.00397 slice_gather.axis1_cpu: 0.00421 slice_gather.axis1_xla_cpu: 0.00427 slice_gather.axis4_cpu: 0.252 slice_gather.axis4_xla_cpu: 0.114 As you can see, the pure-XLA implementation is slower in all the nontrivial cases and as-fast or faster in the slice-gather cases. The slice-gather cases are gathers that can be implemented as a single XLA dynamic-slice, and so the speedup here is likely understated: Once we can simplify the gather to a single dynamic-slice, we should be able to do many other optimizations to it, ideally fusing it so it has zero cost. The nontrivial gathers all gather more than one element, and are implemented with an XLA while loop. The most important one is the axis 0 gather -- gathering from an inner dimension is so slow no matter what you do that it's probably not worth optimizing. It's possible to make this XLA implementation faster -- one option I've considered is "unrolling" the gather into a series of dynamic-slice's that are then concat'ed together. This would be totally fusable, unlike the implementation in this CL. Another option would be adding a notion of uninitialized memory into XLA -- part of what makes us slow is that we have to initialize the memset our output to 0 before we overwrite it. But given that the shape we're benchmarking here is totally arbitrary, and given that we're getting decent performance, I think this is good enough to start with. PiperOrigin-RevId: 171883273
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
36649e842908d89a3dc44a840bd6305fe401123f |
|
26-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Adds XLA support for GatherV2 (gather with axis parameter). PiperOrigin-RevId: 170050380
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
f0e8c545e0196b8b48ce0ad0f116df97d980d1f1 |
|
11-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Switch resource variables from copy-on-read to copy-on-write. RELNOTES: Change the signature of (C++) GetInputTensorFromVariable in training_op_helpers to support new copy-on-write semenatics of resource variables. PiperOrigin-RevId: 168273249
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
018530bbb673ef630d9235db436a59f66b876d91 |
|
17-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Use XLA While loop in Gather/Scatter tf2xla bridge ops, to prevent excessively large HLO output for models with large numbers of embedding lookups. PiperOrigin-RevId: 165514289
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
878e6366362612a2ffba740bde51999c72a73acf |
|
15-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Refactor XLA Gather to use a common implementation for Gather, ResourceGather, etc. PiperOrigin-RevId: 165239093
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
9526b27d7d904acc9e1a7a1990e1320235a8720c |
|
13-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Implementing ResourceGather in TF2XLA. PiperOrigin-RevId: 161730154
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
93f9caba8e371bd2f55ec789ed2f8ece9b3d976d |
|
30-Mar-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Refactor TF/XLA operator registration. Rather than requiring an explicit registration for each (operator, backend) pair, by default register all operators for all backends, for all types supported by each backend. As we are beginning to see out-of-tree backends, as XLA translations of operators are added to the TF/XLA bridge, per-backend explicit registration lists will become stale. Registering all operators on all backends is both less verbose and more maintainable for backend authors. Since not all operators work on all backends, we add several constraint mechanisms: * operators may specify type constraints that are shared across all backends. * operators may specify a whitelist of backends on which they work. This is useful if an operator is CPU-only because of a CustomCall. * backends may register a function that specifies operators to blacklist or whose registrations to modify. This is necessary since operator implementations cannot know the set of all out-of-tree backends. This change also lays the ground-work for removing the list of compile-time constant inputs in const_analysis.cc. In a subsequent CL, compile-time constant inputs can be annotated on the XLA operator registration. Change: 151724100
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
09adaff530f8c600a0b9d1d6f4e3379c9fea1def |
|
14-Mar-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Implement ResourceApplyAdagrad. Split XLA implementation of training ops into their own file. Change: 150125044
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
e38a60797b14fb1413626c99969360e836ecae67 |
|
14-Mar-2017 |
Peter Hawkins <phawkins@google.com> |
Change the code that builds zero slots in slot_creator.py to use a zeros_initializer rather than Tensor, to simplify Python code that works with multiple graphs. Implement ResourceApplyMomentum in the XLA bridge. Change: 150111106
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
b436f4130b54f0f422774d06f9affac417b9363e |
|
27-Feb-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Improvements to resource variables: * enable compilation of VarIsInitializedOp. * fix deprecated variable initializer in variable_ops_test.py * simplify variable logic in XlaContext, move intelligence into XlaOpKernelContext. * add resource variable support in the contrib layers library. Cleanups and refactorings: * merge XlaCompiler::CompileSubComputation with XlaCompiler::CompileFunction. * pass XlaCompiler arguments consistently via XlaCompiler::Options. * split the two roles of XlaCompiler::CompilationResult::input_shapes into input_mapping and xla_input_shapes. * initialize the numpy and Python seeds to a constant for XLA test cases. Change: 148683645
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
9f12227ae9930decb915062614792aa01617264d |
|
24-Feb-2017 |
Alexandre Passos <apassos@google.com> |
Do an aliasing read of a resource variable when fetching. Change: 148396841
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|
542c3cbf711c4b89310fa4046c48150d29564008 |
|
22-Feb-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Add support for resource variables to the Tensorflow/XLA bridge. Change: 148176223
/external/tensorflow/tensorflow/compiler/tf2xla/kernels/variable_ops.cc
|