History log of /external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
3590c452ea8485d063874138eec92411297a9abb 09-Feb-2018 Mingsheng Hong <hongm@google.com> Enabled XLA for TF C API.

Summary of changes:

1. Set MarkForCompilationPassFlags::tf_xla_cpu_global_jit default to true in
C_API unit test env when XLA-execute is intended. Together with setting session
config config.graph_options.optimizer_options.global_jit_level to > 0, this
turns on XLA for the entire graph (eligible nodes only, with _Arg and _RetVal
nodes excluded).

We decided against defaulting MarkForCompilationPassFlags::tf_xla_cpu_global_jit
to true, due to performance concerns with the single-threaded nature of the XLA
CPU backend (see
https://www.tensorflow.org/performance/xla/jit#turning_on_jit_compilation).

2. In FindCompilationCandidates() during MarkForCompilationPass, skip compiling
any '_Arg'-typed nodes. This is necessary to avoid hitting a "Invalid argument
number" error during MarkForCompilationPass.

3. Extended C API based build rules to link in XLA libraries, and added unit
test "CAPI.Session_Min_XLA_CPU".

Also added some misc improvements and debugging aids.

PiperOrigin-RevId: 185193314
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
e78ec6b40d9af515044bde7e184a9a85b0aa0a41 19-Dec-2017 A. Unique TensorFlower <gardener@tensorflow.org> Initial checkin for outside_compilation. Adds a new attribute for encapsulating XLA subgraphs that will in the future be used to mark some Ops in the subgraph as 'outside_compilation' meaning they will be run as interpreted TensorFlow via a callout from a compiled XLA subgraph.

This is the first of a sequence of checkins. It adds new types of edges entering and leaving the subgraphs, suitable for send/recv between a compiled XLA subgraph and the 'host', i.e., uncompiled TensorFlow.

For now no code sets the new 'outside_compilation' attributes, and the Ops to perform the send/recv are not present in the codebase; these will follow in subsequent checkins.

PiperOrigin-RevId: 179591853
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
47b674c938a38c6d88f27244a12ce3944c2f0464 13-Dec-2017 A. Unique TensorFlower <gardener@tensorflow.org> [XLA] Remove a source of nondeterminism in HLO clustering.

Record the HLO clusters with std::set instead of std::unordered_set to ensure
that the algorithm to assign each cluster a sequence number during a set
traversal is deterministic.

PiperOrigin-RevId: 178830794
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
0472116d163eeb77d51cabdc5fc67be917048870 01-Dec-2017 A. Unique TensorFlower <gardener@tensorflow.org> [TF:XLA] Make tf_cnn_benchmarks run on CPU with XLA.

Adds _cpu_jit to tf_cnn_benchmarks_xla BUILD rule and fixes an issue in XLA bridge triggered by XLA CPU compilation of whole graphs. In particular, modifies mark_for_compilation_pass.cc to skip _Retval nodes when looking for compilation candidates in the top level function. _Retval nodes are introduced in the input subgraph as a replacement for fetches. Including _Retval nodes into XLA clusters confuses encapsulate subgraph pass that expects a graph with no pre-existing _Retval nodes.

PiperOrigin-RevId: 177518178
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
205ff0f7592c60ab09fc705f2c5501d8547e83be 15-Nov-2017 A. Unique TensorFlower <gardener@tensorflow.org> [TF:XLA] Added tf_xla_cpu_global_jit flag to TF_XLA_FLAGS environment variable to enable global JIT compilation for CPU via SessionOptions. By default, global JIT compilation
for CPU via SessionOptions is disabled. When TF_XLA_FLAGS=--tf_xla_cpu_global_jit is set,
the value of enable_jit_by_default variable in mark_for_compilation_pass.cc is ignored allowing XLA to use JIT compilation for the whole graph according to SessionOptions setting .

Unless tf_xla_cpu_dev_mode is explicitly set via TF_XLA_FLAGS, this code change
should have no effect on Tensorflow or XLA execution.

RELNOTES: n/a
PiperOrigin-RevId: 175754729
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
825a9f8d9a4cc3cce7cee2fb08dcc058b5a8e2a8 06-Oct-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Make registration of an XlaDevice for autoclustering optional.

PiperOrigin-RevId: 171281666
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
a81d10e2e753039e675d256762b6a3337342b7cd 28-Sep-2017 A. Unique TensorFlower <gardener@tensorflow.org> When constructing the error message, check for a nonexistent node before trying to get the name of that node.

PiperOrigin-RevId: 170349499
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
8bcc4151d4ea266f5f4183f7eaa51c7874ad15a1 21-Sep-2017 A. Unique TensorFlower <gardener@tensorflow.org> If a cycle is detected, mention in the error message what the cycle is.

PiperOrigin-RevId: 169575965
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
19a55725af8102d72d4e081c5139f0e4bd5a4bb7 18-Aug-2017 Rohan Jain <rohanj@google.com> Allowing functions to run across devices. This change expands the ProcessFunctionLibraryRuntime library to Instantiate and Run functions on different devices. When a FunctionLibraryRuntime encounters a function with a target that is another device, it delegates Instantiate() and Run() calls to the ProcessFunctionLibraryRuntime.

This change also moves the table_ containing all function instantiations to the PFLR instead of the FunctionLibraryRuntime.

PiperOrigin-RevId: 165651194
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
935ff49201edd7a6297b313fb9545d1299b9a28d 17-Aug-2017 Rohan Jain <rohanj@google.com> Automated g4 rollback of changelist 165521057

PiperOrigin-RevId: 165604864
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
37de1372ff43b144750c789b088f3166bcb6a27a 17-Aug-2017 Rohan Jain <rohanj@google.com> Allowing functions to run across devices. This change expands the ProcessFunctionLibraryRuntime library to Instantiate and Run functions on different devices. When a FunctionLibraryRuntime encounters a function with a target that is another device, it delegates Instantiate() and Run() calls to the ProcessFunctionLibraryRuntime.

This change also moves the table_ containing all function instantiations to the PFLR instead of the FunctionLibraryRuntime.

PiperOrigin-RevId: 165521057
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
2f1ff0e90dc3ba80f6bbc3f9850e8028875dcbbf 25-Jul-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Register a no-op kernel for ControlTrigger, but forbid the JIT marking pass from compiling ControlTrigger nodes.

CL in preparation for compiling dynamic RNN gradients via XLA.

PiperOrigin-RevId: 163073212
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
a8087b4aae40ef5c97d8b27d40795950996f86d5 27-Jun-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Reject operators with resource outputs on CPU and GPU devices.

We were checking for resource inputs but not resource outputs, which led to accidental fusion of some TensorArray ops on CPU and GPU.

PiperOrigin-RevId: 160294302
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
6ada43366663210beb0159b8c1a67b26ebfe6cb7 23-Jun-2017 Geoffrey Irving <geoffreyi@google.com> Prepare to not include node_def.proto.h in node_def_util.h

The goal is to make kernels mostly independent of proto headers, which will let
us lock down our .so imports. This CL makes a bunch of .cc files
either include node_def.proto.h themselves or not need the definition of
NodeDef; a second CL will make node_def_util.h not include node_def.proto.h.

RELNOTES: n/a
PiperOrigin-RevId: 159982117
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
0f2db739163809782049b2c956355506c88c77e5 02-Jun-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Split union-find implementation in mark_for_compilation_pass.cc into a separate library, make it more generic.

PiperOrigin-RevId: 157850985
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
4e131d27354bc9be90e291f3ec4538c0e3bf06eb 22-May-2017 A. Unique TensorFlower <gardener@tensorflow.org> Many algorithms need to enumerate the set of nodes within a graph, while excluding the special Sink and Source nodes. The checks for skipping Source and Sink are duplicated in dozens of loops.

This CL adds a new Graph::op_nodes() method, which returns an enumerable range of all operation nodes, excluding Sink and Source. This allows many for loops to be simplified.

This simplification is being done mainly for readability / reliability. There may be a tiny performance difference owing to this change (as well as making the Graph::nodes() and Graph::op_nodes() methods inlineable), but the measured difference is not reliably large enough to be significant.

The changes to graph.h and graph.cc are quite minimal. I updated all of the uses of Graph::nodes() that I could reliably determine were unaffected by the change. Most uses immediately checked node->IsOp(). Some compared node->type_string() against literal strings, none of which were "_SINK" or "_SOURCE", and so using op_nodes() was more appropriate than nodes(). In some cases, it was not obvious whether an existing use of Graph::node() wanted to enumerate Sink / Source, so I left those uses unaffected.

PiperOrigin-RevId: 156782112
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
73882f257ffb1bc9e1a828571c085d080b1d9266 17-May-2017 Geoffrey Irving <geoffreyi@google.com> Automated g4 rollback of changelist 156251356

PiperOrigin-RevId: 156315860
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
43db5c623f748b6f9704e9e9be5a5a11fa2a4c1a 17-May-2017 Geoffrey Irving <geoffreyi@google.com> Automated g4 rollback of changelist 156244933

PiperOrigin-RevId: 156251356
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
749e5cc18381f7a5ec174673f76e20aead8529c6 17-May-2017 Geoffrey Irving <geoffreyi@google.com> Reduce direct references to NodeDef in favor of Node and AttrSlice

This is one step towards replacing in-memory use of NodeDef with a customized
NodeInfo class. There are still quite a few Node::def() references, but far fewer than before. Those remaining require more work, either because they are part of kernel registration (which is a bunch of functions), copy and modify the NodeDef, etc. Follow-on CLs will remove more.

RELNOTES: n/a
PiperOrigin-RevId: 156244933
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
1d0b8c007b8bc7f77dd63c74f02d87185071f038 09-May-2017 Peter Hawkins <phawkins@google.com> Remove unnecessary copies of value parameters.

PiperOrigin-RevId: 155511618
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
535928864e296ca051fd7ceedbba915fb0e81bbe 03-Apr-2017 A. Unique TensorFlower <gardener@tensorflow.org> Increase kMaxRecursionDepth.
Change: 152042191
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
b05a83916f21becf59eff4e9db1d375eeb0fe904 16-Mar-2017 A. Unique TensorFlower <gardener@tensorflow.org> [TF:XLA] Don't compile functions that are marked "noinline".

The underlying function mechanism uses LocalExecutor to call the function,
which interacts poorly with the LocalExecutor used by tf2xla to translate
the TF graph into XLA.
Change: 150268961
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
91d2cc5cb4cb2d2463e3ed7ea323fc627c4a2098 23-Feb-2017 Eugene Brevdo <ebrevdo@google.com> Avoid merging adjacent XLA compilations from different scopes/functions

This is part 2 of the bugfix. It implements the xla chain breaking mechanism
based on different coloring of the graph (as represented by different XlaScope
strings). In part 1, we modified both experimental_jit_scope and Defun to
mark their ops as having different XlaScopes, so this is the final change that
actually enables the fusion breaking.

Also fixed a bug where xla_enabled was not True for rnn_cell_tests (the XLA benchmarks in that test were not actually being run with xla enabled).
Change: 148286731
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
542c3cbf711c4b89310fa4046c48150d29564008 22-Feb-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Add support for resource variables to the Tensorflow/XLA bridge.
Change: 148176223
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
6640d3f3de88a3f3ade8ec6e5e4540e545024f87 16-Feb-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Refactor XlaOpRegistry, moving metadata about how to compile operators on a device into a struct.
No functional changes.
Change: 147741833
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
a8c325e57c1077f1e8df540a20bd8b36d3d1f968 15-Feb-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Split XlaOpRegistry out of xla_compilation_device.{cc,h} into a separate xla_op_registry.{cc,h}.
Move XlaExpression out of xla_context.{cc,h} into xla_compilation_device.{cc,h}, since it is used to wrap computation handles on the XLA compilation device.
Change just moves code around, there are no functional changes.
Change: 147632770
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
96007205c42d591ef5cef2d7e8245b780f44f0d7 07-Feb-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Disable the XLA CPU jit by default when the JIT is requested via the OptimizerOptions.
The XLA CPU JIT is not optimized yet, and should not be enabled by default since it is usually slower than the standard Tensorflow CPU kernels.
Change: 146811646
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
c44cde12a4533571257fb30fb2e5ea1b7c6dbf7f 19-Jan-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Add support for compiling computations with no return values.
Remove check for _Send and _Recv nodes in mark_for_compilation_pass.cc.
Change: 144984665
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
c8384ed2900201f55f219e52e2fd57e2d4d48e70 13-Jan-2017 Peter Hawkins <phawkins@google.com> Add a unit test for XlaCompiler.
Add support for marking Xla computations as stateful.
Add a store for xla::ChannelHandles in XlaCompiler.
Don't mark _Send/_Recv for XLA computation.
Change: 144382814
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc
1e67c90e2caceeff82d09793d1ef5fa0300d219b 09-Jan-2017 Peter Hawkins <phawkins@google.com> Initial open-source release of XLA: Accelerated Linear Algebra.

XLA is a compiler-based linear algebra execution engine that targets CPUs, GPUs and custom accelerators.

XLA is still experimental; we are releasing it early to get the community involved.
Change: 143990941
/external/tensorflow/tensorflow/compiler/jit/mark_for_compilation_pass.cc