History log of /external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
4b9ef6c8e07dea7d18f552fa4955c3176646f95d 13-Feb-2018 Jacques Pienaar <jpienaar@google.com> Rollforward switch group identification with fixes.

Fixed computing the switch depth: with the erroneous switch depth incorrect
clusters could be formed. Change the way the switch depth is determined (the
switch depth is now on the output side, so a switch always has a switch depth
one higher than all its inputs), add further checking during execution.

PiperOrigin-RevId: 185461054
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
b70a3794b6e06418b6d5b4d7142edeb78494fe7b 02-Feb-2018 Jacques Pienaar <jpienaar@google.com> Consider beyond immediate neighbors to find exit node.

Most of the exit nodes are immediate neighbors of the switch, except we do have
cases where the switch feeds into an identity that feeds into a exit.

PiperOrigin-RevId: 184297180
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
c5e02bc8fd71d73c5d05f583ce5391f26ad937d7 02-Feb-2018 Jacques Pienaar <jpienaar@google.com> Automated g4 rollback of changelist 184188816

PiperOrigin-RevId: 184213576
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
87aab43770cacbb73706ad6b11a28e9b19c1df0b 01-Feb-2018 Jacques Pienaar <jpienaar@google.com> [TFXLA] Use data flow to determine switch grouping.

* Change how switch grouping works:
- This is an intermediate step, next is combining
DetermineBranchMapAndFrontier into one traversal.
* Homogeneous the naming (switch_nodes -> switches);
* Change graph dumping to be due to class member - currently still performed when vlog-level is sufficiently high;
* Pass in correct library when dumping graphs;

PiperOrigin-RevId: 184188816
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
abc543827b919e9272a742002048ebf98437ad32 12-Jan-2018 Jacques Pienaar <jpienaar@google.com> [TFXLA] Don't rely on CSE to dedup args.

Handle the case where a value is fed in via multiple switch nodes without relying on CSE to dedup the nodes as we'd only want/need to feed in the same value once per function.

PiperOrigin-RevId: 181752351
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
7b52d7e72c22dba34a0391b721bc8ce808542593 05-Jan-2018 Jacques Pienaar <jpienaar@google.com> [TFXLA] Handle control edges to cond not dominated.

Graphs may have control dependency from outside the cond construct that
do not enter via a switch. If there is a control edge from outside then change
the edge to be a control edge onto the inserted XlaIf op instead and remove the
original control edge.

PiperOrigin-RevId: 180862658
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
69e5969d159fa8560eb61d82ec55b04d19bb0560 14-Dec-2017 Jacques Pienaar <jpienaar@google.com> [TFXLA] Simplify identification of cond branches.

* Remove the clustered graph part as it was difficult to keep it updated with the rest of the graph and instead operate on the graph directly;

PiperOrigin-RevId: 178980836
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
667282eb0e62bef03bbe527bef88c656532444bb 29-Nov-2017 Jacques Pienaar <jpienaar@google.com> [TFXLA] Return nullopt if no merge node found.

PiperOrigin-RevId: 177319722
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
1c7661be3337d5ab6c44300aee6a2d4001c81b27 21-Nov-2017 Jacques Pienaar <jpienaar@google.com> [TF2XLA] Flow down across switch edges separately.
* Change the way that the clustering was done by flowing down along the branches of the switch node separately;
- It was previously wrong to assume that the operands of an op are in the same control scope if they are not a switch or a merge node, as a zero-input op (such as a const) could be referenced by both "branches" of a switch without this op not being exclusively in either branch.
* Change from matching a switch for a merge cluster, to matching a merge for a switch cluster:
- The new matching considers switch-merge subgraphs where all nodes within the subgraph are dominated by the switch nodes, so reversing the matching makes it easier to perform the dominance checking.
- This allows for cases where there is a cluster with a control dependency on a switch node and used by a branch of the switch.

PiperOrigin-RevId: 176446211
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
8ad5cc00f21eb9d6f1811d7ed771f6f042dba1ba 15-Nov-2017 Jacques Pienaar <jpienaar@google.com> [TFXLA] Add source node and make GetSwitchCluster more conservative.

PiperOrigin-RevId: 175758538
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
be595b28d61fe478d5441916ce1219aab6746c44 05-Nov-2017 Jacques Pienaar <jpienaar@google.com> [TF2XLA] Don't change output port for control dependency in CopySubgraph.

If the output is being squashed then we want control output 0, except where the
input is a control dependency.

PiperOrigin-RevId: 174633829
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
b2348b8e069e1efbd43452a0f5478cb09f123fbc 02-Nov-2017 A. Unique TensorFlower <gardener@tensorflow.org> Set sharding on the _Arg and _Retval nodes of a function when compiled.
In functionalize_control_flow, set the device on the Identity node for each
value that comes out of a Switch.

PiperOrigin-RevId: 174337984
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
7b3ea8e6176319467cb1a49a1a662d868a205b91 16-Oct-2017 Jacques Pienaar <jpienaar@google.com> [TF2XLA] Expand comparator and use consistently in sorting arguments.

PiperOrigin-RevId: 172376836
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
f0e3edf8b1c8de49672d78abe73dcd0b1f02620c 16-Oct-2017 Jacques Pienaar <jpienaar@google.com> [TF2XLA] Keep Switch and Merge nodes in own clusters.

* Keep Switch and Merge nodes in separate clusters to avoid creating irreducible graphs;
* Merge Switch nodes with common predicates;
* Add support for if-then structure;
* Squash trivial Switch->Merge groups;
* Merge newly Merge free nodes with Switch & Merge free inputs;
* Check to see if it is a Merge node before merging to common merge node;
* Return an error if all Switches have not been replaced;
* Add test fir tf,case;

PiperOrigin-RevId: 172348729
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
40d5bf33829249404f935441bac0fa1615a58c13 14-Oct-2017 Skye Wanderman-Milne <skyewm@google.com> Enable Operation._add_control_inputs() with the C API and related improvements

This change:
- Implements the C API logic for Operation._add_control_inputs()
- Adds type-checking to Operation._add_control_input()
- Makes Graph::AddControlEdge() update the node def if necessary
- Makes Graph::AddControlEdge() a no-op if the control edge already exists

The AddControlEdge() changes may have a performance impact if anything
is sensitive to AddControlEdge(), but nothing is to my knowledge. I'm
not sure what benchmarks would confirm this.

PiperOrigin-RevId: 172158589
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
0e71ecaf9512cd8a69af01ac85e5e1632171c651 06-Oct-2017 Jacques Pienaar <jpienaar@google.com> [TFXLA] Loops whose values are not consumed need no out edges.

If there is no exit node then there is not need to add output edges to it.

PiperOrigin-RevId: 171213900
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
8dc5e3718b85b72a8bc6e5a2ea8270eecfdf99a1 05-Oct-2017 Jacques Pienaar <jpienaar@google.com> [TFXLA] Functionalize tf.cond.

Convert tf.cond to functional form
output = cond ? then_branch(inputs) : else_branch(inputs)
where then_branch and else_branch are functions.

PiperOrigin-RevId: 171164597
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
57d17092d0e2bd6f169724beab28ec29c5e6db85 25-Jul-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Ignore control edges from Enter nodes to the graph sink during loop functionalization.

PiperOrigin-RevId: 163115904
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
90d6421c5e0898fb840197d9533c2f8ba1a7c651 11-Jul-2017 Shanqing Cai <cais@google.com> Merge changes from github.
END_PUBLIC

---
Commit d0f53f77f authored by Penghao Cen<scorpiocph@gmail.com>
Committed by Shanqing Cai<cais@google.com>:
Minor fix typo (#11323)

---
Commit 02fcf564e authored by Chris Song<sjhshy@gmail.com>
Committed by Chris Song<sjhshy@gmail.com>:
Fix misspells.

---
Commit 764c9b6b4 authored by Louis Tiao<ltiao@users.noreply.github.com>
Committed by GitHub<noreply@github.com>:
Fixed typo in docstring
---
Commit f8cd1283e authored by Shanqing Cai<cais@google.com>
Committed by Shanqing Cai<cais@google.com>:
Chaser

---
Commit 01383b946 authored by Shanqing Cai<cais@google.com>
Committed by Shanqing Cai<cais@google.com>:
Adapt TensorFlowTestCase.setUp() to new reset_default_graph() semantics

Avoid calling reset_default_graph() directly to prevent exceptions in
cases where test methods error out from within nested graph contexts,
which can leave _default_graph_stack non-empty in certain Python
versions.

---
Commit 0ffc37890 authored by Amit Patankar<amitpatankar@google.com>
Committed by Amit Patankar<amitpatankar@google.com>:
Removing second declaration of functions.

---
Commit f9c9cacb0 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Refactor ElementalIrEmitter's slice index finding code into
IrArray::Index::SourceIndexOfSlice().

PiperOrigin-RevId: 161140653

---
Commit ba297aec9 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update ops-related pbtxt files.

PiperOrigin-RevId: 161138258

---
Commit 68d666737 authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fixes a reentrant lock issue with tensors using ndarray memory which uses tensor memory.

PiperOrigin-RevId: 161137788

---
Commit a2ee8bca3 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add support for int8 x int8 -> int32 matrix multiplication via cublasGemmEx to stream_executor.

PiperOrigin-RevId: 161137741

---
Commit 755fa7b50 authored by Mark Daoust<markdaoust@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Block generate_test, and docs generating from running in python3.

- Doc generation is currently unsupported in python3

- These both end in errors in python 3.5.1+

PiperOrigin-RevId: 161137467

---
Commit 97cbcac45 authored by Peter Hawkins<phawkins@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[TF:XLA] Fix failure in functionalize_control_flow rewrite for Enter nodes that are unused. Make sure we ignore such nodes without producing an error.

PiperOrigin-RevId: 161136545

---
Commit dabcb60bc authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add reasonable error messages to Builder::Build for bad parameter numbers.

PiperOrigin-RevId: 161136262

---
Commit 0cbd249e8 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add complex tensors support to `matrix_determinant`.

PiperOrigin-RevId: 161132422

---
Commit 335f1f14d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Extend static shape inference for SparseTensors with dense_shapes constructed using slicing.

PiperOrigin-RevId: 161132391

---
Commit 53604916e authored by Jianwei Xie<xiejw@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fixed the missing labels test in TPUEstimator.

PiperOrigin-RevId: 161131282

---
Commit 9f57dc8dd authored by Bruno Rosa<bruno.rosa@eldorado.org.br>
Committed by Bruno Rosa<bruno.rosa@eldorado.org.br>:
Use mcpu instead of march for ppc64le

march is not support by gcc on ppc64le

---
Commit 7d5c74a9c authored by Skye Wanderman-Milne<skyewm@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Move duplicate detection logic from Graph to FunctionLibraryDefinition

Turns out this is more useful, since there are many function libraries
that don't belong to a graph. This will be used in a future
change. Note that this maintains the current behavior of Graph.

In addition, updates FunctionDefsEqual() to handle unset attr entries
(I ran into this when using this in said future change).

PiperOrigin-RevId: 161126628

---
Commit 2caec3af1 authored by Shanqing Cai<cais@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Disable more timeseries py tests failing in OSS PIP GPU builds

PiperOrigin-RevId: 161124799

---
Commit 0b5cce367 authored by Eugene Brevdo<ebrevdo@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Get TopK op working on GPU again. Extend using cub's radix sort.

1. Undo rollback of Andreas Kirsch's initial implementation.
2. Use cub segmented radix sort if Andreas' heap-based impl
for large k and small num_cols (thresholds of k=100, n=1000
determined empirically).
3. Use cub segmented radix sort if k == num_cols (this case is always faster).
4. Added benchmarks.

Benchmarks show that the GPU implementation is up to 3x slower for small k but
can be 10x faster for large num_cols and k.

Benchmarks:

Benchmark: m_128_n_10_k_5_use_gpu_False wall_time: 0.000166 s Throughput: 0.0077 GB/s
Benchmark: m_128_n_10_k_5_use_gpu_True wall_time: 0.000796 s Throughput: 0.00161 GB/s
Benchmark: m_128_n_10_k_9_use_gpu_False wall_time: 0.00017 s Throughput: 0.00751 GB/s
Benchmark: m_128_n_10_k_9_use_gpu_True wall_time: 0.000796 s Throughput: 0.00161 GB/s
Benchmark: m_128_n_10_k_10_use_gpu_False wall_time: 0.00017 s Throughput: 0.00753 GB/s
Benchmark: m_128_n_10_k_10_use_gpu_True wall_time: 0.000775 s Throughput: 0.00165 GB/s
Benchmark: m_128_n_100_k_1_use_gpu_False wall_time: 0.000155 s Throughput: 0.0826 GB/s
Benchmark: m_128_n_100_k_1_use_gpu_True wall_time: 0.000796 s Throughput: 0.0161 GB/s
Benchmark: m_128_n_100_k_50_use_gpu_False wall_time: 0.000247 s Throughput: 0.0519 GB/s
Benchmark: m_128_n_100_k_50_use_gpu_True wall_time: 0.0008 s Throughput: 0.016 GB/s
Benchmark: m_128_n_100_k_99_use_gpu_False wall_time: 0.000261 s Throughput: 0.049 GB/s
Benchmark: m_128_n_100_k_99_use_gpu_True wall_time: 0.000794 s Throughput: 0.0161 GB/s
Benchmark: m_128_n_100_k_100_use_gpu_False wall_time: 0.000239 s Throughput: 0.0536 GB/s
Benchmark: m_128_n_100_k_100_use_gpu_True wall_time: 0.000777 s Throughput: 0.0165 GB/s
Benchmark: m_128_n_1000_k_1_use_gpu_False wall_time: 0.000324 s Throughput: 0.395 GB/s
Benchmark: m_128_n_1000_k_1_use_gpu_True wall_time: 0.000916 s Throughput: 0.14 GB/s
Benchmark: m_128_n_1000_k_10_use_gpu_False wall_time: 0.00042 s Throughput: 0.305 GB/s
Benchmark: m_128_n_1000_k_10_use_gpu_True wall_time: 0.000902 s Throughput: 0.142 GB/s
Benchmark: m_128_n_1000_k_500_use_gpu_False wall_time: 0.0011 s Throughput: 0.116 GB/s
Benchmark: m_128_n_1000_k_500_use_gpu_True wall_time: 0.00097 s Throughput: 0.132 GB/s
Benchmark: m_128_n_1000_k_990_use_gpu_False wall_time: 0.00133 s Throughput: 0.0962 GB/s
Benchmark: m_128_n_1000_k_990_use_gpu_True wall_time: 0.000993 s Throughput: 0.129 GB/s
Benchmark: m_128_n_1000_k_1000_use_gpu_False wall_time: 0.00102 s Throughput: 0.126 GB/s
Benchmark: m_128_n_1000_k_1000_use_gpu_True wall_time: 0.000964 s Throughput: 0.133 GB/s
Benchmark: m_128_n_10000_k_10_use_gpu_False wall_time: 0.002 s Throughput: 0.64 GB/s
Benchmark: m_128_n_10000_k_10_use_gpu_True wall_time: 0.00288 s Throughput: 0.445 GB/s
Benchmark: m_128_n_10000_k_100_use_gpu_False wall_time: 0.00233 s Throughput: 0.549 GB/s
Benchmark: m_128_n_10000_k_100_use_gpu_True wall_time: 0.00325 s Throughput: 0.394 GB/s
Benchmark: m_128_n_10000_k_5000_use_gpu_False wall_time: 0.0127 s Throughput: 0.101 GB/s
Benchmark: m_128_n_10000_k_5000_use_gpu_True wall_time: 0.00381 s Throughput: 0.336 GB/s
Benchmark: m_128_n_10000_k_9900_use_gpu_False wall_time: 0.015 s Throughput: 0.0853 GB/s
Benchmark: m_128_n_10000_k_9900_use_gpu_True wall_time: 0.00438 s Throughput: 0.292 GB/s
Benchmark: m_128_n_10000_k_10000_use_gpu_False wall_time: 0.0104 s Throughput: 0.123 GB/s
Benchmark: m_128_n_10000_k_10000_use_gpu_True wall_time: 0.00427 s Throughput: 0.3 GB/s
Benchmark: m_128_n_100000_k_100_use_gpu_False wall_time: 0.0148 s Throughput: 0.865 GB/s
Benchmark: m_128_n_100000_k_100_use_gpu_True wall_time: 0.0262 s Throughput: 0.488 GB/s
Benchmark: m_128_n_100000_k_1000_use_gpu_False wall_time: 0.0201 s Throughput: 0.636 GB/s
Benchmark: m_128_n_100000_k_1000_use_gpu_True wall_time: 0.0263 s Throughput: 0.486 GB/s
Benchmark: m_128_n_100000_k_50000_use_gpu_False wall_time: 0.214 s Throughput: 0.0599 GB/s
Benchmark: m_128_n_100000_k_50000_use_gpu_True wall_time: 0.0322 s Throughput: 0.398 GB/s
Benchmark: m_128_n_100000_k_99000_use_gpu_False wall_time: 0.262 s Throughput: 0.0489 GB/s
Benchmark: m_128_n_100000_k_99000_use_gpu_True wall_time: 0.0377 s Throughput: 0.34 GB/s
Benchmark: m_128_n_100000_k_100000_use_gpu_False wall_time: 0.118 s Throughput: 0.108 GB/s
Benchmark: m_128_n_100000_k_100000_use_gpu_True wall_time: 0.0365 s Throughput: 0.351 GB/s

END_PUBLIC

BEGIN_PUBLIC
BEGIN_PUBLIC
Automated g4 rollback of changelist 157169178

PiperOrigin-RevId: 161476569
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
97cbcac4535de361d0f04d5b30e9653e3b229c94 07-Jul-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Fix failure in functionalize_control_flow rewrite for Enter nodes that are unused. Make sure we ignore such nodes without producing an error.

PiperOrigin-RevId: 161136545
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc
3e9cd2e13da566e279396d232004d3c4ffad336e 16-Jun-2017 Peter Hawkins <phawkins@google.com> [TF:XLA] Add infrastructure in preparation for supporting tf.while_loop() in the TF/XLA bridge.

PiperOrigin-RevId: 159162832
/external/tensorflow/tensorflow/compiler/tf2xla/functionalize_control_flow.cc