ba019dc689d6393d8dba04ca57e8b01b374db14f |
|
17-Feb-2018 |
Sanjoy Das <sanjoy@google.com> |
[XLA] Add some plumbing, documentation, verification and shape inference for Gather Pretty much everything other than HLO verification and shape inference will fail for Gather with Unimplemented. Note that this CL is intentionally incomplete -- I figured it would be nicer to get some of the boiler-platey stuff out of the way early. Let me know if you want me to send in a larger but more complete CL instead. PiperOrigin-RevId: 186055521
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
13df417665f216bfb527440f1fd8f04958000ec5 |
|
16-Feb-2018 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Adds HostCompute HLO - a pseudo-op to represent host-side computation. PiperOrigin-RevId: 186047964
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
c8b884c683e260b42c15883f1c14caac4ea8d000 |
|
30-Jan-2018 |
Chris Leary <leary@google.com> |
[XLA] Plumb build options via local API. * Break build options into their own translation unit for use from local client and to mirror ExecutableRunOptions. * Add some ToString()s to aid debugging. * Add HLO graph generation regex to build options. * Add SWIG type map for ExecutableBuildOptions. Also fix a build issue occurring on some platforms with triangular_solve. PiperOrigin-RevId: 183837856
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
74b5a4cf30fc1f0fa24a41d212f4aa03dcefa990 |
|
28-Jan-2018 |
Justin Lebar <jlebar@google.com> |
[XLA] Show layouts of tuple-shaped instructions (other than kTuple) in graphs. For example the batch-norm ops return a tuple, and those values' layouts are significant. We still hide the layout on tuples, since this can be noisy. PiperOrigin-RevId: 183594622
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
5bf26acd87d3d44183fc28cb9576cda10c0255ca |
|
02-Jan-2018 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Automated g4 rollback of changelist 180000981 PiperOrigin-RevId: 180581912
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
711b10c280534c0ab73351bb4fd3e7ec32585236 |
|
28-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Fix hlo_graph_dumper: don't crash if the computation has a constant root instruction. PiperOrigin-RevId: 180285687
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
c0c2775ce3de682f7913d1aeaf50bbc4d1521934 |
|
23-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Automated g4 rollback of changelist 179983419 PiperOrigin-RevId: 180000981
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
7d1072dd3374a0aa22637a0fd4a17a4ddd064110 |
|
23-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Adds FFT for XLA: CPU via Eigen, GPU via cuFFT. GPU support includes plan reuse with new scratch allocator per execution in fft_thunk. PiperOrigin-RevId: 179983419
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
e4532d20973c4c00854492362665317551661c18 |
|
22-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Merge changes from github. PiperOrigin-RevId: 179953488
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
b2aa6950db67ab980012c05d496401200ad60320 |
|
22-Dec-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Print out missing extra-info for many instructions in the HLO graph dumper. Now we use the same functionality as HloInstruction::ToString() to print instructions' extra info. This fills in a lot of previously-missing info, like reduce-windows' windows, and dots' dot-dimension-numbers. PiperOrigin-RevId: 179892469
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
ff44881e5605fe5a9a46341dbc96614356fcf7cf |
|
21-Dec-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Fix HLO graph dumper not to assume that instruction names start with "%". HLO graph dumper needs to be aware that we've gotten rid of the "%" prefix in HLO names so it doesn't print e.g. reduce reduce.42 Subcomputation: add ... but instead simply prints reduce.42 Subcomputation: add ... PiperOrigin-RevId: 179756922
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
7fd2c7a7f8650a128213b19b13cb6ced65e87696 |
|
19-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add format field to layout Format will describe the method used to store array data in memory. Currently only DENSE is supported, which represents the way XLA currently stores arrays. Scalars have a DENSE format. Tuples and opaque shapes use INVALID_FORMAT. Adds checks to code that uses minor_to_major to ensure the layout is dense. PiperOrigin-RevId: 179475450
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
897ee02ee5aca8803d7a1ab217d8aeffdebd1473 |
|
17-Dec-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Shorten "custom_call_target" to "target" in XLA graph dumper. Make CustomCall nodes a bit smaller by shortening "custom_call_target=" to "target=". PiperOrigin-RevId: 179347188
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
a99b32fb149d028cd31fe638f81c6ca56c6e3b57 |
|
14-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Gather the bool parameters into one thing to control the text format. PiperOrigin-RevId: 179079727
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
ef3ee202659a2a49afcd9898451bf9b1256a2757 |
|
22-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add BitcastConvert HLO op to enable bitwise operations on floating point types. PiperOrigin-RevId: 176610007
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
4916c64836d5f51d6b8878f429bc1622c465fcdf |
|
16-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Adding kConditional opcode that represents a conditional HLO instruction. PiperOrigin-RevId: 175919301
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
2efb07ffe5d1f12a4eaef3d673f11615a8ddd6e5 |
|
16-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Fix a bug when printing fusion_kind in hlo_graph_dumper. PiperOrigin-RevId: 175915347
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
f9e3e8d8731daf338b6dc743aef84c35740ca037 |
|
14-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Hlo parser: support fusion. Also, - Add a HloInstruction::CreateFusion interface that creates a fusion instruction with given fusion computation. Add a HloComputation::SetFusionInstruction interface to help do that. - Change how we print fusion kind. Before this change we print fusion kind together with the opcode, e.g., fusion:kLoop, which is not easy to parse. Now we append fusion kind as an attribute. - Print fusion computation the same way as other computations, instead of nested in an instruction. PiperOrigin-RevId: 175621768
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
3db96abfc5432c190d3afa62ebfad3c1d82cd818 |
|
13-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Allow assigning colors based on HLO sharding information, when generating Graphviz HLO graphs via a new --xla_hlo_graph_sharding_color option. When generating TF graphs, a new --xla_hlo_tfgraph_device_scopes option allows to prefix the instructions names with a device scope. This help the TF graph viewer to better isolate the parts of the graph which are targeted to different devices, and allow rendering of graphs which would not be able to due to size. Changed TF/XLA broadcast lowering to propagate the request metadata into the HLO broadcast instructions. PiperOrigin-RevId: 175563052
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
685f604f63a30a8162d8762e9d8d22f171dca85e |
|
10-Nov-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Don't deemphasize nodes inside of subcomputations in dumped XLA graphs. Nodes inside of subcomputations (e.g. fusion computations) are always printed by the HLO graph dumper. Before this change, the dumper was not fully aware of this fact, leading it to mark as "deemphasized" (i.e. draw as gray with a dashed outline) nodes that had no business of being deemphasized. PiperOrigin-RevId: 175247474
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
51895becce83ef4dc8bac263377d158fc50e4d53 |
|
09-Nov-2017 |
HyoukJoong Lee <hyouklee@google.com> |
Change for asynchronous Send and Recv by splitting Send into {Send, SendDone} and Recv into {Recv, RecvDone}. See operation_semantics.md for the updated semantics. PiperOrigin-RevId: 175216012
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
4a618e411af3f808eb0f65ce4f7151450f1f16a5 |
|
08-Nov-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Print constant literals of size <= 8 elements. Previously we'd only print scalars. But if you have a constant with just a few values, what the heck, show the whole thing. PiperOrigin-RevId: 175030210
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
3b0414872f08cfabbf71a495ad661a7c892c76d8 |
|
02-Nov-2017 |
Chris Leary <leary@google.com> |
[XLA] Allow full dumps of constant values via boolean parameter. PiperOrigin-RevId: 174257660
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
0a7be5a2f58fe5470fa7526c9de1404cb16fe3dc |
|
31-Oct-2017 |
Sanjoy Das <sanjoy@google.com> |
Rename (Add|Get)ProfileResult to something more specific; NFC PiperOrigin-RevId: 174084570
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
6d1263cdf8ee8323513f984553dbeb070865fd0c |
|
31-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Remove dead opcode kIndex. PiperOrigin-RevId: 173987428
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
efcbf6e34e4519172d38be76c08c2d99792fd7be |
|
30-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Supported in this CL: * Attaching sharding descriptors to HLO ops * Partitioning the HLO graph into per-device computations based on those sharding descriptors. * All operator support for device placement and ops replicated on all devices. * Elementwise op support for tiled shardings. * 2D Convolution support for tiled shardings (no stride or dilation support). PiperOrigin-RevId: 173946036
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
e7645b629568c3ef968fa0dddeb2ff01a67e55e2 |
|
28-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] DOT dumper: Handle fusion nodes nested inside other nodes (e.g. map). PiperOrigin-RevId: 173752314
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
9f4b12bb55d102988ad9c3c064e37d85b1c4e38e |
|
28-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] DOT dumper: Print constant shape when we elide the constant's value. For example, instead of "operand 1 = %constant.42", we now print "operand 1 = %constant.42 (f32[100])". PiperOrigin-RevId: 173741373
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
4198e27be8115585ad6b5b141383fb7dc7856c24 |
|
27-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA:CPU] [XLA:GPU] Adds compiler support for C64 primitive type, including relevant elementwise unary and binary op lowering for CPU and GPU. We use a named LLVM struct "complex64", laid out the same as std::complex<float>. This named struct is accessed via the llvm::Module, which required changes to accessors of PrimitiveTypeToIrType & friends. Ops that require atan2 (in particular, angle and log) are only supported on GPU at this point. LLVM lacks a CPU intrinsic for atan or atan2, whereas libdevice provides this for GPU. PiperOrigin-RevId: 173676849
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
24105d9a83dff9b46326373a7c4fd7fd254f32f0 |
|
26-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Merge large parameter-shaped tuples into their users in DOT graphs. It's common to have a while loop whose body computation has one parameter, a giant tuple. Then we have to draw edges from that tuple to a bunch of get-tuple-element nodes, which are used throughout the while loop's body. This results in many long, difficult-to-follow edges. In practice, the big tuple really functions as N separate parameters. This patch represents it this way visually, erasing the big tuple and replacing it with the get-tuple-element users, which we style like parameters. Future work is figuring out how to do something similar for the tuple op at the bottom of while loop bodies. This will be harder, because it will require breaking the invariant that every HLO corresponds to zero or one nodes in the dot graph. PiperOrigin-RevId: 173584100
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
5965a76ea72e266fba9b78adc94ec4ee71029ece |
|
26-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] De-emphasize uninteresting nodes in the HLO graph dump. The hope is that this will make expensive / interesting ops easier to see. PiperOrigin-RevId: 173478095
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
a80b9297f330be6777a23e2e3a3b6e21097d1926 |
|
26-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Add device assignment export to graph dumpers. PiperOrigin-RevId: 173472156
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
8087e67252bca4075e59ab75023826dae23dfb74 |
|
26-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Remove dead kUpdate opcode. PiperOrigin-RevId: 173462881
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
1c241e5ba7fa7068f9cf8f925638b170db57c438 |
|
13-Oct-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Add ShiftLeft, ShiftRightArithmetic, and ShiftRightLogical operators. PiperOrigin-RevId: 172091595
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
bb789adc1543684512aab1c83b13872b9ca27c63 |
|
09-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Rename HloOpcode::kLogicalX to kX PiperOrigin-RevId: 171536686
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
9b1b5d85b9ce3c812dc772da1f3f5d09581e5b49 |
|
29-Sep-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Make HloComputation::instructions() return a view of HloInstruction*s. Currently it returns a view of unique_ptr<HloInstruction>s. But the fact that these are unique_ptrs is an implementation detail, and it's ugly to leak it everywhere. PiperOrigin-RevId: 170445375
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
b9611a5fd29cf5ab34aa06e6464f178154ba202f |
|
25-Sep-2017 |
Chris Leary <leary@google.com> |
[XLA] Add support for QuantizeAndDequantizeV2. PiperOrigin-RevId: 169955636
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
83066d45ee670d77349e124eb1258cff7045d4e7 |
|
22-Sep-2017 |
Justin Lebar <jlebar@google.com> |
Revamp handling of subcomputations in HLO graph dumper. Before, we relied on a hacky heuristic -- "recurse into nested fusion nodes" -- that didn't work for the case when e.g. a fusion node was nested inside a while loop. This change also adds a (very basic) testcase for the HLO graph dumper. PiperOrigin-RevId: 169731958
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
a1a5ed233ccbc05250020b3518f8405b32d6822e |
|
22-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add show_metadata argument to hlo_graph_dumper::DumpGraph. PiperOrigin-RevId: 169705577
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
d10902f0a947da40f80479d74e9a487617759085 |
|
19-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Fix HLO-graph-dump handling of nodes near computation edges, and indicate root node. Previously, attempting to dump the neighborhood of a node that was too close to a computation edge would result in a crash when we tried to create an edge that went outside the bounds of the computation to a node that wasn't included in the graph. This CL fixes that bug. In addition, we previously did not indicate the root node of the outermost computation, so that it wasn't obvious whether a "bottom" node on the graph was actually the root node or simply didn't have its outputs drawn (or didn't have any outputs). This CL adds a "psuedonode" tag onto the root node that makes it clear that it's the root. This also adds a little verbosity to the graph title. When the graphed computation is a fused computation, we also mention which fusion instruction it's associated with. Finally, this adds some verbose-mode logging that's useful to trace the operation of the graph-generation. (I added this when debugging the crash mentioned above.) PiperOrigin-RevId: 169182331
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
f0550d0aafa81bc6361cab3aa13990c56166a197 |
|
19-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Include stride information for slice in hlo graph dump and hlo string. PiperOrigin-RevId: 169166848
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
3a98035fa8fe8d02960c605e210fbf8af2d14516 |
|
12-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Augment metadata output with source-line info, as before. PiperOrigin-RevId: 168292527
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
2494aa452b07fa5ae88f01ceb49faaa51f4a3baf |
|
07-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Optionally add metadata lines to graph neighborhood dumps. PiperOrigin-RevId: 167911962
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
91617d22fc5868948a361e04a0642a765a092544 |
|
31-Aug-2017 |
David Majnemer <majnemer@google.com> |
[XLA] Dump nested fusion nodes without crashing PiperOrigin-RevId: 167194247
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
e565d1f1fced69789feb10f1ea1241157ec95f93 |
|
30-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Refactor parent-fusion-instruction pointer into HloComputation, not HloInstruction. Presently, each instruction inside a fusion computation contains a pointer to the fusion instruction that contains the computation, which is redundant since this is common across the entire computation. This leads to lots of places where this pointer must be set when adding an instruction to the fusion computation (and bugs such as b/65177535 when one is missed), as well as code to check that it's set correctly. In addition, this is simply unnecessary data bloat. Moreover, the computation itself does not contain a pointer to the fusion instruction that references it, which leads to odd circumlocutions in the HloComputation code that retrieve the fusion instruction from the computation's root instruction. Thus, this CL moves this pointer into the HloComputation class (replacing the is_fusion_computation_ bool value), and refactor the uses as necessary. PiperOrigin-RevId: 167039280
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
6a8fab32c29b78bdf920582761780f48bee1b6d0 |
|
25-Aug-2017 |
Mark Heffernan <meheff@google.com> |
Show control edges in HLO DOT graph (as dotted lines). Also list control successors in HloInstruction::ToString dump. PiperOrigin-RevId: 166521980
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
7359fec792e4efec1670a12332bb524a5608b215 |
|
18-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Implement Batchnorm Inference by expanding them into smaller ops. 1. Add batch norm inference support in batchnorm_rewriter 2. Connect xla's batchnorm inference to tf's FusedBatchNorm RELNOTES: n/a PiperOrigin-RevId: 165655351
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
8e16d674575e9defae488a53ded2764ff4d05518 |
|
03-Aug-2017 |
Justin Lebar <jlebar@google.com> |
Fix dumping of non-fusion subgraphs. Regressed as part of recent changes. PiperOrigin-RevId: 164061267
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
dfd0df11488f27a89ced83815b69d86ea4bdd44a |
|
02-Aug-2017 |
Justin Lebar <jlebar@google.com> |
Show HLO profile data for fusion nodes in HLO graphs again. Recent refactorings regressed this. PiperOrigin-RevId: 163991623
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
724884f1cac11956c64e512fe40c1f02de7f5061 |
|
01-Aug-2017 |
Justin Lebar <jlebar@google.com> |
Show layouts in HLO graph dump. Layouts are displayed as e.g. "f32[100,200]{0,1}". But constants used to be displayed as e.g. "f32[]{42}". To avoid ambiguity, constants are now displayed as e.g. "42 (f32[])". Also gets rid of the xla_hlo_graph_layout flag, which is no longer necessary since we're now showing layouts unconditionally. PiperOrigin-RevId: 163753637
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
6028c071b592330fabf107df08d880ec443d6844 |
|
27-Jul-2017 |
Justin Lebar <jlebar@google.com> |
Highlight incoming/outgoing edges on hover in HLO graphviz dumps, and other improvements. Other improvements: - Don't show tooltips for nodes and clusters. Previously we'd show a tooltip containing a pointer value expressed as decimal. Not so useful. - Show tooltips on edges with the to/from node names. - Fix bug wherein if we had - a node at the "edge" of the graph (so its operands aren't included unless they're referenced by another node), - with all of its operands included in the graph save one or more constants, and - those constants weren't referenced by any nodes not at the edge of the graph, we would incorrectly draw the node as "grayed out", indicating that one of its operands (namely, its constant operand) wasn't present in the graph. This is wrong because constants are inlined into their users, so they should always count as "displayed" for the purposes of determining whether a node is grayed out. PiperOrigin-RevId: 163276108
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
d05bf9a30f8297fea4fa391702d17203767d0c3c |
|
25-Jul-2017 |
Justin Lebar <jlebar@google.com> |
Show fusion nodes inline in HLO graph dumper. To make this work sanely I had to change NodeFilter so that it says to dump all nodes inside subcomputations. Previously, we passed an explicit NodeFilter down to DumpSubcomputation, and used that to control whether or not we dumped nodes in there. But this becomes unwieldy with inline fusion nodes, as sometimes you want to look at 'filter', and other times you want to look at 'filter_', and there's no good way to tell why. I also had to remove the heuristic whereby we'd pull in operands of nodes with just some operands shown. With the much bigger nodes that are generated by this change, the graph was becoming illegible. I think most of the confusion that heuristic was attempting to avoid is addressed by the fact that we "gray out" incomplete nodes. PiperOrigin-RevId: 163091423
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
7349de8e967c1c943a6b0844718f3933333aa8a3 |
|
25-Jul-2017 |
Justin Lebar <jlebar@google.com> |
Improve the HLO graph dumper's output. - Truncate long shapes. It's not uncommon to have giant tuples, and displaying the whole thing makes the graph unreadable. - Don't traverse into the users of a node with < 16 users. These are probably not interesting, and traversing into them can quickly blow up the graph, making it un-renderable. - Allow nodes which have multiple trivial subcomputations (e.g. select-and-scatter) to have those computations inlined. - Match additional patterns in MatchTrivialComputation PiperOrigin-RevId: 163079329
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
6610b3ec6bdb1a8843070a9fea6e4612681b9318 |
|
25-Jul-2017 |
Justin Lebar <jlebar@google.com> |
Refactor HLO graph dumping. This also makes a few minor cosmetic changes, like moving the fusion type out of the fusion node and into the out-of-line computation and adjusting the arrow labels that we use to indicate operand numbers. PiperOrigin-RevId: 163038795
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
2661f6841d0ad9ec1381d177a1f9df02e73d001c |
|
24-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add support for sin(x) transcendental. PiperOrigin-RevId: 162889962
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
51cbb58ca5147218b3995dc124bd92927d93e913 |
|
21-Jul-2017 |
Justin Lebar <jlebar@google.com> |
Some improvements to HLO grapvhiz dumping. - If all params are filtered out, don't show an empty "parameters" node. - If the node name contains the opcode name, omit the opcode. For example, show "%add.4" instead of "add %add.4". This greatly shrinks the width of our graphs. - (For nodes without a lot of operands), always show either none or all of the operands. - Show nodes with some or all operands elided as "grayed out" to make it clear that these are the "edges" of our neighborhood. - Don't show an out-of-line computation for e.g. "add reduce". Instead, simply show it as "%reduce.42<br>Subcomputation: add". - Split up parameter nodes. Previously all params were fused into one big node, but now each parameter is its own node. This is useful because otherwise graphviz has to route long edges from the top of the graph to nodes that use params at the bottom of the graph. - Inline constants into their users, instead of displaying them as separate nodes. This is particularly helpful when a constant (:cough: zero) is used many times, because otherwise we have to draw many long edges all over the graph. PiperOrigin-RevId: 162778619
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
78e7cffa71df4d5057b8a87f78f2d91e421a36f7 |
|
21-Jul-2017 |
Shanqing Cai <cais@google.com> |
Fix open-source build breakage related to std::deque PiperOrigin-RevId: 162701622
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
386f4aef0d05489cc3a4cdc01470533849569dba |
|
21-Jul-2017 |
Justin Lebar <jlebar@google.com> |
Add hlo_graph_dumper::GetInstructionsInNeighborhood, which lets you graph the nodes that are "near" a particular node. PiperOrigin-RevId: 162692461
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
7d0f6385f8e7637e155ef9c340c19aded365a6ff |
|
07-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[BatchNorm] Skeleton code to implement BatchNormGrad This CL sets up all the boilerplate code needed to implement BatchNormGrad. None of the backends bas been implemented yet. RELNOTES: n/a PiperOrigin-RevId: 161161713
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
7ab72bf2205b1775607932b6ccbcd7099368705e |
|
28-Jun-2017 |
Eli Bendersky <eliben@google.com> |
[XLA] Move remaining hlo graph dumper flags into debug_options. Also pipe debug_options through the code in hlo_graph_dumper, since the number of individual parameters was growing too large. PiperOrigin-RevId: 160446088
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
a3a7d1ac38da8fec75ae5a0eaee743b065a9b85c |
|
26-Jun-2017 |
Eli Bendersky <eliben@google.com> |
[XLA] Move HLO dumping flags from service_flags to debug_options_flags This also removes the duplication in the xla_generate_hlo_graph flag. This CL also moves the actual dumping logic from Executable to the hlo_graph_dumper namespace, where it belongs; this is in preparation for removing the hlo_dumper callback altogether, since it isn't serving any role beyond what a direct call to hlo_graph_dumper would have (b/62872831 has more details). PiperOrigin-RevId: 160154869
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
46737e4e81314f7482bfd6a710f126a27f5d7975 |
|
19-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Remove class xla::LiteralUtil. NFC (mind-numbingly so). This patch removes class xla::LiteralUtil and rewrites every call to use class xla::Literal instead. PiperOrigin-RevId: 159446373
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
9d2a432ce74eab4c439fe8c60389e4da9d6c92b2 |
|
17-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Add plumbing for a ReducePrecision operation. This CL is the first part of a series that adds a ReducePrecision operation for experimenting with the effects of reduced-precision storage of intermediate values. ReducePrecision is a Unary operation parameterized on floating-point exponent and mantissa bit sizes, and rounds the input data as if it were converted to a floating-point value with the given bit sizes and then converted back to "normal" F32 data. Using arbitrary parameterized values to describe the lower-precision value type, rather than hardcoding this as a reduction to IEEE f16, allows us to do more flexible experiments -- e.g., "Is this training error due to the reduced mantissa precision, or due to the reduced exponent range?" or "Is this a smooth degradation with reduced precision or is there a sudden drop at some value?" -- which may suggest software mitigations for the effects. This version of the CL adds the kReducePrecision instruction opcode, and the overall plumbing to support the operation. To allow testing, it includes an exceptionally simple implementation of the actual operation that returns "unimplemented" except for the exponent and mantissa bit sizes where it is a complete no-op. PiperOrigin-RevId: 159295615
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
df511d09b051914cbc4fc559807a3f0d07dfee71 |
|
14-Jun-2017 |
Petros Mol <pmol@google.com> |
[XLA] Add a Cos unary operation that computes the elementwise cosine PiperOrigin-RevId: 158984883
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
b6039c875290cdd5c9a62e01393b75b928827504 |
|
14-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
We believe a fused version of batch_norm_op can speed the algorithm up. This pr implements a new op: fused_batch_norm_op in tf-xla and HLO. This is the CPU implementation for batch norm training. This CL is big but a lot of code are boilerplate. PiperOrigin-RevId: 158930166
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
d9fc16f61e89b12681b24dc4206370e5a74f4c6f |
|
09-May-2017 |
Justin Lebar <jlebar@google.com> |
Unbreak HLO graph dumping for graphs that include a convolution or a C++ template (e.g. "max<float>"). These descriptions include angle brackets, which we need to sanitize out. PiperOrigin-RevId: 155521374
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
10096f75722e56eddb95512532842e3082b23cfc |
|
08-May-2017 |
Justin Lebar <jlebar@google.com> |
Beautify HLO dot output. - Show the value of effective scalar constants. Previously we only showed the values of true (i.e., R0) scalars. - Change the graph's font to Roboto. Previously it was...whatever the default is -- Times New Roman? - Place the graph's heading at the top of the graph, instead of wherever the dot renderer pleases. - Make all nodes other than "while" rectangles, and Use rectangles with rounded corners for nested computations, to make them easier to distinguish from the other nodes. - Make nodes' opcodes and graph headings bold. - Tweak the graph's colors. * Now we use Material Design colors, from https://material.io/color. These are in general much lighter, which I think lets the reader focus more on the nodes' contents. * tuple and get-tuple-element are now white, to reflect that they are usually nops. This makes graphs with many GTE ops much easier to parse. (I think this is the most impactful change in this patch.) * We use fewer unique colors now, again in an effort to help readers focus on the graph itself. This necessitates making some ops that previously had different colors share colors: - send, recv, infeed, outfeed, and cross-replica-sum are all now the same color, since they represent data transfer. - dot and conv are now both the same (bold) color, since these are often the most expensive ops in models. - map and fusion are now the same color, since they both are wrappers around "lambda functions". PiperOrigin-RevId: 155411313
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
0555d089a0be69b0c7f2139706d795fdcfef6f49 |
|
26-Apr-2017 |
David Majnemer <majnemer@google.com> |
[XLA] Use "box" instead of "boxed" in the Graphviz dumper Running dot on an XLA graph somtimes shows: Warning: using box for unknown shape boxed According to the Graphviz sources, there is no such shape: https://github.com/ellson/graphviz/blob/3ccbd94343d65ca489ce8568539f39c21a6c6131/lib/common/shapes.c#L233 Change: 154234632
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
62850f51dd5e978ac243695efab753490a52ca15 |
|
12-Apr-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Support dumping HLO graphs as TF GraphDefs in hlo_graph_dumper - Added a new --xla_hlo_dump_as_graphdef TF_XLA_FLAGS - Moved hlo_tfgraph_builder from xla/tools/ to xla/service/ - Refactored GraphRendererInterface a bit to support both dot graph and tf graph. Change: 152921467
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
3088d3664a99e7cb81ee190f4d65f4bd10407f42 |
|
29-Mar-2017 |
David Majnemer <majnemer@google.com> |
[XLA] Move kPad from GpuElementalIrEmitter::MakeElementGenerator to ElementalIrEmitter::MakeElementGenerator There is nothing GPU specific in GpuElementalIrEmitter::MakeElementGenerator for kPad. Move it into the base implementation so that all subcalses have it as an implementation. Change: 151564674
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
1258b206590d9460f87f0aaab0c9f9ccba3b1bfe |
|
16-Mar-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Refactor convolution dimension numbers and windows dumping code and remove duplicate code in hlo_graph_dumper Change: 150324515
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
fc112a6b53d782eacb46eb357a8720d6b5a5d3cc |
|
11-Mar-2017 |
Mark Heffernan <meheff@google.com> |
[XLA] Replace uses of std::set with std::vector. std::set is slow and the iteration order is unstable. A couple other opportunistic changes include consolidating all called computations of an instruction in a single vector. This faciliates fast access to all called computations. Also, replace AddControlSuccessor/Predecessor with Add/RemoveControlDepedencyTo which is less error prone as you can't create a half connected control edge. Change: 149810889
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
0386a01ad3beb28364599d82199be1c0837b3fa9 |
|
10-Mar-2017 |
Dandelion Mané <dandelion@google.com> |
Merge changes from github. Change: 149800363
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
00d0347ccebc3e29ffe541703b5a2f929b89da36 |
|
10-Mar-2017 |
Brennan Saeta <saeta@google.com> |
[TF:XLA] Add debug metadata to HLO ops. In order to support end-to-end debugging and performance profiling tooling for the TensorFlow::XLA toolchain, this change adds a DebugMetadata proto to the HloInstruction class, and pipes it through the tf2xla stack. Change: 149703349
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
1f9518b6ec8adde10cb127623855cb59d4e67d9d |
|
06-Mar-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Not clobber files when dumping HLO dot graph Change: 149340560
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
af2c7253bb1f9d135ad9b0c6a271741205ab57fd |
|
02-Mar-2017 |
David Majnemer <majnemer@google.com> |
[XLA] Add support for profiling multiple computations While we are here, add support for getting the cost analysis for call HLOs. Change: 148952748
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
8120e2a270c28e0a62b9f522164b196a90f113b7 |
|
24-Feb-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Add an IsFinite operation that tests elementwise whether values are finite (i.e., not NaN or Inf). Change: 148485205
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
9113e98115ecbeb1404edb7d14d2cf443f2484bf |
|
27-Jan-2017 |
Tayo Oguntebi <tayo@google.com> |
Addition of Outfeed HLO op. Change: 145772331
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
04b30700fea43a8a5f47e6d189333f5b38644116 |
|
13-Jan-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add a flag do_prefix to hlo_graph_dumper::DumpText() Change: 144402914
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|
1e67c90e2caceeff82d09793d1ef5fa0300d219b |
|
09-Jan-2017 |
Peter Hawkins <phawkins@google.com> |
Initial open-source release of XLA: Accelerated Linear Algebra. XLA is a compiler-based linear algebra execution engine that targets CPUs, GPUs and custom accelerators. XLA is still experimental; we are releasing it early to get the community involved. Change: 143990941
/external/tensorflow/tensorflow/compiler/xla/service/hlo_graph_dumper.cc
|