ba019dc689d6393d8dba04ca57e8b01b374db14f |
|
17-Feb-2018 |
Sanjoy Das <sanjoy@google.com> |
[XLA] Add some plumbing, documentation, verification and shape inference for Gather Pretty much everything other than HLO verification and shape inference will fail for Gather with Unimplemented. Note that this CL is intentionally incomplete -- I figured it would be nicer to get some of the boiler-platey stuff out of the way early. Let me know if you want me to send in a larger but more complete CL instead. PiperOrigin-RevId: 186055521
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
13df417665f216bfb527440f1fd8f04958000ec5 |
|
16-Feb-2018 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Adds HostCompute HLO - a pseudo-op to represent host-side computation. PiperOrigin-RevId: 186047964
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
dd2447af6afe43bf9b10d18a6f438f47963f43dc |
|
12-Feb-2018 |
Brian Patton <bjp@google.com> |
For debugging purposes, it can be useful to know which ops are considered non-pure / non-constant. PiperOrigin-RevId: 185371882
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
09d2efecf44bf313d2e03abdc1c8884cf48e23ae |
|
03-Feb-2018 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Propagate outfeed sharding, if specified from TensorFlow. PiperOrigin-RevId: 184361221
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
a58524fa602829459aa7eb0335a33afe1f28382a |
|
19-Jan-2018 |
Chris Leary <leary@google.com> |
[XLA] Simplify trivial pad/reduce-window combos into broadcasts. PiperOrigin-RevId: 182585236
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
c2f1e2d8dba2fc568ab4c4eb42b0ae24a0c0b02b |
|
18-Jan-2018 |
Chris Leary <leary@google.com> |
[XLA] Add source mapping support to SWIG API. PiperOrigin-RevId: 182292142
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
7d64e124103c8334b7d8b127cd2eff786959d185 |
|
06-Jan-2018 |
Mark Heffernan <meheff@google.com> |
Remove protobuf-compatibility methods from the Literal class. This CL primarily does two things: (1) Remove the protobuf-compatibility methods (eg, mutable_f32s()) from Literal. These were added to Literal as part of the migration of Literal from a proto to a c++ class. Now that Literal is a proper class, these protobuf methods make it difficult to enforce invariants and expose too much of the class' implementation details. (2) Make shape an immutable property of Literals, and make shape and the data members holding the Literal data coherent by construction. Previously, the shape could be set arbitrarily, and the data members such as f32_ could be arbitrarily sized irrespective of the shape of the literal. The remainder of the CL mostly deals with the fallout. Notable other changes: - Literal is no longer a recursive data structure. To avoid copies when passing a subliteral of a tuple-shaped Literal, a LiteralView class is added which provides a read-only view of an arbitrary subliteral. - Tuple-shaped Literals can no longer be built up incrementally so to avoid copying Literal values during construction, the following methods with move semantics are added: Literal::MoveFrom and Literal::MoveIntoTuple. These methods transfer ownership the underlying buffers enabling, for example, a literal to be moved into an element of a tuple-shaped literal with no data copying. - Replace the internal data structure holding the actual data from a bunch of std::vectors (eg, s32s_, f32s, etc) to a single ShapeTree<char*>. This significantly simplifies accessors and makes improved support of tuple-shaped literals much easier (eg, Literal::Get<>() can now access elements in arbitrary subliterals). Also, Literal is made movable, but not copyable. Otherwise, it is all too easy to accidentally introduce expensive copies of Literals. Literal::Clone is added to handle the case where a copy is needed (Literal::CloneToUnique already exists). PiperOrigin-RevId: 181014890
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
7753ab0b4aa3ff989b61725cbf42c4c57e176999 |
|
04-Jan-2018 |
David Majnemer <majnemer@google.com> |
[XLA] Remove RNG_BERNOULLI RNG_BERNOULLI is easy to compose out of other operations and appears to provide no real benefit. Let's remove it. PiperOrigin-RevId: 180726889
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
09b99aa8c323f3fef7d8231237abb42b3052dd11 |
|
03-Jan-2018 |
Chris Leary <leary@google.com> |
[XLA] Document the "Operand to ComputeConstant" error better at the XLA level. PiperOrigin-RevId: 180614338
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
5bf26acd87d3d44183fc28cb9576cda10c0255ca |
|
02-Jan-2018 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Automated g4 rollback of changelist 180000981 PiperOrigin-RevId: 180581912
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
c0c2775ce3de682f7913d1aeaf50bbc4d1521934 |
|
23-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Automated g4 rollback of changelist 179983419 PiperOrigin-RevId: 180000981
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
7d1072dd3374a0aa22637a0fd4a17a4ddd064110 |
|
23-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Adds FFT for XLA: CPU via Eigen, GPU via cuFFT. GPU support includes plan reuse with new scratch allocator per execution in fft_thunk. PiperOrigin-RevId: 179983419
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
6a9a9ed0e1f5eded19d793b2be125d2d845cf079 |
|
22-Dec-2017 |
Justin Lebar <jlebar@google.com> |
[XLA:GPU] Implement BatchNormThunk as a call into cudnn. Using cudnn for these calls is disabled by default, because it's not a performance win on our benchmarks. PiperOrigin-RevId: 179882911
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
e289dfd636bfab31232a511b0e96a785571ada92 |
|
19-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Fix typo in comment for GetRoot(). PiperOrigin-RevId: 179486882
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
0ea2d74f883914109eb154bcf2a7d61ae0557f2d |
|
15-Dec-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Remove the notion of a "parameter name" separate from the instruction's name. Also set the instruction's name in the HLO parser, so that after parsing, the instructions have the names they're given in the input string. PiperOrigin-RevId: 179119003
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
0e9cc7f3113ade82436729bd541f6b501d023ac0 |
|
08-Dec-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Implement Conditional in XLA service, client ComputationBuilder, and CPU backend. PiperOrigin-RevId: 178322445
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
22767d59b3c6958ed690814ff77e29ee1d458b18 |
|
06-Dec-2017 |
Bjarke Hammersholt Roune <broune@google.com> |
Allow CrossReplicaSum to take multiple operands internally. PiperOrigin-RevId: 178043362
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
4146ff1259c0b4ada8afbbad11a7b37d8373d1b9 |
|
30-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Adds Dot with DotDimensionNumbers proto for specifying arbitrary contracting and batch dimensions. PiperOrigin-RevId: 177481231
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
6cc7e387fc1b642d363b6a18877a411382a82fa5 |
|
27-Nov-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Implement StatelessRandomUniform and StatelessRandomNormal using the ThreeFry counter-based PRNG. Extend stateless ops to allow 32-bit integer seeds, with a 64-bit default. PiperOrigin-RevId: 177068747
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
ef3ee202659a2a49afcd9898451bf9b1256a2757 |
|
22-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add BitcastConvert HLO op to enable bitwise operations on floating point types. PiperOrigin-RevId: 176610007
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
c518d35b9077bd193321f8b66dfb958ce9ab61cd |
|
22-Nov-2017 |
Kay Zhu <kayzhu@google.com> |
[XLA] Enable explicit broadcast for ternary operations. Also explicitly broadcast constant 1 in algsimp for pow(x, -1) => 1/x transformation, so that: - we can avoid implicit broadcast which we are trying to eliminate at HLO level. - interpreter, which does not support implicit broadcast, now passes the PowSpecialF32 test case in array_elementwise_ops_test which generates a divide(1.F32[], param.F[4]) instruction that requires implicit broadcast. PiperOrigin-RevId: 176582286
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
3db96abfc5432c190d3afa62ebfad3c1d82cd818 |
|
13-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Allow assigning colors based on HLO sharding information, when generating Graphviz HLO graphs via a new --xla_hlo_graph_sharding_color option. When generating TF graphs, a new --xla_hlo_tfgraph_device_scopes option allows to prefix the instructions names with a device scope. This help the TF graph viewer to better isolate the parts of the graph which are targeted to different devices, and allow rendering of graphs which would not be able to due to size. Changed TF/XLA broadcast lowering to propagate the request metadata into the HLO broadcast instructions. PiperOrigin-RevId: 175563052
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
51895becce83ef4dc8bac263377d158fc50e4d53 |
|
09-Nov-2017 |
HyoukJoong Lee <hyouklee@google.com> |
Change for asynchronous Send and Recv by splitting Send into {Send, SendDone} and Recv into {Recv, RecvDone}. See operation_semantics.md for the updated semantics. PiperOrigin-RevId: 175216012
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
e07ce40153871321361d7adaeabe4b83a739424f |
|
07-Nov-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Previously if ComputeConstant seen a parameter it failed to proceed. After this change we can specify a list of parameters to it and if we specify enough then it will do the computation. The primary goal of this change is to make the HloEvaluator usable with ComputationBuilder from tests through ComputeConstant in cases where the input is a parameter (fed by a literal). PiperOrigin-RevId: 174845108
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
6d1263cdf8ee8323513f984553dbeb070865fd0c |
|
31-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Remove dead opcode kIndex. PiperOrigin-RevId: 173987428
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
efcbf6e34e4519172d38be76c08c2d99792fd7be |
|
30-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Supported in this CL: * Attaching sharding descriptors to HLO ops * Partitioning the HLO graph into per-device computations based on those sharding descriptors. * All operator support for device placement and ops replicated on all devices. * Elementwise op support for tiled shardings. * 2D Convolution support for tiled shardings (no stride or dilation support). PiperOrigin-RevId: 173946036
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
4198e27be8115585ad6b5b141383fb7dc7856c24 |
|
27-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA:CPU] [XLA:GPU] Adds compiler support for C64 primitive type, including relevant elementwise unary and binary op lowering for CPU and GPU. We use a named LLVM struct "complex64", laid out the same as std::complex<float>. This named struct is accessed via the llvm::Module, which required changes to accessors of PrimitiveTypeToIrType & friends. Ops that require atan2 (in particular, angle and log) are only supported on GPU at this point. LLVM lacks a CPU intrinsic for atan or atan2, whereas libdevice provides this for GPU. PiperOrigin-RevId: 173676849
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
8087e67252bca4075e59ab75023826dae23dfb74 |
|
26-Oct-2017 |
Justin Lebar <jlebar@google.com> |
[XLA] Remove dead kUpdate opcode. PiperOrigin-RevId: 173462881
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
b68a3f2e445cdc749f380387b910f6eac72e5dcf |
|
20-Oct-2017 |
Yunxing Dai <yunxing@google.com> |
Iterating through a map in protobuf is essentially nondeterministic. This CL enables us to traverse the map in a deterministic order by sorting the keys first. PiperOrigin-RevId: 172918084
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
1c241e5ba7fa7068f9cf8f925638b170db57c438 |
|
13-Oct-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Add ShiftLeft, ShiftRightArithmetic, and ShiftRightLogical operators. PiperOrigin-RevId: 172091595
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
721fbda83fc0cb00c9bf9ed461c8fc3084f42fe1 |
|
10-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Rename BINOP_LOGICAL_X to BINOP_X PiperOrigin-RevId: 171716540
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
bb789adc1543684512aab1c83b13872b9ca27c63 |
|
09-Oct-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Rename HloOpcode::kLogicalX to kX PiperOrigin-RevId: 171536686
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
8964d1b1ee5170686cb0d2969047b14eccc24318 |
|
29-Sep-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Allow broadcast_dims argument to binary operations to be the identity mapping where the inputs are the same rank. Allowing the identity is well-defined and useful as a base case. PiperOrigin-RevId: 170499871
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
970bdcc47a0085b4913232dd2eec87dc0d82f61e |
|
27-Sep-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Propagate device assignment to HloInstructions created by implicit broadcast lowering in UserComputation. PiperOrigin-RevId: 170225368
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
2e5bc305ff328cbd55bc1b4301457c5a00762a05 |
|
27-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Fix broken open source build. PiperOrigin-RevId: 170136839
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
b29b839215fa9bf5a00ca97e19673cfa5f780314 |
|
26-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Map API change to enable mapping over an arbitrary set of dimensions. PiperOrigin-RevId: 170090055
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
b9611a5fd29cf5ab34aa06e6464f178154ba202f |
|
25-Sep-2017 |
Chris Leary <leary@google.com> |
[XLA] Add support for QuantizeAndDequantizeV2. PiperOrigin-RevId: 169955636
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
85c4a379985b46930ece49edc4347af628ee2928 |
|
24-Sep-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Adds an API to attach a device assignment to HLO operators. PiperOrigin-RevId: 169841868
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
57a4506da1fe74a41812a2b843c46b5fd010193d |
|
15-Sep-2017 |
Mark Heffernan <meheff@google.com> |
Verify the output shapes of (almost) all HLO opcodes in the HloVerifier. Previously, only the elementwise ones (approximately) were verified. As part of this change fix the newly identified brokenness. The only remaining unverified instruction is convolution which is being addressed in cl/166654245. PiperOrigin-RevId: 168763722
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
dc1eda8a6d06cff541be768a0c8e2b22b376651c |
|
13-Sep-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Fix CHECK-failure crash if a non-tuple was passed to GetTupleElement. PiperOrigin-RevId: 168550703
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
a1e3ada75c0ad670b7854f935e07c9630abc8794 |
|
06-Sep-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Fix conversion to explicit scalar broadcast The dimensions field of a broadcast HLO op is meant to be populated with the dimensions that are broadcasted, which in case of a scalar is the empty vector. Generally, the rank of the operand of a broadcast op should always equal the size of the dimensions vector. PiperOrigin-RevId: 167686946
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
96b852627307d9375b2391ef6273abc78a2db5b2 |
|
30-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Don't remove trivial dimensions if they require no broadcast. Currently, the conversion from implicit broadcasts to explicit broadcasts also removes dimensions which are the same as the output shape. This means that sometimes potentially costly (on some backends) reshapes are required. This CL changes the conversion that it will only remove trivial dimensions if they actually require a broadcast. PiperOrigin-RevId: 166970167
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
7359fec792e4efec1670a12332bb524a5608b215 |
|
18-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Implement Batchnorm Inference by expanding them into smaller ops. 1. Add batch norm inference support in batchnorm_rewriter 2. Connect xla's batchnorm inference to tf's FusedBatchNorm RELNOTES: n/a PiperOrigin-RevId: 165655351
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
f43149444086652ab526225456ccd15ae97f66c3 |
|
07-Aug-2017 |
Kay Zhu <kayzhu@google.com> |
[XLA] Remove a VLOG ToString call on a parentless HloComputation, which triggers a CHECK failure affecting all tests. PiperOrigin-RevId: 164499310
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
42bc2aa7c704d3ab1f1a4f4df43b02b03dbf4b61 |
|
03-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Skip dot operation in implicit->explicit broadcast conversion, because dot operation dosen't have broadcast sementic. PiperOrigin-RevId: 164068160
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
93bff4d1fd2c930999b01a82494d2fed1e6213ca |
|
02-Aug-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Name change to strides (plural) in XLA service's Slice op. PiperOrigin-RevId: 163924726
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
2661f6841d0ad9ec1381d177a1f9df02e73d001c |
|
24-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[XLA] Add support for sin(x) transcendental. PiperOrigin-RevId: 162889962
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
00331578f746797989803a22a112e2046649dfbb |
|
19-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Internal change. PiperOrigin-RevId: 162522519
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
c6ec24290259c09099de22eca7ed5351a9fde811 |
|
19-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Internal change. PiperOrigin-RevId: 162502710
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
8ad81fd88faa3facf206518064d421ad5ece4a5c |
|
13-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
This is 1st of 4 CLs to implement BatchNormGrad. Various support in user computation is needed to properly have an end-to-end flow working for BatchNormGrad. RELNOTES: n/a PiperOrigin-RevId: 161856560
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
7d0f6385f8e7637e155ef9c340c19aded365a6ff |
|
07-Jul-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[BatchNorm] Skeleton code to implement BatchNormGrad This CL sets up all the boilerplate code needed to implement BatchNormGrad. None of the backends bas been implemented yet. RELNOTES: n/a PiperOrigin-RevId: 161161713
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
e6a45475735ee8a31c7d6c8e28e9164cda7d1853 |
|
29-Jun-2017 |
Eli Bendersky <eliben@google.com> |
[XLA] Move the flag from user_computation_flags into debug_options_flags This requires some plumbing in user_computation to pipe the debug options through a few layers. PiperOrigin-RevId: 160459822
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
50b999a8336d19400ab75aea66fe46eca2f5fe0b |
|
28-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Merge changes from github. PiperOrigin-RevId: 160344052
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
1fa73c53ab95693f070ce70e6be0c644d83c163a |
|
26-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Automated g4 rollback of changelist 160182040 PiperOrigin-RevId: 160190881
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
f3c89936e97c99dead1ca3310246691c1b221adf |
|
26-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Merge changes from github. END_PUBLIC Note: this CL will break builds. cl/159887762 to follow to fix all the breakages. --- Commit 2336cdf7f authored by Maxwell Paul Brickner<mbrickn@users.noreply.github.com> Committed by gunan<gunan@google.com>: Updated link to use HTTPS (#10998) Howdy! I just updated a link to use https instead of http. Thanks! --- Commit ad0892df1 authored by Luke Iwanski<luke@codeplay.com> Committed by Luke Iwanski<luke@codeplay.com>: [OpenCL] Fixes run_metadata_test for SYCL This test is designed to test CUDA specific behavior --- Commit 6b37a0725 authored by Todd Wang<toddwang@gmail.com> Committed by GitHub<noreply@github.com>: Update comments --- Commit 1699d904a authored by John Lawson<john@codeplay.com> Committed by Luke Iwanski<luke@codeplay.com>: [OpenCL] Fixes CUDA specific test run on SYCL (#56) The testBadParentValuesOnGPU should only be run on CUDA devices, as the test checks for particular CUDA behaviour. We don't actually provide a SYCL kernel for GatherTree and so it's not a problem that the tests don't target SYCL. --- Commit 3c1946230 authored by myPrecious<Moriadry@users.noreply.github.com> Committed by Shanqing Cai<cais@google.com>: Java API to get the size of specified input list of operations. (#10865) * Java API to get the size of specified input list of operations * remove unnecessary explain to avoid bring a new term to users. --- Commit e911c7480 authored by Luke Iwanski<luke@codeplay.com> Committed by Luke Iwanski<luke@codeplay.com>: [OpenCL] REGISTER -> REGISTER6 --- Commit fbf6c4cec authored by superryanguo<superryanguo@gmail.com> Committed by superryanguo<superryanguo@gmail.com>: Simplify the Quickstart section with the weblink is better --- Commit 72e2918cc authored by Taehoon Lee<taehoonlee@snu.ac.kr> Committed by Taehoon Lee<taehoonlee@snu.ac.kr>: Fix typos --- Commit 90c4406b7 authored by Rishabh Patel<patelrishabh@users.noreply.github.com> Committed by GitHub<noreply@github.com>: Correct the learning rate as per the code snippet --- Commit 03da61134 authored by Todd Wang<toddwang@gmail.com> Committed by GitHub<noreply@github.com>: Update ir_array.cc --- Commit 2df6cd3ac authored by Todd Wang<toddwang@gmail.com> Committed by GitHub<noreply@github.com>: Another try --- Commit af0cbace1 authored by Luke Iwanski<luke@codeplay.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: [OpenCL] Transpose to go through Eigen (#10321) --- Commit fc7361081 authored by Luke Iwanski<luke@codeplay.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: [OpenCL] Registers RGBToHSV and HSVToRGB (#91) (#10848) * [OpenCL] Added RGBToHSV and HSVToRGB * Aligning '\' --- Commit 832894ef8 authored by Luke Iwanski<luke@codeplay.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: [OpenCL] Registers AdjustContrastv2 (#10949) * [OpenCL] Registers AdjustContrastv2 (#93) * [OpenCL] Extended adjust_contrast_op_benchmark_test for OpenCL (#96) * [OpenCL] Extended adjust_contrast_op_benchmark_test for OpenCL * simplified to #ifndef * Changed to "#if GOOGLE_CUDA" * Update adjust_contrast_op_benchmark_test.cc * Added comments --- Commit cb4c2f8d1 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Make TransferBufferToInFeed not virual so it compiles. --- Commit e89f04d80 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Fix calling Literal member functions. --- Commit 15a8df724 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Fix mac build clone from meheff's change: [XLA] Change return type of DeviceAssignment::Deserialize to fix build breakage on mac. The mac build had the following error: error: incomplete type 'xla::DeviceAssignment' used in type trait expression This was due to a static method returning a StatusOr<DeviceAssignment> inside of the definition of DeviceAssignment. --- Commit a54d43fa4 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Replace LiteralUtil to Literal in compiler/plugin/executor --- Commit 88a6bb80c authored by Guenther Schmuelling<guschmue@microsoft.com> Committed by Guenther Schmuelling<guschmue@microsoft.com>: expand inline for debug builds to limit number of symbols --- Commit 62fb49d31 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Fix visibility error for contrib/remote_fused_graph/pylib/BUILD. --- Commit 4c75252f2 authored by Mark Neumann<markn@allenai.org> Committed by Mark Neumann<markn@allenai.org>: fix initial test values to avoid numerical instability --- Commit b58d98353 authored by sj6077<epik03sj@gmail.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: Fixes of AutoParallel bug (#10368) * Fix the bug that auto_parallel could replicate variable snapshot name * Use NodeName in grappler:utils instead of substr, convert variables->variable_def of grappler item * remove variable_def from grappler item, exclude snapshot nodes from dont_replicate_nodes in auto_parallel --- Commit a286b7db8 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Make debug_test slice integer. --- Commit 97fcfdfa6 authored by Toby Boyd<tobyboyd@google.com> Committed by GitHub<noreply@github.com>: Fixed path to seq2seq.py and minor formatting --- Commit 63c1befb8 authored by Anish Shah<shah.anish07@gmail.com> Committed by Anish Shah<shah.anish07@gmail.com>: Improve docs for tf.nn.depthwise_conv2d_native --- Commit 8d42202b2 authored by Yong Tang<yong.tang.github@outlook.com> Committed by Yong Tang<yong.tang.github@outlook.com>: Fix mismatched delete in mkl_tfconv_op.cc This fix fixes mismatched new[]-delete in mkl_tfconv_op.cc (the file went through clang-format so there are some additional changes) Signed-off-by: Yong Tang <yong.tang.github@outlook.com> --- Commit 26301bd55 authored by Danny Goodman<goodman.danny@gmail.com> Committed by Danny Goodman<goodman.danny@gmail.com>: fix error format --- Commit b3f33ad46 authored by Yao Zhang<yaozhang@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Make changes to prepare for the fused option of batch norm to be set to None (None means using fused batch norm if possible). PiperOrigin-RevId: 159649743 --- Commit a4a469832 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [XLA] Add tests for select ops and while loops that produce tuples that contain predicates. PiperOrigin-RevId: 159645900 --- Commit 980d3f2be authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Use C API to implement Operation.name property This name property is used in many existing tests including those that already run with C API enabled (math_ops_test, framework_ops_test, session_test, session_partial_run_test, math_ops_test_gpu, etc). PiperOrigin-RevId: 159645767 --- Commit 26239c706 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Previously we didn't have an implementation of BatchNormInference and BatchNormTraining, which gives a linker error if anyone ever tries to call that. A dummy implementation is friendlier than a linker error. PiperOrigin-RevId: 159645612 --- Commit f671c5caa authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: BEGIN_PUBLIC Automated g4 rollback of changelist 159570549 PiperOrigin-RevId: 160182040
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
46737e4e81314f7482bfd6a710f126a27f5d7975 |
|
19-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Remove class xla::LiteralUtil. NFC (mind-numbingly so). This patch removes class xla::LiteralUtil and rewrites every call to use class xla::Literal instead. PiperOrigin-RevId: 159446373
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
18890a070b94f5a1c38ad9720844e2ca0fac7d83 |
|
19-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Removed unused shape creation. PiperOrigin-RevId: 159413535
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
9d2a432ce74eab4c439fe8c60389e4da9d6c92b2 |
|
17-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Add plumbing for a ReducePrecision operation. This CL is the first part of a series that adds a ReducePrecision operation for experimenting with the effects of reduced-precision storage of intermediate values. ReducePrecision is a Unary operation parameterized on floating-point exponent and mantissa bit sizes, and rounds the input data as if it were converted to a floating-point value with the given bit sizes and then converted back to "normal" F32 data. Using arbitrary parameterized values to describe the lower-precision value type, rather than hardcoding this as a reduction to IEEE f16, allows us to do more flexible experiments -- e.g., "Is this training error due to the reduced mantissa precision, or due to the reduced exponent range?" or "Is this a smooth degradation with reduced precision or is there a sudden drop at some value?" -- which may suggest software mitigations for the effects. This version of the CL adds the kReducePrecision instruction opcode, and the overall plumbing to support the operation. To allow testing, it includes an exceptionally simple implementation of the actual operation that returns "unimplemented" except for the exponent and mantissa bit sizes where it is a complete no-op. PiperOrigin-RevId: 159295615
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
df511d09b051914cbc4fc559807a3f0d07dfee71 |
|
14-Jun-2017 |
Petros Mol <pmol@google.com> |
[XLA] Add a Cos unary operation that computes the elementwise cosine PiperOrigin-RevId: 158984883
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
b6039c875290cdd5c9a62e01393b75b928827504 |
|
14-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
We believe a fused version of batch_norm_op can speed the algorithm up. This pr implements a new op: fused_batch_norm_op in tf-xla and HLO. This is the CPU implementation for batch norm training. This CL is big but a lot of code are boilerplate. PiperOrigin-RevId: 158930166
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
02ac85399d4fb35d5055ecf426632b9446a70041 |
|
01-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Introduce new class Literal to replace protobuf Literal. This renames the existing Literal message to LiteralProto and introduces a new C++ class named Literal to replace it. The LiteralProto is only used at RPC boundaries, or when protobuf-specific functionality is required. The Literal class offers a 'ToProto' function to generate a new LiteralProto message when necessary. Currently, all the static functions in class LiteralUtil, just forward to their counterparts in class Literal. This will change in a future CL. Class Literal implements all the buffers as std::vectors. The only exception is preds(), which given the std::vector<bool> representation, makes it unusable for the semantics we require (it's not possible to get the address of the underlying vector, for instance). The CL adds a BoolVector class to work around that issue. In future CLs, the std::vector representation may be changed to something more efficient, if needed. PiperOrigin-RevId: 157739125
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
eb10a4c494d95e7c17ddc44ef35197d08f2f6b33 |
|
01-Jun-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Preallocate vector storage when the ultimate vector size is known in advance PiperOrigin-RevId: 157724431
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
5f097217f4e7991d609828721a4b26122c7c1058 |
|
29-May-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
An initial step of eliminating all implicit broadcast at the HLO level. Guard the shape inference for binary ops behind a flag. PiperOrigin-RevId: 157373647
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
4e3544100876676f85ef1668fc2222a6d18a3a68 |
|
22-May-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Fix stack overflow in the UserComputation ComputationLowerer for deep computations. Use an iterative DFS instead of a recursive DFS. PiperOrigin-RevId: 156772340
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
562136cf7fb887b5dba755319263230062d512e0 |
|
03-May-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
[TF:XLA] Set metadata of all added HLO instructions when lowering computations. Change: 154952289
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
c0e4079b8896c8e7eca79766711b1dd029cd5402 |
|
20-Apr-2017 |
Peter Hawkins <phawkins@google.com> |
[TF:XLA] Make IsConstant conservatively consider While loops as non-constant. Change: 153644901
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
dcd71f6343c086ebd5dd4875e57bc92d9465e769 |
|
15-Mar-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Give Transpose its own Request, rather than piggybacking on ReshapeRequest. Avoids building unnecessary Reshape operators when Transpose was called by the client. Also avoids building Transpose operators when Reshape has identity transpose dimensions, for example when the client called the variant of ComputationBuilder::Reshape() that does not transpose. Makes the HLO graph emitted by the TF bridge more readable. Change: 150253949
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
00d0347ccebc3e29ffe541703b5a2f929b89da36 |
|
10-Mar-2017 |
Brennan Saeta <saeta@google.com> |
[TF:XLA] Add debug metadata to HLO ops. In order to support end-to-end debugging and performance profiling tooling for the TensorFlow::XLA toolchain, this change adds a DebugMetadata proto to the HloInstruction class, and pipes it through the tf2xla stack. Change: 149703349
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
efc8f98d45df835bac2373e19f1da57e3a1ea2d0 |
|
28-Feb-2017 |
Jacques Pienaar <jpienaar@google.com> |
[XLA] Add basic outfeed support. Change: 148699787
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
8120e2a270c28e0a62b9f522164b196a90f113b7 |
|
24-Feb-2017 |
Peter Hawkins <phawkins@google.com> |
[XLA] Add an IsFinite operation that tests elementwise whether values are finite (i.e., not NaN or Inf). Change: 148485205
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
7817ac8055ca328e5bf902677f335502eb0da926 |
|
22-Feb-2017 |
Mark Heffernan <meheff@google.com> |
[XLA] Properly version outfeed and send operations in UserComputation. Previously outfeed and send operations were unconditionally emitted during UserComputation lowering even if the outfeed/send was not in the requested version (computation snapshot). This CL versions these operations. Also, opportunistically improve logging in UserComputation, Service, and ComputationTracker which was used to root cause the underlying bug. Change: 148170893
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
9113e98115ecbeb1404edb7d14d2cf443f2484bf |
|
27-Jan-2017 |
Tayo Oguntebi <tayo@google.com> |
Addition of Outfeed HLO op. Change: 145772331
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
4fe280c59a71e85b73e9947063147743adf2ff2b |
|
21-Jan-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Added optional string argument to infeed HLO op. Change: 145188452
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
2abeb0f1b68a73ce54e9c90459e37de581117d45 |
|
19-Jan-2017 |
HyoukJoong Lee <hyouklee@google.com> |
Add control successors to HloInstruction. Add Send/Recv cases for the ConstantVisitor in UserComputation. Change: 145001882
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|
1e67c90e2caceeff82d09793d1ef5fa0300d219b |
|
09-Jan-2017 |
Peter Hawkins <phawkins@google.com> |
Initial open-source release of XLA: Accelerated Linear Algebra. XLA is a compiler-based linear algebra execution engine that targets CPUs, GPUs and custom accelerators. XLA is still experimental; we are releasing it early to get the community involved. Change: 143990941
/external/tensorflow/tensorflow/compiler/xla/service/user_computation.cc
|