Cross Reference: /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops

History log of /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
465a45cd3b717908ecbae72b824c91f44c2cdce0	13-Feb-2018	Sanjoy Das <sanjoy@google.com>	[XLA:CPU] Implement vectorized Log in LLVM IR This was the last vectorized intrinsic for which we had to call into C++ so also remove the associated machinery. PiperOrigin-RevId: 185482962 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
0b32f622e581e28897b91d93f6fefb47bb21b061	08-Feb-2018	Sanjoy Das <sanjoy@google.com>	[XLA:CPU] Fix test case for vectorized Exp and Tanh to actually vectorize I just noticed that the test case for ArrayElementwiseOpTest::ExpF32sVector and possibly for ArrayElementwiseOpTest::ExpF32sVector does not actually vectorize the intrinsic calls. This is most likely a very recent regression because I remember fixing at least one issue in the emitter demonstrated by the test. Despite that I think the current approach is better since we have unit tests that check that we at least vectorize the vector-of-F32's case. PiperOrigin-RevId: 184918373 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
9b08301f474bb1175acf6ef77e6eb2b6552339ba	07-Feb-2018	Sanjoy Das <sanjoy@google.com>	[XLA:CPU] Add an LLVM IR implementation of Exp This lets us avoid the usual set of issues that crop up when XLA generated code has to call into C++. PiperOrigin-RevId: 184793093 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
b3360e040602a2aa46fb9e27c7af940d651a704b	05-Feb-2018	A. Unique TensorFlower <gardener@tensorflow.org>	[XLA] Add tests for Clamp of S32 and U32 vectors with broadcasted scalars. PiperOrigin-RevId: 184579375 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
0bd78003c36dd194083ec22501c2b0b6db208f4c	31-Jan-2018	Sanjoy Das <sanjoy@google.com>	[XLA:CPU] Generate correct IR for integer clamp PiperOrigin-RevId: 184037078 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
7d64e124103c8334b7d8b127cd2eff786959d185	06-Jan-2018	Mark Heffernan <meheff@google.com>	Remove protobuf-compatibility methods from the Literal class. This CL primarily does two things: (1) Remove the protobuf-compatibility methods (eg, mutable_f32s()) from Literal. These were added to Literal as part of the migration of Literal from a proto to a c++ class. Now that Literal is a proper class, these protobuf methods make it difficult to enforce invariants and expose too much of the class' implementation details. (2) Make shape an immutable property of Literals, and make shape and the data members holding the Literal data coherent by construction. Previously, the shape could be set arbitrarily, and the data members such as f32_ could be arbitrarily sized irrespective of the shape of the literal. The remainder of the CL mostly deals with the fallout. Notable other changes: - Literal is no longer a recursive data structure. To avoid copies when passing a subliteral of a tuple-shaped Literal, a LiteralView class is added which provides a read-only view of an arbitrary subliteral. - Tuple-shaped Literals can no longer be built up incrementally so to avoid copying Literal values during construction, the following methods with move semantics are added: Literal::MoveFrom and Literal::MoveIntoTuple. These methods transfer ownership the underlying buffers enabling, for example, a literal to be moved into an element of a tuple-shaped literal with no data copying. - Replace the internal data structure holding the actual data from a bunch of std::vectors (eg, s32s_, f32s, etc) to a single ShapeTree<char*>. This significantly simplifies accessors and makes improved support of tuple-shaped literals much easier (eg, Literal::Get<>() can now access elements in arbitrary subliterals). Also, Literal is made movable, but not copyable. Otherwise, it is all too easy to accidentally introduce expensive copies of Literals. Literal::Clone is added to handle the case where a copy is needed (Literal::CloneToUnique already exists). PiperOrigin-RevId: 181014890 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
94ee858b374ecc570eb07fa6683499b3494d5c6d	05-Jan-2018	Brian Patton <bjp@google.com>	Adds a test exercising Atan2 via XLA client. PiperOrigin-RevId: 180862094 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
58f7858601b72aa3c5854571f2152b91d1795e29	13-Nov-2017	A. Unique TensorFlower <gardener@tensorflow.org>	[TF:XLA] Adding test coverage for more C64 operations, and ensuring they pass. Included here: - reduction ops (reduce_sum, reduce_prod) - unaries: tanh, sigmoid (currently GPU only) - binaries: pow (currently GPU only) PiperOrigin-RevId: 175562417 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
b7e59ae74ac189df78ec2222694796cb6791d63c	02-Nov-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Hlo parser: support rank 0-5 literals and tuple literal. Also, - Get rid of the trailing commas in Literal::ToString; - Change comments Literal::ToString from line comment style to block comment style. - Fix test failures caused by the literal format change. - Print all literals. PiperOrigin-RevId: 174392388 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
1c241e5ba7fa7068f9cf8f925638b170db57c438	13-Oct-2017	Peter Hawkins <phawkins@google.com>	[XLA] Add ShiftLeft, ShiftRightArithmetic, and ShiftRightLogical operators. PiperOrigin-RevId: 172091595 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
915a8ac568f0a67d6000ab70a665817deff7888c	13-Oct-2017	A. Unique TensorFlower <gardener@tensorflow.org>	[TF:XLA] Implement BitwiseAnd, BitwiseOr, and Invert operators. PiperOrigin-RevId: 172038787 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
e56628b085ffa7922e5238537f6ebd6deee0f0cc	09-Oct-2017	A. Unique TensorFlower <gardener@tensorflow.org>	[TF:XLA] Rename ComputationBuilder::LogicalX to X PiperOrigin-RevId: 171562764 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
8964d1b1ee5170686cb0d2969047b14eccc24318	29-Sep-2017	Peter Hawkins <phawkins@google.com>	[XLA] Allow broadcast_dims argument to binary operations to be the identity mapping where the inputs are the same rank. Allowing the identity is well-defined and useful as a base case. PiperOrigin-RevId: 170499871 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
8c0853db731cf80cfeec9dfb4edab95961aaa585	18-Aug-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Add a test for negative and zero pow() input. PiperOrigin-RevId: 165650096 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
28ce1d163eeffe618a6972c5245be0e660d94e85	15-Aug-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Merge changes from github. END_PUBLIC --- Commit 9f81374c3 authored by raymondxyang<zihao.yang@microsoft.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Add option for build more python tests in Cmake (#11853) * Ignore Windows built project * Fix deprecated methods in tf.contrib.python * Fix regex match for Windows build in contrib.keras * Fix Regex match for Windows build in session_bundle * * Fix deprecated methods * Fix regex match for Windows * Fix compatibility issue with Python 3.x * Add missing ops into Windows build for test * Enabled more testcases for Windows build * Clean code and fix typo * Add conditional cmake mode for enabling more unit testcase * Add Cmake mode for major Contrib packages * Add supplementary info in RAEDME for new cmake option * * Update tf_tests after testing with TF 1.3 * Clean code and resolve conflicts * Fix unsafe regex matches and format code * Update exclude list after testing with latest master branch * Fix missing module --- Commit 98f0e1efe authored by Yong Tang<yong.tang.github@outlook.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Dynamic ksize and strides with MaxPool (#11875) * Dynamic ksize with max_pool This fix tries to fix the issue raised in 4746 where ksize is static (attr) with max_pool. This fix changes ksize to input tensor so that it is dynamic now. This fix fixes 4746. Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Add dynamic ksize to MaxPoolGrad and MaxPoolGradGrad Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Add test cases for max_pool_v2 Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Fix GPU Jenkins issue. Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Enable MaxPoolV2 in GPU Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Hide MaxPoolV2 and other fixes. Signed-off-by: Yong Tang <yong.tang.github@outlook.com> --- Commit 02d6bc185 authored by Bairen Yi<byronyi@users.noreply.github.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: remove useless variable (#12212) --- Commit ed6b0d905 authored by namrata-ibm<bhavenamrata@gmail.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Adding support for s390x in calculation of cpu_frequency (#12201) --- Commit 627dfc9dd authored by Taehoon Lee<taehoonlee@snu.ac.kr> Committed by Taehoon Lee<taehoonlee@snu.ac.kr>: Fix typos --- Commit c0f9b0a91 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: In fast-math mode emit a tanh that has a faster min/max. PiperOrigin-RevId: 164943597 --- Commit 87605f3d6 authored by Kay Zhu<kayzhu@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [TF:XLA] Use HloEvaluator for ComputeConstant, remove the need of a dedicated compute constant backend. PiperOrigin-RevId: 164940970 --- Commit 881de45c2 authored by Taehoon Lee<me@taehoonlee.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Add bool type supports for GPU kernels (#11927) * Add bool type supports for GPU kernels * Add bool type test codes for GPU kernels --- Commit eeacdcdb1 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Add missing "CPU" suffix in registrations. PiperOrigin-RevId: 164939527 --- Commit de01be952 authored by namrata-ibm<bhavenamrata@gmail.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Adding support for Big Endian in graph_constructor_test and wav_io (#12179) --- Commit 26719d29f authored by QingYing Chen<pkudysj@126.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Implement CRF decode (Viterbi decode) for tensor (#12056) * Implement CRF decoding for tensors * add test code for tensor version's CRF decoding * made modifications according to pylint * add some comments for crf decode * remove useless code * add comments at the top comment of crf module and add more comments in crf_test * capitalize first char of first word in comments * replace crf_decode test code with a deterministic example --- Commit f9a81ca2f authored by Pete Warden<pete@petewarden.com> Committed by gunan<gunan@google.com>: Create CI build script for Raspberry Pi (#12190) * Create CI build script for Raspberry Pi * Moved location of Pi build script --- Commit e2a163a90 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Merge code from PR #11940 with internal changes from cl/164796436, and update Python tests to also run on GPU. PiperOrigin-RevId: 164929133 --- Commit 08bbfa187 authored by Taehoon Lee<me@taehoonlee.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Fix typos (#12195) --- Commit ab96f41fb authored by Luke Iwanski<luke@codeplay.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: [OpenCL] Extends matmul_benchmark.py to cover SYCL (#11697) * [OpenCL] Extends matmul_benchmark.py to cover SYCL * Fixed typo * /gpu:0 -> /device:GPU:0 * Fixes control_flow_ops_py_test * /gpu: -> /device:GPU: * Fixes //tensorflow/python/profiler/internal:run_metadata_test * gpu: -> GPU: * Fixes tfprof_node * [OpenCL] Fixes device path to name with many colons (#123) The device path is constructed from a device name by replacing all colons with underscores. Some device names contain more than one colon, for example 'device:SYCL:0' which gives a path 'device_SYCL_0'. The previous code would not convert this back to the original device name, but rather to 'device:SYCL_0'. An alternative fix would be to convert all underscores to colons in the device name (i.e. remove the restriction inside `replace("_", ":", 1)`), however I'm not sure if there are any device names which contain underscores. * If no gpu device aviable fake one * gpu: -> device:GPU * Fixes profiler test * /gpu:x -> /device:GPU:x * Fixes debug_io_utils_test.cc test * Fixes device_name_utils_test.cc --- Commit 35e7a3665 authored by Yong Tang<yong.tang.github@outlook.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: Remove unneeded casting of int64 for reverse_sequence (#12192) This fix remove unneeded cast of int64 for reverse_sequence: ``` lengths = math_ops.to_int64(lengths) ``` as int32 has already been enabled for reverse_sequence. Signed-off-by: Yong Tang <yong.tang.github@outlook.com> --- Commit 9fba8c185 authored by Anna R<annarev@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Add benchmark dashboard link to benchmarks doc. Also, I added a link and description for Benchmarks page to Community index page. PiperOrigin-RevId: 164924906 --- Commit bb6f32fa7 authored by Mark Heffernan<meheff@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Make HloAliasAnalysis updatable after changes to the HLO graph. As part of this change make HloAliasAnalysis a thinner layer which basically only holds a map from HloValue to HloBuffer and vice versa. PiperOrigin-RevId: 164923041 --- Commit 9103096c1 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by Thomas K?ppe<tkoeppe@google.com>: Merged commit includes the following changes: 164923041 by meheff: Make HloAliasAnalysis updatable after changes to the HLO graph. As part of this change make HloAliasAnalysis a thinner layer which basically only holds a map from HloValue to HloBuffer and vice versa. -- PiperOrigin-RevId: 164923041 --- Commit 822603aed authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Merging sibling fusion instruction using multi_output_fusion PiperOrigin-RevId: 164920220 --- Commit c035aa2a8 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Go: Update generated wrapper functions for TensorFlow ops. PiperOrigin-RevId: 164917891 --- Commit e1e81d9ba authored by Luke Iwanski<luke@codeplay.com> Committed by Rasmus Munk Larsen<rmlarsen@google.com>: [OpenCL] Fixes double memcpy bug (#151) (#12173) * [OpenCL] Fixes double memcpy bug (#151) As the debg CopyOp is called on a Tensor without type, we need to use the DataType enum to get type information, and use this to pass the type on to Eigen. This is a workaround Eigen's need to have a type when calling memcpy. If the Eigen memcpy can be provided without a type requirement, then the memcpy in sycl_util is unnecessary. * Acts on feedback from: #12173/files/32cb12a9001b672425867b5a3110fd98e737a20b#r132496277 --- Commit d9ca2d86d authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Internal change PiperOrigin-RevId: 164916465 --- Commit b8d13d218 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Remove more parts of DCASGD missed in the first pass. (47949b) PiperOrigin-RevId: 164914552 --- Commit 73b3d52c7 authored by Alexandre Passos<apassos@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: cmake fix PiperOrigin-RevId: 164911656 --- Commit 2173b5b0a authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Allow TFE_TensorHandleCopyToDevice to have the same device as src and destination. It will reuse the same underlying buffer in those cases. PiperOrigin-RevId: 164909906 --- Commit 13eb3b90e authored by Alexandre Passos<apassos@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Experimental C and Python APIs to invoke TensorFlow kernels on concrete values. PiperOrigin-RevId: 164902588 --- Commit 7dfabcc01 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Initialize ExecutionOptions in ComputeConstant to default values. PiperOrigin-RevId: 164894867 --- Commit c8897e9bc authored by Benoit Steiner<bsteiner@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Static required time computation PiperOrigin-RevId: 164894645 --- Commit 076158f9b authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Enable implicit->explicit conversion by default. PiperOrigin-RevId: 164890915 --- Commit 58c4a4cb1 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Bugfix: number of input channels is not necessarily in the last dimension, after introduction of data_format param. PiperOrigin-RevId: 164889729 --- Commit 8f9b1af8a authored by Igor Saprykin<isaprykin@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Recover MonitoredSession when the Coordinator is requested to stop with one of the _PREEMPTION_ERRORS. When SyncReplicasOptimizer is used, a preemption in the Coordinator may result in two cases: Case 1) the session gets silently marked as complete Case 2) the session gets stuck This CL aims to solve and verify solutions for both of these problems. Fix 1 changes the should_stop logic. Fix 2 changes the CoordinatedSession.run() logic. SyncReplicasOptimizer runs a separate set of threads using a Coordinator instance. Those threads do FIFOQueue.enqueue; the main thread does a blocking FIFOQueue.dequeue. `sync_token_q` FIFOQueue is on parameter-servers. When one of the PS instances gets preempted, an AbortedError causes the Coordinator to stop via request_stop(ex). That by itself changes the state of MonitoredSession.should_stop() to True (Fix 1). Results of the blocking Dequeue operation are sent to the chief worker via Recv. What happens next depends on the amount of tokens in `sync_token_q`. If there are enough for the next call to Dequeue to return, then the low-level "tf session run() call" returns. The next iteration of the `while not MonitoredSession.should_stop()` loop decides that the training is complete (Case 1). If there are not enough tokens in `sync_token_q`, then the blocking Dequeue is going to keep waiting for them. This results in the graph execution getting stuck and the whole session getting garbage collected after 10 minutes (Case 2). We decided to fix that by re-creating a session after it gets garbage collected (Fix 2). An alternative was to try to cancel the pending Dequeue operation, but it's not clear that it is the right thing to do and it is also not easy. PiperOrigin-RevId: 164888390 --- Commit 46e4de6e5 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Undo loop fusion changes for now as they seem to be altering a few results. END_PUBLIC RELNOTES: n/a BEGIN_PUBLIC BEGIN_PUBLIC Automated g4 rollback of changelist 164825735 PiperOrigin-RevId: 165340331 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
4a4b6f7b3e43a176e26ac586340f6af798df5447	02-Aug-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Merged commit includes the following changes: 163914294 by annarev: Refactors build target for gradients_impl to allow code to depend on the gradient generation but not the gradients themselves. -- 163913011 by A. Unique TensorFlower: Use an LLVM-IR version of vector hyperbolic tangent. This lets us: - Inline routine where it is called, eliminated call overhead. - Use AVX instructions in JITed code even if Tensorflow was not built with -mavx. -- 163909534 by A. Unique TensorFlower: Add tensorflow-android to standard TF maven artifacts. -- 163908704 by A. Unique TensorFlower: Go: Update generated wrapper functions for TensorFlow ops. -- 163907709 by A. Unique TensorFlower: Update ops-related pbtxt files. -- 163907497 by A. Unique TensorFlower: Remove old TensorFlow Serving landing page in prepartion for new TF Serving landing page. Fix bad leftnav. -- 163906225 by alive: Refactors build target for gradients_impl to allow code to depend on the gradient generation but not the gradients themselves. -- PiperOrigin-RevId: 163914294 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
ddd8e21b7c1d23bf80ddf0141b44e168c17647f3	27-Jul-2017	Eli Bendersky <eliben@google.com>	[XLA] Consolidate all similar main()s in tests into a single target. PiperOrigin-RevId: 163354724 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
2661f6841d0ad9ec1381d177a1f9df02e73d001c	24-Jul-2017	A. Unique TensorFlower <gardener@tensorflow.org>	[XLA] Add support for sin(x) transcendental. PiperOrigin-RevId: 162889962 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
0bce3ad96382de0506256b6cbc5dd29a8691bdbf	30-Jun-2017	Blake Hechtman <blakehechtman@google.com>	[XLA] Various algebraic simplifications involving division and exponenentials PiperOrigin-RevId: 160655974 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
e6a45475735ee8a31c7d6c8e28e9164cda7d1853	29-Jun-2017	Eli Bendersky <eliben@google.com>	[XLA] Move the flag from user_computation_flags into debug_options_flags This requires some plumbing in user_computation to pipe the debug options through a few layers. PiperOrigin-RevId: 160459822 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
50b999a8336d19400ab75aea66fe46eca2f5fe0b	28-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Merge changes from github. PiperOrigin-RevId: 160344052 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
1fa73c53ab95693f070ce70e6be0c644d83c163a	26-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Automated g4 rollback of changelist 160182040 PiperOrigin-RevId: 160190881 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
f3c89936e97c99dead1ca3310246691c1b221adf	26-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Merge changes from github. END_PUBLIC Note: this CL will break builds. cl/159887762 to follow to fix all the breakages. --- Commit 2336cdf7f authored by Maxwell Paul Brickner<mbrickn@users.noreply.github.com> Committed by gunan<gunan@google.com>: Updated link to use HTTPS (#10998) Howdy! I just updated a link to use https instead of http. Thanks! --- Commit ad0892df1 authored by Luke Iwanski<luke@codeplay.com> Committed by Luke Iwanski<luke@codeplay.com>: [OpenCL] Fixes run_metadata_test for SYCL This test is designed to test CUDA specific behavior --- Commit 6b37a0725 authored by Todd Wang<toddwang@gmail.com> Committed by GitHub<noreply@github.com>: Update comments --- Commit 1699d904a authored by John Lawson<john@codeplay.com> Committed by Luke Iwanski<luke@codeplay.com>: [OpenCL] Fixes CUDA specific test run on SYCL (#56) The testBadParentValuesOnGPU should only be run on CUDA devices, as the test checks for particular CUDA behaviour. We don't actually provide a SYCL kernel for GatherTree and so it's not a problem that the tests don't target SYCL. --- Commit 3c1946230 authored by myPrecious<Moriadry@users.noreply.github.com> Committed by Shanqing Cai<cais@google.com>: Java API to get the size of specified input list of operations. (#10865) * Java API to get the size of specified input list of operations * remove unnecessary explain to avoid bring a new term to users. --- Commit e911c7480 authored by Luke Iwanski<luke@codeplay.com> Committed by Luke Iwanski<luke@codeplay.com>: [OpenCL] REGISTER -> REGISTER6 --- Commit fbf6c4cec authored by superryanguo<superryanguo@gmail.com> Committed by superryanguo<superryanguo@gmail.com>: Simplify the Quickstart section with the weblink is better --- Commit 72e2918cc authored by Taehoon Lee<taehoonlee@snu.ac.kr> Committed by Taehoon Lee<taehoonlee@snu.ac.kr>: Fix typos --- Commit 90c4406b7 authored by Rishabh Patel<patelrishabh@users.noreply.github.com> Committed by GitHub<noreply@github.com>: Correct the learning rate as per the code snippet --- Commit 03da61134 authored by Todd Wang<toddwang@gmail.com> Committed by GitHub<noreply@github.com>: Update ir_array.cc --- Commit 2df6cd3ac authored by Todd Wang<toddwang@gmail.com> Committed by GitHub<noreply@github.com>: Another try --- Commit af0cbace1 authored by Luke Iwanski<luke@codeplay.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: [OpenCL] Transpose to go through Eigen (#10321) --- Commit fc7361081 authored by Luke Iwanski<luke@codeplay.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: [OpenCL] Registers RGBToHSV and HSVToRGB (#91) (#10848) * [OpenCL] Added RGBToHSV and HSVToRGB * Aligning '\' --- Commit 832894ef8 authored by Luke Iwanski<luke@codeplay.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: [OpenCL] Registers AdjustContrastv2 (#10949) * [OpenCL] Registers AdjustContrastv2 (#93) * [OpenCL] Extended adjust_contrast_op_benchmark_test for OpenCL (#96) * [OpenCL] Extended adjust_contrast_op_benchmark_test for OpenCL * simplified to #ifndef * Changed to "#if GOOGLE_CUDA" * Update adjust_contrast_op_benchmark_test.cc * Added comments --- Commit cb4c2f8d1 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Make TransferBufferToInFeed not virual so it compiles. --- Commit e89f04d80 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Fix calling Literal member functions. --- Commit 15a8df724 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Fix mac build clone from meheff's change: [XLA] Change return type of DeviceAssignment::Deserialize to fix build breakage on mac. The mac build had the following error: error: incomplete type 'xla::DeviceAssignment' used in type trait expression This was due to a static method returning a StatusOr<DeviceAssignment> inside of the definition of DeviceAssignment. --- Commit a54d43fa4 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Replace LiteralUtil to Literal in compiler/plugin/executor --- Commit 88a6bb80c authored by Guenther Schmuelling<guschmue@microsoft.com> Committed by Guenther Schmuelling<guschmue@microsoft.com>: expand inline for debug builds to limit number of symbols --- Commit 62fb49d31 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Fix visibility error for contrib/remote_fused_graph/pylib/BUILD. --- Commit 4c75252f2 authored by Mark Neumann<markn@allenai.org> Committed by Mark Neumann<markn@allenai.org>: fix initial test values to avoid numerical instability --- Commit b58d98353 authored by sj6077<epik03sj@gmail.com> Committed by Benoit Steiner<benoitsteiner@users.noreply.github.com>: Fixes of AutoParallel bug (#10368) * Fix the bug that auto_parallel could replicate variable snapshot name * Use NodeName in grappler:utils instead of substr, convert variables->variable_def of grappler item * remove variable_def from grappler item, exclude snapshot nodes from dont_replicate_nodes in auto_parallel --- Commit a286b7db8 authored by Yifei Feng<yifeif@google.com> Committed by Yifei Feng<yifeif@google.com>: Make debug_test slice integer. --- Commit 97fcfdfa6 authored by Toby Boyd<tobyboyd@google.com> Committed by GitHub<noreply@github.com>: Fixed path to seq2seq.py and minor formatting --- Commit 63c1befb8 authored by Anish Shah<shah.anish07@gmail.com> Committed by Anish Shah<shah.anish07@gmail.com>: Improve docs for tf.nn.depthwise_conv2d_native --- Commit 8d42202b2 authored by Yong Tang<yong.tang.github@outlook.com> Committed by Yong Tang<yong.tang.github@outlook.com>: Fix mismatched delete in mkl_tfconv_op.cc This fix fixes mismatched new[]-delete in mkl_tfconv_op.cc (the file went through clang-format so there are some additional changes) Signed-off-by: Yong Tang <yong.tang.github@outlook.com> --- Commit 26301bd55 authored by Danny Goodman<goodman.danny@gmail.com> Committed by Danny Goodman<goodman.danny@gmail.com>: fix error format --- Commit b3f33ad46 authored by Yao Zhang<yaozhang@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Make changes to prepare for the fused option of batch norm to be set to None (None means using fused batch norm if possible). PiperOrigin-RevId: 159649743 --- Commit a4a469832 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [XLA] Add tests for select ops and while loops that produce tuples that contain predicates. PiperOrigin-RevId: 159645900 --- Commit 980d3f2be authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Use C API to implement Operation.name property This name property is used in many existing tests including those that already run with C API enabled (math_ops_test, framework_ops_test, session_test, session_partial_run_test, math_ops_test_gpu, etc). PiperOrigin-RevId: 159645767 --- Commit 26239c706 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Previously we didn't have an implementation of BatchNormInference and BatchNormTraining, which gives a linker error if anyone ever tries to call that. A dummy implementation is friendlier than a linker error. PiperOrigin-RevId: 159645612 --- Commit f671c5caa authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: BEGIN_PUBLIC Automated g4 rollback of changelist 159570549 PiperOrigin-RevId: 160182040 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
8604f612a1b60a81deaa6330b8614f2b710ee488	23-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	[XLA] Add general F32 implementation for ReducePrecision operation. This only tests with parameter inputs (which is needed to ensure we actually test on GPUs as well as CPUs); there's no point in separately testing with constants. PiperOrigin-RevId: 159961430 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
46737e4e81314f7482bfd6a710f126a27f5d7975	19-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Remove class xla::LiteralUtil. NFC (mind-numbingly so). This patch removes class xla::LiteralUtil and rewrites every call to use class xla::Literal instead. PiperOrigin-RevId: 159446373 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
9d2a432ce74eab4c439fe8c60389e4da9d6c92b2	17-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Add plumbing for a ReducePrecision operation. This CL is the first part of a series that adds a ReducePrecision operation for experimenting with the effects of reduced-precision storage of intermediate values. ReducePrecision is a Unary operation parameterized on floating-point exponent and mantissa bit sizes, and rounds the input data as if it were converted to a floating-point value with the given bit sizes and then converted back to "normal" F32 data. Using arbitrary parameterized values to describe the lower-precision value type, rather than hardcoding this as a reduction to IEEE f16, allows us to do more flexible experiments -- e.g., "Is this training error due to the reduced mantissa precision, or due to the reduced exponent range?" or "Is this a smooth degradation with reduced precision or is there a sudden drop at some value?" -- which may suggest software mitigations for the effects. This version of the CL adds the kReducePrecision instruction opcode, and the overall plumbing to support the operation. To allow testing, it includes an exceptionally simple implementation of the actual operation that returns "unimplemented" except for the exponent and mantissa bit sizes where it is a complete no-op. PiperOrigin-RevId: 159295615 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
c77399d44fc2ed6912e7f301839ad3e404739b80	14-Jun-2017	Eli Bendersky <eliben@google.com>	[XLA] Remove remaining flags from cpu_compiler_flags And move them to debug_options_flags; these two flags (embed_ir_in_executable, dump_debug_json_to) are also unified with similarly named GPU compiler flags. This lets us completely remove the cpu_compiler_flags module. PiperOrigin-RevId: 158989621 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
df511d09b051914cbc4fc559807a3f0d07dfee71	14-Jun-2017	Petros Mol <pmol@google.com>	[XLA] Add a Cos unary operation that computes the elementwise cosine PiperOrigin-RevId: 158984883 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
8cedce4b806639c351d45a00324fcc269704f42b	12-Jun-2017	Eli Bendersky <eliben@google.com>	[XLA] Replace some XLA CPU compiler specific options by generic "debug options". LLVM optimization level, extra LLVM flags and "cpu parallel" all turn into debug options on the xla proto. "cpu parallel" is combined with "backend extra options" as a map. PiperOrigin-RevId: 158751784 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
eb10a4c494d95e7c17ddc44ef35197d08f2f6b33	01-Jun-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Preallocate vector storage when the ultimate vector size is known in advance PiperOrigin-RevId: 157724431 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
2d1860859a812437d5c20fa3bf75e6e989fbbb87	31-May-2017	Blake Hechtman <blakehechtman@google.com>	Fix test name in array_elementwise_ops_test. PiperOrigin-RevId: 157552402 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
5f097217f4e7991d609828721a4b26122c7c1058	29-May-2017	A. Unique TensorFlower <gardener@tensorflow.org>	An initial step of eliminating all implicit broadcast at the HLO level. Guard the shape inference for binary ops behind a flag. PiperOrigin-RevId: 157373647 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
5b9dcd8f9b30ca60ba4ee59c7dfd660203b08c17	07-Feb-2017	Peter Hawkins <phawkins@google.com>	[TF:XLA] Add more test cases for integer division. PiperOrigin-RevId: 156453370 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
91ed5939f8892b6522281d898709fc70c4d1d698	17-May-2017	David Majnemer <majnemer@google.com>	[XLA] Add tests for pow(x, y) with x < 0 PiperOrigin-RevId: 156330209 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
183e2c953a81e1a5a7a8b9df2534778b5ae48379	12-May-2017	A. Unique TensorFlower <gardener@tensorflow.org>	[XLA] Emit FCMP_UNE (true if unordered) instead of FCMP_ONE (false if unordered) for not_equal operation to be more compliant with TF and other languages. PiperOrigin-RevId: 155835642 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
5c8acccfc9e90d694a8394f5522097bfe87379b2	11-Apr-2017	A. Unique TensorFlower <gardener@tensorflow.org>	Using GMock matchers in XLA tests. Change: 152823724 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
8120e2a270c28e0a62b9f522164b196a90f113b7	24-Feb-2017	Peter Hawkins <phawkins@google.com>	[XLA] Add an IsFinite operation that tests elementwise whether values are finite (i.e., not NaN or Inf). Change: 148485205 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
d45505fe0c7ab9a10f16682f54d0eb54c4776cd1	01-Feb-2017	Justin Lebar <jlebar@google.com>	[XLA] Move fast-math flags into HLO module config. Previously, XLA controlled the presence/absence of fast-math flags (FMF) via a command-line flag. This patch changes things so we use a new CompileOptions proto instead. This proto lives in HloModuleConfig, and is passed to the service via ExecuteRequest. This change lets us entirely remove llvm_backend_flags.{h,cc}. In addition, this change takes us from two to one fast-math flags. Previously we tried to control "unsafe FP transformations" separately from "full fast math". It turns out that LLVM is misleadingly inconsistent in how it handles these. In the backend, they are indeed two separate options that can be enabled/disabled independently. In the frontend, however, unsafe-fp-math implies all the other FMFs. As a result, it doesn't really make sense for XLA to attempt to split out these two flags, at least not until LLVM changes how it handles them. Change: 146183994 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc
1e67c90e2caceeff82d09793d1ef5fa0300d219b	09-Jan-2017	Peter Hawkins <phawkins@google.com>	Initial open-source release of XLA: Accelerated Linear Algebra. XLA is a compiler-based linear algebra execution engine that targets CPUs, GPUs and custom accelerators. XLA is still experimental; we are releasing it early to get the community involved. Change: 143990941 /external/tensorflow/tensorflow/compiler/xla/tests/array_elementwise_ops_test.cc