982549ea3423df4270ff154e5c764beb43d472da |
|
26-Jan-2018 |
Rasmus Munk Larsen <rmlarsen@google.com> |
Branch 183429339 (#16469) * Change `reduce_logsumexp` to internally use `reshape` rather than `squeeze` since the latter requires the `axis` arg to be a Python `list`. PiperOrigin-RevId: 183396533 * Kernel utils to support broadcast add and mul. PiperOrigin-RevId: 183397494 * Updating sparsify_gather. PiperOrigin-RevId: 183402917 * [tf.data] Move slow-path-related code into the slow path in IteratorHandleOp::Compute(). This slightly reduces the amount of work performed when an iterator is accessed (after the first access), and potentially reduces contention if concurrent steps are accessing the same iterator. PiperOrigin-RevId: 183406221 * Cleanup: Ran clang-format on all *.{cc,h} in under grappler. PiperOrigin-RevId: 183406440 * Increase shard count of //third_party/tensorflow/python:nn_batchnorm_test to avoid timeouts When run under asan, the test runs for about 5 minutes, and sometimes longer, causing frequent timeouts. This change increases the shard count of the test to 4, which brings the run time of the longest running shard under asan to about 2 minutes. PiperOrigin-RevId: 183414888 * Add available choices to toco flags and fix minor formatting issues. PiperOrigin-RevId: 183415713 * Performance improvements to some GPU code to use shared locks instead of unique locks for some hotspot cases. PiperOrigin-RevId: 183418559 * [XLA] Improve error message for bad slices. PiperOrigin-RevId: 183420038 * Fix py3 build rules for all py tests under py2tf. PiperOrigin-RevId: 183422144 * Fix bug with Operation._control_inputs setter. PiperOrigin-RevId: 183422192 * Make softmax_op_test.py work with C API enabled. PiperOrigin-RevId: 183422829 * Cleanup: Ran clang-format on all *.{cc,h} files in tensorflow/core/kernels. PiperOrigin-RevId: 183423961 * Fix the documentation for the dense layer for how rank > 2 inputs are handled. PiperOrigin-RevId: 183425868 * Cleanup: Ran clang-format on all *.{cc,h} in tensorflow/core/ops. PiperOrigin-RevId: 183429339
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
90dcb56df701c37a1afe77b36d1194392bd20cd4 |
|
10-Dec-2016 |
Benoit Steiner <benoit.steiner.goog@gmail.com> |
Improved the performance of broadcasting cwise ops on OpenCL devices
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
2f2d45ed8d976941045290d054dc324b9c09cf9a |
|
04-Dec-2016 |
Luke Iwanski <luke@codeplay.com> |
SYCL improvements
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
6a0263970dd05edfa080931cc0d2c202dfba1976 |
|
21-Nov-2016 |
Luke Iwanski <luke@codeplay.com> |
Added OpenCL support for more operations
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
e374977ee63e48473ba47b7959cb2e45ad60d21d |
|
19-Nov-2016 |
Benoit Steiner <benoit.steiner.goog@gmail.com> |
Use 32bit indexing for all the tensors involved in cwise operations
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
8946424a2c0bd5e7f5ea832be84553f857281f3d |
|
13-Nov-2016 |
Benoit Steiner <benoit.steiner.goog@gmail.com> |
Fixed several of the cwise_op tests
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
fa0abcfd8a5f8257624fcf0927df2ccffe0068aa |
|
09-Nov-2016 |
Luke Iwanski <luke@codeplay.com> |
Registered More Ops for SYCL device
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
30de3c1d1d3f8eb097880062967e17c7730077a7 |
|
04-Nov-2016 |
luke <luke@codeplay.com> |
Registered Add, Div, Mul, Sub Ops for SYCL device.
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
de01167a434fa3246d96b506d70f45fe933f55a3 |
|
03-Nov-2016 |
luke <luke@codeplay.com> |
Cleaned cwise_ops_sycl_common.h
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
cf33ec5e6a33fea15b719a0dd8c16d5b1a5c8b70 |
|
01-Nov-2016 |
luke iwanski <luke@codeplay.com> |
Feedback from #5267 applied.
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
397972c4903d6d9fa1f266e1256bcc6ba786809f |
|
27-Oct-2016 |
luke iwanski <luke@codeplay.com> |
Partial specialisation of UnaryFunctor for SYCLDevice has been added.
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|
78d6d1a2628ed4979f22cebdfc762ba5f88f33e3 |
|
20-Oct-2016 |
luke <luke@codeplay.com> |
Added sycl_device and sycl_device_factory.
/external/tensorflow/tensorflow/core/kernels/cwise_ops_sycl_common.h
|