4ab2d8531c461169cd6a33bc0fef1129b419e9df |
|
30-Jan-2018 |
Jean Flaherty <kobejean@me.com> |
Tensor roll op implementation (#14953) * Half migrated to manip Half migrated to tf.manip * Roll op: polymorphism & GPU attempt * Roll op: Added support for gradients Added compile script * Rebase for roll op code * Roll op: Migrated to manip namespace * Roll op: Supports CPU thread pooling fix namespace error * Remove roll from user_ops * Roll op: Optimization * Roll op: Pylint fix * Roll op: Updated documentation * Roll op: Two versions for CPU thread pooling * Roll op: Huge CPU speed up Fixed thread pooling issue that was due to a bad cost_per_unit parameter Also improved readability * Roll op: Rough draft of DoRollV2 DoRollV2 copies memory in groups instead of element by element. Not thoroughly tested yet. Polished DoRollV2 algorithm * Roll op: Restrict tensor size for GPU implementation * Roll op: Fixed clang-format and missing include * Roll op: Minor change * Roll op GPU bug fix Roll op GPU bug fix GPU bug fix Roll op GPU bug fix Roll op GPU fix Roll GPU test BUILD update * Roll op: Remove GPU code Fully remove roll op GPU code Remove compile_cpu.sh * Roll op: Fixes problems with array_ops_test.py and a size 1 dimension bug * Roll op: Migrated to manip Migrated to tf.manip Roll op registered Roll op uses InlinedVector Small improvements * Roll op: Revert array op changes * Roll op: Api def fix * Roll op: review changes * Roll op: API review changes Roll op: Docstring fix * Roll op: Review changes round 1 * Roll op: resolve conflicts * Roll op: Resolve conflicts * Roll op: clang-tidy * Roll op: Review round 2 changes Roll op: fixed BUILD file Roll op: api docs update * Roll op: failure fixes 1 - updates goldens and fixes api compatibility issue - fixes python op test issue for windows - fixes makefile issues * Roll op: Windows CMake failure fix Windows CMake checks were failing because numpy was on an older version that did not support np.roll on multiple shifts that was use to check the correctness of tf.manip.roll. manip_ops_test.py now checks for numpy version 1.12.0 before testing multiple shifts, otherwise it'll just test single shift roll. * Roll op: pylint changes
/external/tensorflow/tensorflow/core/graph/testlib.h
|
a528ccdbfe6e4dadad4d982099e8ea5be93fe96f |
|
20-Oct-2017 |
Jinze Bai <baijinze1994@163.com> |
Add GPU support and improve performance for tf.diag and tf.diag_part (#13666) * improve tf.diag and tf.diag_part in CPU and GPU * add comment * make changes of DiagOp according to reviews * tidy indent * remove uesless comment prefix * add shard function for DiagOp * add benchmark for diag_op_test in core/kernel * change symbol order in BUILD file * remove empty line for Sanity Checks * add some comments and fix benchmark throughput ratio for DiagOp
/external/tensorflow/tensorflow/core/graph/testlib.h
|
b1f9e2c89eb007cb4b9483d08dcace1e45e84164 |
|
11-Jul-2017 |
RJ Ryan <rjryan@google.com> |
Add an axis parameter to tf.gather. Fixes GitHub issue #11223. This brings tf.gather closer to compatibility with numpy.take. To emulate gathering over an axis generally requires inefficient workarounds, e.g. transpose/gather/transpose. This technique is gaining popularity (hundreds of uses inside and outside of Google), so it is worth supporting efficiently. For an `[a_0, ..., a_i, ..., a_n]` tensor, gathering `N` elements from axis `i` requires `(a_0*...*a_i-1) * N` copies of `(a_i+1 * ... * a_n)` elements each. The CPU kernel does this with memcpy which is far more efficient than transpose/gather/transpose since it requires no intermediate allocations and copies. The GPU kernel does the same number of copies but in parallel across multiple hardware threads. Since this is a backwards incompatible change, this adds a "GatherV2" op with an axis input, and simultaneously supports backwards compatibility with "Gather" ops by defaulting to axis 0 if a 3rd input is not present. PiperOrigin-RevId: 161541416
/external/tensorflow/tensorflow/core/graph/testlib.h
|
6e7f5a232a40b975acebb059e316b71a83f7af28 |
|
27-Apr-2017 |
Peter Hawkins <phawkins@google.com> |
Change Tensorflow constant_folding_test.cc to use the C++ graph building API instead of deprecated testlib API. Cleanup to test code only, no functional changes. Delete unused Broadcast* methods from testlib.h Change: 154450120
/external/tensorflow/tensorflow/core/graph/testlib.h
|
8b07605e45f55c942d1436116fd5b0cc83a29e1d |
|
08-Feb-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Add tf.random_poisson(shape, lam) to tf core. Fixes #6798 Change: 146861107
/external/tensorflow/tensorflow/core/graph/testlib.h
|
51e6e84fadd7b37c6e8e5c13cd0de4c7c8e2959f |
|
11-Jan-2017 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Add optimized reverse kernel for reversing rows of a 3D tensor. This is used in image preprocessing pipelines. Add benchmark for reverse. Change: 144224184
/external/tensorflow/tensorflow/core/graph/testlib.h
|
fd7ff167e6f02fe0966fa70ef52a99d16e0490ec |
|
15-Dec-2016 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Add BroadcastShape op. Specifically create a broadcast_static_shape and broadcast_dynamic_shape python wrappers. broadcast_static_shape returns an inferred TensorShape from the input TensorShapes, while broadcast_dynamic_shape returns an integer Tensor representing the broadcasted shape. Change: 142147719
/external/tensorflow/tensorflow/core/graph/testlib.h
|
41a48abe8da762290794e48ef11ce389674e1077 |
|
13-Dec-2016 |
Ben Lee <blee@google.com> |
Add ConcatV2 test helper method Change: 141915811
/external/tensorflow/tensorflow/core/graph/testlib.h
|
c672ae6c90f829eec7ab0dc593af8da15b90a363 |
|
29-Sep-2016 |
Patrick Nguyen <drpng@google.com> |
Fix spelling in docstrings. Change: 134622348
/external/tensorflow/tensorflow/core/graph/testlib.h
|
b06bc257f65539371f10121344e8d7317eed647b |
|
24-Sep-2016 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Parallelize inner matrix multiplications of BatchMatMul on CPU when appropriate. * Uses simple heuristics to choose between parallelizing outer (batch), inner (matmul) or both. * Adds benchmarks for BatchMatMul. * Switches matmul benchmark to use real time so GFlops reported are w.r.t. walltime and measure the effect of multi-threading. * Refactors the code to avoid a lot of extra instantiations of the Eigen contraction code, which bloats code size (size of test binary drop from 9.3MB to 5.5MB) and compilation time (time binary compilation time drop from ~210 to ~75 seconds). * Fixes bug in cost_per_unit calculation. The old code calculated B*M*N instead of M*N*K. Change: 134138821
/external/tensorflow/tensorflow/core/graph/testlib.h
|
79b549db4bc99f905217c43775f76c9f56cb3dbd |
|
23-Sep-2016 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Automated rollback of change 134025273 Change: 134037266
/external/tensorflow/tensorflow/core/graph/testlib.h
|
9ff9c1f6ce1a94e64bf764d42198998cbb969e5c |
|
23-Sep-2016 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Parallelize inner matrix multiplications of BatchMatMul on CPU when appropriate. * Uses simple heuristics to choose between parallelizing outer (batch), inner (matmul) or both. * Adds benchmarks for BatchMatMul. * Switches matmul benchmark to use real time so GFlops reported are w.r.t. walltime and measure the effect of multi-threading. * Fixes bug in cost_per_unit calculation. The old code calculated B*M*N instead of M*N*K. Change: 134025273
/external/tensorflow/tensorflow/core/graph/testlib.h
|
e0b2779f44ab7bbb82a1afb64f5b639e40a58ab0 |
|
08-Aug-2016 |
A. Unique TensorFlower <gardener@tensorflow.org> |
Update deprecation comment to point to new C++ API. Change: 129663735
/external/tensorflow/tensorflow/core/graph/testlib.h
|
6714c150df0a764b29acf8d23981162dd2f0a9a1 |
|
20-Jul-2016 |
Shanqing Cai <cais@google.com> |
Automated rollback of change 127562075 Change: 127906463
/external/tensorflow/tensorflow/core/graph/testlib.h
|
12efe48d210477bf9d9fa1a3f5e0f0ab4a24de77 |
|
18-Jul-2016 |
Shanqing Cai <cais@google.com> |
Automated rollback of change 127562075 Change: 127709092
/external/tensorflow/tensorflow/core/graph/testlib.h
|
e5ea34a104f55e9d698e50982de90d99ce99550f |
|
15-Jul-2016 |
Shanqing Cai <cais@google.com> |
tfdb: Debug nodes inserter EXPERIMENTAL: Insert special debug ops (e.g., DebugIdentity) to graph for debugging. Currently, debug ops need to take exactly one input and has the string attribute "tensor_name" to indicate what tensor it watches. For example, before the node insertion, the graph may look like: A:0 -----------1----------> B | ---------2-----------> C wherein the output slot 0 of node A feeds as the input to nodes B through edge 1 and to node C through edge 2. After the node insertion, assuming both B and C have non-Ref input, the graph becomes: A:0 ---3---> Copy -----------4----------> B | ---------5--------> C | ---------6--------> X If a node (e.g., B) has Ref input, the graph becomes: ----------------4---------------> B | A:0 ---3-----> Copy -----------5----------> C | -----------6--------> X In other words, we do not feed Refs to deep-copies to downstream nodes. The Copy node is the inserted deep-copy node that copies the input tensor on-device (e.g., CPU-to-CPU or GPU-to-GPU deep copy) that reduces the likelihood of racy updates during debug tensor-watching. X is the newly created debug node that transforms the input (copy of the watched tensor) into a debug signal. DebugIdentity is the simplest debugging paradigm, in which the debug signal (i.e., X:0) equals the tensor itself. More sophisticated debug ops can be used to transform the tensor into other useful debug signals. An example is the added DebugNanCounter op. If the nodes (A, B and C) are located on GPU and the edges from A to B or C is HOST_MEMORY, the CopyHost op will be used instead of the Copy op. A reserved string attribute "debug_url" is created for the debug ops to make it possible to send debug signals to files or RPC calls in the future. Other points worth noting: * The debug ops have control-edge connections to the original destination node, in order to ensure that the debug signals are deterministically generated before the destination node executes. * More than one debug ops can be added to watch a tensor. * A new field called "DebugTensorWatch" is added to RunOptions to support debug node insertion. * A new method GPUUtil::CopyGPUTensorToSameGPU has been added to make GPU-to-GPU deep-copy of tensors possible. * The two test files (debug_gateway_test.cc and debug_gateway_gpu_test.cc) have been consolidated to the former, by using the GOOGLE_CUDA macro. Change: 127562075
/external/tensorflow/tensorflow/core/graph/testlib.h
|
f8bd3aa8f239df106054d70443ed39081cef7917 |
|
01-Jul-2016 |
Jianmin Chen <jmchen@google.com> |
Add conv2d to testlib. Change: 126441771
/external/tensorflow/tensorflow/core/graph/testlib.h
|
3fc8c8349a152cbfeac4f29df6a33d1a0fda390d |
|
25-Jun-2016 |
Jianmin Chen <jmchen@google.com> |
Added more ops to testlib. Change: 125829994
/external/tensorflow/tensorflow/core/graph/testlib.h
|
b62ddf7cb5224b078a4dd325640227680641485a |
|
09-Jun-2016 |
A. Unique TensorFlower <nobody@tensorflow.org> |
Adds tf.random_gamma(shape, alpha, beta) to tf core. Adds a sample method to the Gamma distribution which uses this op. Change: 124498685
/external/tensorflow/tensorflow/core/graph/testlib.h
|
a8a25d85a5c57f8cbb4b22aa5bd7e9c86e5aedd8 |
|
07-Jun-2016 |
Jianmin Chen <goog.jmchen@gmail.com> |
Rewriting training graph to simulate the precision loss for quantized inference. This finds all the matmul and conv2d ops (with the most precision loss) and convert its inputs according to their types. This rewriting uses the quantize_and_dequantize op to convert tensors with the following types. 1. Const/Variable OP: This is quantized as signed tensors with no given range. 2. Activation OP: Set the range accordingly for different types of activations. Currently we handle {Relu, Relu6, Sigmoid, Tanh} 3. Identity OP: The quantization parameters depend on what its input is. 4. Pooling OPs: various pooling ops. Also depends on its input. 5. Reshape OP: Also depends on the first input to this op. 6. Not-Listed-Above OP: If there is only 1 such op, consider it as the model input. However, if there are >1 unknown ops, then return an error for now to avoid unexpected bahavior. Note: The list above might not be a complete list. Please let us know if you see the CHECK failure so we can include your use case. Change: 124190453
/external/tensorflow/tensorflow/core/graph/testlib.h
|
c8b59c046895fa5b6d79f73e0b5817330fcfbfc1 |
|
02-Jun-2016 |
A. Unique TensorFlower <nobody@tensorflow.org> |
Update copyright for 3p/tf/core. Change: 123900938
/external/tensorflow/tensorflow/core/graph/testlib.h
|
73d026e588ce6deffa399ecb662d70b092127ede |
|
16-Apr-2016 |
Manjunath Kudlur <keveman@gmail.com> |
Do not add large constants (> 10M) to the graph while constant folding. Change: 120015061
/external/tensorflow/tensorflow/core/graph/testlib.h
|
098f930de4ef044021f3ef1d3cdd6848c23eddb0 |
|
10-Apr-2016 |
Yuan Yu <yuanbyu@google.com> |
This is another step to make TensorFlow more interactive and flexible to users. It allows a tensor produced by a run call to stay "in-place" so that a future run call can use it in-place. To achieve this, a run call can now return a handle of a tensor to the client, which can then be fed to a subsequent run call. This feature is complimentary to partial run, though there are some overlaps. Here are a few properties of the current implementation: 1. Tensors are stored in the state of a session. The tensors are garbage collected if the client doesn't have a reference to the tensor or the session is closed. 2. There is no change to the current session API. We introduced two ops to manage the conversions between tensors and its handles. (There is a third op to garbage collect a tensor.) See the example below. 3. It fits quite well into the current feed-fetch design/implementation. It tries to reuse the graph (and caches) as much as possible so to make things efficient. Below is a simple example. More examples can be found in sessopn_ops_test.py. # Return a handle. a = tf.constant(10) b = tf.constant(5) c = tf.mul(a, b) h = tf.get_session_handle(c).eval() # Feed a tensor handle. f, x = tf.get_session_tensor(dtypes.int32) y = tf.mul(x, 10) result = sess.run(y, feed_dict={f: h.handle}) # result == 500 Change: 119481352
/external/tensorflow/tensorflow/core/graph/testlib.h
|
643edf3db6082170bc4f7a5cf8eeb51875890b54 |
|
03-Mar-2016 |
A. Unique TensorFlower <nobody@tensorflow.org> |
Rollback of "Adds GPU kernel for gather ops." Change: 116220950
/external/tensorflow/tensorflow/core/graph/testlib.h
|
3f8a7616452346ef4de5a0da7b79fdb1cc8d6594 |
|
02-Mar-2016 |
Vijay Vasudevan <vrv@google.com> |
Rollback of "Adds GPU kernel for gather ops." Change: 116064181
/external/tensorflow/tensorflow/core/graph/testlib.h
|
f16aabc6d9c1f11d1b2c393544999d1ac960a80b |
|
01-Mar-2016 |
A. Unique TensorFlower <nobody@tensorflow.org> |
Adds GPU kernel for gather ops. Change: 116048950
/external/tensorflow/tensorflow/core/graph/testlib.h
|
4c8a0f19ee5eae260d54250d1ed6f328a5a7831a |
|
11-Feb-2016 |
Geoffrey Irving <geoffreyi@google.com> |
Fix bit rot in random op benchmarks 1. RandomParameters is now TruncatedNormal. 2. Shapes can't be scalars anymore. Change: 114377586
/external/tensorflow/tensorflow/core/graph/testlib.h
|
2585e0a75e2802ca8b9877fd06544ecca0b95cd9 |
|
09-Feb-2016 |
Eugene Brevdo <ebrevdo@gmail.com> |
TensorFlow C++ tests: Add a HostConstant to testlib. Change: 114156939
/external/tensorflow/tensorflow/core/graph/testlib.h
|
c509c78d0f28c90f45edd65f7c3bdf080d8e0283 |
|
28-Jan-2016 |
Manjunath Kudlur <keveman@gmail.com> |
Fixed constant folding to handle nodes with multiple outputs. Change: 113220692
/external/tensorflow/tensorflow/core/graph/testlib.h
|
bcd9722be4250b8584e4fe5bc4f60b8793cf87d0 |
|
28-Jan-2016 |
A. Unique TensorFlower <nobody@tensorflow.org> |
Fixed constant folding to handle nodes with multiple outputs. Change: 113215834
/external/tensorflow/tensorflow/core/graph/testlib.h
|
33aa62872774539f2a8144db0c93a1b9ce3ed51f |
|
28-Jan-2016 |
Manjunath Kudlur <keveman@gmail.com> |
Fixed constant folding to handle nodes with multiple outputs. Change: 113212117
/external/tensorflow/tensorflow/core/graph/testlib.h
|
c10f439740396006e45059435e552e4d4ad2c1ad |
|
26-Jan-2016 |
Josh Levenberg <josh11b@tensorflow.org> |
Global search & replace to move to the new location for tensorflow/core/ files and build targets. Change: 113080052
/external/tensorflow/tensorflow/core/graph/testlib.h
|
9c3043ff3bf31a6a81810b4ce9e87ef936f1f529 |
|
20-Nov-2015 |
Manjunath Kudlur <keveman@gmail.com> |
TensorFlow: Improve performance of Alexnet Changes: * error message that refers to removed `DefaultSession` method. * -Wnull-conversion warnings * the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set. * typo in tutorial data download progress message. * a typo ("however their installing"=>"however installing"). * typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website. * a typo ("subtact"=>"subtract"). * protobuf examples in comments in tensorflow::Example.proto. * formula formatting in MNIST beginner tutorial * negative fraction-of-queue-full stats * protobuf inclusion path so that Android demo will build under Blaze. * small typo (moderatly > moderately) * Session.run() to check that tensor arguments come from the session's graph. * another six import * seq2seq typo in bazel command Base CL: 108349164
/external/tensorflow/tensorflow/core/graph/testlib.h
|
56313def004795f75ef8281a0294c958d28f1e06 |
|
16-Nov-2015 |
Vijay Vasudevan <vrv@google.com> |
TensorFlow: Doc and linter fixes, some additional tests and error handling, updates to website. Changes: - Removes redundant reshape from image models by @mrry - Default TensorBoard to localhost by @danmane - Reformatting of tensorflow/core by @josh11b - Make tutorials backwards compatible to 0.5.0 by @girving - Improve print documentation (md files not updated). - Add proper scrolling to sitemap by @martinwicke Base CL: 107956254
/external/tensorflow/tensorflow/core/graph/testlib.h
|
011e9baccd343eb943d25014c4e8aec53eac396b |
|
13-Nov-2015 |
Vijay Vasudevan <vrv@google.com> |
TensorFlow: a few small updates. Changes: - Fix softmax formula in word2vec to remove an extra exp() by @gouwsmeister - Python3 fixes to remove basestring / support for unicode by @mrry - Remove some comments by Josh - Specify exact versions of bower dependencies for TensorBoard by @danmane. Base CL: 107742361
/external/tensorflow/tensorflow/core/graph/testlib.h
|
f41959ccb2d9d4c722fe8fc3351401d53bcf4900 |
|
07-Nov-2015 |
Manjunath Kudlur <keveman@gmail.com> |
TensorFlow: Initial commit of TensorFlow library. TensorFlow is an open source software library for numerical computation using data flow graphs. Base CL: 107276108
/external/tensorflow/tensorflow/core/graph/testlib.h
|