History log of /external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
9e7bf403817a3acd4e8d865b041f37609564076e 10-Apr-2017 drpngx <drpngx@users.noreply.github.com> Branch 152703253 (#9112)

* Improve py_func error handling.

Automatically translate some python errors into corresponding TF errors at runtime.
Change: 152156821

* Update interaction with libpng so that we use the public API instead of
knowledge of the internal libpng data structures.
Change: 152167754

* TensorBoard plugins now contain their own name/route prefix.
Change: 152167807

* Passes trainable flag to separable_conv2d biases.
Change: 152170239

* Saving resource variables with a caching device.
Change: 152171539

* Drop loss from estimator_spec.eval_metric_ops, as required by core Estimator.
Change: 152179924

* sample_stats.percentile DOCFIX.
Change: 152182295

* Added a memory optimizer to grappler.
Change: 152184170

* Change default behavior of the tf runs selector:

- If there are fewer than 41 runs, enable them all by default
- If there are 41 runs or more, disable them all by default

This is in response to user complaints that having it enable only the first ten runs by default was confusing, because it was not obvious to users that some runs had been disabled.
However, it still solves the initial user complaint that having very many runs simultaneously enabled would lag the UI.

I also changed the "toggle all runs" button to try to turn everything off before turning everything on.
Also, I improved the logic for detecting when the runs selection is back in the default state, so that we can avoid generating long URI strings wherever possible.
Change: 152188948

* Autogenerated Change: Change TensorBoard TAG to 52
Change: 152189000

* Remove warning that only happening with config cuda.
Change: 152189205

* Make resource variable shared name consistent with non-resource variables.

Remove colocation constraint from resource variable cached value with the
variable itself.
Change: 152192203

* Add a way to specify the optimization order; refactor and add constant folding to meta optimizer.
Change: 152193646

* Backport fixes and improvements from external Keras.
Change: 152198296

* Merge changes from github.
Change: 152200430

* Go: Update generated wrapper functions for TensorFlow ops.
Change: 152200754

* Update ops-related pbtxt files.
Change: 152203174

* Make ImportGraphDef() work with functions.

In addition to modify graph_constructor.cc, this patch adds some other
functionality to enable importing fucntions:
* Ability to add FunctionDefLibraries to Graphs and
FunctionLibraryDefinitions (in addition to existing functions)
* FunctionDefsEqual() utility function
Change: 152205258

* Expand contrib test to more than just test targets.
Change: 152206822

* Preserve graph version during optimization
Change: 152213262

* Exclude enter and exit nodes from shape refiner's constant folding.
Change: 152213637

* Allow reshape_mover and algebraic_simplifier to make multiple mutations, by avoiding the short-circuit
std::any_of.
Change: 152232810

* Fix dynamic_rnn transpose bug (can input/output non-3d tensors).

Also a few cleanups to RNN code.
Change: 152267628

* Fix flaky tests
Change: 152272801

* Add an auto parallelization grappler optimization pass.
Change: 152276787

* Change json.decode.JSONDecodeError to ValueError. JSONDecodeError seems to be
the exception used in the simplejson module, not the json module.
Change: 152278012

* Internal change.
Change: 152281471

* [XLA] Force buffer sharing of separate while instructions.
Change: 152288540

* replica_device_setter should work for resource variables
Change: 152289915

* Fix ./configure script
1. Add %workspace% in .bazelrc file when using import statement
2. Write action_env into bazelrc file for required environment variables for OpenCL support
Change: 152290700

* Pointing a number of Tensorboard graph visualization-related help links to the new locations for the correspondent API documentation.
Change: 152293459

* Restore most of pull request #8606

Pull request #8606 added str(Label(...)) for most dependencies in
tensorflow.bzl, allowing most functions to be used from repositories which
include TensorFlow as a submodule. Unfortunately, it broke when pulled into
Google and was removed in cl/152200430. This CL restores the change, except
for two Android-only functions; these were the only problematic bits.
Change: 152297413

* Removed dead code in Estimator.
Change: 152297597

* Assert rank is at least equal to new_rank for `_sparse_inner_flatten`.
Change: 152303319

* Extend quantization ranges to include 0.0f.
Change: 152304380

* Remove Keras config file saving.
Change: 152306552

* API backwards compatibility tests.
Change: 152310869

* [TF:XLA] Add a test for an R3 -> R4 broadcast.
Change: 152313967

* Fix the problem that no enough placeholders for persistent tensor
batch delete

The deleter_key is always a device_name, hence there is only one
of it. Hence, we cannot delete >1 handles at one time.

In the fix, it creates delete placeholder on demand, the max
number of placeholders is _DEAD_HANDLES_THRESHOLD.
Change: 152322770

* [XLA] Add several reduction tests.
Change: 152323510

* Added the memory optimizer to the meta optimizer.
Change: 152323689

* Started a set of utilities to categorize op types
Change: 152329057

* Add AudioSpectrogram op to TensorFlow for audio feature generation
Change: 152332221

* Update ops-related pbtxt files.
Change: 152332812

* Automated rollback of change 152332221
Change: 152333917

* Call Py_CLEAR on dead fields during TF_RESOURCE-to-ndarray conversion
Change: 152338333

* [TF contrib seq2seq] Initial, incomplete implementation of beam search decoder.

**DOES NOT WORK, pushed for collaboration only**
Change: 152343927

* [XLA] Change HloPassPipeline to disallow Add* calls after Run.
Change: 152345578

* Automated rollback of change 152332812
Change: 152349057

* Remove all 64/32 bit compiler warnings from core/ops.
Change: 152353506

* libtensorflow.so: Don't export private symbols.

With this change, libtensorflow.so will only export
functions defined in c_api.h. This also results in
a decreased binary size of libtensorflow.so.

On Linux the decrease was from roughly 150MB to 67MB.
On OS X it was from roughly 101MB to 82MB.

Also fixes #8923
Change: 152366053

* Add Elu ops in XLA.
Change: 152383201

* Fixed test. ('broadcast_dims' has size 1)
Change: 152383633

* Add more detailed error message for rank assertion in _sparse_inner_flatten.
Change: 152397909

* tensor_bundle: propagrates errors related to directory creation.
Change: 152401909

* matrix_adjoint added to contrib/linalg/linear_operator_util
Change: 152404828

* Add an is_active method to plugins

This method determines whether a plugin is active. A plugin may be inactive if say it lacks data. This new is_active method allows us to add a route to TensorBoard noting which plugins are active. The frontend could then avoid querying routes of inactive plugins.
Change: 152406232

* Replace a gather op for shapes by a stack op so dilated convolutions can be
placed on GPU even with strict placing (before the gather went to CPU).
Change: 152411159

* [TF:XLA] Implement BatchToSpace, BatchToSpaceND, SpaceToBatch, SpaceToBatchND.
Fix crashes in core implementations of the same operators for zero-sized blocks.
Change: 152416903

* Estimator saves relative paths in checkpoint.
Change: 152420211

* Fix layers_test exception regex matching.
Change: 152422855

* Unhide bijectors. Correct TransformedDistribution docstring.
Change: 152424418

* Choosing a saner default for min_eval_frequency in the constructor for Experiment for the GCS file system, because the default of 1 causes performance problems.
Change: 152439984

* Inherit use_resource from scope for partitioned variables.
Change: 152442103

* Support quantized reshape in hexagon runtime
Change: 152445539

* tfdbg CLI: add command list_source (ls) + UI fixes and improvements

The new list_source (shorthand: ls) command lists Python source files responsible for constructing the nodes and tensors encountered in the run() call.

It divides the source files into two categories and list them separately.
1) files that are not part of the TensorFlow Python library, and
2) files that are a part of it.

The list contains information about how many nodes, tensors and dumps of tensors the files is responsible for. The file paths contain clickable links to the existing print_source/ps command.

The list_source/ls command supports filtering by file-path and node-name regex patterns.

UI fixes:
* Fixed inconsistent black vs. transparent background color that made the layout look messy on some terminal types. Now using the transparent color for default font color consistently.
* In the print_source command output, add clickable links to expand source lines and graph elements.
Change: 152446002

* tfcompile: Be a little more verbose about missing required flags.

Fixes #9014
Change: 152446338

* Disable failing test cases in pooling_ops_test.
Change: 152447322

* Register more types for tf.image_crop_and_resize(). Resolves #9020.
Change: 152448160

* Automated rollback of change 152439984
Change: 152450929

* Add a route to TensorBoard for fetching plugin names

Specifically, we add a /data/plugins_listing route to the TensorBoard application. This route responds with an object mapping the name of each initialized plugin to whether it is active.

This route could help the frontend avoid issuing requests to inactive plugins.

Ordered the listing of routes within application.py so there is a little more organization.

Refactored the test for application to use a fake plugin.
Change: 152451390

* Added the ability to retrieve the amount of usable gpu memory
Change: 152453470

* Allow to set session ConfigProto in RunConfig and use it in Estimator.
Change: 152454548

* Colocate ResourceVariable reads with their handles.
Change: 152455939

* tfdbg: update doc for new command list_source/ls
Change: 152456128

* Make rnn directions slightly easier to follow.
Change: 152456296

* Internal change
Change: 152458104

* Adds batch renormalization.

NOTE: if you use renormalization, you might want to use faster moving average updates, i.e. lower `decay` values.
Change: 152458872

* When using ImportGraphDef with a passed in ShapeRefiner, use the
producer version of the GraphDef when importing; the ShapeRefiner
may be initialized with a different graph_def_version, so we need
to be able to override it.

The test failed without the change to graph_constructor and passes with it.
The test uses a legacy graph that is supported (reduction shape).
Change: 152459169

* Allow any iterable for `export_strategies` arg.
Change: 152461826

* Log steps/sec every 100 steps in MonitoredSession, as before.
Change: 152465320

* Fixes documentation to note that the in case of ties the identity of the return value of ArgMin and ArgMaxis not guaranteed .
Change: 152465346

* Automated rollback of change 152465346
Change: 152465844

* Fix shape inference fn on _ParallelConcatStart.
Change: 152466076

* Fix getting started guide

Explain numerical differences in loss
fix one example to print
Change: 152466119

* Remove superfluous mode argument.
Change: 152467334

* Add a tool that converts HLO computations to tensorflow GraphDef which can be visualized on Tensorboard.

This CL defines basic tensorflow::OpDef for each HLO instruction/node. More attributes (e.g. shapes, colors) will be added in the future.
Change: 152477918

* [TF:XLA] Increase shard count of //third_party/tensorflow/compiler/tests:spacetobatch_test to reduce flakiness when built under ASAN.
Change: 152496244

* Make projector plugin backend read assets saved via the PluginAssets API.

At the same time, keep backwards compatibility with the old way of looking up assets.
Change: 152504793

* Move MNIST pointers to mirror hosted by the CVDF on Google Cloud.
Fixes: #9031
Change: 152504901

* Merge changes from github.
Change: 152508170

* Update API after changing default step couter frequency before.
Change: 152517535

* Move a few random op helper functions to header files

1. shape_inference::RandomShape
2. OpKernel::MakeShape(Tensor, TensorShape*)
Change: 152522156

* addresses the divide by zero bug
Change: 152522488

* Clarify doc on tf.assign.
Change: 152523909

* Sparse adam for resource variables.
Change: 152525327

* Automated rollback of change 152310869
Change: 152528732

* Add an env_var tf_sync_on_finish_bool that block until device has finished all queued operations in a step if true.
Change: 152533676

* Add more node attributes for HloInstruction on Tensorboard e.g. shape and layout etc.
Change: 152534472

* Add tf.complex64 GPU support to tf.gather.

Also add ldg specializations for std::complex.
Change: 152537848

* Formatting changes
Change: 152544842

* Upgrade TensorBoard TypeScript to 2.2.1

See also: #8326
Change: 152545950

* TEST: Getting reasonable test sizes on linalg library, removing need for
sharding.
Change: 152546409

* Disabling _testSourceUtilModuleReturnsTrue as its causing opensource issues.
Change: 152548721

* Fix race due to unsafe buffer forwarding in maxpooling second order gradients added in #6664.
Re-enable previously flaky tests.
Clean up a few minor things in maxpooling_op_gpu.cu.cc
Change: 152550050

* LinearOperator: adjoint_arg kwarg added to all operators. Now,
operator.apply(x, adjoint_arg=True) means that the adjoint of 'x' is taken
before application of operator. Sometimes this is done more efficiently than
simply taking adjoint.
Change: 152560471

* Adds weighted_average_loss metric key.
Change: 152560999

* Documentation: Fix bug in manual device placement example
Change: 152563392

* Change for internal compatibility.

* Use std::vector for storage instead of map.
Do the sorting inplace and return the same vector to avoid any copies.
On larger streams it is about 50% faster.
Change: 152576112

* Add tf.add_n GPU support for complex64/complex128.

Also adds a unit test for tf.add_n.
Change: 152577190

* - Adds support for nested types in tf.case and tf.cond.
- Adds a "strict" mode which disables silent unpacking of singleton lists.
- Adds shape inference to tf.case.
- Adds a lot of unit tests.
Change: 152581097

* [XLA] Add support for folding transpose into convolution
Change: 152581336

* Add a smoke test to ensure that the doc generator runs.
Change: 152592164

* Add tensorboard to the _do_not_descend_map of the PublicAPIVisitor.
Change: 152592268

* Add auto parallelization to meta optimizer. Enable MetaOptimizer if any one of the optimizers is on.
Change: 152598517

* Update ops-related pbtxt files.
Change: 152629248

* Prevent the renorm_weight from being updated too early.
Change: 152631776

* Automated rollback of change 152528732
Change: 152652473

* Construct TensorBoard dashboards in a JS list

Previously, adding a dashboard to TensorBoard involved changing logic in several places.

As part of this effort, added constructors to dashboards. Tweaked logic in various dashboards to preserve original behavior. For instance, the graph dashboard can only perform fitting after the dashboard is attached to the DOM.
Change: 152658532

* Make CheckpointSaverListener visible next to CheckpointSaverHook.
Change: 152662945

* tfdbg CLI: minor bug fixes

1: The calculation of the scroll command in the scroll bar didn't take into account that the y-coordinate of the scroll block is in the ScrollBar coordinate system, while the mouse click y-coordinate is in the screen coordinate system.

2: The y position of the ScrollBar was off by one.

3: The command box is not re-created after mouse-triggered commands, leading to strange-looking cursor position.
Change: 152684294

* Remove obsolete use of validate_indices from embedding_ops.py

validate_indices is ignored, so it shouldn't appear in new code.
Change: 152691948

* Preparation of using GMock matchers in XLA tests.
Change: 152691970

* Replace RuntimeException by RuntimeError in coordinator documentation.
Change: 152697758

* Move the TensorBoard debugger plugin to be internal.

This feature is currently not open-source anyway.
Change: 152700267

* Add a single-machine tf.learn Estimator implementation for the WALS solver.
Change: 152700915

* Add tf.contrib.training.python_input -- making it easy to feed data into
TensorFlow from python coroutines.
Change: 152701623

* Show that QuantizeToFloat consistently introduces a small error. The
error is equal to
range_min - round(range_min / range_scale) * range_scale
Change: 152702015

* Internal Changes
Change: 152703253

* Remove tensorflow/tensorboard/plugins/debugger, as part of merge resolution.
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
c8b59c046895fa5b6d79f73e0b5817330fcfbfc1 02-Jun-2016 A. Unique TensorFlower <nobody@tensorflow.org> Update copyright for 3p/tf/core.
Change: 123900938
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
3ede5506acf6a026f09eda33277d46e34ac7ed10 26-Jan-2016 Josh Levenberg <josh11b@tensorflow.org> Global search & replace to move to the new location for
tensorflow/core/ files and build targets.
Change: 113075177
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
2712ed6de036e16b2599fcab2071acd7bbf8b17a 19-Jan-2016 Eugene Brevdo <ebrevdo@gmail.com> Implement TensorArray forward ops.

Allows dynamic writing to- and reading from- an array of Tensors (size of the array determined at run time).

This is useful for, e.g., While loops. Each while iteration can write to the Array; and the final handle can be used with Concat to get all the outputs in one Tensor.

No gradient support yet, this will be implemented in a future CL.
Change: 112493043
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
9c3043ff3bf31a6a81810b4ce9e87ef936f1f529 20-Nov-2015 Manjunath Kudlur <keveman@gmail.com> TensorFlow: Improve performance of Alexnet

Changes:

* error message that refers to removed `DefaultSession` method.
* -Wnull-conversion warnings
* the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set.
* typo in tutorial data download progress message.
* a typo ("however their installing"=>"however installing").
* typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website.
* a typo ("subtact"=>"subtract").
* protobuf examples in comments in tensorflow::Example.proto.
* formula formatting in MNIST beginner tutorial
* negative fraction-of-queue-full stats
* protobuf inclusion path so that Android demo will build under Blaze.
* small typo (moderatly > moderately)
* Session.run() to check that tensor arguments come from the session's graph.
* another six import
* seq2seq typo in bazel command

Base CL: 108349164
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
56313def004795f75ef8281a0294c958d28f1e06 16-Nov-2015 Vijay Vasudevan <vrv@google.com> TensorFlow: Doc and linter fixes, some additional tests and
error handling, updates to website.

Changes:
- Removes redundant reshape from image models by @mrry
- Default TensorBoard to localhost by @danmane
- Reformatting of tensorflow/core by @josh11b
- Make tutorials backwards compatible to 0.5.0 by @girving
- Improve print documentation (md files not updated).
- Add proper scrolling to sitemap by @martinwicke

Base CL: 107956254
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc
f41959ccb2d9d4c722fe8fc3351401d53bcf4900 07-Nov-2015 Manjunath Kudlur <keveman@gmail.com> TensorFlow: Initial commit of TensorFlow library.
TensorFlow is an open source software library for numerical computation
using data flow graphs.

Base CL: 107276108
/external/tensorflow/tensorflow/core/kernels/aggregate_ops_gpu.cu.cc