History log of /external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
5840b3fa4efef708f586a77be5490b99554b9188 20-Sep-2017 A. Unique TensorFlower <gardener@tensorflow.org> Update the debug_ops unittest

PiperOrigin-RevId: 169341341
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
1196fbc824b45679cba9fae8daa35cc6c02d3599 01-Sep-2017 A. Unique TensorFlower <gardener@tensorflow.org> Use boolean literals where appropriate instead of narrowing ints

PiperOrigin-RevId: 167314054
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
3c482c66b5a1f74875969e96834ff7564e829668 11-Aug-2017 Shanqing Cai <cais@google.com> tfdbg: extend grpc_debug_server protocol for interactive debugging

Previously, a grpc-gated debug op has two modes: DISABLED and ENABLED.
This CL splits the ENABLED state into two states: READ_ONLY and READ_WRITE.

* READ_ONLY is equivalent to the previous ENABLED state, wherein a debug op
publishes debug tensor to the grpc debug server and proceeds. It can be
regarded as a "watchpoint" that doesn't block execution.
* READ_WRITE is a "breakpoint". In addition to publishing the debug tensor,
it blocks and awaits a EventReply proto response from the grpc debug server
before proceeding.

PiperOrigin-RevId: 164987725
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
0fe0bfcc3cf6930edc096998b1445cead92de8c3 06-Jun-2017 A. Unique TensorFlower <gardener@tensorflow.org> Remove unused protobuf header inclusions

PiperOrigin-RevId: 158120864
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
cc2dd4ac8538045e94e3f8fe4fb1c532f67c1844 05-Jun-2017 Shanqing Cai <cais@google.com> tfdbg: dump debug data from different devices in separate directories

Fixes: #7051
wherein TFDBG failed to load the data dump from a Session.run() involving multiple GPUs.

The root cause of the bug was that TFDBG previously assumed that node names are unique across all partition graphs. This is however not the case when multiple GPUs exist. The Send/Recv nodes in the partition graphs of the GPUs can have duplicate names. There will potentially be other cases like this in the future due to other reasons (e.g., distributed sessions and/or graph optimization).

This CL relaxes this assumption, by dumping the GraphDef and tensor data from different devices into different sub-directories under the dump root directory.

PiperOrigin-RevId: 158029814
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
d83074847ebfe8871188f1f9f1e84ab0451f59e6 30-May-2017 A. Unique TensorFlower <gardener@tensorflow.org> Use "nullptr" for null pointer values

PiperOrigin-RevId: 157468186
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
b4466279a63d072c581e9eb5b79480d8482128b5 27-May-2017 Shanqing Cai <cais@google.com> tfdbg: add runtime shape and dtype info to DebugNumericSummary

PiperOrigin-RevId: 157291215
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
1d0b8c007b8bc7f77dd63c74f02d87185071f038 09-May-2017 Peter Hawkins <phawkins@google.com> Remove unnecessary copies of value parameters.

PiperOrigin-RevId: 155511618
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
ecb5266e4791639781e4789a91cca8d3e00c4da7 10-Apr-2017 Shanqing Cai <cais@google.com> tfdbg core: allow gRPC debug server to remotely disable/enable debug ops

Synopsis of changes:
* The EventReply protobuf is expanded: a new field called "debug_op_state_change" is added to allow the debug server to remotely enable and disable debug ops.

* At the end of every debug gRPC stream, the server sends all the queued EventReply protos to the client. The client (i.e., the debugged TF runtime) receives them and toggle the enabled status of the debug ops accordingly.

* Added gated_grpc attribute to existing debug ops. This new boolean attribute is set to False by default, ensuring backward compatibility in behavior. If set to True, the debug ops will send the output tensors through grpc:// streams if and only if they are currently enabled. Otherwise we say that the debug op is "gated off" at the grpc:// URL.

* If a debug op is gated off at all URLs it possesses, it will perform no expensive computation and instead just emit an empty (size {0}) output tensor.

Other detailed change descriptions:
* All debug ops now share the same base class "BaseDebugOp" to reduce the amount of boilerplate, which has grown in size due to the new gRPC gating logic.
Change: 152733779
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
3288f2eee7140e4a97c5976417fcbab5fe28a05c 20-Mar-2017 Shanqing Cai <cais@google.com> tfdbg core: add configurable attributes to debug ops, DebugNumericSummary

Added three attributes to the debug op "DebugNumericSummary" used in tfdbg-based TensorBoard health pills:
1) lower_bound (type: float)
2) upper_bound (type: float)
3) mute_if_healthy (type: bool)

lower_bound and upper_bound make it possible to customize thresholds beyond which tensor elements are counted as -inf or inf. mute_if_healthy makes it possible to mute a DebugNumericSummary op unless there are nan, -inf or inf elements in the watched tensor, which is useful for reducing the amount of health pill data.

Changes are made in the C++ DebugNodeInserter class, so that these attributes can be directly set from Python methods such as tf_debug.watch_graph() using the following syntax in the debug_ops argument:
debug_ops=["DebugNumericSummary(attribute_name=attribute_value)"]

e.g.,
debug_ops=["DebugNumericSummary(lower_bound=-100.0; mute_if_healthy=true)"]

Currently, string, float, int, and bool attribute value types are supported.
Change: 150665493
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
affb6697d7b80085d131fd7ffb839496cd882e4b 21-Feb-2017 Shanqing Cai <cais@google.com> tfdbg ops: Expand DebugNumericSummary to bool and integer inputs
Change: 148102707
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
a6421c4dda1a83ea975bae545df1de16d38726b0 18-Feb-2017 A. Unique TensorFlower <gardener@tensorflow.org> Swap NaN count from index 7 to 2 within DebugNumericSummary ops.
Change: 147888410
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
ed814ca9682d043d408e8789e0d6e5c2dc94cd89 12-Dec-2016 Shanqing Cai <cais@google.com> tfdbg: add debug op DebugNumericSummary
Change: 141721771
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
879e0accd1c833771c8058d3eb5f2d4f06f895d4 04-Nov-2016 Jonathan Hseu <jhseu@google.com> Change FileExists to return tensorflow::Status.

Also done separately by @llhe at github.com/tensorflow/tensorflow/pull/5370. We needed to do this change internally to fix all callers.

Motivation: The existing FileExists interface doesn't allow callers to distinguish between file not found vs. filesystem errors.

Semantics changes:
- gfile.Exists in Python now throws an exception for filesystem errors. It continues to return true/false if it can accurately determine whether a file exists.
- RecursivelyCreateDir now returns errors for filesystem errors when calling FileExists.
Change: 138224013
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
9ccbae51231fcf6cfc8c4ef727790c21f7fed85c 30-Jul-2016 Shanqing Cai <cais@google.com> Debugger: File IO utils + File dumping from DebugOps

* Add debug_urls to existing Debug Ops to allow specification of debug URLs.

* Support file debug URLs, such as "file:///tmp/tfdbg_dump_1", where the path specifies a directory in which the dump files will be generated for intermediate tensors during a Session::Run() call.
For example, given that the node name is "foo/bar/node_a", the output slot index of the dumped tensor is 0, and the debug Op is "DebugIdentity", the full path of the dump file will be: "/tmp/tfdbg_dump_1/foo/bar/node_a_0_DebugIdentity_${WALL_TIME_US}", where WALL_TIME_US is the wall timestamp for when the dumped tensor is generated, in microseconds.

* The debug_urls list of strings can be contain multiple individual URLs, e.g., {"file:///tmp/dump_1", "file:///tmp/dump_2"}, in which case, each debug signal tensor will be dumped to all specified paths.

* Add C++ and Python unit tests.

Other related changes:

* Move debugger- (tfdbg-)related build targets from core/BUILD to core/debug/BUILD

Future change lists will implement GRPC debug URL targets.
Change: 128873549
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
6714c150df0a764b29acf8d23981162dd2f0a9a1 20-Jul-2016 Shanqing Cai <cais@google.com> Automated rollback of change 127562075
Change: 127906463
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
12efe48d210477bf9d9fa1a3f5e0f0ab4a24de77 18-Jul-2016 Shanqing Cai <cais@google.com> Automated rollback of change 127562075
Change: 127709092
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc
e5ea34a104f55e9d698e50982de90d99ce99550f 15-Jul-2016 Shanqing Cai <cais@google.com> tfdb: Debug nodes inserter

EXPERIMENTAL: Insert special debug ops (e.g., DebugIdentity) to graph for debugging. Currently, debug ops need to take exactly one input and has the string attribute "tensor_name" to indicate what tensor it watches.

For example, before the node insertion, the graph may look like:

A:0 -----------1----------> B
|
---------2-----------> C

wherein the output slot 0 of node A feeds as the input to nodes B through
edge 1 and to node C through edge 2.

After the node insertion, assuming both B and C have non-Ref input, the graph becomes:

A:0 ---3---> Copy -----------4----------> B
|
---------5--------> C
|
---------6--------> X

If a node (e.g., B) has Ref input, the graph becomes:

----------------4---------------> B
|
A:0 ---3-----> Copy -----------5----------> C
|
-----------6--------> X

In other words, we do not feed Refs to deep-copies to downstream nodes.

The Copy node is the inserted deep-copy node that copies the input tensor on-device (e.g., CPU-to-CPU or GPU-to-GPU deep copy) that reduces the likelihood of racy updates during debug tensor-watching. X is the newly created debug node that transforms the input (copy of the watched tensor) into a debug signal.

DebugIdentity is the simplest debugging paradigm, in which the debug signal (i.e., X:0) equals the tensor itself. More sophisticated debug ops can be used to transform the tensor into other useful debug signals. An example is the added DebugNanCounter op.

If the nodes (A, B and C) are located on GPU and the edges from A to B or C is HOST_MEMORY, the CopyHost op will be used instead of the Copy op.

A reserved string attribute "debug_url" is created for the debug ops to make it possible to send debug signals to files or RPC calls in the future.

Other points worth noting:
* The debug ops have control-edge connections to the original destination node, in order to ensure that the debug signals are deterministically generated before the destination node executes.
* More than one debug ops can be added to watch a tensor.
* A new field called "DebugTensorWatch" is added to RunOptions to support debug node insertion.
* A new method GPUUtil::CopyGPUTensorToSameGPU has been added to make GPU-to-GPU deep-copy of tensors possible.
* The two test files (debug_gateway_test.cc and debug_gateway_gpu_test.cc) have been consolidated to the former, by using the GOOGLE_CUDA macro.
Change: 127562075
/external/tensorflow/tensorflow/core/kernels/debug_ops_test.cc