904edee4456a61d50d5b1ffe9858a7772acc423e |
|
28-Mar-2017 |
Alexander Matyasko <aam-at@users.noreply.github.com> |
Max pool grad grad (#6664) * Adds cpu and gpu kernels for max pooling grad grad * Adds operations to tensorflow C++ registry * Adds python gradient definition for average and max pooling * Adds maxpool grad grad to the list of hidden ops * Update shape inference function for MaxPoolGradGrad * Fixes max pool grad grad cpu kernel * Adds grad grad grad for max pool and avg pool * Updates gpu kernel to correctly accumulate gradient * Adds tests for max pool grad grad (python bindings) * Fixes max pool grad grad op error message * Updates cpu and gpu kernels (accumulate only single value) Current implementation propagates only one maximum value in top_diff. To be compatible with cudnn and the rest of tensorflow api propagate only single value (specifically should_stop flag is added into inner loop of kernel) * Adds gpu grad grad kernel for HCWC data layout * Defines pooling ops for all real number types * Registers kernels properly Group kernels and register them using compiler macros. This allows to easily add new types for kernels. * Declares functors for max pooling kernels with support for double This allows to use templates and conveniently register kernels. Further refactoring may combine cpu and gpu ops. However, currently cpu does not support all the variants which are available for gpu. * Forward declaration of max pool 3d grad grad and registers gradients * Updates cudnn pooling bindings to support double Changes to core stream_executor merely duplicate method for float with different type. * Adds test for maxpooling with double * Registers 3d pooling with half precision floats * Adds kernels for max pool3d grad grad * Increase margin for second order gradient to 1.5e-2 testMaxPoolGradValidPadding2_1_7_3d on gpu fails with 1e-2 margin (error is 0.102). Increases margin so tests can pass. On cpu tests passes with 1e-2 margin. The differences are related to different numerical precision on gpu and cpu. * Adds 3-rd order gradient for max pooling * Remove redundant comment (eigen pooling works only with floats) * Update shape function for max pool grad grad More comprehensive check for input and output shapes for op. * Fix confusion with planes/rows/cols in kernel call * Update tests for 3d pooling kernels * Respect clang google format * Update cudnn pooling loading for double Reference commit: 191658d54f90ac03c15b339326129cd52d1f56a3 * Fix function name typo in python * Buildifier auto-fix for build file * Fix typo mistake * Remove redundant context checks * Replace allocate_output calls with forward_input_or_... * Use double for alpha and beta with cudnn pooling for doubles * Updated doc for MaxPoolingGradGrad * Update doc for MaxPoolingGradGrad kernels * Rename s/margin/tolerance/ and update doc string * Remove unnecessary zero initialization and lunch with cuda config For MaxPoolGradGrad kernels zero initialization is unnecessary. Therefore, removing it might save some computation time. * Use assertAllCloseAccordingToType * Fix MaxPool3dGrad op definition and default type for maxpool Previous registration uses orig_input: float and grad: T. To pass compatibility check two types for the input and output is necessary * Fix pooling_ops_3d test for xla compiler Grad operations for pooling 3d were hidden like pooling 2d grads.
/external/tensorflow/tensorflow/compiler/tests/pooling_ops_3d_test.py
|