profile_model_architecture.md revision 9d12c629c0e3cb2767ff02ebe5e886f51c608024
19d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower##Profile Model Architecture 29d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 39d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* [Profile Model Parameters](#profile-model-parameters) 49d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* [Profile Model Float Operations](#profile-model-float-operations) 59d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 69d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower###Profile Model Parameters 79d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 89d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower<b>Notes:</b> 99d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower`VariableV2` operation type might contain variables created by TensorFlow 109d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerimplicitly. User normally don't want to count them as "model capacity". 119d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerWe can use customized operation type to select a subset of variables. 129d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerFor example `_trainable_variables` is created automatically by tfprof Python 139d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerAPI. User can also define customized operation type. 149d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 159d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower``` 169d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# parameters are created by operation type 'VariableV2' (For older model, 179d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# it's 'Variable'). scope view is usually suitable in this case. 189d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> scope -account_type_regexes VariableV2 -max_depth 4 -select params 199d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower_TFProfRoot (--/930.58k params) 209d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower global_step (1/1 params) 219d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower init/init_conv/DW (3x3x3x16, 432/864 params) 229d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/DW (64x10, 640/1.28k params) 239d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/DW/Momentum (64x10, 640/640 params) 249d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/biases (10, 10/20 params) 259d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/biases/Momentum (10, 10/10 params) 269d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/beta (64, 64/128 params) 279d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/gamma (64, 64/128 params) 289d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/moving_mean (64, 64/64 params) 299d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/moving_variance (64, 64/64 params) 309d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 319d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# The Python API profiles tf.trainable_variables() instead of VariableV2. 329d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# 339d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# By default, it's printed to stdout. User can update tfprof_options['output'] 349d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# to write to file. The result is always returned as a proto buffer. 359d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerparam_stats = tf.contrib.tfprof.model_analyzer.print_model_analysis( 369d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower tf.get_default_graph(), 379d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower tfprof_options=tf.contrib.tfprof.model_analyzer. 389d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower TRAINABLE_VARS_PARAMS_STAT_OPTIONS) 399d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowersys.stdout.write('total_params: %d\n' % param_stats.total_parameters) 409d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower``` 419d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 429d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower###Profile Model Float Operations 439d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 449d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower####Caveats 459d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 469d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerFor an operation to have float operation statistics: 479d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 489d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* It must have `RegisterStatistics('flops')` defined in TensorFlow. tfprof 499d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFloweruse the definition to calculate float operations. Contributes are welcome. 509d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 519d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* It must have known "shape" information for RegisterStatistics('flops') 529d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerto calculate the statistics. It is suggested to pass in `-run_meta_path` if 539d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowershape is only known during runtime. tfprof can fill in the missing shape with 549d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerthe runtime shape information from RunMetadata. 559d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 569d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerHence, it is suggested to use `-account_displayed_name_only` 579d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFloweroption so that you know the statistics are only for the operations printed out. 589d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 599d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 609d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower```python 619d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# To profile float opertions in commandline, you need to pass --graph_path 629d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# and --op_log_path. 639d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> scope -min_float_ops 1 -select float_ops -account_displayed_op_only 649d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowernode name | # float_ops 659d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower_TFProfRoot (--/17.63b flops) 669d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul (163.84k/163.84k flops) 679d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul_1 (163.84k/163.84k flops) 689d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower init/init_conv/Conv2D (113.25m/113.25m flops) 699d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/xw_plus_b (1.28k/165.12k flops) 709d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/xw_plus_b/MatMul (163.84k/163.84k flops) 719d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_0/sub1/conv1/Conv2D (603.98m/603.98m flops) 729d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_0/sub2/conv2/Conv2D (603.98m/603.98m flops) 739d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_1/sub1/conv1/Conv2D (603.98m/603.98m flops) 749d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_1/sub2/conv2/Conv2D (603.98m/603.98m flops) 759d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 769d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# Some might prefer op view that aggregate by operation type. 779d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> op -min_float_ops 1 -select float_ops -account_displayed_op_only -order_by float_ops 789d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowernode name | # float_ops 799d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerConv2D 17.63b float_ops (100.00%, 100.00%) 809d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerMatMul 491.52k float_ops (0.00%, 0.00%) 819d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerBiasAdd 1.28k float_ops (0.00%, 0.00%) 829d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 839d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# You can also do that in Python API. 849d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertf.contrib.tfprof.model_analyzer.print_model_analysis( 859d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower tf.get_default_graph(), 869d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS) 879d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower``` 88