151eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun## Profile Model Architecture 29d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 39d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* [Profile Model Parameters](#profile-model-parameters) 49d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* [Profile Model Float Operations](#profile-model-float-operations) 59d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 651eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun### Profile Model Parameters 79d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 89d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower<b>Notes:</b> 99d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower`VariableV2` operation type might contain variables created by TensorFlow 109d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerimplicitly. User normally don't want to count them as "model capacity". 119d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerWe can use customized operation type to select a subset of variables. 129d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerFor example `_trainable_variables` is created automatically by tfprof Python 139d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerAPI. User can also define customized operation type. 149d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 159d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower``` 169d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# parameters are created by operation type 'VariableV2' (For older model, 179d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# it's 'Variable'). scope view is usually suitable in this case. 189d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> scope -account_type_regexes VariableV2 -max_depth 4 -select params 199d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower_TFProfRoot (--/930.58k params) 209d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower global_step (1/1 params) 219d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower init/init_conv/DW (3x3x3x16, 432/864 params) 229d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/DW (64x10, 640/1.28k params) 239d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/DW/Momentum (64x10, 640/640 params) 249d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/biases (10, 10/20 params) 259d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/biases/Momentum (10, 10/10 params) 269d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/beta (64, 64/128 params) 279d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/gamma (64, 64/128 params) 289d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/moving_mean (64, 64/64 params) 299d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_last/final_bn/moving_variance (64, 64/64 params) 309d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 319d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# The Python API profiles tf.trainable_variables() instead of VariableV2. 329d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# 33af23ae65db2585f4a18d0bc5f21f15e94805aa4fA. Unique TensorFlower# By default, it's printed to stdout. User can update options['output'] 349d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# to write to file. The result is always returned as a proto buffer. 35af23ae65db2585f4a18d0bc5f21f15e94805aa4fA. Unique TensorFlowerparam_stats = tf.profiler.profile( 369d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower tf.get_default_graph(), 37154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlower options=tf.profiler.ProfileOptionBuilder 38154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlower .trainable_variables_parameter()) 399d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowersys.stdout.write('total_params: %d\n' % param_stats.total_parameters) 409d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower``` 419d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 4251eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun### Profile Model Float Operations 439d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 4451eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun#### Caveats 459d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 469d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerFor an operation to have float operation statistics: 479d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 489d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* It must have `RegisterStatistics('flops')` defined in TensorFlow. tfprof 499d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFloweruse the definition to calculate float operations. Contributes are welcome. 509d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 519d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* It must have known "shape" information for RegisterStatistics('flops') 529d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerto calculate the statistics. It is suggested to pass in `-run_meta_path` if 539d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowershape is only known during runtime. tfprof can fill in the missing shape with 549d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerthe runtime shape information from RunMetadata. 55154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlowerHence, it is suggested to use `-account_displayed_op_only` 569d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFloweroption so that you know the statistics are only for the operations printed out. 579d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 585af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlower* If no RunMetadata provided, tfprof count float_ops of each graph node once, 595af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlowereven if it is defined in tf.while_loop. This is because tfprof doesn't know 605af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlowerhow many times are run statically. If RunMetadata provided, tfprof calculate 615af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlowerfloat_ops as float_ops * run_count. 625af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlower 635af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlower 649d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 659d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower```python 669d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# To profile float opertions in commandline, you need to pass --graph_path 679d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# and --op_log_path. 689d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> scope -min_float_ops 1 -select float_ops -account_displayed_op_only 699d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowernode name | # float_ops 709d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower_TFProfRoot (--/17.63b flops) 719d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul (163.84k/163.84k flops) 729d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul_1 (163.84k/163.84k flops) 739d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower init/init_conv/Conv2D (113.25m/113.25m flops) 749d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/xw_plus_b (1.28k/165.12k flops) 759d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower pool_logit/xw_plus_b/MatMul (163.84k/163.84k flops) 769d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_0/sub1/conv1/Conv2D (603.98m/603.98m flops) 779d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_0/sub2/conv2/Conv2D (603.98m/603.98m flops) 789d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_1/sub1/conv1/Conv2D (603.98m/603.98m flops) 799d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower unit_1_1/sub2/conv2/Conv2D (603.98m/603.98m flops) 809d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 819d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# Some might prefer op view that aggregate by operation type. 829d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> op -min_float_ops 1 -select float_ops -account_displayed_op_only -order_by float_ops 839d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowernode name | # float_ops 849d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerConv2D 17.63b float_ops (100.00%, 100.00%) 859d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerMatMul 491.52k float_ops (0.00%, 0.00%) 869d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerBiasAdd 1.28k float_ops (0.00%, 0.00%) 879d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower 889d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# You can also do that in Python API. 89af23ae65db2585f4a18d0bc5f21f15e94805aa4fA. Unique TensorFlowertf.profiler.profile( 909d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower tf.get_default_graph(), 91154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlower options=tf.profiler.ProfileOptionBuilder.float_operation()) 929d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower``` 93