151eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun## Profile Model Architecture
29d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
39d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* [Profile Model Parameters](#profile-model-parameters)
49d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* [Profile Model Float Operations](#profile-model-float-operations)
59d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
651eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun### Profile Model Parameters
79d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
89d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower<b>Notes:</b>
99d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower`VariableV2` operation type might contain variables created by TensorFlow
109d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerimplicitly. User normally don't want to count them as "model capacity".
119d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerWe can use customized operation type to select a subset of variables.
129d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerFor example `_trainable_variables` is created automatically by tfprof Python
139d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerAPI. User can also define customized operation type.
149d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
159d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower```
169d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# parameters are created by operation type 'VariableV2' (For older model,
179d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# it's 'Variable'). scope view is usually suitable in this case.
189d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> scope -account_type_regexes VariableV2 -max_depth 4 -select params
199d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower_TFProfRoot (--/930.58k params)
209d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  global_step (1/1 params)
219d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  init/init_conv/DW (3x3x3x16, 432/864 params)
229d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  pool_logit/DW (64x10, 640/1.28k params)
239d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower    pool_logit/DW/Momentum (64x10, 640/640 params)
249d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  pool_logit/biases (10, 10/20 params)
259d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower    pool_logit/biases/Momentum (10, 10/10 params)
269d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_last/final_bn/beta (64, 64/128 params)
279d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_last/final_bn/gamma (64, 64/128 params)
289d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_last/final_bn/moving_mean (64, 64/64 params)
299d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_last/final_bn/moving_variance (64, 64/64 params)
309d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
319d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# The Python API profiles tf.trainable_variables() instead of VariableV2.
329d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower#
33af23ae65db2585f4a18d0bc5f21f15e94805aa4fA. Unique TensorFlower# By default, it's printed to stdout. User can update options['output']
349d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# to write to file. The result is always returned as a proto buffer.
35af23ae65db2585f4a18d0bc5f21f15e94805aa4fA. Unique TensorFlowerparam_stats = tf.profiler.profile(
369d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower    tf.get_default_graph(),
37154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlower    options=tf.profiler.ProfileOptionBuilder
38154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlower        .trainable_variables_parameter())
399d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowersys.stdout.write('total_params: %d\n' % param_stats.total_parameters)
409d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower```
419d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
4251eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun### Profile Model Float Operations
439d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
4451eb71cd3b8353ba65ca3539539f4b69fda029dcmeijun#### Caveats
459d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
469d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerFor an operation to have float operation statistics:
479d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
489d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* It must have `RegisterStatistics('flops')` defined in TensorFlow. tfprof
499d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFloweruse the definition to calculate float operations. Contributes are welcome.
509d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
519d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower* It must have known "shape" information for RegisterStatistics('flops')
529d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerto calculate the statistics. It is suggested to pass in `-run_meta_path` if
539d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowershape is only known during runtime. tfprof can fill in the missing shape with
549d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerthe runtime shape information from RunMetadata.
55154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlowerHence, it is suggested to use `-account_displayed_op_only`
569d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFloweroption so that you know the statistics are only for the operations printed out.
579d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
585af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlower* If no RunMetadata provided, tfprof count float_ops of each graph node once,
595af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlowereven if it is defined in tf.while_loop. This is because tfprof doesn't know
605af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlowerhow many times are run statically. If RunMetadata provided, tfprof calculate
615af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlowerfloat_ops as float_ops * run_count.
625af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlower
635af6b3e40161ead74390278053de81908bfd7674A. Unique TensorFlower
649d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
659d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower```python
669d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# To profile float opertions in commandline, you need to pass --graph_path
679d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# and --op_log_path.
689d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> scope -min_float_ops 1 -select float_ops -account_displayed_op_only
699d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowernode name | # float_ops
709d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower_TFProfRoot (--/17.63b flops)
719d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul (163.84k/163.84k flops)
729d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul_1 (163.84k/163.84k flops)
739d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  init/init_conv/Conv2D (113.25m/113.25m flops)
749d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  pool_logit/xw_plus_b (1.28k/165.12k flops)
759d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower    pool_logit/xw_plus_b/MatMul (163.84k/163.84k flops)
769d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_1_0/sub1/conv1/Conv2D (603.98m/603.98m flops)
779d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_1_0/sub2/conv2/Conv2D (603.98m/603.98m flops)
789d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_1_1/sub1/conv1/Conv2D (603.98m/603.98m flops)
799d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower  unit_1_1/sub2/conv2/Conv2D (603.98m/603.98m flops)
809d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
819d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# Some might prefer op view that aggregate by operation type.
829d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowertfprof> op -min_float_ops 1 -select float_ops -account_displayed_op_only -order_by float_ops
839d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowernode name | # float_ops
849d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerConv2D                   17.63b float_ops (100.00%, 100.00%)
859d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerMatMul                   491.52k float_ops (0.00%, 0.00%)
869d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlowerBiasAdd                  1.28k float_ops (0.00%, 0.00%)
879d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower
889d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower# You can also do that in Python API.
89af23ae65db2585f4a18d0bc5f21f15e94805aa4fA. Unique TensorFlowertf.profiler.profile(
909d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower    tf.get_default_graph(),
91154df32a959df74b3a1c377ff72f955d755b3d34A. Unique TensorFlower    options=tf.profiler.ProfileOptionBuilder.float_operation())
929d12c629c0e3cb2767ff02ebe5e886f51c608024A. Unique TensorFlower```
93