estp.rst 5.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208
  1. ===================================================
  2. Embedded Stack Trace Profiler (ESTP) User Guide
  3. ===================================================
  4. .. default-role:: code
  5. .. include:: rstcommon.rst
  6. :Author: Andreas Rumpf
  7. :Version: |nimversion|
  8. Nim comes with a platform independent profiler -
  9. the Embedded Stack Trace Profiler (ESTP). The profiler
  10. is *embedded* into your executable. To activate the profiler you need to do:
  11. * compile your program with the `--profiler:on --stackTrace:on`:option: command
  12. line options
  13. * import the `nimprof` module
  14. * run your program as usual.
  15. You can in fact look at `nimprof`'s source code to see how to implement
  16. your own profiler.
  17. The setting `--profiler:on`:option: defines the conditional symbol `profiler`.
  18. You can use `when compileOption("profiler")` to make the switch seamless.
  19. If `profiler`:option: is `off`:option:, your program runs normally.
  20. Otherwise your program is profiled.
  21. ```nim
  22. when compileOption("profiler"):
  23. import nimprof
  24. ```
  25. After your program has finished the profiler will create a
  26. file ``profile_results.txt`` containing the profiling results.
  27. Since the profiler works by examining stack traces, it's essential that
  28. the option `--stackTrace:on`:option: is active! Unfortunately this means that a
  29. profiling build is much slower than a release build.
  30. Memory profiler
  31. ===============
  32. You can also use ESTP as a memory profiler to see which stack traces allocate
  33. the most memory and thus create the most GC pressure. It may also help to
  34. find memory leaks. To activate the memory profiler you need to do:
  35. * compile your program with the
  36. `--profiler:off --stackTrace:on -d:memProfiler`:option:
  37. command line options. Yes it's `--profiler:off`:option:.
  38. * import the `nimprof` module
  39. * run your program as usual.
  40. Define the symbol `ignoreAllocationSize` so that only the number of
  41. allocations is counted and the sizes of the memory allocations do not matter.
  42. Example results file
  43. ====================
  44. The results file lists stack traces ordered by significance.
  45. The following example file has been generated by profiling the Nim compiler
  46. itself: It shows that in total 5.4% of the runtime has been spent
  47. in `crcFromRope` or its children.
  48. In general the stack traces show you immediately where the problem is because
  49. the trace acts like an explanation; in traditional profilers you can only find
  50. expensive leaf functions easily but the *reason* why they are invoked
  51. often remains mysterious.
  52. ::
  53. total executions of each stack trace:
  54. Entry: 0/3391 Calls: 84/4160 = 2.0% [sum: 84; 84/4160 = 2.0%]
  55. newCrcFromRopeAux
  56. crcFromRope
  57. writeRopeIfNotEqual
  58. shouldRecompile
  59. writeModule
  60. myClose
  61. closePasses
  62. processModule
  63. CompileModule
  64. CompileProject
  65. CommandCompileToC
  66. MainCommand
  67. HandleCmdLine
  68. nim
  69. Entry: 1/3391 Calls: 46/4160 = 1.1% [sum: 130; 130/4160 = 3.1%]
  70. updateCrc32
  71. newCrcFromRopeAux
  72. crcFromRope
  73. writeRopeIfNotEqual
  74. shouldRecompile
  75. writeModule
  76. myClose
  77. closePasses
  78. processModule
  79. CompileModule
  80. CompileProject
  81. CommandCompileToC
  82. MainCommand
  83. HandleCmdLine
  84. nim
  85. Entry: 2/3391 Calls: 41/4160 = 0.99% [sum: 171; 171/4160 = 4.1%]
  86. updateCrc32
  87. updateCrc32
  88. newCrcFromRopeAux
  89. crcFromRope
  90. writeRopeIfNotEqual
  91. shouldRecompile
  92. writeModule
  93. myClose
  94. closePasses
  95. processModule
  96. CompileModule
  97. CompileProject
  98. CommandCompileToC
  99. MainCommand
  100. HandleCmdLine
  101. nim
  102. Entry: 3/3391 Calls: 41/4160 = 0.99% [sum: 212; 212/4160 = 5.1%]
  103. crcFromFile
  104. writeRopeIfNotEqual
  105. shouldRecompile
  106. writeModule
  107. myClose
  108. closePasses
  109. processModule
  110. CompileModule
  111. CompileProject
  112. CommandCompileToC
  113. MainCommand
  114. HandleCmdLine
  115. nim
  116. Entry: 4/3391 Calls: 41/4160 = 0.99% [sum: 253; 253/4160 = 6.1%]
  117. updateCrc32
  118. crcFromFile
  119. writeRopeIfNotEqual
  120. shouldRecompile
  121. writeModule
  122. myClose
  123. closePasses
  124. processModule
  125. CompileModule
  126. CompileProject
  127. CommandCompileToC
  128. MainCommand
  129. HandleCmdLine
  130. nim
  131. Entry: 5/3391 Calls: 32/4160 = 0.77% [sum: 285; 285/4160 = 6.9%]
  132. pop
  133. newCrcFromRopeAux
  134. crcFromRope
  135. writeRopeIfNotEqual
  136. shouldRecompile
  137. writeModule
  138. myClose
  139. closePasses
  140. processModule
  141. CompileModule
  142. CompileProject
  143. CommandCompileToC
  144. MainCommand
  145. HandleCmdLine
  146. nim
  147. Entry: 6/3391 Calls: 17/4160 = 0.41% [sum: 302; 302/4160 = 7.3%]
  148. doOperation
  149. forAllChildrenAux
  150. pop
  151. newCrcFromRopeAux
  152. crcFromRope
  153. writeRopeIfNotEqual
  154. shouldRecompile
  155. writeModule
  156. myClose
  157. closePasses
  158. processModule
  159. CompileModule
  160. CompileProject
  161. CommandCompileToC
  162. MainCommand
  163. HandleCmdLine
  164. ...
  165. nim
  166. Entry: 7/3391 Calls: 14/4160 = 0.34% [sum: 316; 316/4160 = 7.6%]
  167. Contains
  168. isAccessible
  169. interiorAllocatedPtr
  170. gcMark
  171. markStackAndRegisters
  172. collectCTBody
  173. collectCT
  174. rawNewObj
  175. newObj
  176. newNode
  177. copyTree
  178. matchesAux
  179. matches
  180. resolveOverloads
  181. semOverloadedCall
  182. semOverloadedCallAnalyseEffects
  183. ...
  184. CommandCompileToC
  185. MainCommand
  186. HandleCmdLine