intern.txt 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569
  1. =========================================
  2. Internals of the Nim Compiler
  3. =========================================
  4. :Author: Andreas Rumpf
  5. :Version: |nimversion|
  6. .. contents::
  7. "Abstraction is layering ignorance on top of reality." -- unknown
  8. Directory structure
  9. ===================
  10. The Nim project's directory structure is:
  11. ============ ==============================================
  12. Path Purpose
  13. ============ ==============================================
  14. ``bin`` generated binary files
  15. ``build`` generated C code for the installation
  16. ``compiler`` the Nim compiler itself; note that this
  17. code has been translated from a bootstrapping
  18. version written in Pascal, so the code is **not**
  19. a poster child of good Nim code
  20. ``config`` configuration files for Nim
  21. ``dist`` additional packages for the distribution
  22. ``doc`` the documentation; it is a bunch of
  23. reStructuredText files
  24. ``lib`` the Nim library
  25. ``web`` website of Nim; generated by ``nimweb``
  26. from the ``*.txt`` and ``*.tmpl`` files
  27. ============ ==============================================
  28. Bootstrapping the compiler
  29. ==========================
  30. As of version 0.8.5 the compiler is maintained in Nim. (The first versions
  31. have been implemented in Object Pascal.) The Python-based build system has
  32. been rewritten in Nim too.
  33. Compiling the compiler is a simple matter of running::
  34. nim c koch.nim
  35. ./koch boot
  36. For a release version use::
  37. nim c koch.nim
  38. ./koch boot -d:release
  39. And for a debug version compatible with GDB::
  40. nim c koch.nim
  41. ./koch boot --debuginfo --linedir:on
  42. The ``koch`` program is Nim's maintenance script. It is a replacement for
  43. make and shell scripting with the advantage that it is much more portable.
  44. More information about its options can be found in the `koch <koch.html>`_
  45. documentation.
  46. Coding Guidelines
  47. =================
  48. * Use CamelCase, not underscored_identifiers.
  49. * Indent with two spaces.
  50. * Max line length is 80 characters.
  51. * Provide spaces around binary operators if that enhances readability.
  52. * Use a space after a colon, but not before it.
  53. * Start types with a capital ``T``, unless they are pointers/references which
  54. start with ``P``.
  55. See also the `API naming design <apis.html>`_ document.
  56. Porting to new platforms
  57. ========================
  58. Porting Nim to a new architecture is pretty easy, since C is the most
  59. portable programming language (within certain limits) and Nim generates
  60. C code, porting the code generator is not necessary.
  61. POSIX-compliant systems on conventional hardware are usually pretty easy to
  62. port: Add the platform to ``platform`` (if it is not already listed there),
  63. check that the OS, System modules work and recompile Nim.
  64. The only case where things aren't as easy is when the garbage
  65. collector needs some assembler tweaking to work. The standard
  66. version of the GC uses C's ``setjmp`` function to store all registers
  67. on the hardware stack. It may be necessary that the new platform needs to
  68. replace this generic code by some assembler code.
  69. Runtime type information
  70. ========================
  71. *Runtime type information* (RTTI) is needed for several aspects of the Nim
  72. programming language:
  73. Garbage collection
  74. The most important reason for RTTI. Generating
  75. traversal procedures produces bigger code and is likely to be slower on
  76. modern hardware as dynamic procedure binding is hard to predict.
  77. Complex assignments
  78. Sequences and strings are implemented as
  79. pointers to resizeable buffers, but Nim requires copying for
  80. assignments. Apart from RTTI the compiler could generate copy procedures
  81. for any type that needs one. However, this would make the code bigger and
  82. the RTTI is likely already there for the GC.
  83. We already know the type information as a graph in the compiler.
  84. Thus we need to serialize this graph as RTTI for C code generation.
  85. Look at the file ``lib/system/hti.nim`` for more information.
  86. Debugging the compiler
  87. ======================
  88. You can of course use GDB or Visual Studio to debug the
  89. compiler (via ``--debuginfo --lineDir:on``). However, there
  90. are also lots of procs that aid in debugging:
  91. .. code-block:: nim
  92. # pretty prints the Nim AST
  93. echo renderTree(someNode)
  94. # outputs some JSON representation
  95. debug(someNode)
  96. # pretty prints some type
  97. echo typeToString(someType)
  98. debug(someType)
  99. echo symbol.name.s
  100. debug(symbol)
  101. # pretty prints the Nim ast, but annotates symbol IDs:
  102. echo renderTree(someNode, {renderIds})
  103. if n.info ?? "temp.nim":
  104. # only output when it comes from "temp.nim"
  105. echo renderTree(n)
  106. if n.info ?? "temp.nim":
  107. # why does it process temp.nim here?
  108. writeStackTrace()
  109. To create a new compiler for each run, use ``koch temp``::
  110. ./koch temp c /tmp/test.nim
  111. ``koch temp`` creates a debug build of the compiler, which is useful
  112. to create stacktraces for compiler debugging.
  113. ``koch temp`` returns 125 as the exit code in case the compiler
  114. compilation fails. This exit code tells ``git bisect`` to skip the
  115. current commit.::
  116. git bisect start bad-commit good-commit
  117. git bisect run ./koch temp -r c test-source.nim
  118. The compiler's architecture
  119. ===========================
  120. Nim uses the classic compiler architecture: A lexer/scanner feds tokens to a
  121. parser. The parser builds a syntax tree that is used by the code generator.
  122. This syntax tree is the interface between the parser and the code generator.
  123. It is essential to understand most of the compiler's code.
  124. In order to compile Nim correctly, type-checking has to be separated from
  125. parsing. Otherwise generics cannot work.
  126. .. include:: filelist.txt
  127. The syntax tree
  128. ---------------
  129. The syntax tree consists of nodes which may have an arbitrary number of
  130. children. Types and symbols are represented by other nodes, because they
  131. may contain cycles. The AST changes its shape after semantic checking. This
  132. is needed to make life easier for the code generators. See the "ast" module
  133. for the type definitions. The `macros <macros.html>`_ module contains many
  134. examples how the AST represents each syntactic structure.
  135. How the RTL is compiled
  136. =======================
  137. The ``system`` module contains the part of the RTL which needs support by
  138. compiler magic (and the stuff that needs to be in it because the spec
  139. says so). The C code generator generates the C code for it just like any other
  140. module. However, calls to some procedures like ``addInt`` are inserted by
  141. the CCG. Therefore the module ``magicsys`` contains a table (``compilerprocs``)
  142. with all symbols that are marked as ``compilerproc``. ``compilerprocs`` are
  143. needed by the code generator. A ``magic`` proc is not the same as a
  144. ``compilerproc``: A ``magic`` is a proc that needs compiler magic for its
  145. semantic checking, a ``compilerproc`` is a proc that is used by the code
  146. generator.
  147. Compilation cache
  148. =================
  149. The implementation of the compilation cache is tricky: There are lots
  150. of issues to be solved for the front- and backend. In the following
  151. sections *global* means *shared between modules* or *property of the whole
  152. program*.
  153. Frontend issues
  154. ---------------
  155. Methods and type converters
  156. ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  157. Nim contains language features that are *global*. The best example for that
  158. are multi methods: Introducing a new method with the same name and some
  159. compatible object parameter means that the method's dispatcher needs to take
  160. the new method into account. So the dispatching logic is only completely known
  161. after the whole program has been translated!
  162. Other features that are *implicitly* triggered cause problems for modularity
  163. too. Type converters fall into this category:
  164. .. code-block:: nim
  165. # module A
  166. converter toBool(x: int): bool =
  167. result = x != 0
  168. .. code-block:: nim
  169. # module B
  170. import A
  171. if 1:
  172. echo "ugly, but should work"
  173. If in the above example module ``B`` is re-compiled, but ``A`` is not then
  174. ``B`` needs to be aware of ``toBool`` even though ``toBool`` is not referenced
  175. in ``B`` *explicitly*.
  176. Both the multi method and the type converter problems are solved by storing
  177. them in special sections in the ROD file that are loaded *unconditionally*
  178. when the ROD file is read.
  179. Generics
  180. ~~~~~~~~
  181. If we generate an instance of a generic, we'd like to re-use that
  182. instance if possible across module boundaries. However, this is not
  183. possible if the compilation cache is enabled. So we give up then and use
  184. the caching of generics only per module, not per project. This means that
  185. ``--symbolFiles:on`` hurts a bit for efficiency. A better solution would
  186. be to persist the instantiations in a global cache per project. This might be
  187. implemented in later versions.
  188. Backend issues
  189. --------------
  190. - Init procs must not be "forgotten" to be called.
  191. - Files must not be "forgotten" to be linked.
  192. - Anything that is contained in ``nim__dat.c`` is shared between modules
  193. implicitly.
  194. - Method dispatchers are global.
  195. - DLL loading via ``dlsym`` is global.
  196. - Emulated thread vars are global.
  197. However the biggest problem is that dead code elimination breaks modularity!
  198. To see why, consider this scenario: The module ``G`` (for example the huge
  199. Gtk2 module...) is compiled with dead code elimination turned on. So none
  200. of ``G``'s procs is generated at all.
  201. Then module ``B`` is compiled that requires ``G.P1``. Ok, no problem,
  202. ``G.P1`` is loaded from the symbol file and ``G.c`` now contains ``G.P1``.
  203. Then module ``A`` (that depends onto ``B`` and ``G``) is compiled and ``B``
  204. and ``G`` are left unchanged. ``A`` requires ``G.P2``.
  205. So now ``G.c`` MUST contain both ``P1`` and ``P2``, but we haven't even
  206. loaded ``P1`` from the symbol file, nor do we want to because we then quickly
  207. would restore large parts of the whole program. But we also don't want to
  208. store ``P1`` in ``B.c`` because that would mean to store every symbol where
  209. it is referred from which ultimately means the main module and putting
  210. everything in a single C file.
  211. There is however another solution: The old file ``G.c`` containing ``P1`` is
  212. **merged** with the new file ``G.c`` containing ``P2``. This is the solution
  213. that is implemented in the C code generator (have a look at the ``ccgmerge``
  214. module). The merging may lead to *cruft* (aka dead code) in generated C code
  215. which can only be removed by recompiling a project with the compilation cache
  216. turned off. Nevertheless the merge solution is way superior to the
  217. cheap solution "turn off dead code elimination if the compilation cache is
  218. turned on".
  219. Debugging Nim's memory management
  220. =================================
  221. The following paragraphs are mostly a reminder for myself. Things to keep
  222. in mind:
  223. * If an assertion in Nim's memory manager or GC fails, the stack trace
  224. keeps allocating memory! Thus a stack overflow may happen, hiding the
  225. real issue.
  226. * What seem to be C code generation problems is often a bug resulting from
  227. not producing prototypes, so that some types default to ``cint``. Testing
  228. without the ``-w`` option helps!
  229. The Garbage Collector
  230. =====================
  231. Introduction
  232. ------------
  233. I use the term *cell* here to refer to everything that is traced
  234. (sequences, refs, strings).
  235. This section describes how the new GC works.
  236. The basic algorithm is *Deferrent Reference Counting* with cycle detection.
  237. References on the stack are not counted for better performance and easier C
  238. code generation.
  239. Each cell has a header consisting of a RC and a pointer to its type
  240. descriptor. However the program does not know about these, so they are placed at
  241. negative offsets. In the GC code the type ``PCell`` denotes a pointer
  242. decremented by the right offset, so that the header can be accessed easily. It
  243. is extremely important that ``pointer`` is not confused with a ``PCell``
  244. as this would lead to a memory corruption.
  245. The CellSet data structure
  246. --------------------------
  247. The GC depends on an extremely efficient datastructure for storing a
  248. set of pointers - this is called a ``TCellSet`` in the source code.
  249. Inserting, deleting and searching are done in constant time. However,
  250. modifying a ``TCellSet`` during traversation leads to undefined behaviour.
  251. .. code-block:: Nim
  252. type
  253. TCellSet # hidden
  254. proc cellSetInit(s: var TCellSet) # initialize a new set
  255. proc cellSetDeinit(s: var TCellSet) # empty the set and free its memory
  256. proc incl(s: var TCellSet, elem: PCell) # include an element
  257. proc excl(s: var TCellSet, elem: PCell) # exclude an element
  258. proc `in`(elem: PCell, s: TCellSet): bool # tests membership
  259. iterator elements(s: TCellSet): (elem: PCell)
  260. All the operations have to perform efficiently. Because a Cellset can
  261. become huge a hash table alone is not suitable for this.
  262. We use a mixture of bitset and hash table for this. The hash table maps *pages*
  263. to a page descriptor. The page descriptor contains a bit for any possible cell
  264. address within this page. So including a cell is done as follows:
  265. - Find the page descriptor for the page the cell belongs to.
  266. - Set the appropriate bit in the page descriptor indicating that the
  267. cell points to the start of a memory block.
  268. Removing a cell is analogous - the bit has to be set to zero.
  269. Single page descriptors are never deleted from the hash table. This is not
  270. needed as the data structures needs to be rebuilt periodically anyway.
  271. Complete traversal is done in this way::
  272. for each page descriptor d:
  273. for each bit in d:
  274. if bit == 1:
  275. traverse the pointer belonging to this bit
  276. Further complications
  277. ---------------------
  278. In Nim the compiler cannot always know if a reference
  279. is stored on the stack or not. This is caused by var parameters.
  280. Consider this example:
  281. .. code-block:: Nim
  282. proc setRef(r: var ref TNode) =
  283. new(r)
  284. proc usage =
  285. var
  286. r: ref TNode
  287. setRef(r) # here we should not update the reference counts, because
  288. # r is on the stack
  289. setRef(r.left) # here we should update the refcounts!
  290. We have to decide at runtime whether the reference is on the stack or not.
  291. The generated code looks roughly like this:
  292. .. code-block:: C
  293. void setref(TNode** ref) {
  294. unsureAsgnRef(ref, newObj(TNode_TI, sizeof(TNode)))
  295. }
  296. void usage(void) {
  297. setRef(&r)
  298. setRef(&r->left)
  299. }
  300. Note that for systems with a continuous stack (which most systems have)
  301. the check whether the ref is on the stack is very cheap (only two
  302. comparisons).
  303. Code generation for closures
  304. ============================
  305. Code generation for closures is implemented by `lambda lifting`:idx:.
  306. Design
  307. ------
  308. A ``closure`` proc var can call ordinary procs of the default Nim calling
  309. convention. But not the other way round! A closure is implemented as a
  310. ``tuple[prc, env]``. ``env`` can be nil implying a call without a closure.
  311. This means that a call through a closure generates an ``if`` but the
  312. interoperability is worth the cost of the ``if``. Thunk generation would be
  313. possible too, but it's slightly more effort to implement.
  314. Tests with GCC on Amd64 showed that it's really beneficical if the
  315. 'environment' pointer is passed as the last argument, not as the first argument.
  316. Proper thunk generation is harder because the proc that is to wrap
  317. could stem from a complex expression:
  318. .. code-block:: nim
  319. receivesClosure(returnsDefaultCC[i])
  320. A thunk would need to call 'returnsDefaultCC[i]' somehow and that would require
  321. an *additional* closure generation... Ok, not really, but it requires to pass
  322. the function to call. So we'd end up with 2 indirect calls instead of one.
  323. Another much more severe problem which this solution is that it's not GC-safe
  324. to pass a proc pointer around via a generic ``ref`` type.
  325. Example code:
  326. .. code-block:: nim
  327. proc add(x: int): proc (y: int): int {.closure.} =
  328. return proc (y: int): int =
  329. return x + y
  330. var add2 = add(2)
  331. echo add2(5) #OUT 7
  332. This should produce roughly this code:
  333. .. code-block:: nim
  334. type
  335. PEnv = ref object
  336. x: int # data
  337. proc anon(y: int, c: PEnv): int =
  338. return y + c.x
  339. proc add(x: int): tuple[prc, data] =
  340. var env: PEnv
  341. new env
  342. env.x = x
  343. result = (anon, env)
  344. var add2 = add(2)
  345. let tmp = if add2.data == nil: add2.prc(5) else: add2.prc(5, add2.data)
  346. echo tmp
  347. Beware of nesting:
  348. .. code-block:: nim
  349. proc add(x: int): proc (y: int): proc (z: int): int {.closure.} {.closure.} =
  350. return lamba (y: int): proc (z: int): int {.closure.} =
  351. return lambda (z: int): int =
  352. return x + y + z
  353. var add24 = add(2)(4)
  354. echo add24(5) #OUT 11
  355. This should produce roughly this code:
  356. .. code-block:: nim
  357. type
  358. PEnvX = ref object
  359. x: int # data
  360. PEnvY = ref object
  361. y: int
  362. ex: PEnvX
  363. proc lambdaZ(z: int, ey: PEnvY): int =
  364. return ey.ex.x + ey.y + z
  365. proc lambdaY(y: int, ex: PEnvX): tuple[prc, data: PEnvY] =
  366. var ey: PEnvY
  367. new ey
  368. ey.y = y
  369. ey.ex = ex
  370. result = (lambdaZ, ey)
  371. proc add(x: int): tuple[prc, data: PEnvX] =
  372. var ex: PEnvX
  373. ex.x = x
  374. result = (labmdaY, ex)
  375. var tmp = add(2)
  376. var tmp2 = tmp.fn(4, tmp.data)
  377. var add24 = tmp2.fn(4, tmp2.data)
  378. echo add24(5)
  379. We could get rid of nesting environments by always inlining inner anon procs.
  380. More useful is escape analysis and stack allocation of the environment,
  381. however.
  382. Alternative
  383. -----------
  384. Process the closure of all inner procs in one pass and accumulate the
  385. environments. This is however not always possible.
  386. Accumulator
  387. -----------
  388. .. code-block:: nim
  389. proc getAccumulator(start: int): proc (): int {.closure} =
  390. var i = start
  391. return lambda: int =
  392. inc i
  393. return i
  394. proc p =
  395. var delta = 7
  396. proc accumulator(start: int): proc(): int =
  397. var x = start-1
  398. result = proc (): int =
  399. x = x + delta
  400. inc delta
  401. return x
  402. var a = accumulator(3)
  403. var b = accumulator(4)
  404. echo a() + b()
  405. Internals
  406. ---------
  407. Lambda lifting is implemented as part of the ``transf`` pass. The ``transf``
  408. pass generates code to setup the environment and to pass it around. However,
  409. this pass does not change the types! So we have some kind of mismatch here; on
  410. the one hand the proc expression becomes an explicit tuple, on the other hand
  411. the tyProc(ccClosure) type is not changed. For C code generation it's also
  412. important the hidden formal param is ``void*`` and not something more
  413. specialized. However the more specialized env type needs to passed to the
  414. backend somehow. We deal with this by modifying ``s.ast[paramPos]`` to contain
  415. the formal hidden parameter, but not ``s.typ``!