tut2.rst 35 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067
  1. ======================
  2. Nim Tutorial (Part II)
  3. ======================
  4. :Author: Andreas Rumpf
  5. :Version: |nimversion|
  6. .. contents::
  7. Introduction
  8. ============
  9. "Repetition renders the ridiculous reasonable." -- Norman Wildberger
  10. This document is a tutorial for the advanced constructs of the *Nim*
  11. programming language. **Note that this document is somewhat obsolete as the**
  12. `manual <manual.html>`_ **contains many more examples of the advanced language
  13. features.**
  14. Pragmas
  15. =======
  16. Pragmas are Nim's method to give the compiler additional information/
  17. commands without introducing a massive number of new keywords. Pragmas are
  18. enclosed in the special ``{.`` and ``.}`` curly dot brackets. This tutorial
  19. does not cover pragmas. See the `manual <manual.html#pragmas>`_ or `user guide
  20. <nimc.html#additional-features>`_ for a description of the available
  21. pragmas.
  22. Object Oriented Programming
  23. ===========================
  24. While Nim's support for object oriented programming (OOP) is minimalistic,
  25. powerful OOP techniques can be used. OOP is seen as *one* way to design a
  26. program, not *the only* way. Often a procedural approach leads to simpler
  27. and more efficient code. In particular, preferring composition over inheritance
  28. is often the better design.
  29. Objects
  30. -------
  31. Like tuples, objects are a means to pack different values together in a
  32. structured way. However, objects provide many features that tuples do not:
  33. They provide inheritance and information hiding. Because objects encapsulate
  34. data, the ``T()`` object constructor should only be used internally and the
  35. programmer should provide a proc to initialize the object (this is called
  36. a *constructor*).
  37. Objects have access to their type at runtime. There is an
  38. ``of`` operator that can be used to check the object's type:
  39. .. code-block:: nim
  40. type
  41. Person = ref object of RootObj
  42. name*: string # the * means that `name` is accessible from other modules
  43. age: int # no * means that the field is hidden from other modules
  44. Student = ref object of Person # Student inherits from Person
  45. id: int # with an id field
  46. var
  47. student: Student
  48. person: Person
  49. assert(student of Student) # is true
  50. # object construction:
  51. student = Student(name: "Anton", age: 5, id: 2)
  52. echo student[]
  53. Object fields that should be visible from outside the defining module have to
  54. be marked by ``*``. In contrast to tuples, different object types are
  55. never *equivalent*. New object types can only be defined within a type
  56. section.
  57. Inheritance is done with the ``object of`` syntax. Multiple inheritance is
  58. currently not supported. If an object type has no suitable ancestor, ``RootObj``
  59. can be used as its ancestor, but this is only a convention. Objects that have
  60. no ancestor are implicitly ``final``. You can use the ``inheritable`` pragma
  61. to introduce new object roots apart from ``system.RootObj``. (This is used
  62. in the GTK wrapper for instance.)
  63. Ref objects should be used whenever inheritance is used. It isn't strictly
  64. necessary, but with non-ref objects assignments such as ``let person: Person =
  65. Student(id: 123)`` will truncate subclass fields.
  66. **Note**: Composition (*has-a* relation) is often preferable to inheritance
  67. (*is-a* relation) for simple code reuse. Since objects are value types in
  68. Nim, composition is as efficient as inheritance.
  69. Mutually recursive types
  70. ------------------------
  71. Objects, tuples and references can model quite complex data structures which
  72. depend on each other; they are *mutually recursive*. In Nim
  73. these types can only be declared within a single type section. (Anything else
  74. would require arbitrary symbol lookahead which slows down compilation.)
  75. Example:
  76. .. code-block:: nim
  77. type
  78. Node = ref NodeObj # a traced reference to a NodeObj
  79. NodeObj = object
  80. le, ri: Node # left and right subtrees
  81. sym: ref Sym # leaves contain a reference to a Sym
  82. Sym = object # a symbol
  83. name: string # the symbol's name
  84. line: int # the line the symbol was declared in
  85. code: Node # the symbol's abstract syntax tree
  86. Type conversions
  87. ----------------
  88. Nim distinguishes between `type casts`:idx: and `type conversions`:idx:.
  89. Casts are done with the ``cast`` operator and force the compiler to
  90. interpret a bit pattern to be of another type.
  91. Type conversions are a much more polite way to convert a type into another:
  92. They preserve the abstract *value*, not necessarily the *bit-pattern*. If a
  93. type conversion is not possible, the compiler complains or an exception is
  94. raised.
  95. The syntax for type conversions is ``destination_type(expression_to_convert)``
  96. (like an ordinary call):
  97. .. code-block:: nim
  98. proc getID(x: Person): int =
  99. Student(x).id
  100. The ``InvalidObjectConversionError`` exception is raised if ``x`` is not a
  101. ``Student``.
  102. Object variants
  103. ---------------
  104. Often an object hierarchy is overkill in certain situations where simple
  105. variant types are needed.
  106. An example:
  107. .. code-block:: nim
  108. # This is an example how an abstract syntax tree could be modelled in Nim
  109. type
  110. NodeKind = enum # the different node types
  111. nkInt, # a leaf with an integer value
  112. nkFloat, # a leaf with a float value
  113. nkString, # a leaf with a string value
  114. nkAdd, # an addition
  115. nkSub, # a subtraction
  116. nkIf # an if statement
  117. Node = ref NodeObj
  118. NodeObj = object
  119. case kind: NodeKind # the ``kind`` field is the discriminator
  120. of nkInt: intVal: int
  121. of nkFloat: floatVal: float
  122. of nkString: strVal: string
  123. of nkAdd, nkSub:
  124. leftOp, rightOp: Node
  125. of nkIf:
  126. condition, thenPart, elsePart: Node
  127. var n = Node(kind: nkFloat, floatVal: 1.0)
  128. # the following statement raises an `FieldError` exception, because
  129. # n.kind's value does not fit:
  130. n.strVal = ""
  131. As can been seen from the example, an advantage to an object hierarchy is that
  132. no conversion between different object types is needed. Yet, access to invalid
  133. object fields raises an exception.
  134. Methods
  135. -------
  136. In ordinary object oriented languages, procedures (also called *methods*) are
  137. bound to a class. This has disadvantages:
  138. * Adding a method to a class the programmer has no control over is
  139. impossible or needs ugly workarounds.
  140. * Often it is unclear where the method should belong to: is
  141. ``join`` a string method or an array method?
  142. Nim avoids these problems by not assigning methods to a class. All methods
  143. in Nim are multi-methods. As we will see later, multi-methods are
  144. distinguished from procs only for dynamic binding purposes.
  145. Method call syntax
  146. ------------------
  147. There is a syntactic sugar for calling routines:
  148. The syntax ``obj.method(args)`` can be used instead of ``method(obj, args)``.
  149. If there are no remaining arguments, the parentheses can be omitted:
  150. ``obj.len`` (instead of ``len(obj)``).
  151. This method call syntax is not restricted to objects, it can be used
  152. for any type:
  153. .. code-block:: nim
  154. echo "abc".len # is the same as echo len("abc")
  155. echo "abc".toUpper()
  156. echo({'a', 'b', 'c'}.card)
  157. stdout.writeLine("Hallo") # the same as writeLine(stdout, "Hallo")
  158. (Another way to look at the method call syntax is that it provides the missing
  159. postfix notation.)
  160. So "pure object oriented" code is easy to write:
  161. .. code-block:: nim
  162. import strutils, sequtils
  163. stdout.writeLine("Give a list of numbers (separated by spaces): ")
  164. stdout.write(stdin.readLine.splitWhitespace.map(parseInt).max.`$`)
  165. stdout.writeLine(" is the maximum!")
  166. Properties
  167. ----------
  168. As the above example shows, Nim has no need for *get-properties*:
  169. Ordinary get-procedures that are called with the *method call syntax* achieve
  170. the same. But setting a value is different; for this a special setter syntax
  171. is needed:
  172. .. code-block:: nim
  173. type
  174. Socket* = ref object of RootObj
  175. h: int # cannot be accessed from the outside of the module due to missing star
  176. proc `host=`*(s: var Socket, value: int) {.inline.} =
  177. ## setter of host address
  178. s.h = value
  179. proc host*(s: Socket): int {.inline.} =
  180. ## getter of host address
  181. s.h
  182. var s: Socket
  183. new s
  184. s.host = 34 # same as `host=`(s, 34)
  185. (The example also shows ``inline`` procedures.)
  186. The ``[]`` array access operator can be overloaded to provide
  187. `array properties`:idx:\ :
  188. .. code-block:: nim
  189. type
  190. Vector* = object
  191. x, y, z: float
  192. proc `[]=`* (v: var Vector, i: int, value: float) =
  193. # setter
  194. case i
  195. of 0: v.x = value
  196. of 1: v.y = value
  197. of 2: v.z = value
  198. else: assert(false)
  199. proc `[]`* (v: Vector, i: int): float =
  200. # getter
  201. case i
  202. of 0: result = v.x
  203. of 1: result = v.y
  204. of 2: result = v.z
  205. else: assert(false)
  206. The example is silly, since a vector is better modelled by a tuple which
  207. already provides ``v[]`` access.
  208. Dynamic dispatch
  209. ----------------
  210. Procedures always use static dispatch. For dynamic dispatch replace the
  211. ``proc`` keyword by ``method``:
  212. .. code-block:: nim
  213. type
  214. PExpr = ref object of RootObj ## abstract base class for an expression
  215. PLiteral = ref object of PExpr
  216. x: int
  217. PPlusExpr = ref object of PExpr
  218. a, b: PExpr
  219. # watch out: 'eval' relies on dynamic binding
  220. method eval(e: PExpr): int =
  221. # override this base method
  222. quit "to override!"
  223. method eval(e: PLiteral): int = e.x
  224. method eval(e: PPlusExpr): int = eval(e.a) + eval(e.b)
  225. proc newLit(x: int): PLiteral = PLiteral(x: x)
  226. proc newPlus(a, b: PExpr): PPlusExpr = PPlusExpr(a: a, b: b)
  227. echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4)))
  228. Note that in the example the constructors ``newLit`` and ``newPlus`` are procs
  229. because it makes more sense for them to use static binding, but ``eval`` is a
  230. method because it requires dynamic binding.
  231. In a multi-method all parameters that have an object type are used for the
  232. dispatching:
  233. .. code-block:: nim
  234. type
  235. Thing = ref object of RootObj
  236. Unit = ref object of Thing
  237. x: int
  238. method collide(a, b: Thing) {.inline.} =
  239. quit "to override!"
  240. method collide(a: Thing, b: Unit) {.inline.} =
  241. echo "1"
  242. method collide(a: Unit, b: Thing) {.inline.} =
  243. echo "2"
  244. var a, b: Unit
  245. new a
  246. new b
  247. collide(a, b) # output: 2
  248. As the example demonstrates, invocation of a multi-method cannot be ambiguous:
  249. Collide 2 is preferred over collide 1 because the resolution works from left to
  250. right. Thus ``Unit, Thing`` is preferred over ``Thing, Unit``.
  251. **Performance note**: Nim does not produce a virtual method table, but
  252. generates dispatch trees. This avoids the expensive indirect branch for method
  253. calls and enables inlining. However, other optimizations like compile time
  254. evaluation or dead code elimination do not work with methods.
  255. Exceptions
  256. ==========
  257. In Nim exceptions are objects. By convention, exception types are
  258. suffixed with 'Error'. The `system <system.html>`_ module defines an
  259. exception hierarchy that you might want to stick to. Exceptions derive from
  260. ``system.Exception``, which provides the common interface.
  261. Exceptions have to be allocated on the heap because their lifetime is unknown.
  262. The compiler will prevent you from raising an exception created on the stack.
  263. All raised exceptions should at least specify the reason for being raised in
  264. the ``msg`` field.
  265. A convention is that exceptions should be raised in *exceptional* cases:
  266. For example, if a file cannot be opened, this should not raise an
  267. exception since this is quite common (the file may not exist).
  268. Raise statement
  269. ---------------
  270. Raising an exception is done with the ``raise`` statement:
  271. .. code-block:: nim
  272. var
  273. e: ref OSError
  274. new(e)
  275. e.msg = "the request to the OS failed"
  276. raise e
  277. If the ``raise`` keyword is not followed by an expression, the last exception
  278. is *re-raised*. For the purpose of avoiding repeating this common code pattern,
  279. the template ``newException`` in the ``system`` module can be used:
  280. .. code-block:: nim
  281. raise newException(OSError, "the request to the OS failed")
  282. Try statement
  283. -------------
  284. The ``try`` statement handles exceptions:
  285. .. code-block:: nim
  286. # read the first two lines of a text file that should contain numbers
  287. # and tries to add them
  288. var
  289. f: File
  290. if open(f, "numbers.txt"):
  291. try:
  292. let a = readLine(f)
  293. let b = readLine(f)
  294. echo "sum: ", parseInt(a) + parseInt(b)
  295. except OverflowError:
  296. echo "overflow!"
  297. except ValueError:
  298. echo "could not convert string to integer"
  299. except IOError:
  300. echo "IO error!"
  301. except:
  302. echo "Unknown exception!"
  303. # reraise the unknown exception:
  304. raise
  305. finally:
  306. close(f)
  307. The statements after the ``try`` are executed unless an exception is
  308. raised. Then the appropriate ``except`` part is executed.
  309. The empty ``except`` part is executed if there is an exception that is
  310. not explicitly listed. It is similar to an ``else`` part in ``if``
  311. statements.
  312. If there is a ``finally`` part, it is always executed after the
  313. exception handlers.
  314. The exception is *consumed* in an ``except`` part. If an exception is not
  315. handled, it is propagated through the call stack. This means that often
  316. the rest of the procedure - that is not within a ``finally`` clause -
  317. is not executed (if an exception occurs).
  318. If you need to *access* the actual exception object or message inside an
  319. ``except`` branch you can use the `getCurrentException()
  320. <system.html#getCurrentException>`_ and `getCurrentExceptionMsg()
  321. <system.html#getCurrentExceptionMsg>`_ procs from the `system <system.html>`_
  322. module. Example:
  323. .. code-block:: nim
  324. try:
  325. doSomethingHere()
  326. except:
  327. let
  328. e = getCurrentException()
  329. msg = getCurrentExceptionMsg()
  330. echo "Got exception ", repr(e), " with message ", msg
  331. Annotating procs with raised exceptions
  332. ---------------------------------------
  333. Through the use of the optional ``{.raises.}`` pragma you can specify that a
  334. proc is meant to raise a specific set of exceptions, or none at all. If the
  335. ``{.raises.}`` pragma is used, the compiler will verify that this is true. For
  336. instance, if you specify that a proc raises ``IOError``, and at some point it
  337. (or one of the procs it calls) starts raising a new exception the compiler will
  338. prevent that proc from compiling. Usage example:
  339. .. code-block:: nim
  340. proc complexProc() {.raises: [IOError, ArithmeticError].} =
  341. ...
  342. proc simpleProc() {.raises: [].} =
  343. ...
  344. Once you have code like this in place, if the list of raised exception changes
  345. the compiler will stop with an error specifying the line of the proc which
  346. stopped validating the pragma and the raised exception not being caught, along
  347. with the file and line where the uncaught exception is being raised, which may
  348. help you locate the offending code which has changed.
  349. If you want to add the ``{.raises.}`` pragma to existing code, the compiler can
  350. also help you. You can add the ``{.effects.}`` pragma statement to your proc and
  351. the compiler will output all inferred effects up to that point (exception
  352. tracking is part of Nim's effect system). Another more roundabout way to
  353. find out the list of exceptions raised by a proc is to use the Nim ``doc2``
  354. command which generates documentation for a whole module and decorates all
  355. procs with the list of raised exceptions. You can read more about Nim's
  356. `effect system and related pragmas in the manual <manual.html#effect-system>`_.
  357. Generics
  358. ========
  359. Generics are Nim's means to parametrize procs, iterators or types
  360. with `type parameters`:idx:. They are most useful for efficient type safe
  361. containers:
  362. .. code-block:: nim
  363. type
  364. BinaryTreeObj[T] = object # BinaryTree is a generic type with
  365. # with generic param ``T``
  366. le, ri: BinaryTree[T] # left and right subtrees; may be nil
  367. data: T # the data stored in a node
  368. BinaryTree*[T] = ref BinaryTreeObj[T] # type that is exported
  369. proc newNode*[T](data: T): BinaryTree[T] =
  370. # constructor for a node
  371. new(result)
  372. result.data = data
  373. proc add*[T](root: var BinaryTree[T], n: BinaryTree[T]) =
  374. # insert a node into the tree
  375. if root == nil:
  376. root = n
  377. else:
  378. var it = root
  379. while it != nil:
  380. # compare the data items; uses the generic ``cmp`` proc
  381. # that works for any type that has a ``==`` and ``<`` operator
  382. var c = cmp(it.data, n.data)
  383. if c < 0:
  384. if it.le == nil:
  385. it.le = n
  386. return
  387. it = it.le
  388. else:
  389. if it.ri == nil:
  390. it.ri = n
  391. return
  392. it = it.ri
  393. proc add*[T](root: var BinaryTree[T], data: T) =
  394. # convenience proc:
  395. add(root, newNode(data))
  396. iterator preorder*[T](root: BinaryTree[T]): T =
  397. # Preorder traversal of a binary tree.
  398. # Since recursive iterators are not yet implemented,
  399. # this uses an explicit stack (which is more efficient anyway):
  400. var stack: seq[BinaryTree[T]] = @[root]
  401. while stack.len > 0:
  402. var n = stack.pop()
  403. while n != nil:
  404. yield n.data
  405. add(stack, n.ri) # push right subtree onto the stack
  406. n = n.le # and follow the left pointer
  407. var
  408. root: BinaryTree[string] # instantiate a BinaryTree with ``string``
  409. add(root, newNode("hello")) # instantiates ``newNode`` and ``add``
  410. add(root, "world") # instantiates the second ``add`` proc
  411. for str in preorder(root):
  412. stdout.writeLine(str)
  413. The example shows a generic binary tree. Depending on context, the brackets are
  414. used either to introduce type parameters or to instantiate a generic proc,
  415. iterator or type. As the example shows, generics work with overloading: the
  416. best match of ``add`` is used. The built-in ``add`` procedure for sequences
  417. is not hidden and is used in the ``preorder`` iterator.
  418. Templates
  419. =========
  420. Templates are a simple substitution mechanism that operates on Nim's
  421. abstract syntax trees. Templates are processed in the semantic pass of the
  422. compiler. They integrate well with the rest of the language and share none
  423. of C's preprocessor macros flaws.
  424. To *invoke* a template, call it like a procedure.
  425. Example:
  426. .. code-block:: nim
  427. template `!=` (a, b: untyped): untyped =
  428. # this definition exists in the System module
  429. not (a == b)
  430. assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6))
  431. The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact
  432. templates: this has the benefit that if you overload the ``==`` operator,
  433. the ``!=`` operator is available automatically and does the right thing. (Except
  434. for IEEE floating point numbers - NaN breaks basic boolean logic.)
  435. ``a > b`` is transformed into ``b < a``.
  436. ``a in b`` is transformed into ``contains(b, a)``.
  437. ``notin`` and ``isnot`` have the obvious meanings.
  438. Templates are especially useful for lazy evaluation purposes. Consider a
  439. simple proc for logging:
  440. .. code-block:: nim
  441. const
  442. debug = true
  443. proc log(msg: string) {.inline.} =
  444. if debug: stdout.writeLine(msg)
  445. var
  446. x = 4
  447. log("x has the value: " & $x)
  448. This code has a shortcoming: if ``debug`` is set to false someday, the quite
  449. expensive ``$`` and ``&`` operations are still performed! (The argument
  450. evaluation for procedures is *eager*).
  451. Turning the ``log`` proc into a template solves this problem:
  452. .. code-block:: nim
  453. const
  454. debug = true
  455. template log(msg: string) =
  456. if debug: stdout.writeLine(msg)
  457. var
  458. x = 4
  459. log("x has the value: " & $x)
  460. The parameters' types can be ordinary types or the meta types ``untyped``,
  461. ``typed``, or ``typedesc``.
  462. ``typedesc`` stands for *type description*, and ``untyped`` means symbol lookups and
  463. type resolution is not performed before the expression is passed to the template.
  464. If the template has no explicit return type,
  465. ``void`` is used for consistency with procs and methods.
  466. To pass a block of statements to a template, use 'untyped' for the last parameter:
  467. .. code-block:: nim
  468. template withFile(f: untyped, filename: string, mode: FileMode,
  469. body: untyped): typed =
  470. let fn = filename
  471. var f: File
  472. if open(f, fn, mode):
  473. try:
  474. body
  475. finally:
  476. close(f)
  477. else:
  478. quit("cannot open: " & fn)
  479. withFile(txt, "ttempl3.txt", fmWrite):
  480. txt.writeLine("line 1")
  481. txt.writeLine("line 2")
  482. In the example the two ``writeLine`` statements are bound to the ``body``
  483. parameter. The ``withFile`` template contains boilerplate code and helps to
  484. avoid a common bug: to forget to close the file. Note how the
  485. ``let fn = filename`` statement ensures that ``filename`` is evaluated only
  486. once.
  487. Macros
  488. ======
  489. Macros enable advanced compile-time code transformations, but they cannot
  490. change Nim's syntax. However, this is no real restriction because Nim's
  491. syntax is flexible enough anyway. Macros have to be implemented in pure Nim
  492. code if the `foreign function interface (FFI)
  493. <manual.html#foreign-function-interface>`_ is not enabled in the compiler, but
  494. other than that restriction (which at some point in the future will go away)
  495. you can write any kind of Nim code and the compiler will run it at compile
  496. time.
  497. There are two ways to write a macro, either *generating* Nim source code and
  498. letting the compiler parse it, or creating manually an abstract syntax tree
  499. (AST) which you feed to the compiler. In order to build the AST one needs to
  500. know how the Nim concrete syntax is converted to an abstract syntax tree
  501. (AST). The AST is documented in the `macros <macros.html>`_ module.
  502. Once your macro is finished, there are two ways to invoke it:
  503. (1) invoking a macro like a procedure call (expression macros)
  504. (2) invoking a macro with the special ``macrostmt``
  505. syntax (statement macros)
  506. Expression Macros
  507. -----------------
  508. The following example implements a powerful ``debug`` command that accepts a
  509. variable number of arguments:
  510. .. code-block:: nim
  511. # to work with Nim syntax trees, we need an API that is defined in the
  512. # ``macros`` module:
  513. import macros
  514. macro debug(n: varargs[untyped]): typed =
  515. # `n` is a Nim AST that contains a list of expressions;
  516. # this macro returns a list of statements (n is passed for proper line
  517. # information):
  518. result = newNimNode(nnkStmtList, n)
  519. # iterate over any argument that is passed to this macro:
  520. for x in n:
  521. # add a call to the statement list that writes the expression;
  522. # `toStrLit` converts an AST to its string representation:
  523. result.add(newCall("write", newIdentNode("stdout"), toStrLit(x)))
  524. # add a call to the statement list that writes ": "
  525. result.add(newCall("write", newIdentNode("stdout"), newStrLitNode(": ")))
  526. # add a call to the statement list that writes the expressions value:
  527. result.add(newCall("writeLine", newIdentNode("stdout"), x))
  528. var
  529. a: array[0..10, int]
  530. x = "some string"
  531. a[0] = 42
  532. a[1] = 45
  533. debug(a[0], a[1], x)
  534. The macro call expands to:
  535. .. code-block:: nim
  536. write(stdout, "a[0]")
  537. write(stdout, ": ")
  538. writeLine(stdout, a[0])
  539. write(stdout, "a[1]")
  540. write(stdout, ": ")
  541. writeLine(stdout, a[1])
  542. write(stdout, "x")
  543. write(stdout, ": ")
  544. writeLine(stdout, x)
  545. Statement Macros
  546. ----------------
  547. Statement macros are defined just as expression macros. However, they are
  548. invoked by an expression following a colon.
  549. The following example outlines a macro that generates a lexical analyzer from
  550. regular expressions:
  551. .. code-block:: nim
  552. macro case_token(n: varargs[untyped]): typed =
  553. # creates a lexical analyzer from regular expressions
  554. # ... (implementation is an exercise for the reader :-)
  555. discard
  556. case_token: # this colon tells the parser it is a macro statement
  557. of r"[A-Za-z_]+[A-Za-z_0-9]*":
  558. return tkIdentifier
  559. of r"0-9+":
  560. return tkInteger
  561. of r"[\+\-\*\?]+":
  562. return tkOperator
  563. else:
  564. return tkUnknown
  565. Building your first macro
  566. -------------------------
  567. To give a footstart to writing macros we will show now how to turn your typical
  568. dynamic code into something that compiles statically. For the exercise we will
  569. use the following snippet of code as the starting point:
  570. .. code-block:: nim
  571. import strutils, tables
  572. proc readCfgAtRuntime(cfgFilename: string): Table[string, string] =
  573. let
  574. inputString = readFile(cfgFilename)
  575. var
  576. source = ""
  577. result = initTable[string, string]()
  578. for line in inputString.splitLines:
  579. # Ignore empty lines
  580. if line.len < 1: continue
  581. var chunks = split(line, ',')
  582. if chunks.len != 2:
  583. quit("Input needs comma split values, got: " & line)
  584. result[chunks[0]] = chunks[1]
  585. if result.len < 1: quit("Input file empty!")
  586. let info = readCfgAtRuntime("data.cfg")
  587. when isMainModule:
  588. echo info["licenseOwner"]
  589. echo info["licenseKey"]
  590. echo info["version"]
  591. Presumably this snippet of code could be used in a commercial software, reading
  592. a configuration file to display information about the person who bought the
  593. software. This external file would be generated by an online web shopping cart
  594. to be included along the program containing the license information::
  595. version,1.1
  596. licenseOwner,Hyori Lee
  597. licenseKey,M1Tl3PjBWO2CC48m
  598. The ``readCfgAtRuntime`` proc will open the given filename and return a
  599. ``Table`` from the `tables module <tables.html>`_. The parsing of the file is
  600. done (without much care for handling invalid data or corner cases) using the
  601. `splitLines proc from the strutils module <strutils.html#splitLines>`_. There
  602. are many things which can fail; mind the purpose is explaining how to make
  603. this run at compile time, not how to properly implement a DRM scheme.
  604. The reimplementation of this code as a compile time proc will allow us to get
  605. rid of the ``data.cfg`` file we would need to distribute along the binary, plus
  606. if the information is really constant, it doesn't make from a logical point of
  607. view to have it *mutable* in a global variable, it would be better if it was a
  608. constant. Finally, and likely the most valuable feature, we can implement some
  609. verification at compile time. You could think of this as a *better unit
  610. testing*, since it is impossible to obtain a binary unless everything is
  611. correct, preventing you to ship to users a broken program which won't start
  612. because a small critical file is missing or its contents changed by mistake to
  613. something invalid.
  614. Generating source code
  615. ++++++++++++++++++++++
  616. Our first attempt will start by modifying the program to generate a compile
  617. time string with the *generated source code*, which we then pass to the
  618. ``parseStmt`` proc from the `macros module <macros.html>`_. Here is the
  619. modified source code implementing the macro:
  620. .. code-block:: nim
  621. :number-lines:
  622. import macros, strutils
  623. macro readCfgAndBuildSource(cfgFilename: string): typed =
  624. let
  625. inputString = slurp(cfgFilename.strVal)
  626. var
  627. source = ""
  628. for line in inputString.splitLines:
  629. # Ignore empty lines
  630. if line.len < 1: continue
  631. var chunks = split(line, ',')
  632. if chunks.len != 2:
  633. error("Input needs comma split values, got: " & line)
  634. source &= "const cfg" & chunks[0] & "= \"" & chunks[1] & "\"\n"
  635. if source.len < 1: error("Input file empty!")
  636. result = parseStmt(source)
  637. readCfgAndBuildSource("data.cfg")
  638. when isMainModule:
  639. echo cfglicenseOwner
  640. echo cfglicenseKey
  641. echo cfgversion
  642. The good news is not much has changed! First, we need to change the handling
  643. of the input parameter (line 3). In the dynamic version the
  644. ``readCfgAtRuntime`` proc receives a string parameter. However, in the macro
  645. version it is also declared as string, but this is the *outside* interface of
  646. the macro. When the macro is run, it actually gets a ``PNimNode`` object
  647. instead of a string, and we have to call the `strVal proc
  648. <macros.html#strVal>`_ (line 5) from the `macros module <macros.html>`_ to
  649. obtain the string being passed in to the macro.
  650. Second, we cannot use the `readFile proc <system.html#readFile>`_ from the
  651. `system module <system.html>`_ due to FFI restriction at compile time. If we
  652. try to use this proc, or any other which depends on FFI, the compiler will
  653. error with the message ``cannot evaluate`` and a dump of the macro's source
  654. code, along with a stack trace where the compiler reached before bailing out.
  655. We can get around this limitation by using the `slurp proc
  656. <system.html#slurp>`_ from the `system module <system.html>`_, which was
  657. precisely made for compilation time (just like `gorge <system.html#gorge>`_
  658. which executes an external program and captures its output).
  659. The interesting thing is that our macro does not return a runtime `Table
  660. <tables.html#Table>`_ object. Instead, it builds up Nim source code into
  661. the ``source`` variable. For each line of the configuration file a ``const``
  662. variable will be generated (line 15). To avoid conflicts we prefix these
  663. variables with ``cfg``. In essence, what the compiler is doing is replacing
  664. the line calling the macro with the following snippet of code:
  665. .. code-block:: nim
  666. const cfgversion= "1.1"
  667. const cfglicenseOwner= "Hyori Lee"
  668. const cfglicenseKey= "M1Tl3PjBWO2CC48m"
  669. You can verify this yourself adding the line ``echo source`` somewhere at the
  670. end of the macro and compiling the program. Another difference is that instead
  671. of calling the usual `quit proc <system.html#quit>`_ to abort (which we could
  672. still call) this version calls the `error proc <macros.html#error>`_ (line
  673. 14). The ``error`` proc has the same behavior as ``quit`` but will dump also
  674. the source and file line information where the error happened, making it
  675. easier for the programmer to find where compilation failed. In this situation
  676. it would point to the line invoking the macro, but **not** the line of
  677. ``data.cfg`` we are processing, that's something the macro itself would need
  678. to control.
  679. Generating AST by hand
  680. ++++++++++++++++++++++
  681. To generate an AST we would need to intimately know the structures used by the
  682. Nim compiler exposed in the `macros module <macros.html>`_, which at first
  683. look seems a daunting task. But we can use as helper shortcut the `dumpTree
  684. macro <macros.html#dumpTree>`_, which is used as a statement macro instead of
  685. an expression macro. Since we know that we want to generate a bunch of
  686. ``const`` symbols we can create the following source file and compile it to
  687. see what the compiler *expects* from us:
  688. .. code-block:: nim
  689. import macros
  690. dumpTree:
  691. const cfgversion: string = "1.1"
  692. const cfglicenseOwner= "Hyori Lee"
  693. const cfglicenseKey= "M1Tl3PjBWO2CC48m"
  694. During compilation of the source code we should see the following lines in the
  695. output (again, since this is a macro, compilation is enough, you don't have to
  696. run any binary)::
  697. StmtList
  698. ConstSection
  699. ConstDef
  700. Ident !"cfgversion"
  701. Ident !"string"
  702. StrLit 1.1
  703. ConstSection
  704. ConstDef
  705. Ident !"cfglicenseOwner"
  706. Empty
  707. StrLit Hyori Lee
  708. ConstSection
  709. ConstDef
  710. Ident !"cfglicenseKey"
  711. Empty
  712. StrLit M1Tl3PjBWO2CC48m
  713. With this output we have a better idea of what kind of input the compiler
  714. expects. We need to generate a list of statements. For each constant the source
  715. code generates a ``ConstSection`` and a ``ConstDef``. If we were to move all
  716. the constants to a single ``const`` block we would see only a single
  717. ``ConstSection`` with three children.
  718. Maybe you didn't notice, but in the ``dumpTree`` example the first constant
  719. explicitly specifies the type of the constant. That's why in the tree output
  720. the two last constants have their second child ``Empty`` but the first has a
  721. string identifier. So basically a ``const`` definition is made up from an
  722. identifier, optionally a type (can be an *empty* node) and the value. Armed
  723. with this knowledge, let's look at the finished version of the AST building
  724. macro:
  725. .. code-block:: nim
  726. :number-lines:
  727. import macros, strutils
  728. macro readCfgAndBuildAST(cfgFilename: string): typed =
  729. let
  730. inputString = slurp(cfgFilename.strVal)
  731. result = newNimNode(nnkStmtList)
  732. for line in inputString.splitLines:
  733. # Ignore empty lines
  734. if line.len < 1: continue
  735. var chunks = split(line, ',')
  736. if chunks.len != 2:
  737. error("Input needs comma split values, got: " & line)
  738. var
  739. section = newNimNode(nnkConstSection)
  740. constDef = newNimNode(nnkConstDef)
  741. constDef.add(newIdentNode("cfg" & chunks[0]))
  742. constDef.add(newEmptyNode())
  743. constDef.add(newStrLitNode(chunks[1]))
  744. section.add(constDef)
  745. result.add(section)
  746. if result.len < 1: error("Input file empty!")
  747. readCfgAndBuildAST("data.cfg")
  748. when isMainModule:
  749. echo cfglicenseOwner
  750. echo cfglicenseKey
  751. echo cfgversion
  752. Since we are building on the previous example generating source code, we will
  753. only mention the differences to it. Instead of creating a temporary ``string``
  754. variable and writing into it source code as if it were written *by hand*, we
  755. use the ``result`` variable directly and create a statement list node
  756. (``nnkStmtList``) which will hold our children (line 7).
  757. For each input line we have to create a constant definition (``nnkConstDef``)
  758. and wrap it inside a constant section (``nnkConstSection``). Once these
  759. variables are created, we fill them hierarchichally (line 17) like the
  760. previous AST dump tree showed: the constant definition is a child of the
  761. section definition, and the constant definition has an identifier node, an
  762. empty node (we let the compiler figure out the type), and a string literal
  763. with the value.
  764. A last tip when writing a macro: if you are not sure the AST you are building
  765. looks ok, you may be tempted to use the ``dumpTree`` macro. But you can't use
  766. it *inside* the macro you are writting/debugging. Instead ``echo`` the string
  767. generated by `treeRepr <macros.html#treeRepr>`_. If at the end of the this
  768. example you add ``echo treeRepr(result)`` you should get the same output as
  769. using the ``dumpTree`` macro, but of course you can call that at any point of
  770. the macro where you might be having troubles.
  771. Example Templates and Macros
  772. ============================
  773. Lifting Procs
  774. +++++++++++++
  775. .. code-block:: nim
  776. import math
  777. template liftScalarProc(fname) =
  778. ## Lift a proc taking one scalar parameter and returning a
  779. ## scalar value (eg ``proc sssss[T](x: T): float``),
  780. ## to provide templated procs that can handle a single
  781. ## parameter of seq[T] or nested seq[seq[]] or the same type
  782. ##
  783. ## .. code-block:: Nim
  784. ## liftScalarProc(abs)
  785. ## # now abs(@[@[1,-2], @[-2,-3]]) == @[@[1,2], @[2,3]]
  786. proc fname[T](x: openarray[T]): auto =
  787. var temp: T
  788. type outType = type(fname(temp))
  789. result = newSeq[outType](x.len)
  790. for i in 0..<x.len:
  791. result[i] = fname(x[i])
  792. liftScalarProc(sqrt) # make sqrt() work for sequences
  793. echo sqrt(@[4.0, 16.0, 25.0, 36.0]) # => @[2.0, 4.0, 5.0, 6.0]
  794. Identifier Mangling
  795. +++++++++++++++++++
  796. .. code-block:: nim
  797. proc echoHW() =
  798. echo "Hello world"
  799. proc echoHW0() =
  800. echo "Hello world 0"
  801. proc echoHW1() =
  802. echo "Hello world 1"
  803. template joinSymbols(a, b: untyped): untyped =
  804. `a b`()
  805. joinSymbols(echo, HW)
  806. macro str2Call(s1, s2): typed =
  807. result = newNimNode(nnkStmtList)
  808. for i in 0..1:
  809. # combines s1, s2 and an integer into an proc identifier
  810. # that is called in a statement list
  811. result.add(newCall(!($s1 & $s2 & $i)))
  812. str2Call("echo", "HW")
  813. # Output:
  814. # Hello world
  815. # Hello world 0
  816. # Hello world 1
  817. Compilation to JavaScript
  818. =========================
  819. Nim code can be compiled to JavaScript. However in order to write
  820. JavaScript-compatible code you should remember the following:
  821. - ``addr`` and ``ptr`` have slightly different semantic meaning in JavaScript.
  822. It is recommended to avoid those if you're not sure how they are translated
  823. to JavaScript.
  824. - ``cast[T](x)`` in JavaScript is translated to ``(x)``, except for casting
  825. between signed/unsigned ints, in which case it behaves as static cast in
  826. C language.
  827. - ``cstring`` in JavaScript means JavaScript string. It is a good practice to
  828. use ``cstring`` only when it is semantically appropriate. E.g. don't use
  829. ``cstring`` as a binary data buffer.