trmacros.txt 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368
  1. Term rewriting macros
  2. =====================
  3. Term rewriting macros are macros or templates that have not only
  4. a *name* but also a *pattern* that is searched for after the semantic checking
  5. phase of the compiler: This means they provide an easy way to enhance the
  6. compilation pipeline with user defined optimizations:
  7. .. code-block:: nim
  8. template optMul{`*`(a, 2)}(a: int): int = a+a
  9. let x = 3
  10. echo x * 2
  11. The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the
  12. curlies is the pattern to match against. The operators ``*``, ``**``,
  13. ``|``, ``~`` have a special meaning in patterns if they are written in infix
  14. notation, so to match verbatim against ``*`` the ordinary function call syntax
  15. needs to be used.
  16. Unfortunately optimizations are hard to get right and even the tiny example
  17. is **wrong**:
  18. .. code-block:: nim
  19. template optMul{`*`(a, 2)}(a: int): int = a+a
  20. proc f(): int =
  21. echo "side effect!"
  22. result = 55
  23. echo f() * 2
  24. We cannot duplicate 'a' if it denotes an expression that has a side effect!
  25. Fortunately Nim supports side effect analysis:
  26. .. code-block:: nim
  27. template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a
  28. proc f(): int =
  29. echo "side effect!"
  30. result = 55
  31. echo f() * 2 # not optimized ;-)
  32. You can make one overload matching with a constraint and one without, and the
  33. one with a constraint will have precedence, and so you can handle both cases
  34. differently.
  35. So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We
  36. cannot really do that however as the following code only swaps arguments
  37. blindly:
  38. .. code-block:: nim
  39. template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a
  40. What optimizers really need to do is a *canonicalization*:
  41. .. code-block:: nim
  42. template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a
  43. The ``int{lit}`` parameter pattern matches against an expression of
  44. type ``int``, but only if it's a literal.
  45. Parameter constraints
  46. ---------------------
  47. The `parameter constraint`:idx: expression can use the operators ``|`` (or),
  48. ``&`` (and) and ``~`` (not) and the following predicates:
  49. =================== =====================================================
  50. Predicate Meaning
  51. =================== =====================================================
  52. ``atom`` The matching node has no children.
  53. ``lit`` The matching node is a literal like "abc", 12.
  54. ``sym`` The matching node must be a symbol (a bound
  55. identifier).
  56. ``ident`` The matching node must be an identifier (an unbound
  57. identifier).
  58. ``call`` The matching AST must be a call/apply expression.
  59. ``lvalue`` The matching AST must be an lvalue.
  60. ``sideeffect`` The matching AST must have a side effect.
  61. ``nosideeffect`` The matching AST must have no side effect.
  62. ``param`` A symbol which is a parameter.
  63. ``genericparam`` A symbol which is a generic parameter.
  64. ``module`` A symbol which is a module.
  65. ``type`` A symbol which is a type.
  66. ``var`` A symbol which is a variable.
  67. ``let`` A symbol which is a ``let`` variable.
  68. ``const`` A symbol which is a constant.
  69. ``result`` The special ``result`` variable.
  70. ``proc`` A symbol which is a proc.
  71. ``method`` A symbol which is a method.
  72. ``iterator`` A symbol which is an iterator.
  73. ``converter`` A symbol which is a converter.
  74. ``macro`` A symbol which is a macro.
  75. ``template`` A symbol which is a template.
  76. ``field`` A symbol which is a field in a tuple or an object.
  77. ``enumfield`` A symbol which is a field in an enumeration.
  78. ``forvar`` A for loop variable.
  79. ``label`` A label (used in ``block`` statements).
  80. ``nk*`` The matching AST must have the specified kind.
  81. (Example: ``nkIfStmt`` denotes an ``if`` statement.)
  82. ``alias`` States that the marked parameter needs to alias
  83. with *some* other parameter.
  84. ``noalias`` States that *every* other parameter must not alias
  85. with the marked parameter.
  86. =================== =====================================================
  87. Predicates that share their name with a keyword have to be escaped with
  88. backticks: `` `const` ``.
  89. The ``alias`` and ``noalias`` predicates refer not only to the matching AST,
  90. but also to every other bound parameter; syntactically they need to occur after
  91. the ordinary AST predicates:
  92. .. code-block:: nim
  93. template ex{a = b + c}(a: int{noalias}, b, c: int) =
  94. # this transformation is only valid if 'b' and 'c' do not alias 'a':
  95. a = b
  96. inc a, c
  97. Pattern operators
  98. -----------------
  99. The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns
  100. if they are written in infix notation.
  101. The ``|`` operator
  102. ~~~~~~~~~~~~~~~~~~
  103. The ``|`` operator if used as infix operator creates an ordered choice:
  104. .. code-block:: nim
  105. template t{0|1}(): untyped = 3
  106. let a = 1
  107. # outputs 3:
  108. echo a
  109. The matching is performed after the compiler performed some optimizations like
  110. constant folding, so the following does not work:
  111. .. code-block:: nim
  112. template t{0|1}(): untyped = 3
  113. # outputs 1:
  114. echo 1
  115. The reason is that the compiler already transformed the 1 into "1" for
  116. the ``echo`` statement. However, a term rewriting macro should not change the
  117. semantics anyway. In fact they can be deactivated with the ``--patterns:off``
  118. command line option or temporarily with the ``patterns`` pragma.
  119. The ``{}`` operator
  120. ~~~~~~~~~~~~~~~~~~~
  121. A pattern expression can be bound to a pattern parameter via the ``expr{param}``
  122. notation:
  123. .. code-block:: nim
  124. template t{(0|1|2){x}}(x: untyped): untyped = x+1
  125. let a = 1
  126. # outputs 2:
  127. echo a
  128. The ``~`` operator
  129. ~~~~~~~~~~~~~~~~~~
  130. The ``~`` operator is the **not** operator in patterns:
  131. .. code-block:: nim
  132. template t{x = (~x){y} and (~x){z}}(x, y, z: bool) =
  133. x = y
  134. if x: x = z
  135. var
  136. a = false
  137. b = true
  138. c = false
  139. a = b and c
  140. echo a
  141. The ``*`` operator
  142. ~~~~~~~~~~~~~~~~~~
  143. The ``*`` operator can *flatten* a nested binary expression like ``a & b & c``
  144. to ``&(a, b, c)``:
  145. .. code-block:: nim
  146. var
  147. calls = 0
  148. proc `&&`(s: varargs[string]): string =
  149. result = s[0]
  150. for i in 1..len(s)-1: result.add s[i]
  151. inc calls
  152. template optConc{ `&&` * a }(a: string): untyped = &&a
  153. let space = " "
  154. echo "my" && (space & "awe" && "some " ) && "concat"
  155. # check that it's been optimized properly:
  156. doAssert calls == 1
  157. The second operator of `*` must be a parameter; it is used to gather all the
  158. arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"``
  159. is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``)
  160. which is flattened into a call expression; thus the invocation of ``optConc``
  161. produces:
  162. .. code-block:: nim
  163. `&&`("my", space & "awe", "some ", "concat")
  164. The ``**`` operator
  165. ~~~~~~~~~~~~~~~~~~~
  166. The ``**`` is much like the ``*`` operator, except that it gathers not only
  167. all the arguments, but also the matched operators in reverse polish notation:
  168. .. code-block:: nim
  169. import macros
  170. type
  171. Matrix = object
  172. dummy: int
  173. proc `*`(a, b: Matrix): Matrix = discard
  174. proc `+`(a, b: Matrix): Matrix = discard
  175. proc `-`(a, b: Matrix): Matrix = discard
  176. proc `$`(a: Matrix): string = result = $a.dummy
  177. proc mat21(): Matrix =
  178. result.dummy = 21
  179. macro optM{ (`+`|`-`|`*`) ** a }(a: Matrix): untyped =
  180. echo treeRepr(a)
  181. result = newCall(bindSym"mat21")
  182. var x, y, z: Matrix
  183. echo x + y * z - x
  184. This passes the expression ``x + y * z - x`` to the ``optM`` macro as
  185. an ``nnkArgList`` node containing::
  186. Arglist
  187. Sym "x"
  188. Sym "y"
  189. Sym "z"
  190. Sym "*"
  191. Sym "+"
  192. Sym "x"
  193. Sym "-"
  194. (Which is the reverse polish notation of ``x + y * z - x``.)
  195. Parameters
  196. ----------
  197. Parameters in a pattern are type checked in the matching process. If a
  198. parameter is of the type ``varargs`` it is treated specially and it can match
  199. 0 or more arguments in the AST to be matched against:
  200. .. code-block:: nim
  201. template optWrite{
  202. write(f, x)
  203. ((write|writeLine){w})(f, y)
  204. }(x, y: varargs[untyped], f: File, w: untyped) =
  205. w(f, x, y)
  206. Example: Partial evaluation
  207. ---------------------------
  208. The following example shows how some simple partial evaluation can be
  209. implemented with term rewriting:
  210. .. code-block:: nim
  211. proc p(x, y: int; cond: bool): int =
  212. result = if cond: x + y else: x - y
  213. template optP1{p(x, y, true)}(x, y: untyped): untyped = x + y
  214. template optP2{p(x, y, false)}(x, y: untyped): untyped = x - y
  215. Example: Hoisting
  216. -----------------
  217. The following example shows how some form of hoisting can be implemented:
  218. .. code-block:: nim
  219. import pegs
  220. template optPeg{peg(pattern)}(pattern: string{lit}): Peg =
  221. var gl {.global, gensym.} = peg(pattern)
  222. gl
  223. for i in 0 .. 3:
  224. echo match("(a b c)", peg"'(' @ ')'")
  225. echo match("W_HI_Le", peg"\y 'while'")
  226. The ``optPeg`` template optimizes the case of a peg constructor with a string
  227. literal, so that the pattern will only be parsed once at program startup and
  228. stored in a global ``gl`` which is then re-used. This optimization is called
  229. hoisting because it is comparable to classical loop hoisting.
  230. AST based overloading
  231. =====================
  232. Parameter constraints can also be used for ordinary routine parameters; these
  233. constraints affect ordinary overloading resolution then:
  234. .. code-block:: nim
  235. proc optLit(a: string{lit|`const`}) =
  236. echo "string literal"
  237. proc optLit(a: string) =
  238. echo "no string literal"
  239. const
  240. constant = "abc"
  241. var
  242. variable = "xyz"
  243. optLit("literal")
  244. optLit(constant)
  245. optLit(variable)
  246. However, the constraints ``alias`` and ``noalias`` are not available in
  247. ordinary routines.
  248. Move optimization
  249. -----------------
  250. The ``call`` constraint is particularly useful to implement a move
  251. optimization for types that have copying semantics:
  252. .. code-block:: nim
  253. proc `[]=`*(t: var Table, key: string, val: string) =
  254. ## puts a (key, value)-pair into `t`. The semantics of string require
  255. ## a copy here:
  256. let idx = findInsertionPosition(key)
  257. t[idx].key = key
  258. t[idx].val = val
  259. proc `[]=`*(t: var Table, key: string{call}, val: string{call}) =
  260. ## puts a (key, value)-pair into `t`. Optimized version that knows that
  261. ## the strings are unique and thus don't need to be copied:
  262. let idx = findInsertionPosition(key)
  263. shallowCopy t[idx].key, key
  264. shallowCopy t[idx].val, val
  265. var t: Table
  266. # overloading resolution ensures that the optimized []= is called here:
  267. t[f()] = g()