123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368 |
- Term rewriting macros
- =====================
- Term rewriting macros are macros or templates that have not only
- a *name* but also a *pattern* that is searched for after the semantic checking
- phase of the compiler: This means they provide an easy way to enhance the
- compilation pipeline with user defined optimizations:
- .. code-block:: nim
- template optMul{`*`(a, 2)}(a: int): int = a+a
- let x = 3
- echo x * 2
- The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the
- curlies is the pattern to match against. The operators ``*``, ``**``,
- ``|``, ``~`` have a special meaning in patterns if they are written in infix
- notation, so to match verbatim against ``*`` the ordinary function call syntax
- needs to be used.
- Unfortunately optimizations are hard to get right and even the tiny example
- is **wrong**:
- .. code-block:: nim
- template optMul{`*`(a, 2)}(a: int): int = a+a
- proc f(): int =
- echo "side effect!"
- result = 55
- echo f() * 2
- We cannot duplicate 'a' if it denotes an expression that has a side effect!
- Fortunately Nim supports side effect analysis:
- .. code-block:: nim
- template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a
- proc f(): int =
- echo "side effect!"
- result = 55
- echo f() * 2 # not optimized ;-)
- You can make one overload matching with a constraint and one without, and the
- one with a constraint will have precedence, and so you can handle both cases
- differently.
- So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We
- cannot really do that however as the following code only swaps arguments
- blindly:
- .. code-block:: nim
- template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a
- What optimizers really need to do is a *canonicalization*:
- .. code-block:: nim
- template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a
- The ``int{lit}`` parameter pattern matches against an expression of
- type ``int``, but only if it's a literal.
- Parameter constraints
- ---------------------
- The `parameter constraint`:idx: expression can use the operators ``|`` (or),
- ``&`` (and) and ``~`` (not) and the following predicates:
- =================== =====================================================
- Predicate Meaning
- =================== =====================================================
- ``atom`` The matching node has no children.
- ``lit`` The matching node is a literal like "abc", 12.
- ``sym`` The matching node must be a symbol (a bound
- identifier).
- ``ident`` The matching node must be an identifier (an unbound
- identifier).
- ``call`` The matching AST must be a call/apply expression.
- ``lvalue`` The matching AST must be an lvalue.
- ``sideeffect`` The matching AST must have a side effect.
- ``nosideeffect`` The matching AST must have no side effect.
- ``param`` A symbol which is a parameter.
- ``genericparam`` A symbol which is a generic parameter.
- ``module`` A symbol which is a module.
- ``type`` A symbol which is a type.
- ``var`` A symbol which is a variable.
- ``let`` A symbol which is a ``let`` variable.
- ``const`` A symbol which is a constant.
- ``result`` The special ``result`` variable.
- ``proc`` A symbol which is a proc.
- ``method`` A symbol which is a method.
- ``iterator`` A symbol which is an iterator.
- ``converter`` A symbol which is a converter.
- ``macro`` A symbol which is a macro.
- ``template`` A symbol which is a template.
- ``field`` A symbol which is a field in a tuple or an object.
- ``enumfield`` A symbol which is a field in an enumeration.
- ``forvar`` A for loop variable.
- ``label`` A label (used in ``block`` statements).
- ``nk*`` The matching AST must have the specified kind.
- (Example: ``nkIfStmt`` denotes an ``if`` statement.)
- ``alias`` States that the marked parameter needs to alias
- with *some* other parameter.
- ``noalias`` States that *every* other parameter must not alias
- with the marked parameter.
- =================== =====================================================
- Predicates that share their name with a keyword have to be escaped with
- backticks: `` `const` ``.
- The ``alias`` and ``noalias`` predicates refer not only to the matching AST,
- but also to every other bound parameter; syntactically they need to occur after
- the ordinary AST predicates:
- .. code-block:: nim
- template ex{a = b + c}(a: int{noalias}, b, c: int) =
- # this transformation is only valid if 'b' and 'c' do not alias 'a':
- a = b
- inc a, c
- Pattern operators
- -----------------
- The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns
- if they are written in infix notation.
- The ``|`` operator
- ~~~~~~~~~~~~~~~~~~
- The ``|`` operator if used as infix operator creates an ordered choice:
- .. code-block:: nim
- template t{0|1}(): untyped = 3
- let a = 1
- # outputs 3:
- echo a
- The matching is performed after the compiler performed some optimizations like
- constant folding, so the following does not work:
- .. code-block:: nim
- template t{0|1}(): untyped = 3
- # outputs 1:
- echo 1
- The reason is that the compiler already transformed the 1 into "1" for
- the ``echo`` statement. However, a term rewriting macro should not change the
- semantics anyway. In fact they can be deactivated with the ``--patterns:off``
- command line option or temporarily with the ``patterns`` pragma.
- The ``{}`` operator
- ~~~~~~~~~~~~~~~~~~~
- A pattern expression can be bound to a pattern parameter via the ``expr{param}``
- notation:
- .. code-block:: nim
- template t{(0|1|2){x}}(x: untyped): untyped = x+1
- let a = 1
- # outputs 2:
- echo a
- The ``~`` operator
- ~~~~~~~~~~~~~~~~~~
- The ``~`` operator is the **not** operator in patterns:
- .. code-block:: nim
- template t{x = (~x){y} and (~x){z}}(x, y, z: bool) =
- x = y
- if x: x = z
- var
- a = false
- b = true
- c = false
- a = b and c
- echo a
- The ``*`` operator
- ~~~~~~~~~~~~~~~~~~
- The ``*`` operator can *flatten* a nested binary expression like ``a & b & c``
- to ``&(a, b, c)``:
- .. code-block:: nim
- var
- calls = 0
- proc `&&`(s: varargs[string]): string =
- result = s[0]
- for i in 1..len(s)-1: result.add s[i]
- inc calls
- template optConc{ `&&` * a }(a: string): untyped = &&a
- let space = " "
- echo "my" && (space & "awe" && "some " ) && "concat"
- # check that it's been optimized properly:
- doAssert calls == 1
- The second operator of `*` must be a parameter; it is used to gather all the
- arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"``
- is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``)
- which is flattened into a call expression; thus the invocation of ``optConc``
- produces:
- .. code-block:: nim
- `&&`("my", space & "awe", "some ", "concat")
- The ``**`` operator
- ~~~~~~~~~~~~~~~~~~~
- The ``**`` is much like the ``*`` operator, except that it gathers not only
- all the arguments, but also the matched operators in reverse polish notation:
- .. code-block:: nim
- import macros
- type
- Matrix = object
- dummy: int
- proc `*`(a, b: Matrix): Matrix = discard
- proc `+`(a, b: Matrix): Matrix = discard
- proc `-`(a, b: Matrix): Matrix = discard
- proc `$`(a: Matrix): string = result = $a.dummy
- proc mat21(): Matrix =
- result.dummy = 21
- macro optM{ (`+`|`-`|`*`) ** a }(a: Matrix): untyped =
- echo treeRepr(a)
- result = newCall(bindSym"mat21")
- var x, y, z: Matrix
- echo x + y * z - x
- This passes the expression ``x + y * z - x`` to the ``optM`` macro as
- an ``nnkArgList`` node containing::
- Arglist
- Sym "x"
- Sym "y"
- Sym "z"
- Sym "*"
- Sym "+"
- Sym "x"
- Sym "-"
- (Which is the reverse polish notation of ``x + y * z - x``.)
- Parameters
- ----------
- Parameters in a pattern are type checked in the matching process. If a
- parameter is of the type ``varargs`` it is treated specially and it can match
- 0 or more arguments in the AST to be matched against:
- .. code-block:: nim
- template optWrite{
- write(f, x)
- ((write|writeLine){w})(f, y)
- }(x, y: varargs[untyped], f: File, w: untyped) =
- w(f, x, y)
- Example: Partial evaluation
- ---------------------------
- The following example shows how some simple partial evaluation can be
- implemented with term rewriting:
- .. code-block:: nim
- proc p(x, y: int; cond: bool): int =
- result = if cond: x + y else: x - y
- template optP1{p(x, y, true)}(x, y: untyped): untyped = x + y
- template optP2{p(x, y, false)}(x, y: untyped): untyped = x - y
- Example: Hoisting
- -----------------
- The following example shows how some form of hoisting can be implemented:
- .. code-block:: nim
- import pegs
- template optPeg{peg(pattern)}(pattern: string{lit}): Peg =
- var gl {.global, gensym.} = peg(pattern)
- gl
- for i in 0 .. 3:
- echo match("(a b c)", peg"'(' @ ')'")
- echo match("W_HI_Le", peg"\y 'while'")
- The ``optPeg`` template optimizes the case of a peg constructor with a string
- literal, so that the pattern will only be parsed once at program startup and
- stored in a global ``gl`` which is then re-used. This optimization is called
- hoisting because it is comparable to classical loop hoisting.
- AST based overloading
- =====================
- Parameter constraints can also be used for ordinary routine parameters; these
- constraints affect ordinary overloading resolution then:
- .. code-block:: nim
- proc optLit(a: string{lit|`const`}) =
- echo "string literal"
- proc optLit(a: string) =
- echo "no string literal"
- const
- constant = "abc"
- var
- variable = "xyz"
- optLit("literal")
- optLit(constant)
- optLit(variable)
- However, the constraints ``alias`` and ``noalias`` are not available in
- ordinary routines.
- Move optimization
- -----------------
- The ``call`` constraint is particularly useful to implement a move
- optimization for types that have copying semantics:
- .. code-block:: nim
- proc `[]=`*(t: var Table, key: string, val: string) =
- ## puts a (key, value)-pair into `t`. The semantics of string require
- ## a copy here:
- let idx = findInsertionPosition(key)
- t[idx].key = key
- t[idx].val = val
- proc `[]=`*(t: var Table, key: string{call}, val: string{call}) =
- ## puts a (key, value)-pair into `t`. Optimized version that knows that
- ## the strings are unique and thus don't need to be copied:
- let idx = findInsertionPosition(key)
- shallowCopy t[idx].key, key
- shallowCopy t[idx].val, val
- var t: Table
- # overloading resolution ensures that the optimized []= is called here:
- t[f()] = g()
|