threads.txt 8.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222
  1. Threads
  2. =======
  3. To enable thread support the ``--threads:on`` command line switch needs to
  4. be used. The ``system`` module then contains several threading primitives.
  5. See the `threads <threads.html>`_ and `channels <channels.html>`_ modules
  6. for the low level thread API. There are also high level parallelism constructs
  7. available. See `spawn`_ for further details.
  8. Nim's memory model for threads is quite different than that of other common
  9. programming languages (C, Pascal, Java): Each thread has its own (garbage
  10. collected) heap and sharing of memory is restricted to global variables. This
  11. helps to prevent race conditions. GC efficiency is improved quite a lot,
  12. because the GC never has to stop other threads and see what they reference.
  13. Memory allocation requires no lock at all! This design easily scales to massive
  14. multicore processors that are becoming the norm.
  15. Thread pragma
  16. -------------
  17. A proc that is executed as a new thread of execution should be marked by the
  18. ``thread`` pragma for reasons of readability. The compiler checks for
  19. violations of the `no heap sharing restriction`:idx:\: This restriction implies
  20. that it is invalid to construct a data structure that consists of memory
  21. allocated from different (thread local) heaps.
  22. A thread proc is passed to ``createThread`` or ``spawn`` and invoked
  23. indirectly; so the ``thread`` pragma implies ``procvar``.
  24. GC safety
  25. ---------
  26. We call a proc ``p`` `GC safe`:idx: when it doesn't access any global variable
  27. that contains GC'ed memory (``string``, ``seq``, ``ref`` or a closure) either
  28. directly or indirectly through a call to a GC unsafe proc.
  29. The `gcsafe`:idx: annotation can be used to mark a proc to be gcsafe,
  30. otherwise this property is inferred by the compiler. Note that ``noSideEffect``
  31. implies ``gcsafe``. The only way to create a thread is via ``spawn`` or
  32. ``createThread``. ``spawn`` is usually the preferable method. Either way
  33. the invoked proc must not use ``var`` parameters nor must any of its parameters
  34. contain a ``ref`` or ``closure`` type. This enforces
  35. the *no heap sharing restriction*.
  36. Routines that are imported from C are always assumed to be ``gcsafe``.
  37. To disable the GC-safety checking the ``--threadAnalysis:off`` command line
  38. switch can be used. This is a temporary workaround to ease the porting effort
  39. from old code to the new threading model.
  40. To override the compiler's gcsafety analysis a ``{.gcsafe.}`` pragma block can
  41. be used:
  42. .. code-block:: nim
  43. var
  44. someGlobal: string = "some string here"
  45. perThread {.threadvar.}: string
  46. proc setPerThread() =
  47. {.gcsafe.}:
  48. deepCopy(perThread, someGlobal)
  49. Future directions:
  50. - A shared GC'ed heap might be provided.
  51. Threadvar pragma
  52. ----------------
  53. A global variable can be marked with the ``threadvar`` pragma; it is
  54. a `thread-local`:idx: variable then:
  55. .. code-block:: nim
  56. var checkpoints* {.threadvar.}: seq[string]
  57. Due to implementation restrictions thread local variables cannot be
  58. initialized within the ``var`` section. (Every thread local variable needs to
  59. be replicated at thread creation.)
  60. Threads and exceptions
  61. ----------------------
  62. The interaction between threads and exceptions is simple: A *handled* exception
  63. in one thread cannot affect any other thread. However, an *unhandled* exception
  64. in one thread terminates the whole *process*!
  65. Parallel & Spawn
  66. ================
  67. Nim has two flavors of parallelism:
  68. 1) `Structured`:idx: parallelism via the ``parallel`` statement.
  69. 2) `Unstructured`:idx: parallelism via the standalone ``spawn`` statement.
  70. Nim has a builtin thread pool that can be used for CPU intensive tasks. For
  71. IO intensive tasks the ``async`` and ``await`` features should be
  72. used instead. Both parallel and spawn need the `threadpool <threadpool.html>`_
  73. module to work.
  74. Somewhat confusingly, ``spawn`` is also used in the ``parallel`` statement
  75. with slightly different semantics. ``spawn`` always takes a call expression of
  76. the form ``f(a, ...)``. Let ``T`` be ``f``'s return type. If ``T`` is ``void``
  77. then ``spawn``'s return type is also ``void`` otherwise it is ``FlowVar[T]``.
  78. Within a ``parallel`` section sometimes the ``FlowVar[T]`` is eliminated
  79. to ``T``. This happens when ``T`` does not contain any GC'ed memory.
  80. The compiler can ensure the location in ``location = spawn f(...)`` is not
  81. read prematurely within a ``parallel`` section and so there is no need for
  82. the overhead of an indirection via ``FlowVar[T]`` to ensure correctness.
  83. **Note**: Currently exceptions are not propagated between ``spawn``'ed tasks!
  84. Spawn statement
  85. ---------------
  86. `spawn`:idx: can be used to pass a task to the thread pool:
  87. .. code-block:: nim
  88. import threadpool
  89. proc processLine(line: string) =
  90. discard "do some heavy lifting here"
  91. for x in lines("myinput.txt"):
  92. spawn processLine(x)
  93. sync()
  94. For reasons of type safety and implementation simplicity the expression
  95. that ``spawn`` takes is restricted:
  96. * It must be a call expression ``f(a, ...)``.
  97. * ``f`` must be ``gcsafe``.
  98. * ``f`` must not have the calling convention ``closure``.
  99. * ``f``'s parameters may not be of type ``var``.
  100. This means one has to use raw ``ptr``'s for data passing reminding the
  101. programmer to be careful.
  102. * ``ref`` parameters are deeply copied which is a subtle semantic change and
  103. can cause performance problems but ensures memory safety. This deep copy
  104. is performed via ``system.deepCopy`` and so can be overridden.
  105. * For *safe* data exchange between ``f`` and the caller a global ``TChannel``
  106. needs to be used. However, since spawn can return a result, often no further
  107. communication is required.
  108. ``spawn`` executes the passed expression on the thread pool and returns
  109. a `data flow variable`:idx: ``FlowVar[T]`` that can be read from. The reading
  110. with the ``^`` operator is **blocking**. However, one can use ``awaitAny`` to
  111. wait on multiple flow variables at the same time:
  112. .. code-block:: nim
  113. import threadpool, ...
  114. # wait until 2 out of 3 servers received the update:
  115. proc main =
  116. var responses = newSeq[FlowVarBase](3)
  117. for i in 0..2:
  118. responses[i] = spawn tellServer(Update, "key", "value")
  119. var index = awaitAny(responses)
  120. assert index >= 0
  121. responses.del(index)
  122. discard awaitAny(responses)
  123. Data flow variables ensure that no data races
  124. are possible. Due to technical limitations not every type ``T`` is possible in
  125. a data flow variable: ``T`` has to be of the type ``ref``, ``string``, ``seq``
  126. or of a type that doesn't contain a type that is garbage collected. This
  127. restriction is not hard to work-around in practice.
  128. Parallel statement
  129. ------------------
  130. Example:
  131. .. code-block:: nim
  132. # Compute PI in an inefficient way
  133. import strutils, math, threadpool
  134. proc term(k: float): float = 4 * math.pow(-1, k) / (2*k + 1)
  135. proc pi(n: int): float =
  136. var ch = newSeq[float](n+1)
  137. parallel:
  138. for k in 0..ch.high:
  139. ch[k] = spawn term(float(k))
  140. for k in 0..ch.high:
  141. result += ch[k]
  142. echo formatFloat(pi(5000))
  143. The parallel statement is the preferred mechanism to introduce parallelism
  144. in a Nim program. A subset of the Nim language is valid within a
  145. ``parallel`` section. This subset is checked to be free of data races at
  146. compile time. A sophisticated `disjoint checker`:idx: ensures that no data
  147. races are possible even though shared memory is extensively supported!
  148. The subset is in fact the full language with the following
  149. restrictions / changes:
  150. * ``spawn`` within a ``parallel`` section has special semantics.
  151. * Every location of the form ``a[i]`` and ``a[i..j]`` and ``dest`` where
  152. ``dest`` is part of the pattern ``dest = spawn f(...)`` has to be
  153. provably disjoint. This is called the *disjoint check*.
  154. * Every other complex location ``loc`` that is used in a spawned
  155. proc (``spawn f(loc)``) has to be immutable for the duration of
  156. the ``parallel`` section. This is called the *immutability check*. Currently
  157. it is not specified what exactly "complex location" means. We need to make
  158. this an optimization!
  159. * Every array access has to be provably within bounds. This is called
  160. the *bounds check*.
  161. * Slices are optimized so that no copy is performed. This optimization is not
  162. yet performed for ordinary slices outside of a ``parallel`` section.