vm.texi 85 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974
  1. @c -*-texinfo-*-
  2. @c This is part of the GNU Guile Reference Manual.
  3. @c Copyright (C) 2008-2011, 2013, 2015, 2018, 2019
  4. @c Free Software Foundation, Inc.
  5. @c See the file guile.texi for copying conditions.
  6. @node A Virtual Machine for Guile
  7. @section A Virtual Machine for Guile
  8. Enough about data---how does Guile run code?
  9. Code is a grammatical production of a language. Sometimes these
  10. languages are implemented using interpreters: programs that run
  11. along-side the program being interpreted, dynamically translating the
  12. high-level code to low-level code. Sometimes these languages are
  13. implemented using compilers: programs that translate high-level
  14. programs to equivalent low-level code, and pass on that low-level code
  15. to some other language implementation. Each of these languages can be
  16. thought to be virtual machines: they offer programs an abstract machine
  17. on which to run.
  18. Guile implements a number of interpreters and compilers on different
  19. language levels. For example, there is an interpreter for the Scheme
  20. language that is itself implemented as a Scheme program compiled to a
  21. bytecode for a low-level virtual machine shipped with Guile. That
  22. virtual machine is implemented by both an interpreter---a C program that
  23. interprets the bytecodes---and a compiler---a C program that dynamically
  24. translates bytecode programs to native machine code@footnote{Even the
  25. lowest-level machine code can be thought to be interpreted by the CPU,
  26. and indeed is often implemented by compiling machine instructions to
  27. ``micro-operations''.}.
  28. This section describes the language implemented by Guile's bytecode
  29. virtual machine, as well as some examples of translations of Scheme
  30. programs to Guile's VM.
  31. @menu
  32. * Why a VM?::
  33. * VM Concepts::
  34. * Stack Layout::
  35. * Variables and the VM::
  36. * VM Programs::
  37. * Object File Format::
  38. * Instruction Set::
  39. * Just-In-Time Native Code::
  40. @end menu
  41. @node Why a VM?
  42. @subsection Why a VM?
  43. @cindex interpreter
  44. For a long time, Guile only had a Scheme interpreter, implemented in C.
  45. Guile's interpreter operated directly on the S-expression representation
  46. of Scheme source code.
  47. But while the interpreter was highly optimized and hand-tuned, it still
  48. performed many needless computations during the course of evaluating a
  49. Scheme expression. For example, application of a function to arguments
  50. needlessly consed up the arguments in a list. Evaluation of an
  51. expression like @code{(f x y)} always had to figure out whether @var{f}
  52. was a procedure, or a special form like @code{if}, or something else.
  53. The interpreter represented the lexical environment as a heap data
  54. structure, so every evaluation caused allocation, which was of course
  55. slow. Et cetera.
  56. The solution to the slow-interpreter problem was to compile the
  57. higher-level language, Scheme, into a lower-level language for which all
  58. of the checks and dispatching have already been done---the code is
  59. instead stripped to the bare minimum needed to ``do the job''.
  60. The question becomes then, what low-level language to choose? There are
  61. many options. We could compile to native code directly, but that poses
  62. portability problems for Guile, as it is a highly cross-platform
  63. project.
  64. So we want the performance gains that compilation provides, but we
  65. also want to maintain the portability benefits of a single code path.
  66. The obvious solution is to compile to a virtual machine that is
  67. present on all Guile installations.
  68. The easiest (and most fun) way to depend on a virtual machine is to
  69. implement the virtual machine within Guile itself. Guile contains a
  70. bytecode interpreter (written in C) and a Scheme to bytecode compiler
  71. (written in Scheme). This way the virtual machine provides what Scheme
  72. needs (tail calls, multiple values, @code{call/cc}) and can provide
  73. optimized inline instructions for Guile as well (GC-managed allocations,
  74. type checks, etc.).
  75. Guile also includes a just-in-time (JIT) compiler to translate bytecode
  76. to native code. Because Guile embeds a portable code generation library
  77. (@url{https://gitlab.com/wingo/lightening}), we keep the benefits of
  78. portability while also benefitting from fast native code. To avoid too
  79. much time spent in the JIT compiler itself, Guile is tuned to only emit
  80. machine code for bytecode that is called often.
  81. The rest of this section describes that VM that Guile implements, and
  82. the compiled procedures that run on it.
  83. Before moving on, though, we should note that though we spoke of the
  84. interpreter in the past tense, Guile still has an interpreter. The
  85. difference is that before, it was Guile's main Scheme implementation,
  86. and so was implemented in highly optimized C; now, it is actually
  87. implemented in Scheme, and compiled down to VM bytecode, just like any
  88. other program. (There is still a C interpreter around, used to
  89. bootstrap the compiler, but it is not normally used at runtime.)
  90. The upside of implementing the interpreter in Scheme is that we preserve
  91. tail calls and multiple-value handling between interpreted and compiled
  92. code, and with advent of the JIT compiler in Guile 3.0 we reach the
  93. speed of the old hand-tuned C implementation; it's the best of both
  94. worlds.
  95. Also note that this decision to implement a bytecode compiler does not
  96. preclude ahead-of-time native compilation. More possibilities are
  97. discussed in @ref{Extending the Compiler}.
  98. @node VM Concepts
  99. @subsection VM Concepts
  100. The bytecode in a Scheme procedure is interpreted by a virtual machine
  101. (VM). Each thread has its own instantiation of the VM. The virtual
  102. machine executes the sequence of instructions in a procedure.
  103. Each VM instruction starts by indicating which operation it is, and then
  104. follows by encoding its source and destination operands. Each procedure
  105. declares that it has some number of local variables, including the
  106. function arguments. These local variables form the available operands
  107. of the procedure, and are accessed by index.
  108. The local variables for a procedure are stored on a stack. Calling a
  109. procedure typically enlarges the stack, and returning from a procedure
  110. shrinks it. Stack memory is exclusive to the virtual machine that owns
  111. it.
  112. In addition to their stacks, virtual machines also have access to the
  113. global memory (modules, global bindings, etc) that is shared among other
  114. parts of Guile, including other VMs.
  115. The registers that a VM has are as follows:
  116. @itemize
  117. @item ip - Instruction pointer
  118. @item sp - Stack pointer
  119. @item fp - Frame pointer
  120. @end itemize
  121. In other architectures, the instruction pointer is sometimes called the
  122. ``program counter'' (pc). This set of registers is pretty typical for
  123. virtual machines; their exact meanings in the context of Guile's VM are
  124. described in the next section.
  125. @node Stack Layout
  126. @subsection Stack Layout
  127. The stack of Guile's virtual machine is composed of @dfn{frames}. Each
  128. frame corresponds to the application of one compiled procedure, and
  129. contains storage space for arguments, local variables, and some
  130. bookkeeping information (such as what to do after the frame is
  131. finished).
  132. While the compiler is free to do whatever it wants to, as long as the
  133. semantics of a computation are preserved, in practice every time you
  134. call a function, a new frame is created. (The notable exception of
  135. course is the tail call case, @pxref{Tail Calls}.)
  136. The structure of the top stack frame is as follows:
  137. @example
  138. | ...previous frame locals... |
  139. +==============================+ <- fp + 3
  140. | Dynamic link |
  141. +------------------------------+
  142. | Virtual return address (vRA) |
  143. +------------------------------+
  144. | Machine return address (mRA) |
  145. +==============================+ <- fp
  146. | Local 0 |
  147. +------------------------------+
  148. | Local 1 |
  149. +------------------------------+
  150. | ... |
  151. +------------------------------+
  152. | Local N-1 |
  153. \------------------------------/ <- sp
  154. @end example
  155. In the above drawing, the stack grows downward. At the beginning of a
  156. function call, the procedure being applied is in local 0, followed by
  157. the arguments from local 1. After the procedure checks that it is being
  158. passed a compatible set of arguments, the procedure allocates some
  159. additional space in the frame to hold variables local to the function.
  160. Note that once a value in a local variable slot is no longer needed,
  161. Guile is free to re-use that slot. This applies to the slots that were
  162. initially used for the callee and arguments, too. For this reason,
  163. backtraces in Guile aren't always able to show all of the arguments: it
  164. could be that the slot corresponding to that argument was re-used by
  165. some other variable.
  166. The @dfn{virtual return address} is the @code{ip} that was in effect
  167. before this program was applied. When we return from this activation
  168. frame, we will jump back to this @code{ip}. Likewise, the @dfn{dynamic
  169. link} is the offset of the @code{fp} that was in effect before this
  170. program was applied, relative to the current @code{fp}.
  171. There are two return addresses: the virtual return address (vRA), and
  172. the machine return address (mRA). The vRA is always present and
  173. indicates a bytecode address. The mRA is only present when a call is
  174. made from a function with machine code (e.g. a function that has been
  175. JIT-compiled).
  176. To prepare for a non-tail application, Guile's VM will emit code that
  177. shuffles the function to apply and its arguments into appropriate stack
  178. slots, with three free slots below them. The call then initializes
  179. those free slots to hold the machine return address (or NULL), the
  180. virtual return address, and the offset to the previous frame pointer
  181. (@code{fp}). It then gets the @code{ip} for the function being called
  182. and adjusts @code{fp} to point to the new call frame.
  183. In this way, the dynamic link links the current frame to the previous
  184. frame. Computing a stack trace involves traversing these frames.
  185. Each stack local in Guile is 64 bits wide, even on 32-bit architectures.
  186. This allows Guile to preserve its uniform treatment of stack locals
  187. while allowing for unboxed arithmetic on 64-bit integers and
  188. floating-point numbers. @xref{Instruction Set}, for more on unboxed
  189. arithmetic.
  190. As an implementation detail, we actually store the dynamic link as an
  191. offset and not an absolute value because the stack can move at runtime
  192. as it expands or during partial continuation calls. If it were an
  193. absolute value, we would have to walk the frames, relocating frame
  194. pointers.
  195. @node Variables and the VM
  196. @subsection Variables and the VM
  197. Consider the following Scheme code as an example:
  198. @example
  199. (define (foo a)
  200. (lambda (b) (vector foo a b)))
  201. @end example
  202. Within the lambda expression, @code{foo} is a top-level variable,
  203. @code{a} is a lexically captured variable, and @code{b} is a local
  204. variable.
  205. Another way to refer to @code{a} and @code{b} is to say that @code{a} is
  206. a ``free'' variable, since it is not defined within the lambda, and
  207. @code{b} is a ``bound'' variable. These are the terms used in the
  208. @dfn{lambda calculus}, a mathematical notation for describing functions.
  209. The lambda calculus is useful because it is a language in which to
  210. reason precisely about functions and variables. It is especially good
  211. at describing scope relations, and it is for that reason that we mention
  212. it here.
  213. Guile allocates all variables on the stack. When a lexically enclosed
  214. procedure with free variables---a @dfn{closure}---is created, it copies
  215. those variables into its free variable vector. References to free
  216. variables are then redirected through the free variable vector.
  217. If a variable is ever @code{set!}, however, it will need to be
  218. heap-allocated instead of stack-allocated, so that different closures
  219. that capture the same variable can see the same value. Also, this
  220. allows continuations to capture a reference to the variable, instead
  221. of to its value at one point in time. For these reasons, @code{set!}
  222. variables are allocated in ``boxes''---actually, in variable cells.
  223. @xref{Variables}, for more information. References to @code{set!}
  224. variables are indirected through the boxes.
  225. Thus perhaps counterintuitively, what would seem ``closer to the
  226. metal'', viz @code{set!}, actually forces an extra memory allocation and
  227. indirection. Sometimes Guile's optimizer can remove this allocation,
  228. but not always.
  229. Going back to our example, @code{b} may be allocated on the stack, as
  230. it is never mutated.
  231. @code{a} may also be allocated on the stack, as it too is never
  232. mutated. Within the enclosed lambda, its value will be copied into
  233. (and referenced from) the free variables vector.
  234. @code{foo} is a top-level variable, because @code{foo} is not
  235. lexically bound in this example.
  236. @node VM Programs
  237. @subsection Compiled Procedures are VM Programs
  238. By default, when you enter in expressions at Guile's REPL, they are
  239. first compiled to bytecode. Then that bytecode is executed to produce a
  240. value. If the expression evaluates to a procedure, the result of this
  241. process is a compiled procedure.
  242. A compiled procedure is a compound object consisting of its bytecode and
  243. a reference to any captured lexical variables. In addition, when a
  244. procedure is compiled, it has associated metadata written to side
  245. tables, for instance a line number mapping, or its docstring. You can
  246. pick apart these pieces with the accessors in @code{(system vm
  247. program)}. @xref{Compiled Procedures}, for a full API reference.
  248. A procedure may reference data that was statically allocated when the
  249. procedure was compiled. For example, a pair of immediate objects
  250. (@pxref{Immediate Objects}) can be allocated directly in the memory
  251. segment that contains the compiled bytecode, and accessed directly by
  252. the bytecode.
  253. Another use for statically allocated data is to serve as a cache for a
  254. bytecode. Top-level variable lookups are handled in this way; the first
  255. time a top-level binding is referenced, the resolved variable will be
  256. stored in a cache. Thereafter all access to the variable goes through
  257. the cache cell. The variable's value may change in the future, but the
  258. variable itself will not.
  259. We can see how these concepts tie together by disassembling the
  260. @code{foo} function we defined earlier to see what is going on:
  261. @smallexample
  262. scheme@@(guile-user)> (define (foo a) (lambda (b) (vector foo a b)))
  263. scheme@@(guile-user)> ,x foo
  264. Disassembly of #<procedure foo (a)> at #xf1da30:
  265. 0 (instrument-entry 164) at (unknown file):5:0
  266. 2 (assert-nargs-ee/locals 2 1) ;; 3 slots (1 arg)
  267. 3 (allocate-words/immediate 2 3) at (unknown file):5:16
  268. 4 (load-u64 0 0 65605)
  269. 7 (word-set!/immediate 2 0 0)
  270. 8 (load-label 0 7) ;; anonymous procedure at #xf1da6c
  271. 10 (word-set!/immediate 2 1 0)
  272. 11 (scm-set!/immediate 2 2 1)
  273. 12 (reset-frame 1) ;; 1 slot
  274. 13 (handle-interrupts)
  275. 14 (return-values)
  276. ----------------------------------------
  277. Disassembly of anonymous procedure at #xf1da6c:
  278. 0 (instrument-entry 183) at (unknown file):5:16
  279. 2 (assert-nargs-ee/locals 2 3) ;; 5 slots (1 arg)
  280. 3 (static-ref 2 152) ;; #<variable 112e530 value: #<procedure foo (a)>>
  281. 5 (immediate-tag=? 2 7 0) ;; heap-object?
  282. 7 (je 19) ;; -> L2
  283. 8 (static-ref 2 119) ;; #<directory (guile-user) ca9750>
  284. 10 (static-ref 1 127) ;; foo
  285. 12 (call-scm<-scm-scm 2 2 1 40)
  286. 14 (immediate-tag=? 2 7 0) ;; heap-object?
  287. 16 (jne 8) ;; -> L1
  288. 17 (scm-ref/immediate 0 2 1)
  289. 18 (immediate-tag=? 0 4095 2308) ;; undefined?
  290. 20 (je 4) ;; -> L1
  291. 21 (static-set! 2 134) ;; #<variable 112e530 value: #<procedure foo (a)>>
  292. 23 (j 3) ;; -> L2
  293. L1:
  294. 24 (throw/value 1 151) ;; #(unbound-variable #f "Unbound variable: ~S")
  295. L2:
  296. 26 (scm-ref/immediate 2 2 1)
  297. 27 (allocate-words/immediate 1 4) at (unknown file):5:28
  298. 28 (load-u64 0 0 781)
  299. 31 (word-set!/immediate 1 0 0)
  300. 32 (scm-set!/immediate 1 1 2)
  301. 33 (scm-ref/immediate 4 4 2)
  302. 34 (scm-set!/immediate 1 2 4)
  303. 35 (scm-set!/immediate 1 3 3)
  304. 36 (mov 4 1)
  305. 37 (reset-frame 1) ;; 1 slot
  306. 38 (handle-interrupts)
  307. 39 (return-values)
  308. @end smallexample
  309. The first thing to notice is that the bytecode is at a fairly low level.
  310. When a program is compiled from Scheme to bytecode, it is expressed in
  311. terms of more primitive operations. As such, there can be more
  312. instructions than you might expect.
  313. The first chunk of instructions is the outer @code{foo} procedure. It
  314. is followed by the code for the contained closure. The code can look
  315. daunting at first glance, but with practice it quickly becomes
  316. comprehensible, and indeed being able to read bytecode is an important
  317. step to understanding the low-level performance of Guile programs.
  318. The @code{foo} function begins with a prelude. The
  319. @code{instrument-entry} bytecode increments a counter associated with
  320. the function. If the counter reaches a certain threshold, Guile will
  321. emit machine code (``JIT-compile'') for @code{foo}. Emitting machine
  322. code is fairly cheap but it does take time, so it's not something you
  323. want to do for every function. Using a per-function counter and a
  324. global threshold allows Guile to spend time JIT-compiling only the
  325. ``hot'' functions.
  326. Next in the prelude is an argument-checking instruction, which checks
  327. that it was called with only 1 argument (plus the callee function itself
  328. makes 2) and then reserves stack space for an additional 1 local.
  329. Then from @code{ip} 3 to 11, we allocate a new closure by allocating a
  330. three-word object, initializing its first word to store a type tag,
  331. setting its second word to its code pointer, and finally at @code{ip}
  332. 11, storing local value 1 (the @code{a} argument) into the third word
  333. (the first free variable).
  334. Before returning, @code{foo} ``resets the frame'' to hold only one local
  335. (the return value), runs any pending interrupts (@pxref{Asyncs}) and
  336. then returns.
  337. Note that local variables in Guile's virtual machine are usually
  338. addressed relative to the stack pointer, which leads to a pleasantly
  339. efficient @code{sp[@var{n}]} access. However it can make the
  340. disassembly hard to read, because the @code{sp} can change during the
  341. function, and because incoming arguments are relative to the @code{fp},
  342. not the @code{sp}.
  343. To know what @code{fp}-relative slot corresponds to an
  344. @code{sp}-relative reference, scan up in the disassembly until you get
  345. to a ``@var{n} slots'' annotation; in our case, 3, indicating that the
  346. frame has space for 3 slots. Thus a zero-indexed @code{sp}-relative
  347. slot of 2 corresponds to the @code{fp}-relative slot of 0, which
  348. initially held the value of the closure being called. This means that
  349. Guile doesn't need the value of the closure to compute its result, and
  350. so slot 0 was free for re-use, in this case for the result of making a
  351. new closure.
  352. A closure is code with data. As you can see, making the closure
  353. involved making an object (@code{ip} 3), putting a code pointer in it
  354. (@code{ip} 8 and 10), and putting in the closure's free variable
  355. (@code{ip} 11).
  356. The second stanza disassembles the code for the closure. After the
  357. prelude, all of the code between @code{ip} 5 and 24 is related to
  358. loading the toplevel variable @code{foo} into slot 1. This lookup
  359. happens only once, and is associated with a cache; after the first run,
  360. the value in the cache will be a bound variable, and the code will jump
  361. from @code{ip} 7 to 26. On the first run, Guile gets the module
  362. associated with the function, calls out to a run-time routine to look up
  363. the variable, and checks that the variable is bound before initializing
  364. the cache. Either way, @code{ip} 26 dereferences the variable into
  365. local 2.
  366. What follows is the allocation and initialization of the vector return
  367. value. @code{Ip} 27 does the allocation, and the following two
  368. instructions initialize the type-and-length tag for the object's first
  369. word. @code{Ip} 32 sets word 1 of the object (the first vector slot) to
  370. the value of @code{foo}; @code{ip} 33 fetches the closure variable for
  371. @code{a}, then in @code{ip} 34 stores it in the second vector slot; and
  372. finally, in @code{ip} 35, local @code{b} is stored to the third vector
  373. slot. This is followed by the return sequence.
  374. @node Object File Format
  375. @subsection Object File Format
  376. To compile a file to disk, we need a format in which to write the
  377. compiled code to disk, and later load it into Guile. A good @dfn{object
  378. file format} has a number of characteristics:
  379. @itemize
  380. @item Above all else, it should be very cheap to load a compiled file.
  381. @item It should be possible to statically allocate constants in the
  382. file. For example, a bytevector literal in source code can be emitted
  383. directly into the object file.
  384. @item The compiled file should enable maximum code and data sharing
  385. between different processes.
  386. @item The compiled file should contain debugging information, such as
  387. line numbers, but that information should be separated from the code
  388. itself. It should be possible to strip debugging information if space
  389. is tight.
  390. @end itemize
  391. These characteristics are not specific to Scheme. Indeed, mainstream
  392. languages like C and C++ have solved this issue many times in the past.
  393. Guile builds on their work by adopting ELF, the object file format of
  394. GNU and other Unix-like systems, as its object file format. Although
  395. Guile uses ELF on all platforms, we do not use platform support for ELF.
  396. Guile implements its own linker and loader. The advantage of using ELF
  397. is not sharing code, but sharing ideas. ELF is simply a well-designed
  398. object file format.
  399. An ELF file has two meta-tables describing its contents. The first
  400. meta-table is for the loader, and is called the @dfn{program table} or
  401. sometimes the @dfn{segment table}. The program table divides the file
  402. into big chunks that should be treated differently by the loader.
  403. Mostly the difference between these @dfn{segments} is their
  404. permissions.
  405. Typically all segments of an ELF file are marked as read-only, except
  406. that part that represents modifiable static data or static data that
  407. needs load-time initialization. Loading an ELF file is as simple as
  408. mmapping the thing into memory with read-only permissions, then using
  409. the segment table to mark a small sub-region of the file as writable.
  410. This writable section is typically added to the root set of the garbage
  411. collector as well.
  412. One ELF segment is marked as ``dynamic'', meaning that it has data of
  413. interest to the loader. Guile uses this segment to record the Guile
  414. version corresponding to this file. There is also an entry in the
  415. dynamic segment that points to the address of an initialization thunk
  416. that is run to perform any needed link-time initialization. (This is
  417. like dynamic relocations for normal ELF shared objects, except that we
  418. compile the relocations as a procedure instead of having the loader
  419. interpret a table of relocations.) Finally, the dynamic segment marks
  420. the location of the ``entry thunk'' of the object file. This thunk is
  421. returned to the caller of @code{load-thunk-from-memory} or
  422. @code{load-thunk-from-file}. When called, it will execute the ``body''
  423. of the compiled expression.
  424. The other meta-table in an ELF file is the @dfn{section table}. Whereas
  425. the program table divides an ELF file into big chunks for the loader,
  426. the section table specifies small sections for use by introspective
  427. tools like debuggers or the like. One segment (program table entry)
  428. typically contains many sections. There may be sections outside of any
  429. segment, as well.
  430. Typical sections in a Guile @code{.go} file include:
  431. @table @code
  432. @item .rtl-text
  433. Bytecode.
  434. @item .data
  435. Data that needs initialization, or which may be modified at runtime.
  436. @item .rodata
  437. Statically allocated data that needs no run-time initialization, and
  438. which therefore can be shared between processes.
  439. @item .dynamic
  440. The dynamic section, discussed above.
  441. @item .symtab
  442. @itemx .strtab
  443. A table mapping addresses in the @code{.rtl-text} to procedure names.
  444. @code{.strtab} is used by @code{.symtab}.
  445. @item .guile.procprops
  446. @itemx .guile.arities
  447. @itemx .guile.arities.strtab
  448. @itemx .guile.docstrs
  449. @itemx .guile.docstrs.strtab
  450. Side tables of procedure properties, arities, and docstrings.
  451. @item .guile.docstrs.strtab
  452. Side table of frame maps, describing the set of live slots for ever
  453. return point in the program text, and whether those slots are pointers
  454. are not. Used by the garbage collector.
  455. @item .debug_info
  456. @itemx .debug_abbrev
  457. @itemx .debug_str
  458. @itemx .debug_loc
  459. @itemx .debug_line
  460. Debugging information, in DWARF format. See the DWARF specification,
  461. for more information.
  462. @item .shstrtab
  463. Section name string table.
  464. @end table
  465. For more information, see @uref{http://linux.die.net/man/5/elf,,the
  466. elf(5) man page}. See @uref{http://dwarfstd.org/,the DWARF
  467. specification} for more on the DWARF debugging format. Or if you are an
  468. adventurous explorer, try running @code{readelf} or @code{objdump} on
  469. compiled @code{.go} files. It's good times!
  470. @node Instruction Set
  471. @subsection Instruction Set
  472. There are currently about 150 instructions in Guile's virtual machine.
  473. These instructions represent atomic units of a program's execution.
  474. Ideally, they perform one task without conditional branches, then
  475. dispatch to the next instruction in the stream.
  476. Instructions themselves are composed of 1 or more 32-bit units. The low
  477. 8 bits of the first word indicate the opcode, and the rest of
  478. instruction describe the operands. There are a number of different ways
  479. operands can be encoded.
  480. @table @code
  481. @item s@var{n}
  482. An unsigned @var{n}-bit integer, indicating the @code{sp}-relative index
  483. of a local variable.
  484. @item f@var{n}
  485. An unsigned @var{n}-bit integer, indicating the @code{fp}-relative index
  486. of a local variable. Used when a continuation accepts a variable number
  487. of values, to shuffle received values into known locations in the
  488. frame.
  489. @item c@var{n}
  490. An unsigned @var{n}-bit integer, indicating a constant value.
  491. @item l24
  492. An offset from the current @code{ip}, in 32-bit units, as a signed
  493. 24-bit value. Indicates a bytecode address, for a relative jump.
  494. @item i16
  495. @itemx i32
  496. An immediate Scheme value (@pxref{Immediate Objects}), encoded directly
  497. in 16 or 32 bits.
  498. @item a32
  499. @itemx b32
  500. An immediate Scheme value, encoded as a pair of 32-bit words.
  501. @code{a32} and @code{b32} values always go together on the same opcode,
  502. and indicate the high and low bits, respectively. Normally only used on
  503. 64-bit systems.
  504. @item n32
  505. A statically allocated non-immediate. The address of the non-immediate
  506. is encoded as a signed 32-bit integer, and indicates a relative offset
  507. in 32-bit units. Think of it as @code{SCM x = ip + offset}.
  508. @item r32
  509. Indirect scheme value, like @code{n32} but indirected. Think of it as
  510. @code{SCM *x = ip + offset}.
  511. @item l32
  512. @item lo32
  513. An ip-relative address, as a signed 32-bit integer. Could indicate a
  514. bytecode address, as in @code{make-closure}, or a non-immediate address,
  515. as with @code{static-patch!}.
  516. @code{l32} and @code{lo32} are the same from the perspective of the
  517. virtual machine. The difference is that an assembler might want to
  518. allow an @code{lo32} address to be specified as a label and then some
  519. number of words offset from that label, for example when patching a
  520. field of a statically allocated object.
  521. @item b1
  522. A boolean value: 1 for true, otherwise 0.
  523. @item x@var{n}
  524. An ignored sequence of @var{n} bits.
  525. @end table
  526. An instruction is specified by giving its name, then describing its
  527. operands. The operands are packed by 32-bit words, with earlier
  528. operands occupying the lower bits.
  529. For example, consider the following instruction specification:
  530. @deftypefn Instruction {} call f24:@var{proc} x8:@var{_} c24:@var{nlocals}
  531. @end deftypefn
  532. The first word in the instruction will start with the 8-bit value
  533. corresponding to the @var{call} opcode in the low bits, followed by
  534. @var{proc} as a 24-bit value. The second word starts with 8 dead bits,
  535. followed by the index as a 24-bit immediate value.
  536. For instructions with operands that encode references to the stack, the
  537. interpretation of those stack values is up to the instruction itself.
  538. Most instructions expect their operands to be tagged SCM values
  539. (@code{scm} representation), but some instructions expect unboxed
  540. integers (@code{u64} and @code{s64} representations) or floating-point
  541. numbers (@code{f64} representation). It is assumed that the bits for a
  542. @code{u64} value are the same as those for an @code{s64} value, and that
  543. @code{s64} values are stored in two's complement.
  544. Instructions have static types: they must receive their operands in the
  545. format they expect. It's up to the compiler to ensure this is the case.
  546. Unless otherwise mentioned, all operands and results are in the
  547. @code{scm} representation.
  548. @menu
  549. * Call and Return Instructions::
  550. * Function Prologue Instructions::
  551. * Shuffling Instructions::
  552. * Trampoline Instructions::
  553. * Non-Local Control Flow Instructions::
  554. * Instrumentation Instructions::
  555. * Intrinsic Call Instructions::
  556. * Constant Instructions::
  557. * Memory Access Instructions::
  558. * Atomic Memory Access Instructions::
  559. * Tagging and Untagging Instructions::
  560. * Integer Arithmetic Instructions::
  561. * Floating-Point Arithmetic Instructions::
  562. * Comparison Instructions::
  563. * Branch Instructions::
  564. * Raw Memory Access Instructions::
  565. @end menu
  566. @node Call and Return Instructions
  567. @subsubsection Call and Return Instructions
  568. As described earlier (@pxref{Stack Layout}), Guile's calling convention
  569. is that arguments are passed and values returned on the stack.
  570. For calls, both in tail position and in non-tail position, we require
  571. that the procedure and the arguments already be shuffled into place
  572. before the call instruction. ``Into place'' for a tail call means that
  573. the procedure should be in slot 0, relative to the @code{fp}, and the
  574. arguments should follow. For a non-tail call, if the procedure is in
  575. @code{fp}-relative slot @var{n}, the arguments should follow from slot
  576. @var{n}+1, and there should be three free slots between @var{n}-1 and
  577. @var{n}-3 in which to save the mRA, vRA, and @code{fp}.
  578. Returning values is similar. Multiple-value returns should have values
  579. already shuffled down to start from @code{fp}-relative slot 0 before
  580. emitting @code{return-values}.
  581. In both calls and returns, the @code{sp} is used to indicate to the
  582. callee or caller the number of arguments or return values, respectively.
  583. After receiving return values, it is the caller's responsibility to
  584. @dfn{restore the frame} by resetting the @code{sp} to its former value.
  585. @deftypefn Instruction {} call f24:@var{proc} x8:@var{_} c24:@var{nlocals}
  586. Call a procedure. @var{proc} is the local corresponding to a procedure.
  587. The three values below @var{proc} will be overwritten by the saved call
  588. frame data. The new frame will have space for @var{nlocals} locals: one
  589. for the procedure, and the rest for the arguments which should already
  590. have been pushed on.
  591. When the call returns, execution proceeds with the next instruction.
  592. There may be any number of values on the return stack; the precise
  593. number can be had by subtracting the address of @var{proc}-1 from the
  594. post-call @code{sp}.
  595. @end deftypefn
  596. @deftypefn Instruction {} call-label f24:@var{proc} x8:@var{_} c24:@var{nlocals} l32:@var{label}
  597. Call a procedure in the same compilation unit.
  598. This instruction is just like @code{call}, except that instead of
  599. dereferencing @var{proc} to find the call target, the call target is
  600. known to be at @var{label}, a signed 32-bit offset in 32-bit units from
  601. the current @code{ip}. Since @var{proc} is not dereferenced, it may be
  602. some other representation of the closure.
  603. @end deftypefn
  604. @deftypefn Instruction {} tail-call x24:@var{_}
  605. Tail-call a procedure. Requires that the procedure and all of the
  606. arguments have already been shuffled into position, and that the frame
  607. has already been reset to the number of arguments to the call.
  608. @end deftypefn
  609. @deftypefn Instruction {} tail-call-label x24:@var{_} l32:@var{label}
  610. Tail-call a known procedure. As @code{call} is to @code{call-label},
  611. @code{tail-call} is to @code{tail-call-label}.
  612. @end deftypefn
  613. @deftypefn Instruction {} return-values x24:@var{_}
  614. Return a number of values from a call frame. The return values should
  615. have already been shuffled down to a contiguous array starting at slot
  616. 0, and the frame already reset.
  617. @end deftypefn
  618. @deftypefn Instruction {} receive f12:@var{dst} f12:@var{proc} x8:@var{_} c24:@var{nlocals}
  619. Receive a single return value from a call whose procedure was in
  620. @var{proc}, asserting that the call actually returned at least one
  621. value. Afterwards, resets the frame to @var{nlocals} locals.
  622. @end deftypefn
  623. @deftypefn Instruction {} receive-values f24:@var{proc} b1:@var{allow-extra?} x7:@var{_} c24:@var{nvalues}
  624. Receive a return of multiple values from a call whose procedure was in
  625. @var{proc}. If fewer than @var{nvalues} values were returned, signal an
  626. error. Unless @var{allow-extra?} is true, require that the number of
  627. return values equals @var{nvalues} exactly. After @code{receive-values}
  628. has run, the values can be copied down via @code{mov}, or used in place.
  629. @end deftypefn
  630. @node Function Prologue Instructions
  631. @subsubsection Function Prologue Instructions
  632. A function call in Guile is very cheap: the VM simply hands control to
  633. the procedure. The procedure itself is responsible for asserting that it
  634. has been passed an appropriate number of arguments. This strategy allows
  635. arbitrarily complex argument parsing idioms to be developed, without
  636. harming the common case.
  637. For example, only calls to keyword-argument procedures ``pay'' for the
  638. cost of parsing keyword arguments. (At the time of this writing, calling
  639. procedures with keyword arguments is typically two to four times as
  640. costly as calling procedures with a fixed set of arguments.)
  641. @deftypefn Instruction {} assert-nargs-ee c24:@var{expected}
  642. @deftypefnx Instruction {} assert-nargs-ge c24:@var{expected}
  643. @deftypefnx Instruction {} assert-nargs-le c24:@var{expected}
  644. If the number of actual arguments is not @code{==}, @code{>=}, or
  645. @code{<=} @var{expected}, respectively, signal an error.
  646. The number of arguments is determined by subtracting the stack pointer
  647. from the frame pointer (@code{fp - sp}). @xref{Stack Layout}, for more
  648. details on stack frames. Note that @var{expected} includes the
  649. procedure itself.
  650. @end deftypefn
  651. @deftypefn Instruction {} arguments<=? c24:@var{expected}
  652. Set the @code{LESS_THAN}, @code{EQUAL}, or @code{NONE} comparison result
  653. values if the number of arguments is respectively less than, equal to,
  654. or greater than @var{expected}.
  655. @end deftypefn
  656. @deftypefn Instruction {} positional-arguments<=? c24:@var{nreq} x8:@var{_} c24:@var{expected}
  657. Set the @code{LESS_THAN}, @code{EQUAL}, or @code{NONE} comparison result
  658. values if the number of positional arguments is respectively less than,
  659. equal to, or greater than @var{expected}. The first @var{nreq}
  660. arguments are positional arguments, as are the subsequent arguments that
  661. are not keywords.
  662. @end deftypefn
  663. The @code{arguments<=?} and @code{positional-arguments<=?} instructions
  664. are used to implement multiple arities, as in @code{case-lambda}.
  665. @xref{Case-lambda}, for more information. @xref{Branch Instructions},
  666. for more on comparison results.
  667. @deftypefn Instruction {} bind-kwargs c24:@var{nreq} c8:@var{flags} c24:@var{nreq-and-opt} x8:@var{_} c24:@var{ntotal} n32:@var{kw-offset}
  668. @var{flags} is a bitfield, whose lowest bit is @var{allow-other-keys},
  669. second bit is @var{has-rest}, and whose following six bits are unused.
  670. Find the last positional argument, and shuffle all the rest above
  671. @var{ntotal}. Initialize the intervening locals to
  672. @code{SCM_UNDEFINED}. Then load the constant at @var{kw-offset} words
  673. from the current @var{ip}, and use it and the @var{allow-other-keys}
  674. flag to bind keyword arguments. If @var{has-rest}, collect all shuffled
  675. arguments into a list, and store it in @var{nreq-and-opt}. Finally,
  676. clear the arguments that we shuffled up.
  677. The parsing is driven by a keyword arguments association list, looked up
  678. using @var{kw-offset}. The alist is a list of pairs of the form
  679. @code{(@var{kw} . @var{index})}, mapping keyword arguments to their
  680. local slot indices. Unless @code{allow-other-keys} is set, the parser
  681. will signal an error if an unknown key is found.
  682. A macro-mega-instruction.
  683. @end deftypefn
  684. @deftypefn Instruction {} bind-optionals f24:@var{nlocals}
  685. Expand the current frame to have at least @var{nlocals} locals, filling
  686. in any fresh values with @code{SCM_UNDEFINED}. If the frame has more
  687. than @var{nlocals} locals, it is left as it is.
  688. @end deftypefn
  689. @deftypefn Instruction {} bind-rest f24:@var{dst}
  690. Collect any arguments at or above @var{dst} into a list, and store that
  691. list at @var{dst}.
  692. @end deftypefn
  693. @deftypefn Instruction {} alloc-frame c24:@var{nlocals}
  694. Ensure that there is space on the stack for @var{nlocals} local
  695. variables. The value of any new local is undefined.
  696. @end deftypefn
  697. @deftypefn Instruction {} reset-frame c24:@var{nlocals}
  698. Like @code{alloc-frame}, but doesn't check that the stack is big enough,
  699. and doesn't initialize values to @code{SCM_UNDEFINED}. Used to reset
  700. the frame size to something less than the size that was previously set
  701. via alloc-frame.
  702. @end deftypefn
  703. @deftypefn Instruction {} assert-nargs-ee/locals c12:@var{expected} c12:@var{nlocals}
  704. Equivalent to a sequence of @code{assert-nargs-ee} and
  705. @code{allocate-frame}. The number of locals reserved is @var{expected}
  706. + @var{nlocals}.
  707. @end deftypefn
  708. @node Shuffling Instructions
  709. @subsubsection Shuffling Instructions
  710. These instructions are used to move around values on the stack.
  711. @deftypefn Instruction {} mov s12:@var{dst} s12:@var{src}
  712. @deftypefnx Instruction {} long-mov s24:@var{dst} x8:@var{_} s24:@var{src}
  713. Copy a value from one local slot to another.
  714. As discussed previously, procedure arguments and local variables are
  715. allocated to local slots. Guile's compiler tries to avoid shuffling
  716. variables around to different slots, which often makes @code{mov}
  717. instructions redundant. However there are some cases in which shuffling
  718. is necessary, and in those cases, @code{mov} is the thing to use.
  719. @end deftypefn
  720. @deftypefn Instruction {} long-fmov f24:@var{dst} x8:@var{_} f24:@var{src}
  721. Copy a value from one local slot to another, but addressing slots
  722. relative to the @code{fp} instead of the @code{sp}. This is used when
  723. shuffling values into place after multiple-value returns.
  724. @end deftypefn
  725. @deftypefn Instruction {} push s24:@var{src}
  726. Bump the stack pointer by one word, and fill it with the value from slot
  727. @var{src}. The offset to @var{src} is calculated before the stack
  728. pointer is adjusted.
  729. @end deftypefn
  730. The @code{push} instruction is used when another instruction is unable
  731. to address an operand because the operand is encoded with fewer than 24
  732. bits. In that case, Guile's assembler will transparently emit code that
  733. temporarily pushes any needed operands onto the stack, emits the
  734. original instruction to address those now-near variables, then shuffles
  735. the result (if any) back into place.
  736. @deftypefn Instruction {} pop s24:@var{dst}
  737. Pop the stack pointer, storing the value that was there in slot
  738. @var{dst}. The offset to @var{dst} is calculated after the stack
  739. pointer is adjusted.
  740. @end deftypefn
  741. @deftypefn Instruction {} drop c24:@var{count}
  742. Pop the stack pointer by @var{count} words, discarding any values that
  743. were stored there.
  744. @end deftypefn
  745. @deftypefn Instruction {} shuffle-down f12:@var{from} f12:@var{to}
  746. Shuffle down values from @var{from} to @var{to}, reducing the frame size
  747. by @var{FROM}-@var{TO} slots. Part of the internal implementation of
  748. @code{call-with-values}, @code{values}, and @code{apply}.
  749. @end deftypefn
  750. @deftypefn Instruction {} expand-apply-argument x24:@var{_}
  751. Take the last local in a frame and expand it out onto the stack, as for
  752. the last argument to @code{apply}.
  753. @end deftypefn
  754. @node Trampoline Instructions
  755. @subsubsection Trampoline Instructions
  756. Though most applicable objects in Guile are procedures implemented in
  757. bytecode, not all are. There are primitives, continuations, and other
  758. procedure-like objects that have their own calling convention. Instead
  759. of adding special cases to the @code{call} instruction, Guile wraps
  760. these other applicable objects in VM trampoline procedures, then
  761. provides special support for these objects in bytecode.
  762. Trampoline procedures are typically generated by Guile at runtime, for
  763. example in response to a call to @code{scm_c_make_gsubr}. As such, a
  764. compiler probably shouldn't emit code with these instructions. However,
  765. it's still interesting to know how these things work, so we document
  766. these trampoline instructions here.
  767. @deftypefn Instruction {} subr-call c24:@var{idx}
  768. Call a subr, passing all locals in this frame as arguments, and storing
  769. the results on the stack, ready to be returned.
  770. @end deftypefn
  771. @deftypefn Instruction {} foreign-call c12:@var{cif-idx} c12:@var{ptr-idx}
  772. Call a foreign function. Fetch the @var{cif} and foreign pointer from
  773. @var{cif-idx} and @var{ptr-idx} closure slots of the callee. Arguments
  774. are taken from the stack, and results placed on the stack, ready to be
  775. returned.
  776. @end deftypefn
  777. @deftypefn Instruction {} builtin-ref s12:@var{dst} c12:@var{idx}
  778. Load a builtin stub by index into @var{dst}.
  779. @end deftypefn
  780. @node Non-Local Control Flow Instructions
  781. @subsubsection Non-Local Control Flow Instructions
  782. @deftypefn Instruction {} capture-continuation s24:@var{dst}
  783. Capture the current continuation, and write it to @var{dst}. Part of
  784. the implementation of @code{call/cc}.
  785. @end deftypefn
  786. @deftypefn Instruction {} continuation-call c24:@var{contregs}
  787. Return to a continuation, nonlocally. The arguments to the continuation
  788. are taken from the stack. @var{contregs} is a free variable containing
  789. the reified continuation.
  790. @end deftypefn
  791. @deftypefn Instruction {} abort x24:@var{_}
  792. Abort to a prompt handler. The tag is expected in slot 1, and the rest
  793. of the values in the frame are returned to the prompt handler. This
  794. corresponds to a tail application of @code{abort-to-prompt}.
  795. If no prompt can be found in the dynamic environment with the given tag,
  796. an error is signalled. Otherwise all arguments are passed to the
  797. prompt's handler, along with the captured continuation, if necessary.
  798. If the prompt's handler can be proven to not reference the captured
  799. continuation, no continuation is allocated. This decision happens
  800. dynamically, at run-time; the general case is that the continuation may
  801. be captured, and thus resumed. A reinstated continuation will have its
  802. arguments pushed on the stack from slot 0, as if from a multiple-value
  803. return, and control resumes in the caller. Thus to the calling
  804. function, a call to @code{abort-to-prompt} looks like any other function
  805. call.
  806. @end deftypefn
  807. @deftypefn Instruction {} compose-continuation c24:@var{cont}
  808. Compose a partial continuation with the current continuation. The
  809. arguments to the continuation are taken from the stack. @var{cont} is a
  810. free variable containing the reified continuation.
  811. @end deftypefn
  812. @deftypefn Instruction {} prompt s24:@var{tag} b1:@var{escape-only?} x7:@var{_} f24:@var{proc-slot} x8:@var{_} l24:@var{handler-offset}
  813. Push a new prompt on the dynamic stack, with a tag from @var{tag} and a
  814. handler at @var{handler-offset} words from the current @var{ip}.
  815. If an abort is made to this prompt, control will jump to the handler.
  816. The handler will expect a multiple-value return as if from a call with
  817. the procedure at @var{proc-slot}, with the reified partial continuation
  818. as the first argument, followed by the values returned to the handler.
  819. If control returns to the handler, the prompt is already popped off by
  820. the abort mechanism. (Guile's @code{prompt} implements Felleisen's
  821. @dfn{--F--} operator.)
  822. If @var{escape-only?} is nonzero, the prompt will be marked as
  823. escape-only, which allows an abort to this prompt to avoid reifying the
  824. continuation.
  825. @xref{Prompts}, for more information on prompts.
  826. @end deftypefn
  827. @deftypefn Instruction {} throw s12:@var{key} s12:@var{args}
  828. Raise an error by throwing to @var{key} and @var{args}. @var{args}
  829. should be a list.
  830. @end deftypefn
  831. @deftypefn Instruction {} throw/value s24:@var{value} n32:@var{key-subr-and-message}
  832. @deftypefnx Instruction {} throw/value+data s24:@var{value} n32:@var{key-subr-and-message}
  833. Raise an error, indicating @var{val} as the bad value.
  834. @var{key-subr-and-message} should be a vector, where the first element
  835. is the symbol to which to throw, the second is the procedure in which to
  836. signal the error (a string) or @code{#f}, and the third is a format
  837. string for the message, with one template. These instructions do not
  838. fall through.
  839. Both of these instructions throw to a key with four arguments: the
  840. procedure that indicates the error (or @code{#f}, the format string, a
  841. list with @var{value}, and either @code{#f} or the list with @var{value}
  842. as the last argument respectively.
  843. @end deftypefn
  844. @node Instrumentation Instructions
  845. @subsubsection Instrumentation Instructions
  846. @deftypefn Instruction {} instrument-entry x24_@var{_} n32:@var{data}
  847. @deftypefnx Instruction {} instrument-loop x24_@var{_} n32:@var{data}
  848. Increase execution counter for this function and potentially tier up to
  849. the next JIT level. @var{data} is an offset to a structure recording
  850. execution counts and the next-level JIT code corresponding to this
  851. function. The increment values are currently 30 for
  852. @code{instrument-entry} and 2 for @code{instrument-loop}.
  853. @code{instrument-entry} will also run the apply hook, if VM hooks are
  854. enabled.
  855. @end deftypefn
  856. @deftypefn Instruction {} handle-interrupts x24:@var{_}
  857. Handle pending asynchronous interrupts (asyncs). @xref{Asyncs}. The
  858. compiler inserts @code{handle-interrupts} instructions before any call,
  859. return, or loop back-edge.
  860. @end deftypefn
  861. @deftypefn Instruction {} return-from-interrupt x24:@var{_}
  862. A special instruction to return from a call and also pop off the stack
  863. frame from the call. Used when returning from asynchronous interrupts.
  864. @end deftypefn
  865. @node Intrinsic Call Instructions
  866. @subsubsection Intrinsic Call Instructions
  867. Guile's instruction set is low-level. This is good because the separate
  868. components of, say, a @code{vector-ref} operation might be able to be
  869. optimized out, leaving only the operations that need to be performed at
  870. run-time.
  871. However some macro-operations may need to perform large amounts of
  872. computation at run-time to handle all the edge cases, and whose
  873. micro-operation components aren't amenable to optimization.
  874. Residualizing code for the entire macro-operation would lead to code
  875. bloat with no benefit.
  876. In this kind of a case, Guile's VM calls out to @dfn{intrinsics}:
  877. run-time routines written in the host language (currently C, possibly
  878. more in the future if Guile gains more run-time targets like
  879. WebAssembly). There is one instruction for each instrinsic prototype;
  880. the intrinsic is specified by index in the instruction.
  881. @deftypefn Instruction {} call-thread x24:@var{_} c32:@var{idx}
  882. Call the @code{void}-returning instrinsic with index @var{idx}, passing
  883. the current @code{scm_thread*} as the argument.
  884. @end deftypefn
  885. @deftypefn Instruction {} call-thread-scm s24:@var{a} c32:@var{idx}
  886. Call the @code{void}-returning instrinsic with index @var{idx}, passing
  887. the current @code{scm_thread*} and the @code{scm} local @var{a} as
  888. arguments.
  889. @end deftypefn
  890. @deftypefn Instruction {} call-thread-scm-scm s12:@var{a} s12:@var{b} c32:@var{idx}
  891. Call the @code{void}-returning instrinsic with index @var{idx}, passing
  892. the current @code{scm_thread*} and the @code{scm} locals @var{a} and
  893. @var{b} as arguments.
  894. @end deftypefn
  895. @deftypefn Instruction {} call-scm-sz-u32 s12:@var{a} s12:@var{b} c32:@var{idx}
  896. Call the @code{void}-returning instrinsic with index @var{idx}, passing
  897. the locals @var{a}, @var{b}, and @var{c} as arguments. @var{a} is a
  898. @code{scm} value, while @var{b} and @var{c} are raw @code{u64} values
  899. which fit into @code{size_t} and @code{uint32_t} types, respectively.
  900. @end deftypefn
  901. @deftypefn Instruction {} call-scm<-u64 s24:@var{dst} c32:@var{idx}
  902. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  903. the current @code{scm_thread*} as the argument. Place the result in
  904. @var{dst}.
  905. @end deftypefn
  906. @deftypefn Instruction {} call-scm<-u64 s12:@var{dst} s12:@var{a} c32:@var{idx}
  907. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  908. @code{u64} local @var{a} as the argument. Place the result in
  909. @var{dst}.
  910. @end deftypefn
  911. @deftypefn Instruction {} call-scm<-s64 s12:@var{dst} s12:@var{a} c32:@var{idx}
  912. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  913. @code{s64} local @var{a} as the argument. Place the result in
  914. @var{dst}.
  915. @end deftypefn
  916. @deftypefn Instruction {} call-scm<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
  917. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  918. @code{scm} local @var{a} as the argument. Place the result in
  919. @var{dst}.
  920. @end deftypefn
  921. @deftypefn Instruction {} call-u64<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
  922. Call the @code{uint64_t}-returning instrinsic with index @var{idx},
  923. passing @code{scm} local @var{a} as the argument. Place the @code{u64}
  924. result in @var{dst}.
  925. @end deftypefn
  926. @deftypefn Instruction {} call-s64<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
  927. Call the @code{int64_t}-returning instrinsic with index @var{idx},
  928. passing @code{scm} local @var{a} as the argument. Place the @code{s64}
  929. result in @var{dst}.
  930. @end deftypefn
  931. @deftypefn Instruction {} call-f64<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
  932. Call the @code{double}-returning instrinsic with index @var{idx},
  933. passing @code{scm} local @var{a} as the argument. Place the @code{f64}
  934. result in @var{dst}.
  935. @end deftypefn
  936. @deftypefn Instruction {} call-scm<-scm-scm s8:@var{dst} s8:@var{a} s8:@var{b} c32:@var{idx}
  937. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  938. @code{scm} locals @var{a} and @var{b} as arguments. Place the
  939. @code{scm} result in @var{dst}.
  940. @end deftypefn
  941. @deftypefn Instruction {} call-scm<-scm-uimm s8:@var{dst} s8:@var{a} c8:@var{b} c32:@var{idx}
  942. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  943. @code{scm} local @var{a} and @code{uint8_t} immediate @var{b} as
  944. arguments. Place the @code{scm} result in @var{dst}.
  945. @end deftypefn
  946. @deftypefn Instruction {} call-scm<-thread-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
  947. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  948. the current @code{scm_thread*} and @code{scm} local @var{a} as
  949. arguments. Place the @code{scm} result in @var{dst}.
  950. @end deftypefn
  951. @deftypefn Instruction {} call-scm<-scm-u64 s8:@var{dst} s8:@var{a} s8:@var{b} c32:@var{idx}
  952. Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
  953. @code{scm} local @var{a} and @code{u64} local @var{b} as arguments.
  954. Place the @code{scm} result in @var{dst}.
  955. @end deftypefn
  956. There are corresponding macro-instructions for specific intrinsics.
  957. These are equivalent to @code{call-@var{instrinsic-kind}} instructions
  958. with the appropriate intrinsic @var{idx} arguments.
  959. @deffn {Macro Instruction} add dst a b
  960. @deffnx {Macro Instruction} add/immediate dst a b/imm
  961. Add @code{SCM} values @var{a} and @var{b} and place the result in
  962. @var{dst}.
  963. @end deffn
  964. @deffn {Macro Instruction} sub dst a b
  965. @deffnx {Macro Instruction} sub/immediate dst a b/imm
  966. Subtract @code{SCM} value @var{b} from @var{a} and place the result in
  967. @var{dst}.
  968. @end deffn
  969. @deffn {Macro Instruction} mul dst a b
  970. Multiply @code{SCM} values @var{a} and @var{b} and place the result in
  971. @var{dst}.
  972. @end deffn
  973. @deffn {Macro Instruction} div dst a b
  974. Divide @code{SCM} value @var{a} by @var{b} and place the result in
  975. @var{dst}.
  976. @end deffn
  977. @deffn {Macro Instruction} quo dst a b
  978. Compute the quotient of @code{SCM} values @var{a} and @var{b} and place
  979. the result in @var{dst}.
  980. @end deffn
  981. @deffn {Macro Instruction} rem dst a b
  982. Compute the remainder of @code{SCM} values @var{a} and @var{b} and place
  983. the result in @var{dst}.
  984. @end deffn
  985. @deffn {Macro Instruction} mod dst a b
  986. Compute the modulo of @code{SCM} value @var{a} by @var{b} and place the
  987. result in @var{dst}.
  988. @end deffn
  989. @deffn {Macro Instruction} logand dst a b
  990. Compute the bitwise @code{and} of @code{SCM} values @var{a} and @var{b}
  991. and place the result in @var{dst}.
  992. @end deffn
  993. @deffn {Macro Instruction} logior dst a b
  994. Compute the bitwise inclusive @code{or} of @code{SCM} values @var{a} and
  995. @var{b} and place the result in @var{dst}.
  996. @end deffn
  997. @deffn {Macro Instruction} logxor dst a b
  998. Compute the bitwise exclusive @code{or} of @code{SCM} values @var{a} and
  999. @var{b} and place the result in @var{dst}.
  1000. @end deffn
  1001. @deffn {Macro Instruction} logsub dst a b
  1002. Compute the bitwise @code{and} of @code{SCM} value @var{a} and the
  1003. bitwise @code{not} of @var{b} and place the result in @var{dst}.
  1004. @end deffn
  1005. @deffn {Macro Instruction} lsh dst a b
  1006. @deffnx {Macro Instruction} lsh/immediate a b/imm
  1007. Shift @code{SCM} value @var{a} left by @code{u64} value @var{b} bits and
  1008. place the result in @var{dst}.
  1009. @end deffn
  1010. @deffn {Macro Instruction} rsh dst a b
  1011. @deffnx {Macro Instruction} rsh/immediate dst a b/imm
  1012. Shifts @code{SCM} value @var{a} right by @code{u64} value @var{b} bits
  1013. and place the result in @var{dst}.
  1014. @end deffn
  1015. @deffn {Macro Instruction} scm->f64 dst src
  1016. Convert @var{src} to an unboxed @code{f64} and place the result in
  1017. @var{dst}, or raises an error if @var{src} is not a real number.
  1018. @end deffn
  1019. @deffn {Macro Instruction} scm->u64 dst src
  1020. Convert @var{src} to an unboxed @code{u64} and place the result in
  1021. @var{dst}, or raises an error if @var{src} is not an integer within
  1022. range.
  1023. @end deffn
  1024. @deffn {Macro Instruction} scm->u64/truncate dst src
  1025. Convert @var{src} to an unboxed @code{u64} and place the result in
  1026. @var{dst}, truncating to the low 64 bits, or raises an error if
  1027. @var{src} is not an integer.
  1028. @end deffn
  1029. @deffn {Macro Instruction} scm->s64 dst src
  1030. Convert @var{src} to an unboxed @code{s64} and place the result in
  1031. @var{dst}, or raises an error if @var{src} is not an integer within
  1032. range.
  1033. @end deffn
  1034. @deffn {Macro Instruction} u64->scm dst src
  1035. Convert @var{u64} value @var{src} to a Scheme integer in @var{dst}.
  1036. @end deffn
  1037. @deffn {Macro Instruction} s64->scm scm<-s64
  1038. Convert @var{s64} value @var{src} to a Scheme integer in @var{dst}.
  1039. @end deffn
  1040. @deffn {Macro Instruction} string-set! str idx ch
  1041. Sets the character @var{idx} (a @code{u64}) of string @var{str} to
  1042. @var{ch} (a @code{u64} that is a valid character value).
  1043. @end deffn
  1044. @deffn {Macro Instruction} string->number dst src
  1045. Call @code{string->number} on @var{src} and place the result in
  1046. @var{dst}.
  1047. @end deffn
  1048. @deffn {Macro Instruction} string->symbol dst src
  1049. Call @code{string->symbol} on @var{src} and place the result in
  1050. @var{dst}.
  1051. @end deffn
  1052. @deffn {Macro Instruction} symbol->keyword dst src
  1053. Call @code{symbol->keyword} on @var{src} and place the result in
  1054. @var{dst}.
  1055. @end deffn
  1056. @deffn {Macro Instruction} class-of dst src
  1057. Set @var{dst} to the GOOPS class of @code{src}.
  1058. @end deffn
  1059. @deffn {Macro Instruction} wind winder unwinder
  1060. Push wind and unwind procedures onto the dynamic stack. Note that
  1061. neither are actually called; the compiler should emit calls to
  1062. @var{winder} and @var{unwinder} for the normal dynamic-wind control
  1063. flow. Also note that the compiler should have inserted checks that
  1064. @var{winder} and @var{unwinder} are thunks, if it could not prove that
  1065. to be the case. @xref{Dynamic Wind}.
  1066. @end deffn
  1067. @deffn {Macro Instruction} unwind
  1068. Exit from the dynamic extent of an expression, popping the top entry off
  1069. of the dynamic stack.
  1070. @end deffn
  1071. @deffn {Macro Instruction} push-fluid fluid value
  1072. Dynamically bind @var{value} to @var{fluid} by creating a with-fluids
  1073. object, pushing that object on the dynamic stack. @xref{Fluids and
  1074. Dynamic States}.
  1075. @end deffn
  1076. @deffn {Macro Instruction} pop-fluid
  1077. Leave the dynamic extent of a @code{with-fluid*} expression, restoring
  1078. the fluid to its previous value. @code{push-fluid} should always be
  1079. balanced with @code{pop-fluid}.
  1080. @end deffn
  1081. @deffn {Macro Instruction} fluid-ref dst fluid
  1082. Place the value associated with the fluid @var{fluid} in @var{dst}.
  1083. @end deffn
  1084. @deffn {Macro Instruction} fluid-set! fluid value
  1085. Set the value of the fluid @var{fluid} to @var{value}.
  1086. @end deffn
  1087. @deffn {Macro Instruction} push-dynamic-state state
  1088. Save the current set of fluid bindings on the dynamic stack and instate
  1089. the bindings from @var{state} instead. @xref{Fluids and Dynamic
  1090. States}.
  1091. @end deffn
  1092. @deffn {Macro Instruction} pop-dynamic-state
  1093. Restore a saved set of fluid bindings from the dynamic stack.
  1094. @code{push-dynamic-state} should always be balanced with
  1095. @code{pop-dynamic-state}.
  1096. @end deffn
  1097. @deffn {Macro Instruction} resolve-module dst name public?
  1098. Look up the module named @var{name}, resolve its public interface if the
  1099. immediate operand @var{public?} is true, then place the result in
  1100. @var{dst}.
  1101. @end deffn
  1102. @deffn {Macro Instruction} lookup dst mod sym
  1103. Look up @var{sym} in module @var{mod}, placing the resulting variable
  1104. (or @code{#f} if not found) in @var{dst}.
  1105. @end deffn
  1106. @deffn {Macro Instruction} define! dst mod sym
  1107. Look up @var{sym} in module @var{mod}, placing the resulting variable in
  1108. @var{dst}, creating the variable if needed.
  1109. @end deffn
  1110. @deffn {Macro Instruction} current-module dst
  1111. Set @var{dst} to the current module.
  1112. @end deffn
  1113. @node Constant Instructions
  1114. @subsubsection Constant Instructions
  1115. The following instructions load literal data into a program. There are
  1116. two kinds.
  1117. The first set of instructions loads immediate values. These
  1118. instructions encode the immediate directly into the instruction stream.
  1119. @deftypefn Instruction {} make-short-immediate s8:@var{dst} i16:@var{low-bits}
  1120. Make an immediate whose low bits are @var{low-bits}, and whose top bits are
  1121. 0.
  1122. @end deftypefn
  1123. @deftypefn Instruction {} make-long-immediate s24:@var{dst} i32:@var{low-bits}
  1124. Make an immediate whose low bits are @var{low-bits}, and whose top bits are
  1125. 0.
  1126. @end deftypefn
  1127. @deftypefn Instruction {} make-long-long-immediate s24:@var{dst} a32:@var{high-bits} b32:@var{low-bits}
  1128. Make an immediate with @var{high-bits} and @var{low-bits}.
  1129. @end deftypefn
  1130. Non-immediate constant literals are referenced either directly or
  1131. indirectly. For example, Guile knows at compile-time what the layout of
  1132. a string will be like, and arranges to embed that object directly in the
  1133. compiled image. A reference to a string will use
  1134. @code{make-non-immediate} to treat a pointer into the compilation unit
  1135. as a @code{scm} value directly.
  1136. @deftypefn Instruction {} make-non-immediate s24:@var{dst} n32:@var{offset}
  1137. Load a pointer to statically allocated memory into @var{dst}. The
  1138. object's memory is will be found @var{offset} 32-bit words away from the
  1139. current instruction pointer. Whether the object is mutable or immutable
  1140. depends on where it was allocated by the compiler, and loaded by the
  1141. loader.
  1142. @end deftypefn
  1143. Sometimes you need to load up a code pointer into a register; for this,
  1144. use @code{load-label}.
  1145. @deftypefn Instruction {} make-non-immediate s24:@var{dst} l32:@var{offset}
  1146. Load a label @var{offset} words away from the current @code{ip} and
  1147. write it to @var{dst}. @var{offset} is a signed 32-bit integer.
  1148. @end deftypefn
  1149. Finally, Guile supports a number of unboxed data types, with their
  1150. associate constant loaders.
  1151. @deftypefn Instruction {} load-f64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
  1152. Load a double-precision floating-point value formed by joining
  1153. @var{high-bits} and @var{low-bits}, and write it to @var{dst}.
  1154. @end deftypefn
  1155. @deftypefn Instruction {} load-u64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
  1156. Load an unsigned 64-bit integer formed by joining @var{high-bits} and
  1157. @var{low-bits}, and write it to @var{dst}.
  1158. @end deftypefn
  1159. @deftypefn Instruction {} load-s64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
  1160. Load a signed 64-bit integer formed by joining @var{high-bits} and
  1161. @var{low-bits}, and write it to @var{dst}.
  1162. @end deftypefn
  1163. Some objects must be unique across the whole system. This is the case
  1164. for symbols and keywords. For these objects, Guile arranges to
  1165. initialize them when the compilation unit is loaded, storing them into a
  1166. slot in the image. References go indirectly through that slot.
  1167. @code{static-ref} is used in this case.
  1168. @deftypefn Instruction {} static-ref s24:@var{dst} r32:@var{offset}
  1169. Load a @var{scm} value into @var{dst}. The @var{scm} value will be fetched from
  1170. memory, @var{offset} 32-bit words away from the current instruction
  1171. pointer. @var{offset} is a signed value.
  1172. @end deftypefn
  1173. Fields of non-immediates may need to be fixed up at load time, because
  1174. we do not know in advance at what address they will be loaded. This is
  1175. the case, for example, for a pair containing a non-immediate in one of
  1176. its fields. @code{static-ref} and @code{static-patch!} are used in
  1177. these situations.
  1178. @deftypefn Instruction {} static-set! s24:@var{src} lo32:@var{offset}
  1179. Store a @var{scm} value into memory, @var{offset} 32-bit words away from the
  1180. current instruction pointer. @var{offset} is a signed value.
  1181. @end deftypefn
  1182. @deftypefn Instruction {} static-patch! x24:@var{_} lo32:@var{dst-offset} l32:@var{src-offset}
  1183. Patch a pointer at @var{dst-offset} to point to @var{src-offset}. Both offsets
  1184. are signed 32-bit values, indicating a memory address as a number
  1185. of 32-bit words away from the current instruction pointer.
  1186. @end deftypefn
  1187. @node Memory Access Instructions
  1188. @subsubsection Memory Access Instructions
  1189. In these instructions, the @code{/immediate} variants represent their
  1190. indexes or counts as immediates; otherwise these values are unboxed u64
  1191. locals.
  1192. @deftypefn Instruction {} allocate-words s12:@var{dst} s12:@var{count}
  1193. @deftypefnx Instruction {} allocate-words/immediate s12:@var{dst} c12:@var{count}
  1194. Allocate a fresh GC-traced object consisting of @var{count} words and
  1195. store it into @var{dst}.
  1196. @end deftypefn
  1197. @deftypefn Instruction {} scm-ref s8:@var{dst} s8:@var{obj} s8:@var{idx}
  1198. @deftypefnx Instruction {} scm-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
  1199. Load the @code{SCM} object at word offset @var{idx} from local
  1200. @var{obj}, and store it to @var{dst}.
  1201. @end deftypefn
  1202. @deftypefn Instruction {} scm-set! s8:@var{dst} s8:@var{idx} s8:@var{obj}
  1203. @deftypefnx Instruction {} scm-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{obj}
  1204. Store the @code{scm} local @var{val} into object @var{obj} at word
  1205. offset @var{idx}.
  1206. @end deftypefn
  1207. @deftypefn Instruction {} scm-ref/tag s8:@var{dst} s8:@var{obj} c8:@var{tag}
  1208. Load the first word of @var{obj}, subtract the immediate @var{tag}, and store the
  1209. resulting @code{SCM} to @var{dst}.
  1210. @end deftypefn
  1211. @deftypefn Instruction {} scm-set!/tag s8:@var{obj} c8:@var{tag} s8:@var{val}
  1212. Set the first word of @var{obj} to the unpacked bits of the @code{scm}
  1213. value @var{val} plus the immediate value @var{tag}.
  1214. @end deftypefn
  1215. @deftypefn Instruction {} word-ref s8:@var{dst} s8:@var{obj} s8:@var{idx}
  1216. @deftypefnx Instruction {} word-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
  1217. Load the word at offset @var{idx} from local @var{obj}, and store it to
  1218. the @code{u64} local @var{dst}.
  1219. @end deftypefn
  1220. @deftypefn Instruction {} word-set! s8:@var{dst} s8:@var{idx} s8:@var{obj}
  1221. @deftypefnx Instruction {} word-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{obj}
  1222. Store the @code{u64} local @var{val} into object @var{obj} at word
  1223. offset @var{idx}.
  1224. @end deftypefn
  1225. @deftypefn Instruction {} pointer-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
  1226. Load the pointer at offset @var{idx} from local @var{obj}, and store it
  1227. to the unboxed pointer local @var{dst}.
  1228. @end deftypefn
  1229. @deftypefn Instruction {} pointer-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{obj}
  1230. Store the unboxed pointer local @var{val} into object @var{obj} at word
  1231. offset @var{idx}.
  1232. @end deftypefn
  1233. @deftypefn Instruction {} tail-pointer-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
  1234. Compute the address of word offset @var{idx} from local @var{obj}, and store it
  1235. to @var{dst}.
  1236. @end deftypefn
  1237. @node Atomic Memory Access Instructions
  1238. @subsubsection Atomic Memory Access Instructions
  1239. @deftypefn Instruction {} current-thread s24:@var{dst}
  1240. Write the current thread into @var{dst}.
  1241. @end deftypefn
  1242. @deftypefn Instruction {} atomic-scm-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
  1243. Atomically load the @code{SCM} object at word offset @var{idx} from
  1244. local @var{obj}, using the sequential consistency memory model. Store
  1245. the result to @var{dst}.
  1246. @end deftypefn
  1247. @deftypefn Instruction {} atomic-scm-set!/immediate s8:@var{obj} c8:@var{idx} s8:@var{val}
  1248. Atomically set the @code{SCM} object at word offset @var{idx} from local
  1249. @var{obj} to @var{val}, using the sequential consistency memory model.
  1250. @end deftypefn
  1251. @deftypefn Instruction {} atomic-scm-swap!/immediate s24:@var{dst} x8:@var{_} s24:@var{obj} c8:@var{idx} s24:@var{val}
  1252. Atomically swap the @code{SCM} value stored in object @var{obj} at word
  1253. offset @var{idx} with @var{val}, using the sequentially consistent
  1254. memory model. Store the previous value to @var{dst}.
  1255. @end deftypefn
  1256. @deftypefn Instruction {} atomic-scm-compare-and-swap!/immediate s24:@var{dst} x8:@var{_} s24:@var{obj} c8:@var{idx} s24:@var{expected} x8:@var{_} s24:@var{desired}
  1257. Atomically swap the @code{SCM} value stored in object @var{obj} at word
  1258. offset @var{idx} with @var{desired}, if and only if the value that was
  1259. there was @var{expected}, using the sequentially consistent memory
  1260. model. Store the value that was previously at @var{idx} from @var{obj}
  1261. in @var{dst}.
  1262. @end deftypefn
  1263. @node Tagging and Untagging Instructions
  1264. @subsubsection Tagging and Untagging Instructions
  1265. @deftypefn Instruction {} tag-char s12:@var{dst} s12:@var{src}
  1266. Make a @code{SCM} character whose integer value is the @code{u64} in
  1267. @var{src}, and store it in @var{dst}.
  1268. @end deftypefn
  1269. @deftypefn Instruction {} untag-char s12:@var{dst} s12:@var{src}
  1270. Extract the integer value from the @code{SCM} character @var{src}, and
  1271. store the resulting @code{u64} in @var{dst}.
  1272. @end deftypefn
  1273. @deftypefn Instruction {} tag-fixnum s12:@var{dst} s12:@var{src}
  1274. Make a @code{SCM} integer whose value is the @code{s64} in @var{src},
  1275. and store it in @var{dst}.
  1276. @end deftypefn
  1277. @deftypefn Instruction {} untag-fixnum s12:@var{dst} s12:@var{src}
  1278. Extract the integer value from the @code{SCM} integer @var{src}, and
  1279. store the resulting @code{s64} in @var{dst}.
  1280. @end deftypefn
  1281. @node Integer Arithmetic Instructions
  1282. @subsubsection Integer Arithmetic Instructions
  1283. @deftypefn Instruction {} uadd s8:@var{dst} s8:@var{a} s8:@var{b}
  1284. @deftypefnx Instruction {} uadd/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
  1285. Add the @code{u64} values @var{a} and @var{b}, and store the @code{u64}
  1286. result to @var{dst}. Overflow will wrap.
  1287. @end deftypefn
  1288. @deftypefn Instruction {} usub s8:@var{dst} s8:@var{a} s8:@var{b}
  1289. @deftypefnx Instruction {} usub/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
  1290. Subtract the @code{u64} value @var{b} from @var{a}, and store the
  1291. @code{u64} result to @var{dst}. Underflow will wrap.
  1292. @end deftypefn
  1293. @deftypefn Instruction {} umul s8:@var{dst} s8:@var{a} s8:@var{b}
  1294. @deftypefnx Instruction {} umul/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
  1295. Multiply the @code{u64} values @var{a} and @var{b}, and store the
  1296. @code{u64} result to @var{dst}. Overflow will wrap.
  1297. @end deftypefn
  1298. @deftypefn Instruction {} ulogand s8:@var{dst} s8:@var{a} s8:@var{b}
  1299. Place the bitwise @code{and} of the @code{u64} values @var{a} and
  1300. @var{b} into the @code{u64} local @var{dst}.
  1301. @end deftypefn
  1302. @deftypefn Instruction {} ulogior s8:@var{dst} s8:@var{a} s8:@var{b}
  1303. Place the bitwise inclusive @code{or} of the @code{u64} values @var{a}
  1304. and @var{b} into the @code{u64} local @var{dst}.
  1305. @end deftypefn
  1306. @deftypefn Instruction {} ulogxor s8:@var{dst} s8:@var{a} s8:@var{b}
  1307. Place the bitwise exclusive @code{or} of the @code{u64} values @var{a}
  1308. and @var{b} into the @code{u64} local @var{dst}.
  1309. @end deftypefn
  1310. @deftypefn Instruction {} ulogsub s8:@var{dst} s8:@var{a} s8:@var{b}
  1311. Place the bitwise @code{and} of the @code{u64} values @var{a} and the
  1312. bitwise @code{not} of @var{b} into the @code{u64} local @var{dst}.
  1313. @end deftypefn
  1314. @deftypefn Instruction {} ulsh s8:@var{dst} s8:@var{a} s8:@var{b}
  1315. @deftypefnx Instruction {} ulsh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
  1316. Shift the unboxed unsigned 64-bit integer in @var{a} left by @var{b}
  1317. bits, also an unboxed unsigned 64-bit integer. Truncate to 64 bits and
  1318. write to @var{dst} as an unboxed value. Only the lower 6 bits of
  1319. @var{b} are used.
  1320. @end deftypefn
  1321. @deftypefn Instruction {} ursh s8:@var{dst} s8:@var{a} s8:@var{b}
  1322. @deftypefnx Instruction {} ursh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
  1323. Shift the unboxed unsigned 64-bit integer in @var{a} right by @var{b}
  1324. bits, also an unboxed unsigned 64-bit integer. Truncate to 64 bits and
  1325. write to @var{dst} as an unboxed value. Only the lower 6 bits of
  1326. @var{b} are used.
  1327. @end deftypefn
  1328. @deftypefn Instruction {} srsh s8:@var{dst} s8:@var{a} s8:@var{b}
  1329. @deftypefnx Instruction {} srsh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
  1330. Shift the unboxed signed 64-bit integer in @var{a} right by @var{b}
  1331. bits, also an unboxed signed 64-bit integer. Truncate to 64 bits and
  1332. write to @var{dst} as an unboxed value. Only the lower 6 bits of
  1333. @var{b} are used.
  1334. @end deftypefn
  1335. @node Floating-Point Arithmetic Instructions
  1336. @subsubsection Floating-Point Arithmetic Instructions
  1337. @deftypefn Instruction {} fadd s8:@var{dst} s8:@var{a} s8:@var{b}
  1338. Add the @code{f64} values @var{a} and @var{b}, and store the @code{f64}
  1339. result to @var{dst}.
  1340. @end deftypefn
  1341. @deftypefn Instruction {} fsub s8:@var{dst} s8:@var{a} s8:@var{b}
  1342. Subtract the @code{f64} value @var{b} from @var{a}, and store the
  1343. @code{f64} result to @var{dst}.
  1344. @end deftypefn
  1345. @deftypefn Instruction {} fmul s8:@var{dst} s8:@var{a} s8:@var{b}
  1346. Multiply the @code{f64} values @var{a} and @var{b}, and store the
  1347. @code{f64} result to @var{dst}.
  1348. @end deftypefn
  1349. @deftypefn Instruction {} fdiv s8:@var{dst} s8:@var{a} s8:@var{b}
  1350. Divide the @code{f64} values @var{a} by @var{b}, and store the
  1351. @code{f64} result to @var{dst}.
  1352. @end deftypefn
  1353. @node Comparison Instructions
  1354. @subsubsection Comparison Instructions
  1355. @deftypefn Instruction {} u64=? s12:@var{a} s12:@var{b}
  1356. Set the comparison result to @var{EQUAL} if the @code{u64} values
  1357. @var{a} and @var{b} are the same, or @code{NONE} otherwise.
  1358. @end deftypefn
  1359. @deftypefn Instruction {} u64<? s12:@var{a} s12:@var{b}
  1360. Set the comparison result to @code{LESS_THAN} if the @code{u64} value
  1361. @var{a} is less than the @code{u64} value @var{b} are the same, or
  1362. @code{NONE} otherwise.
  1363. @end deftypefn
  1364. @deftypefn Instruction {} s64<? s12:@var{a} s12:@var{b}
  1365. Set the comparison result to @code{LESS_THAN} if the @code{s64} value
  1366. @var{a} is less than the @code{s64} value @var{b} are the same, or
  1367. @code{NONE} otherwise.
  1368. @end deftypefn
  1369. @deftypefn Instruction {} s64-imm=? s12:@var{a} z12:@var{b}
  1370. Set the comparison result to @var{EQUAL} if the @code{s64} value @var{a}
  1371. is equal to the immediate @code{s64} value @var{b}, or @code{NONE}
  1372. otherwise.
  1373. @end deftypefn
  1374. @deftypefn Instruction {} u64-imm<? s12:@var{a} c12:@var{b}
  1375. Set the comparison result to @code{LESS_THAN} if the @code{u64} value
  1376. @var{a} is less than the immediate @code{u64} value @var{b}, or
  1377. @code{NONE} otherwise.
  1378. @end deftypefn
  1379. @deftypefn Instruction {} imm-u64<? s12:@var{a} s12:@var{b}
  1380. Set the comparison result to @code{LESS_THAN} if the @code{u64}
  1381. immediate @var{b} is less than the @code{u64} value @var{a}, or
  1382. @code{NONE} otherwise.
  1383. @end deftypefn
  1384. @deftypefn Instruction {} s64-imm<? s12:@var{a} z12:@var{b}
  1385. Set the comparison result to @code{LESS_THAN} if the @code{s64} value
  1386. @var{a} is less than the immediate @code{s64} value @var{b}, or
  1387. @code{NONE} otherwise.
  1388. @end deftypefn
  1389. @deftypefn Instruction {} imm-s64<? s12:@var{a} z12:@var{b}
  1390. Set the comparison result to @code{LESS_THAN} if the @code{s64}
  1391. immediate @var{b} is less than the @code{s64} value @var{a}, or
  1392. @code{NONE} otherwise.
  1393. @end deftypefn
  1394. @deftypefn Instruction {} f64=? s12:@var{a} s12:@var{b}
  1395. Set the comparison result to @var{EQUAL} if the f64 value @var{a} is
  1396. equal to the f64 value @var{b}, or @code{NONE} otherwise.
  1397. @end deftypefn
  1398. @deftypefn Instruction {} f64<? s12:@var{a} s12:@var{b}
  1399. Set the comparison result to @code{LESS_THAN} if the f64 value @var{a}
  1400. is less than the f64 value @var{b}, @code{NONE} if @var{a} is greater
  1401. than or equal to @var{b}, or @code{INVALID} otherwise.
  1402. @end deftypefn
  1403. @deftypefn Instruction {} =? s12:@var{a} s12:@var{b}
  1404. Set the comparison result to @var{EQUAL} if the SCM values @var{a} and
  1405. @var{b} are numerically equal, in the sense of the Scheme @code{=}
  1406. operator. Set to @code{NONE} otherwise.
  1407. @end deftypefn
  1408. @deftypefn Instruction {} heap-numbers-equal? s12:@var{a} s12:@var{b}
  1409. Set the comparison result to @var{EQUAL} if the SCM values @var{a} and
  1410. @var{b} are numerically equal, in the sense of Scheme @code{=}. Set to
  1411. @code{NONE} otherwise. It is known that both @var{a} and @var{b} are
  1412. heap numbers.
  1413. @end deftypefn
  1414. @deftypefn Instruction {} <? s12:@var{a} s12:@var{b}
  1415. Set the comparison result to @code{LESS_THAN} if the SCM value @var{a}
  1416. is less than the SCM value @var{b}, @code{NONE} if @var{a} is greater
  1417. than or equal to @var{b}, or @code{INVALID} otherwise.
  1418. @end deftypefn
  1419. @deftypefn Instruction {} immediate-tag=? s24:@var{obj} c16:@var{mask} c16:@var{tag}
  1420. Set the comparison result to @var{EQUAL} if the result of a bitwise
  1421. @code{and} between the bits of @code{scm} value @var{a} and the
  1422. immediate @var{mask} is @var{tag}, or @code{NONE} otherwise.
  1423. @end deftypefn
  1424. @deftypefn Instruction {} heap-tag=? s24:@var{obj} c16:@var{mask} c16:@var{tag}
  1425. Set the comparison result to @var{EQUAL} if the result of a bitwise
  1426. @code{and} between the first word of @code{scm} value @var{a} and the
  1427. immediate @var{mask} is @var{tag}, or @code{NONE} otherwise.
  1428. @end deftypefn
  1429. @deftypefn Instruction {} eq? s12:@var{a} s12:@var{b}
  1430. Set the comparison result to @var{EQUAL} if the SCM values @var{a} and
  1431. @var{b} are @code{eq?}, or @code{NONE} otherwise.
  1432. @end deftypefn
  1433. There are a set of macro-instructions for @code{immediate-tag=?} and
  1434. @code{heap-tag=?} as well that abstract away the precise type tag
  1435. values. @xref{The SCM Type in Guile}.
  1436. @deffn {Macro Instruction} fixnum? x
  1437. @deffnx {Macro Instruction} heap-object? x
  1438. @deffnx {Macro Instruction} char? x
  1439. @deffnx {Macro Instruction} eq-false? x
  1440. @deffnx {Macro Instruction} eq-nil? x
  1441. @deffnx {Macro Instruction} eq-null? x
  1442. @deffnx {Macro Instruction} eq-true? x
  1443. @deffnx {Macro Instruction} unspecified? x
  1444. @deffnx {Macro Instruction} undefined? x
  1445. @deffnx {Macro Instruction} eof-object? x
  1446. @deffnx {Macro Instruction} null? x
  1447. @deffnx {Macro Instruction} false? x
  1448. @deffnx {Macro Instruction} nil? x
  1449. Emit a @code{immediate-tag=?} instruction that will set the comparison
  1450. result to @code{EQUAL} if @var{x} would pass the corresponding predicate
  1451. (e.g. @code{null?}), or @code{NONE} otherwise.
  1452. @end deffn
  1453. @deffn {Macro Instruction} pair? x
  1454. @deffnx {Macro Instruction} struct? x
  1455. @deffnx {Macro Instruction} symbol? x
  1456. @deffnx {Macro Instruction} variable? x
  1457. @deffnx {Macro Instruction} vector? x
  1458. @deffnx {Macro Instruction} immutable-vector? x
  1459. @deffnx {Macro Instruction} mutable-vector? x
  1460. @deffnx {Macro Instruction} weak-vector? x
  1461. @deffnx {Macro Instruction} string? x
  1462. @deffnx {Macro Instruction} heap-number? x
  1463. @deffnx {Macro Instruction} hash-table? x
  1464. @deffnx {Macro Instruction} pointer? x
  1465. @deffnx {Macro Instruction} fluid? x
  1466. @deffnx {Macro Instruction} stringbuf? x
  1467. @deffnx {Macro Instruction} dynamic-state? x
  1468. @deffnx {Macro Instruction} frame? x
  1469. @deffnx {Macro Instruction} keyword? x
  1470. @deffnx {Macro Instruction} atomic-box? x
  1471. @deffnx {Macro Instruction} syntax? x
  1472. @deffnx {Macro Instruction} program? x
  1473. @deffnx {Macro Instruction} vm-continuation? x
  1474. @deffnx {Macro Instruction} bytevector? x
  1475. @deffnx {Macro Instruction} weak-set? x
  1476. @deffnx {Macro Instruction} weak-table? x
  1477. @deffnx {Macro Instruction} array? x
  1478. @deffnx {Macro Instruction} bitvector? x
  1479. @deffnx {Macro Instruction} smob? x
  1480. @deffnx {Macro Instruction} port? x
  1481. @deffnx {Macro Instruction} bignum? x
  1482. @deffnx {Macro Instruction} flonum? x
  1483. @deffnx {Macro Instruction} compnum? x
  1484. @deffnx {Macro Instruction} fracnum? x
  1485. Emit a @code{heap-tag=?} instruction that will set the comparison result
  1486. to @code{EQUAL} if @var{x} would pass the corresponding predicate
  1487. (e.g. @code{null?}), or @code{NONE} otherwise.
  1488. @end deffn
  1489. @node Branch Instructions
  1490. @subsubsection Branch Instructions
  1491. All offsets to branch instructions are 24-bit signed numbers, which
  1492. count 32-bit units. This gives Guile effectively a 26-bit address range
  1493. for relative jumps.
  1494. @deftypefn Instruction {} j l24:@var{offset}
  1495. Add @var{offset} to the current instruction pointer.
  1496. @end deftypefn
  1497. @deftypefn Instruction {} jl l24:@var{offset}
  1498. If the last comparison result is @code{LESS_THAN}, add @var{offset}, a
  1499. signed 24-bit number, to the current instruction pointer.
  1500. @end deftypefn
  1501. @deftypefn Instruction {} je l24:@var{offset}
  1502. If the last comparison result is @code{EQUAL}, add @var{offset}, a
  1503. signed 24-bit number, to the current instruction pointer.
  1504. @end deftypefn
  1505. @deftypefn Instruction {} jnl l24:@var{offset}
  1506. If the last comparison result is not @code{LESS_THAN}, add @var{offset},
  1507. a signed 24-bit number, to the current instruction pointer.
  1508. @end deftypefn
  1509. @deftypefn Instruction {} jne l24:@var{offset}
  1510. If the last comparison result is not @code{EQUAL}, add @var{offset}, a
  1511. signed 24-bit number, to the current instruction pointer.
  1512. @end deftypefn
  1513. @deftypefn Instruction {} jge l24:@var{offset}
  1514. If the last comparison result is @code{NONE}, add @var{offset}, a
  1515. signed 24-bit number, to the current instruction pointer.
  1516. This is intended for use after a @code{<?} comparison, and is different
  1517. from @code{jnl} in the way it handles not-a-number (NaN) values:
  1518. @code{<?} sets @code{INVALID} instead of @code{NONE} if either value is
  1519. a NaN. For exact numbers, @code{jge} is the same as @code{jnl}.
  1520. @end deftypefn
  1521. @deftypefn Instruction {} jnge l24:@var{offset}
  1522. If the last comparison result is not @code{NONE}, add @var{offset}, a
  1523. signed 24-bit number, to the current instruction pointer.
  1524. This is intended for use after a @code{<?} comparison, and is different
  1525. from @code{jl} in the way it handles not-a-number (NaN) values:
  1526. @code{<?} sets @code{INVALID} instead of @code{NONE} if either value is
  1527. a NaN. For exact numbers, @code{jnge} is the same as @code{jl}.
  1528. @end deftypefn
  1529. @node Raw Memory Access Instructions
  1530. @subsubsection Raw Memory Access Instructions
  1531. Bytevector operations correspond closely to what the current hardware
  1532. can do, so it makes sense to inline them to VM instructions, providing
  1533. a clear path for eventual native compilation. Without this, Scheme
  1534. programs would need other primitives for accessing raw bytes -- but
  1535. these primitives are as good as any.
  1536. @deftypefn Instruction {} u8-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1537. @deftypefnx Instruction {} s8-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1538. @deftypefnx Instruction {} u16-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1539. @deftypefnx Instruction {} s16-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1540. @deftypefnx Instruction {} u32-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1541. @deftypefnx Instruction {} s32-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1542. @deftypefnx Instruction {} u64-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1543. @deftypefnx Instruction {} s64-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1544. @deftypefnx Instruction {} f32-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1545. @deftypefnx Instruction {} f64-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
  1546. Fetch the item at byte offset @var{idx} from the raw pointer local
  1547. @var{ptr}, and store it in @var{dst}. All accesses use native
  1548. endianness.
  1549. The @var{idx} value should be an unboxed unsigned 64-bit integer.
  1550. The results are all written to the stack as unboxed values, either as
  1551. signed 64-bit integers, unsigned 64-bit integers, or IEEE double
  1552. floating point numbers.
  1553. @end deftypefn
  1554. @deftypefn Instruction {} u8-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1555. @deftypefnx Instruction {} s8-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1556. @deftypefnx Instruction {} u16-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1557. @deftypefnx Instruction {} s16-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1558. @deftypefnx Instruction {} u32-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1559. @deftypefnx Instruction {} s32-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1560. @deftypefnx Instruction {} u64-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1561. @deftypefnx Instruction {} s64-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1562. @deftypefnx Instruction {} f32-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1563. @deftypefnx Instruction {} f64-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
  1564. Store @var{val} into memory pointed to by raw pointer local @var{ptr},
  1565. at byte offset @var{idx}. Multibyte values are written using native
  1566. endianness.
  1567. The @var{idx} value should be an unboxed unsigned 64-bit integer.
  1568. The @var{val} values are all unboxed, either as signed 64-bit integers,
  1569. unsigned 64-bit integers, or IEEE double floating point numbers.
  1570. @end deftypefn
  1571. @node Just-In-Time Native Code
  1572. @subsection Just-In-Time Native Code
  1573. @cindex just-in-time compiler
  1574. @cindex jit compiler
  1575. @cindex template jit
  1576. @cindex compiler, just-in-time
  1577. The final piece of Guile's virtual machine is a just-in-time (JIT)
  1578. compiler from bytecode instructions to native code. It is faster to run
  1579. a function when its bytecode instructions are compiled to native code,
  1580. compared to having the VM interpret the instructions.
  1581. The JIT compiler runs automatically, triggered by counters associated
  1582. with each function. The counter increments when functions are called
  1583. and during each loop iteration. Once a function's counter passes a
  1584. certain value, the function gets JIT-compiled. @xref{Instrumentation
  1585. Instructions}, for full details.
  1586. Guile's JIT compiler is what is known as a @dfn{template JIT}. This
  1587. kind of JIT is very simple: for each instruction in a function, the JIT
  1588. compiler will emit a generic sequence of machine code corresponding to
  1589. the instruction kind, specializing that generic template to reference
  1590. the specific operands of the instruction being compiled.
  1591. The strength of a template JIT is principally that it is very fast at
  1592. emitting code. It doesn't need to do any time-consuming analysis on the
  1593. bytecode that it is compiling to do its job.
  1594. A template JIT is also very predictable: the native code emitted by a
  1595. template JIT has the same performance characteristics of the
  1596. corresponding bytecode, only that it runs faster. In theory you could
  1597. even generate the template-JIT machine code ahead of time, as it doesn't
  1598. depend on any value seen at run-time.
  1599. This predictability makes it possible to reason about the performance of
  1600. a system in terms of bytecode, knowing that the conclusions apply to
  1601. native code emitted by a template JIT.
  1602. Because the machine code corresponding to an instruction always performs
  1603. the same tasks that the interpreter would do for that instruction,
  1604. bytecode and a template JIT also allows Guile programmers to debug their
  1605. programs in terms of the bytecode model. When a Guile programmer sets a
  1606. breakpoint, Guile will disable the JIT for the thread being debugged,
  1607. falling back to the interpreter (which has the corresponding code to run
  1608. the hooks). @xref{VM Hooks}.
  1609. To emit native code, Guile uses a forked version of GNU Lightning. This
  1610. "Lightening" effort, spun out as a separate project, aims to build on
  1611. the back-end support from GNU Lightning, but adapting the API and
  1612. behavior of the library to match Guile's needs. This code is included
  1613. in the Guile source distribution. For more information, see
  1614. @url{https://gitlab.com/wingo/lightening}. As of mid-2019, Lightening
  1615. supports code generation for the x86-64, ia32, ARMv7, and AArch64
  1616. architectures.
  1617. The weaknesses of a template JIT are two-fold. Firstly, as a simple
  1618. back-end that has to run fast, a template JIT doesn't have time to do
  1619. analysis that could help it generate better code, notably global
  1620. register allocation and instruction selection.
  1621. However this is a minor weakness compared to the inability to perform
  1622. significant, speculative program transformations. For example, Guile
  1623. could see that in an expression @code{(f x)}, that in practice @var{f}
  1624. always refers to the same function. An advanced JIT compiler would
  1625. speculatively inline @var{f} into the call-site, along with a dynamic
  1626. check to make sure that the assertion still held. But as a template JIT
  1627. doesn't pay attention to values only known at run-time, it can't make
  1628. this transformation.
  1629. This limitation is mitigated in part by Guile's robust ahead-of-time
  1630. compiler which can already perform significant optimizations when it can
  1631. prove they will always be valid, and its low-level bytecode which is
  1632. able to represent the effect of those optimizations (e.g. elided
  1633. type-checks). @xref{Compiling to the Virtual Machine}, for more on
  1634. Guile's compiler.
  1635. An ahead-of-time Scheme-to-bytecode strategy, complemented by a template
  1636. JIT, also particularly suits the somewhat static nature of Scheme.
  1637. Scheme programmers often write code in a way that makes the identity of
  1638. free variable references lexically apparent. For example, the @code{(f
  1639. x)} expression could appear within a @code{(let ((f (lambda (x) (1+
  1640. x)))) ...)} expression, or we could see that @code{f} was imported from
  1641. a particular module where we know its binding. Ahead-of-time
  1642. compilation techniques can work well for a language like Scheme where
  1643. there is little polymorphism and much first-order programming. They do
  1644. not work so well for a language like JavaScript, which is highly mutable
  1645. at run-time and difficult to analyze due to method calls (which are
  1646. effectively higher-order calls).
  1647. All that said, a template JIT works well for Guile at this point. It's
  1648. only a few thousand lines of maintainable code, it speeds up Scheme
  1649. programs, and it keeps the bulk of the Guile Scheme implementation
  1650. written in Scheme itself. The next step is probably to add
  1651. ahead-of-time native code emission to the back-end of the compiler
  1652. written in Scheme, to take advantage of the opportunity to do global
  1653. register allocation and instruction selection. Once this is working, it
  1654. can allow Guile to experiment with speculative optimizations in Scheme
  1655. as well. @xref{Extending the Compiler}, for more on future directions.
  1656. Finally, note that there are a few environment variables that can be
  1657. tweaked to make JIT compilation happen sooner, later, or never.
  1658. @xref{Environment Variables}, for more.