data-rep.texi 26 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664
  1. @c -*-texinfo-*-
  2. @c This is part of the GNU Guile Reference Manual.
  3. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2010, 2015, 2018
  4. @c Free Software Foundation, Inc.
  5. @c See the file guile.texi for copying conditions.
  6. @node Data Representation
  7. @section Data Representation
  8. Scheme is a latently-typed language; this means that the system cannot,
  9. in general, determine the type of a given expression at compile time.
  10. Types only become apparent at run time. Variables do not have fixed
  11. types; a variable may hold a pair at one point, an integer at the next,
  12. and a thousand-element vector later. Instead, values, not variables,
  13. have fixed types.
  14. In order to implement standard Scheme functions like @code{pair?} and
  15. @code{string?} and provide garbage collection, the representation of
  16. every value must contain enough information to accurately determine its
  17. type at run time. Often, Scheme systems also use this information to
  18. determine whether a program has attempted to apply an operation to an
  19. inappropriately typed value (such as taking the @code{car} of a string).
  20. Because variables, pairs, and vectors may hold values of any type,
  21. Scheme implementations use a uniform representation for values --- a
  22. single type large enough to hold either a complete value or a pointer
  23. to a complete value, along with the necessary typing information.
  24. The following sections will present a simple typing system, and then
  25. make some refinements to correct its major weaknesses. We then conclude
  26. with a discussion of specific choices that Guile has made regarding
  27. garbage collection and data representation.
  28. @menu
  29. * A Simple Representation::
  30. * Faster Integers::
  31. * Cheaper Pairs::
  32. * Conservative GC::
  33. * The SCM Type in Guile::
  34. @end menu
  35. @node A Simple Representation
  36. @subsection A Simple Representation
  37. The simplest way to represent Scheme values in C would be to represent
  38. each value as a pointer to a structure containing a type indicator,
  39. followed by a union carrying the real value. Assuming that @code{SCM} is
  40. the name of our universal type, we can write:
  41. @example
  42. enum type @{ integer, pair, string, vector, ... @};
  43. typedef struct value *SCM;
  44. struct value @{
  45. enum type type;
  46. union @{
  47. int integer;
  48. struct @{ SCM car, cdr; @} pair;
  49. struct @{ int length; char *elts; @} string;
  50. struct @{ int length; SCM *elts; @} vector;
  51. ...
  52. @} value;
  53. @};
  54. @end example
  55. with the ellipses replaced with code for the remaining Scheme types.
  56. This representation is sufficient to implement all of Scheme's
  57. semantics. If @var{x} is an @code{SCM} value:
  58. @itemize @bullet
  59. @item
  60. To test if @var{x} is an integer, we can write @code{@var{x}->type == integer}.
  61. @item
  62. To find its value, we can write @code{@var{x}->value.integer}.
  63. @item
  64. To test if @var{x} is a vector, we can write @code{@var{x}->type == vector}.
  65. @item
  66. If we know @var{x} is a vector, we can write
  67. @code{@var{x}->value.vector.elts[0]} to refer to its first element.
  68. @item
  69. If we know @var{x} is a pair, we can write
  70. @code{@var{x}->value.pair.car} to extract its car.
  71. @end itemize
  72. @node Faster Integers
  73. @subsection Faster Integers
  74. Unfortunately, the above representation has a serious disadvantage. In
  75. order to return an integer, an expression must allocate a @code{struct
  76. value}, initialize it to represent that integer, and return a pointer to
  77. it. Furthermore, fetching an integer's value requires a memory
  78. reference, which is much slower than a register reference on most
  79. processors. Since integers are extremely common, this representation is
  80. too costly, in both time and space. Integers should be very cheap to
  81. create and manipulate.
  82. One possible solution comes from the observation that, on many
  83. architectures, heap-allocated data (i.e., what you get when you call
  84. @code{malloc}) must be aligned on an eight-byte boundary. (Whether or
  85. not the machine actually requires it, we can write our own allocator for
  86. @code{struct value} objects that assures this is true.) In this case,
  87. the lower three bits of the structure's address are known to be zero.
  88. This gives us the room we need to provide an improved representation
  89. for integers. We make the following rules:
  90. @itemize @bullet
  91. @item
  92. If the lower three bits of an @code{SCM} value are zero, then the SCM
  93. value is a pointer to a @code{struct value}, and everything proceeds as
  94. before.
  95. @item
  96. Otherwise, the @code{SCM} value represents an integer, whose value
  97. appears in its upper bits.
  98. @end itemize
  99. Here is C code implementing this convention:
  100. @example
  101. enum type @{ pair, string, vector, ... @};
  102. typedef struct value *SCM;
  103. struct value @{
  104. enum type type;
  105. union @{
  106. struct @{ SCM car, cdr; @} pair;
  107. struct @{ int length; char *elts; @} string;
  108. struct @{ int length; SCM *elts; @} vector;
  109. ...
  110. @} value;
  111. @};
  112. #define POINTER_P(x) (((int) (x) & 7) == 0)
  113. #define INTEGER_P(x) (! POINTER_P (x))
  114. #define GET_INTEGER(x) ((int) (x) >> 3)
  115. #define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))
  116. @end example
  117. Notice that @code{integer} no longer appears as an element of @code{enum
  118. type}, and the union has lost its @code{integer} member. Instead, we
  119. use the @code{POINTER_P} and @code{INTEGER_P} macros to make a coarse
  120. classification of values into integers and non-integers, and do further
  121. type testing as before.
  122. Here's how we would answer the questions posed above (again, assume
  123. @var{x} is an @code{SCM} value):
  124. @itemize @bullet
  125. @item
  126. To test if @var{x} is an integer, we can write @code{INTEGER_P (@var{x})}.
  127. @item
  128. To find its value, we can write @code{GET_INTEGER (@var{x})}.
  129. @item
  130. To test if @var{x} is a vector, we can write:
  131. @example
  132. @code{POINTER_P (@var{x}) && @var{x}->type == vector}
  133. @end example
  134. Given the new representation, we must make sure @var{x} is truly a
  135. pointer before we dereference it to determine its complete type.
  136. @item
  137. If we know @var{x} is a vector, we can write
  138. @code{@var{x}->value.vector.elts[0]} to refer to its first element, as
  139. before.
  140. @item
  141. If we know @var{x} is a pair, we can write
  142. @code{@var{x}->value.pair.car} to extract its car, just as before.
  143. @end itemize
  144. This representation allows us to operate more efficiently on integers
  145. than the first. For example, if @var{x} and @var{y} are known to be
  146. integers, we can compute their sum as follows:
  147. @example
  148. MAKE_INTEGER (GET_INTEGER (@var{x}) + GET_INTEGER (@var{y}))
  149. @end example
  150. Now, integer math requires no allocation or memory references. Most real
  151. Scheme systems actually implement addition and other operations using an
  152. even more efficient algorithm, but this essay isn't about
  153. bit-twiddling. (Hint: how do you decide when to overflow to a bignum?
  154. How would you do it in assembly?)
  155. @node Cheaper Pairs
  156. @subsection Cheaper Pairs
  157. However, there is yet another issue to confront. Most Scheme heaps
  158. contain more pairs than any other type of object; Jonathan Rees said at
  159. one point that pairs occupy 45% of the heap in his Scheme
  160. implementation, Scheme 48. However, our representation above spends
  161. three @code{SCM}-sized words per pair --- one for the type, and two for
  162. the @sc{car} and @sc{cdr}. Is there any way to represent pairs using
  163. only two words?
  164. Let us refine the convention we established earlier. Let us assert
  165. that:
  166. @itemize @bullet
  167. @item
  168. If the bottom three bits of an @code{SCM} value are @code{#b000}, then
  169. it is a pointer, as before.
  170. @item
  171. If the bottom three bits are @code{#b001}, then the upper bits are an
  172. integer. This is a bit more restrictive than before.
  173. @item
  174. If the bottom two bits are @code{#b010}, then the value, with the bottom
  175. three bits masked out, is the address of a pair.
  176. @end itemize
  177. Here is the new C code:
  178. @example
  179. enum type @{ string, vector, ... @};
  180. typedef struct value *SCM;
  181. struct value @{
  182. enum type type;
  183. union @{
  184. struct @{ int length; char *elts; @} string;
  185. struct @{ int length; SCM *elts; @} vector;
  186. ...
  187. @} value;
  188. @};
  189. struct pair @{
  190. SCM car, cdr;
  191. @};
  192. #define POINTER_P(x) (((int) (x) & 7) == 0)
  193. #define INTEGER_P(x) (((int) (x) & 7) == 1)
  194. #define GET_INTEGER(x) ((int) (x) >> 3)
  195. #define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))
  196. #define PAIR_P(x) (((int) (x) & 7) == 2)
  197. #define GET_PAIR(x) ((struct pair *) ((int) (x) & ~7))
  198. @end example
  199. Notice that @code{enum type} and @code{struct value} now only contain
  200. provisions for vectors and strings; both integers and pairs have become
  201. special cases. The code above also assumes that an @code{int} is large
  202. enough to hold a pointer, which isn't generally true.
  203. Our list of examples is now as follows:
  204. @itemize @bullet
  205. @item
  206. To test if @var{x} is an integer, we can write @code{INTEGER_P
  207. (@var{x})}; this is as before.
  208. @item
  209. To find its value, we can write @code{GET_INTEGER (@var{x})}, as
  210. before.
  211. @item
  212. To test if @var{x} is a vector, we can write:
  213. @example
  214. @code{POINTER_P (@var{x}) && @var{x}->type == vector}
  215. @end example
  216. We must still make sure that @var{x} is a pointer to a @code{struct
  217. value} before dereferencing it to find its type.
  218. @item
  219. If we know @var{x} is a vector, we can write
  220. @code{@var{x}->value.vector.elts[0]} to refer to its first element, as
  221. before.
  222. @item
  223. We can write @code{PAIR_P (@var{x})} to determine if @var{x} is a
  224. pair, and then write @code{GET_PAIR (@var{x})->car} to refer to its
  225. car.
  226. @end itemize
  227. This change in representation reduces our heap size by 15%. It also
  228. makes it cheaper to decide if a value is a pair, because no memory
  229. references are necessary; it suffices to check the bottom two bits of
  230. the @code{SCM} value. This may be significant when traversing lists, a
  231. common activity in a Scheme system.
  232. Again, most real Scheme systems use a slightly different implementation;
  233. for example, if GET_PAIR subtracts off the low bits of @code{x}, instead
  234. of masking them off, the optimizer will often be able to combine that
  235. subtraction with the addition of the offset of the structure member we
  236. are referencing, making a modified pointer as fast to use as an
  237. unmodified pointer.
  238. @node Conservative GC
  239. @subsection Conservative Garbage Collection
  240. Aside from the latent typing, the major source of constraints on a
  241. Scheme implementation's data representation is the garbage collector.
  242. The collector must be able to traverse every live object in the heap, to
  243. determine which objects are not live, and thus collectable.
  244. There are many ways to implement this. Guile's garbage collection is
  245. built on a library, the Boehm-Demers-Weiser conservative garbage
  246. collector (BDW-GC). The BDW-GC ``just works'', for the most part. But
  247. since it is interesting to know how these things work, we include here a
  248. high-level description of what the BDW-GC does.
  249. Garbage collection has two logical phases: a @dfn{mark} phase, in which
  250. the set of live objects is enumerated, and a @dfn{sweep} phase, in which
  251. objects not traversed in the mark phase are collected. Correct
  252. functioning of the collector depends on being able to traverse the
  253. entire set of live objects.
  254. In the mark phase, the collector scans the system's global variables and
  255. the local variables on the stack to determine which objects are
  256. immediately accessible by the C code. It then scans those objects to
  257. find the objects they point to, and so on. The collector logically sets
  258. a @dfn{mark bit} on each object it finds, so each object is traversed
  259. only once.
  260. When the collector can find no unmarked objects pointed to by marked
  261. objects, it assumes that any objects that are still unmarked will never
  262. be used by the program (since there is no path of dereferences from any
  263. global or local variable that reaches them) and deallocates them.
  264. In the above paragraphs, we did not specify how the garbage collector
  265. finds the global and local variables; as usual, there are many different
  266. approaches. Frequently, the programmer must maintain a list of pointers
  267. to all global variables that refer to the heap, and another list
  268. (adjusted upon entry to and exit from each function) of local variables,
  269. for the collector's benefit.
  270. The list of global variables is usually not too difficult to maintain,
  271. since global variables are relatively rare. However, an explicitly
  272. maintained list of local variables (in the author's personal experience)
  273. is a nightmare to maintain. Thus, the BDW-GC uses a technique called
  274. @dfn{conservative garbage collection}, to make the local variable list
  275. unnecessary.
  276. The trick to conservative collection is to treat the C stack as an
  277. ordinary range of memory, and assume that @emph{every} word on the C
  278. stack is a pointer into the heap. Thus, the collector marks all objects
  279. whose addresses appear anywhere in the C stack, without knowing for sure
  280. how that word is meant to be interpreted.
  281. In addition to the stack, the BDW-GC will also scan static data
  282. sections. This means that global variables are also scanned when looking
  283. for live Scheme objects.
  284. Obviously, such a system will occasionally retain objects that are
  285. actually garbage, and should be freed. In practice, this is not a
  286. problem, as the set of conservatively-scanned locations is fixed; the
  287. Scheme stack is maintained apart from the C stack, and is scanned
  288. precisely (as opposed to conservatively). The GC-managed heap is also
  289. partitioned into parts that can contain pointers (such as vectors) and
  290. parts that can't (such as bytevectors), limiting the potential for
  291. confusing a raw integer with a pointer to a live object.
  292. Interested readers should see the BDW-GC web page at
  293. @uref{http://www.hboehm.info/gc/}, for more information on conservative
  294. GC in general and the BDW-GC implementation in particular.
  295. @node The SCM Type in Guile
  296. @subsection The SCM Type in Guile
  297. Guile classifies Scheme objects into two kinds: those that fit entirely
  298. within an @code{SCM}, and those that require heap storage.
  299. The former class are called @dfn{immediates}. The class of immediates
  300. includes small integers, characters, boolean values, the empty list, the
  301. mysterious end-of-file object, and some others.
  302. The remaining types are called, not surprisingly, @dfn{non-immediates}.
  303. They include pairs, procedures, strings, vectors, and all other data
  304. types in Guile. For non-immediates, the @code{SCM} word contains a
  305. pointer to data on the heap, with further information about the object
  306. in question is stored in that data.
  307. This section describes how the @code{SCM} type is actually represented
  308. and used at the C level. Interested readers should see
  309. @code{libguile/scm.h} for an exposition of how Guile stores type
  310. information.
  311. In fact, there are two basic C data types to represent objects in
  312. Guile: @code{SCM} and @code{scm_t_bits}.
  313. @menu
  314. * Relationship Between SCM and scm_t_bits::
  315. * Immediate Objects::
  316. * Non-Immediate Objects::
  317. * Allocating Heap Objects::
  318. * Heap Object Type Information::
  319. * Accessing Heap Object Fields::
  320. @end menu
  321. @node Relationship Between SCM and scm_t_bits
  322. @subsubsection Relationship Between @code{SCM} and @code{scm_t_bits}
  323. A variable of type @code{SCM} is guaranteed to hold a valid Scheme
  324. object. A variable of type @code{scm_t_bits}, on the other hand, may
  325. hold a representation of a @code{SCM} value as a C integral type, but
  326. may also hold any C value, even if it does not correspond to a valid
  327. Scheme object.
  328. For a variable @var{x} of type @code{SCM}, the Scheme object's type
  329. information is stored in a form that is not directly usable. To be able
  330. to work on the type encoding of the scheme value, the @code{SCM}
  331. variable has to be transformed into the corresponding representation as
  332. a @code{scm_t_bits} variable @var{y} by using the @code{SCM_UNPACK}
  333. macro. Once this has been done, the type of the scheme object @var{x}
  334. can be derived from the content of the bits of the @code{scm_t_bits}
  335. value @var{y}, in the way illustrated by the example earlier in this
  336. chapter (@pxref{Cheaper Pairs}). Conversely, a valid bit encoding of a
  337. Scheme value as a @code{scm_t_bits} variable can be transformed into the
  338. corresponding @code{SCM} value using the @code{SCM_PACK} macro.
  339. @node Immediate Objects
  340. @subsubsection Immediate Objects
  341. A Scheme object may either be an immediate, i.e.@: carrying all
  342. necessary information by itself, or it may contain a reference to a
  343. @dfn{heap object} which is, as the name implies, data on the heap.
  344. Although in general it should be irrelevant for user code whether an
  345. object is an immediate or not, within Guile's own code the distinction
  346. is sometimes of importance. Thus, the following low level macro is
  347. provided:
  348. @deftypefn Macro int SCM_IMP (SCM @var{x})
  349. A Scheme object is an immediate if it fulfills the @code{SCM_IMP}
  350. predicate, otherwise it holds an encoded reference to a heap object. The
  351. result of the predicate is delivered as a C style boolean value. User
  352. code and code that extends Guile should normally not be required to use
  353. this macro.
  354. @end deftypefn
  355. @noindent
  356. Summary:
  357. @itemize @bullet
  358. @item
  359. Given a Scheme object @var{x} of unknown type, check first
  360. with @code{SCM_IMP (@var{x})} if it is an immediate object.
  361. @item
  362. If so, all of the type and value information can be determined from the
  363. @code{scm_t_bits} value that is delivered by @code{SCM_UNPACK
  364. (@var{x})}.
  365. @end itemize
  366. There are a number of special values in Scheme, most of them documented
  367. elsewhere in this manual. It's not quite the right place to put them,
  368. but for now, here's a list of the C names given to some of these values:
  369. @deftypefn Macro SCM SCM_EOL
  370. The Scheme empty list object, or ``End Of List'' object, usually written
  371. in Scheme as @code{'()}.
  372. @end deftypefn
  373. @deftypefn Macro SCM SCM_EOF_VAL
  374. The Scheme end-of-file value. It has no standard written
  375. representation, for obvious reasons.
  376. @end deftypefn
  377. @deftypefn Macro SCM SCM_UNSPECIFIED
  378. The value returned by some (but not all) expressions that the Scheme
  379. standard says return an ``unspecified'' value.
  380. This is sort of a weirdly literal way to take things, but the standard
  381. read-eval-print loop prints nothing when the expression returns this
  382. value, so it's not a bad idea to return this when you can't think of
  383. anything else helpful.
  384. @end deftypefn
  385. @deftypefn Macro SCM SCM_UNDEFINED
  386. The ``undefined'' value. Its most important property is that is not
  387. equal to any valid Scheme value. This is put to various internal uses
  388. by C code interacting with Guile.
  389. For example, when you write a C function that is callable from Scheme
  390. and which takes optional arguments, the interpreter passes
  391. @code{SCM_UNDEFINED} for any arguments you did not receive.
  392. We also use this to mark unbound variables.
  393. @end deftypefn
  394. @deftypefn Macro int SCM_UNBNDP (SCM @var{x})
  395. Return true if @var{x} is @code{SCM_UNDEFINED}. Note that this is not a
  396. check to see if @var{x} is @code{SCM_UNBOUND}. History will not be kind
  397. to us.
  398. @end deftypefn
  399. @node Non-Immediate Objects
  400. @subsubsection Non-Immediate Objects
  401. A Scheme object of type @code{SCM} that does not fulfill the
  402. @code{SCM_IMP} predicate holds an encoded reference to a heap object.
  403. This reference can be decoded to a C pointer to a heap object using the
  404. @code{SCM_UNPACK_POINTER} macro. The encoding of a pointer to a heap
  405. object into a @code{SCM} value is done using the @code{SCM_PACK_POINTER}
  406. macro.
  407. @cindex cells, deprecated concept
  408. Before Guile 2.0, Guile had a custom garbage collector that allocated
  409. heap objects in units of 2-word @dfn{cells}. With the move to the
  410. BDW-GC collector in Guile 2.0, Guile can allocate heap objects of any
  411. size, and the concept of a cell is now obsolete. Still, we mention
  412. it here as the name still appears in various low-level interfaces.
  413. @deftypefn Macro {scm_t_bits *} SCM_UNPACK_POINTER (SCM @var{x})
  414. @deftypefnx Macro {scm_t_cell *} SCM2PTR (SCM @var{x})
  415. Extract and return the heap object pointer from a non-immediate
  416. @code{SCM} object @var{x}. The name @code{SCM2PTR} is deprecated but
  417. still common.
  418. @end deftypefn
  419. @deftypefn Macro SCM_PACK_POINTER (scm_t_bits * @var{x})
  420. @deftypefnx Macro SCM PTR2SCM (scm_t_cell * @var{x})
  421. Return a @code{SCM} value that encodes a reference to the heap object
  422. pointer @var{x}. The name @code{PTR2SCM} is deprecated but still
  423. common.
  424. @end deftypefn
  425. Note that it is also possible to transform a non-immediate @code{SCM}
  426. value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable.
  427. However, the result of @code{SCM_UNPACK} may not be used as a pointer to
  428. a heap object: only @code{SCM_UNPACK_POINTER} is guaranteed to transform
  429. a @code{SCM} object into a valid pointer to a heap object. Also, it is
  430. not allowed to apply @code{SCM_PACK_POINTER} to anything that is not a
  431. valid pointer to a heap object.
  432. @noindent
  433. Summary:
  434. @itemize @bullet
  435. @item
  436. Only use @code{SCM_UNPACK_POINTER} on @code{SCM} values for which
  437. @code{SCM_IMP} is false!
  438. @item
  439. Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use
  440. @code{SCM_UNPACK_POINTER (@var{x})} instead!
  441. @item
  442. Don't use @code{SCM_PACK_POINTER} for anything but a heap object pointer!
  443. @end itemize
  444. @node Allocating Heap Objects
  445. @subsubsection Allocating Heap Objects
  446. Heap objects are heap-allocated data pointed to by non-immediate
  447. @code{SCM} value. The first word of the heap object should contain a
  448. type code. The object may be any number of words in length, and is
  449. generally scanned by the garbage collector for additional unless the
  450. object was allocated using a ``pointerless'' allocation function.
  451. You should generally not need these functions, unless you are
  452. implementing a new data type, and thoroughly understand the code in
  453. @code{<libguile/scm.h>}.
  454. If you just want to allocate pairs, use @code{scm_cons}.
  455. @deftypefn Function SCM scm_words (scm_t_bits word_0, uint32_t n_words)
  456. Allocate a new heap object containing @var{n_words}, and initialize the
  457. first slot to @var{word_0}, and return a non-immediate @code{SCM} value
  458. encoding a pointer to the object. Typically @var{word_0} will contain
  459. the type tag.
  460. @end deftypefn
  461. There are also deprecated but common variants of @code{scm_words} that
  462. use the term ``cell'' to indicate 2-word objects.
  463. @deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1)
  464. Allocate a new 2-word heap object, initialize the two slots with
  465. @var{word_0} and @var{word_1}, and return it. Just like calling
  466. @code{scm_words (@var{word_0}, 2)}, then initializing the second slot to
  467. @var{word_1}.
  468. Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}.
  469. If you want to pass a @code{SCM} object, you need to use
  470. @code{SCM_UNPACK}.
  471. @end deftypefn
  472. @deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3)
  473. Like @code{scm_cell}, but allocates a 4-word heap object.
  474. @end deftypefn
  475. @node Heap Object Type Information
  476. @subsubsection Heap Object Type Information
  477. Heap objects contain a type tag and are followed by a number of
  478. word-sized slots. The interpretation of the object contents depends on
  479. the type of the object.
  480. @deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x})
  481. Extract the first word of the heap object pointed to by @var{x}. This
  482. value holds the information about the cell type.
  483. @end deftypefn
  484. @deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t})
  485. For a non-immediate Scheme object @var{x}, write the value @var{t} into
  486. the first word of the heap object referenced by @var{x}. The value
  487. @var{t} must hold a valid cell type.
  488. @end deftypefn
  489. @node Accessing Heap Object Fields
  490. @subsubsection Accessing Heap Object Fields
  491. For a non-immediate Scheme object @var{x}, the object type can be
  492. determined by using the @code{SCM_CELL_TYPE} macro described in the
  493. previous section. For each different type of heap object it is known
  494. which fields hold tagged Scheme objects and which fields hold untagged
  495. raw data. To access the different fields appropriately, the following
  496. macros are provided.
  497. @deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n})
  498. @deftypefnx Macro scm_t_bits SCM_CELL_WORD_0 (@var{x})
  499. @deftypefnx Macro scm_t_bits SCM_CELL_WORD_1 (@var{x})
  500. @deftypefnx Macro scm_t_bits SCM_CELL_WORD_2 (@var{x})
  501. @deftypefnx Macro scm_t_bits SCM_CELL_WORD_3 (@var{x})
  502. Deliver the field @var{n} of the heap object referenced by the
  503. non-immediate Scheme object @var{x} as raw untagged data. Only use this
  504. macro for fields containing untagged data; don't use it for fields
  505. containing tagged @code{SCM} objects.
  506. @end deftypefn
  507. @deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n})
  508. @deftypefnx Macro SCM SCM_CELL_OBJECT_0 (SCM @var{x})
  509. @deftypefnx Macro SCM SCM_CELL_OBJECT_1 (SCM @var{x})
  510. @deftypefnx Macro SCM SCM_CELL_OBJECT_2 (SCM @var{x})
  511. @deftypefnx Macro SCM SCM_CELL_OBJECT_3 (SCM @var{x})
  512. Deliver the field @var{n} of the heap object referenced by the
  513. non-immediate Scheme object @var{x} as a Scheme object. Only use this
  514. macro for fields containing tagged @code{SCM} objects; don't use it for
  515. fields containing untagged data.
  516. @end deftypefn
  517. @deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w})
  518. @deftypefnx Macro void SCM_SET_CELL_WORD_0 (@var{x}, @var{w})
  519. @deftypefnx Macro void SCM_SET_CELL_WORD_1 (@var{x}, @var{w})
  520. @deftypefnx Macro void SCM_SET_CELL_WORD_2 (@var{x}, @var{w})
  521. @deftypefnx Macro void SCM_SET_CELL_WORD_3 (@var{x}, @var{w})
  522. Write the raw value @var{w} into field number @var{n} of the heap object
  523. referenced by the non-immediate Scheme value @var{x}. Values that are
  524. written into heap objects as raw values should only be read later using
  525. the @code{SCM_CELL_WORD} macros.
  526. @end deftypefn
  527. @deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o})
  528. @deftypefnx Macro void SCM_SET_CELL_OBJECT_0 (SCM @var{x}, SCM @var{o})
  529. @deftypefnx Macro void SCM_SET_CELL_OBJECT_1 (SCM @var{x}, SCM @var{o})
  530. @deftypefnx Macro void SCM_SET_CELL_OBJECT_2 (SCM @var{x}, SCM @var{o})
  531. @deftypefnx Macro void SCM_SET_CELL_OBJECT_3 (SCM @var{x}, SCM @var{o})
  532. Write the Scheme object @var{o} into field number @var{n} of the heap
  533. object referenced by the non-immediate Scheme value @var{x}. Values
  534. that are written into heap objects as objects should only be read using
  535. the @code{SCM_CELL_OBJECT} macros.
  536. @end deftypefn
  537. @noindent
  538. Summary:
  539. @itemize @bullet
  540. @item
  541. For a non-immediate Scheme object @var{x} of unknown type, get the type
  542. information by using @code{SCM_CELL_TYPE (@var{x})}.
  543. @item
  544. As soon as the type information is available, only use the appropriate
  545. access methods to read and write data to the different heap object
  546. fields.
  547. @item
  548. Note that field 0 stores the cell type information. Generally speaking,
  549. other data associated with a heap object is stored starting from field
  550. 1.
  551. @end itemize
  552. @c Local Variables:
  553. @c TeX-master: "guile.texi"
  554. @c End: