data-rep.texi 51 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348
  1. @c -*-texinfo-*-
  2. @c This is part of the GNU Guile Reference Manual.
  3. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004
  4. @c Free Software Foundation, Inc.
  5. @c See the file guile.texi for copying conditions.
  6. @c essay \input texinfo
  7. @c essay @c -*-texinfo-*-
  8. @c essay @c %**start of header
  9. @c essay @setfilename data-rep.info
  10. @c essay @settitle Data Representation in Guile
  11. @c essay @c %**end of header
  12. @c essay @include version.texi
  13. @c essay @dircategory The Algorithmic Language Scheme
  14. @c essay @direntry
  15. @c essay * data-rep: (data-rep). Data Representation in Guile --- how to use
  16. @c essay Guile objects in your C code.
  17. @c essay @end direntry
  18. @c essay @setchapternewpage off
  19. @c essay @ifinfo
  20. @c essay Data Representation in Guile
  21. @c essay Copyright (C) 1998, 1999, 2000, 2003, 2006 Free Software Foundation
  22. @c essay Permission is granted to make and distribute verbatim copies of
  23. @c essay this manual provided the copyright notice and this permission notice
  24. @c essay are preserved on all copies.
  25. @c essay @ignore
  26. @c essay Permission is granted to process this file through TeX and print the
  27. @c essay results, provided the printed document carries copying permission
  28. @c essay notice identical to this one except for the removal of this paragraph
  29. @c essay (this paragraph not being relevant to the printed manual).
  30. @c essay @end ignore
  31. @c essay Permission is granted to copy and distribute modified versions of this
  32. @c essay manual under the conditions for verbatim copying, provided that the entire
  33. @c essay resulting derived work is distributed under the terms of a permission
  34. @c essay notice identical to this one.
  35. @c essay Permission is granted to copy and distribute translations of this manual
  36. @c essay into another language, under the above conditions for modified versions,
  37. @c essay except that this permission notice may be stated in a translation approved
  38. @c essay by the Free Software Foundation.
  39. @c essay @end ifinfo
  40. @c essay @titlepage
  41. @c essay @sp 10
  42. @c essay @comment The title is printed in a large font.
  43. @c essay @title Data Representation in Guile
  44. @c essay @subtitle $Id: data-rep.texi,v 1.19.2.1 2006-02-12 13:42:50 mvo Exp $
  45. @c essay @subtitle For use with Guile @value{VERSION}
  46. @c essay @author Jim Blandy
  47. @c essay @author Free Software Foundation
  48. @c essay @author @email{jimb@@red-bean.com}
  49. @c essay @c The following two commands start the copyright page.
  50. @c essay @page
  51. @c essay @vskip 0pt plus 1filll
  52. @c essay @vskip 0pt plus 1filll
  53. @c essay Copyright @copyright{} 1998, 2006 Free Software Foundation
  54. @c essay Permission is granted to make and distribute verbatim copies of
  55. @c essay this manual provided the copyright notice and this permission notice
  56. @c essay are preserved on all copies.
  57. @c essay Permission is granted to copy and distribute modified versions of this
  58. @c essay manual under the conditions for verbatim copying, provided that the entire
  59. @c essay resulting derived work is distributed under the terms of a permission
  60. @c essay notice identical to this one.
  61. @c essay Permission is granted to copy and distribute translations of this manual
  62. @c essay into another language, under the above conditions for modified versions,
  63. @c essay except that this permission notice may be stated in a translation approved
  64. @c essay by Free Software Foundation.
  65. @c essay @end titlepage
  66. @c essay @c @smallbook
  67. @c essay @c @finalout
  68. @c essay @headings double
  69. @c essay @node Top, Data Representation in Scheme, (dir), (dir)
  70. @c essay @top Data Representation in Guile
  71. @c essay @ifinfo
  72. @c essay This essay is meant to provide the background necessary to read and
  73. @c essay write C code that manipulates Scheme values in a way that conforms to
  74. @c essay libguile's interface. If you would like to write or maintain a
  75. @c essay Guile-based application in C or C++, this is the first information you
  76. @c essay need.
  77. @c essay In order to make sense of Guile's @code{SCM_} functions, or read
  78. @c essay libguile's source code, it's essential to have a good grasp of how Guile
  79. @c essay actually represents Scheme values. Otherwise, a lot of the code, and
  80. @c essay the conventions it follows, won't make very much sense.
  81. @c essay We assume you know both C and Scheme, but we do not assume you are
  82. @c essay familiar with Guile's C interface.
  83. @c essay @end ifinfo
  84. @node Data Representation
  85. @appendix Data Representation in Guile
  86. @strong{by Jim Blandy}
  87. [Due to the rather non-orthogonal and performance-oriented nature of the
  88. SCM interface, you need to understand SCM internals *before* you can use
  89. the SCM API. That's why this chapter comes first.]
  90. [NOTE: this is Jim Blandy's essay almost entirely unmodified. It has to
  91. be adapted to fit this manual smoothly.]
  92. In order to make sense of Guile's SCM_ functions, or read libguile's
  93. source code, it's essential to have a good grasp of how Guile actually
  94. represents Scheme values. Otherwise, a lot of the code, and the
  95. conventions it follows, won't make very much sense. This essay is meant
  96. to provide the background necessary to read and write C code that
  97. manipulates Scheme values in a way that is compatible with libguile.
  98. We assume you know both C and Scheme, but we do not assume you are
  99. familiar with Guile's implementation.
  100. @menu
  101. * Data Representation in Scheme:: Why things aren't just totally
  102. straightforward, in general terms.
  103. * How Guile does it:: How to write C code that manipulates
  104. Guile values, with an explanation
  105. of Guile's garbage collector.
  106. @end menu
  107. @node Data Representation in Scheme
  108. @section Data Representation in Scheme
  109. Scheme is a latently-typed language; this means that the system cannot,
  110. in general, determine the type of a given expression at compile time.
  111. Types only become apparent at run time. Variables do not have fixed
  112. types; a variable may hold a pair at one point, an integer at the next,
  113. and a thousand-element vector later. Instead, values, not variables,
  114. have fixed types.
  115. In order to implement standard Scheme functions like @code{pair?} and
  116. @code{string?} and provide garbage collection, the representation of
  117. every value must contain enough information to accurately determine its
  118. type at run time. Often, Scheme systems also use this information to
  119. determine whether a program has attempted to apply an operation to an
  120. inappropriately typed value (such as taking the @code{car} of a string).
  121. Because variables, pairs, and vectors may hold values of any type,
  122. Scheme implementations use a uniform representation for values --- a
  123. single type large enough to hold either a complete value or a pointer
  124. to a complete value, along with the necessary typing information.
  125. The following sections will present a simple typing system, and then
  126. make some refinements to correct its major weaknesses. However, this is
  127. not a description of the system Guile actually uses. It is only an
  128. illustration of the issues Guile's system must address. We provide all
  129. the information one needs to work with Guile's data in @ref{How Guile
  130. does it}.
  131. @menu
  132. * A Simple Representation::
  133. * Faster Integers::
  134. * Cheaper Pairs::
  135. * Guile Is Hairier::
  136. @end menu
  137. @node A Simple Representation
  138. @subsection A Simple Representation
  139. The simplest way to meet the above requirements in C would be to
  140. represent each value as a pointer to a structure containing a type
  141. indicator, followed by a union carrying the real value. Assuming that
  142. @code{SCM} is the name of our universal type, we can write:
  143. @example
  144. enum type @{ integer, pair, string, vector, ... @};
  145. typedef struct value *SCM;
  146. struct value @{
  147. enum type type;
  148. union @{
  149. int integer;
  150. struct @{ SCM car, cdr; @} pair;
  151. struct @{ int length; char *elts; @} string;
  152. struct @{ int length; SCM *elts; @} vector;
  153. ...
  154. @} value;
  155. @};
  156. @end example
  157. with the ellipses replaced with code for the remaining Scheme types.
  158. This representation is sufficient to implement all of Scheme's
  159. semantics. If @var{x} is an @code{SCM} value:
  160. @itemize @bullet
  161. @item
  162. To test if @var{x} is an integer, we can write @code{@var{x}->type == integer}.
  163. @item
  164. To find its value, we can write @code{@var{x}->value.integer}.
  165. @item
  166. To test if @var{x} is a vector, we can write @code{@var{x}->type == vector}.
  167. @item
  168. If we know @var{x} is a vector, we can write
  169. @code{@var{x}->value.vector.elts[0]} to refer to its first element.
  170. @item
  171. If we know @var{x} is a pair, we can write
  172. @code{@var{x}->value.pair.car} to extract its car.
  173. @end itemize
  174. @node Faster Integers
  175. @subsection Faster Integers
  176. Unfortunately, the above representation has a serious disadvantage. In
  177. order to return an integer, an expression must allocate a @code{struct
  178. value}, initialize it to represent that integer, and return a pointer to
  179. it. Furthermore, fetching an integer's value requires a memory
  180. reference, which is much slower than a register reference on most
  181. processors. Since integers are extremely common, this representation is
  182. too costly, in both time and space. Integers should be very cheap to
  183. create and manipulate.
  184. One possible solution comes from the observation that, on many
  185. architectures, structures must be aligned on a four-byte boundary.
  186. (Whether or not the machine actually requires it, we can write our own
  187. allocator for @code{struct value} objects that assures this is true.)
  188. In this case, the lower two bits of the structure's address are known to
  189. be zero.
  190. This gives us the room we need to provide an improved representation
  191. for integers. We make the following rules:
  192. @itemize @bullet
  193. @item
  194. If the lower two bits of an @code{SCM} value are zero, then the SCM
  195. value is a pointer to a @code{struct value}, and everything proceeds as
  196. before.
  197. @item
  198. Otherwise, the @code{SCM} value represents an integer, whose value
  199. appears in its upper bits.
  200. @end itemize
  201. Here is C code implementing this convention:
  202. @example
  203. enum type @{ pair, string, vector, ... @};
  204. typedef struct value *SCM;
  205. struct value @{
  206. enum type type;
  207. union @{
  208. struct @{ SCM car, cdr; @} pair;
  209. struct @{ int length; char *elts; @} string;
  210. struct @{ int length; SCM *elts; @} vector;
  211. ...
  212. @} value;
  213. @};
  214. #define POINTER_P(x) (((int) (x) & 3) == 0)
  215. #define INTEGER_P(x) (! POINTER_P (x))
  216. #define GET_INTEGER(x) ((int) (x) >> 2)
  217. #define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1))
  218. @end example
  219. Notice that @code{integer} no longer appears as an element of @code{enum
  220. type}, and the union has lost its @code{integer} member. Instead, we
  221. use the @code{POINTER_P} and @code{INTEGER_P} macros to make a coarse
  222. classification of values into integers and non-integers, and do further
  223. type testing as before.
  224. Here's how we would answer the questions posed above (again, assume
  225. @var{x} is an @code{SCM} value):
  226. @itemize @bullet
  227. @item
  228. To test if @var{x} is an integer, we can write @code{INTEGER_P (@var{x})}.
  229. @item
  230. To find its value, we can write @code{GET_INTEGER (@var{x})}.
  231. @item
  232. To test if @var{x} is a vector, we can write:
  233. @example
  234. @code{POINTER_P (@var{x}) && @var{x}->type == vector}
  235. @end example
  236. Given the new representation, we must make sure @var{x} is truly a
  237. pointer before we dereference it to determine its complete type.
  238. @item
  239. If we know @var{x} is a vector, we can write
  240. @code{@var{x}->value.vector.elts[0]} to refer to its first element, as
  241. before.
  242. @item
  243. If we know @var{x} is a pair, we can write
  244. @code{@var{x}->value.pair.car} to extract its car, just as before.
  245. @end itemize
  246. This representation allows us to operate more efficiently on integers
  247. than the first. For example, if @var{x} and @var{y} are known to be
  248. integers, we can compute their sum as follows:
  249. @example
  250. MAKE_INTEGER (GET_INTEGER (@var{x}) + GET_INTEGER (@var{y}))
  251. @end example
  252. Now, integer math requires no allocation or memory references. Most
  253. real Scheme systems actually use an even more efficient representation,
  254. but this essay isn't about bit-twiddling. (Hint: what if pointers had
  255. @code{01} in their least significant bits, and integers had @code{00}?)
  256. @node Cheaper Pairs
  257. @subsection Cheaper Pairs
  258. However, there is yet another issue to confront. Most Scheme heaps
  259. contain more pairs than any other type of object; Jonathan Rees says
  260. that pairs occupy 45% of the heap in his Scheme implementation, Scheme
  261. 48. However, our representation above spends three @code{SCM}-sized
  262. words per pair --- one for the type, and two for the @sc{car} and
  263. @sc{cdr}. Is there any way to represent pairs using only two words?
  264. Let us refine the convention we established earlier. Let us assert
  265. that:
  266. @itemize @bullet
  267. @item
  268. If the bottom two bits of an @code{SCM} value are @code{#b00}, then
  269. it is a pointer, as before.
  270. @item
  271. If the bottom two bits are @code{#b01}, then the upper bits are an
  272. integer. This is a bit more restrictive than before.
  273. @item
  274. If the bottom two bits are @code{#b10}, then the value, with the bottom
  275. two bits masked out, is the address of a pair.
  276. @end itemize
  277. Here is the new C code:
  278. @example
  279. enum type @{ string, vector, ... @};
  280. typedef struct value *SCM;
  281. struct value @{
  282. enum type type;
  283. union @{
  284. struct @{ int length; char *elts; @} string;
  285. struct @{ int length; SCM *elts; @} vector;
  286. ...
  287. @} value;
  288. @};
  289. struct pair @{
  290. SCM car, cdr;
  291. @};
  292. #define POINTER_P(x) (((int) (x) & 3) == 0)
  293. #define INTEGER_P(x) (((int) (x) & 3) == 1)
  294. #define GET_INTEGER(x) ((int) (x) >> 2)
  295. #define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1))
  296. #define PAIR_P(x) (((int) (x) & 3) == 2)
  297. #define GET_PAIR(x) ((struct pair *) ((int) (x) & ~3))
  298. @end example
  299. Notice that @code{enum type} and @code{struct value} now only contain
  300. provisions for vectors and strings; both integers and pairs have become
  301. special cases. The code above also assumes that an @code{int} is large
  302. enough to hold a pointer, which isn't generally true.
  303. Our list of examples is now as follows:
  304. @itemize @bullet
  305. @item
  306. To test if @var{x} is an integer, we can write @code{INTEGER_P
  307. (@var{x})}; this is as before.
  308. @item
  309. To find its value, we can write @code{GET_INTEGER (@var{x})}, as
  310. before.
  311. @item
  312. To test if @var{x} is a vector, we can write:
  313. @example
  314. @code{POINTER_P (@var{x}) && @var{x}->type == vector}
  315. @end example
  316. We must still make sure that @var{x} is a pointer to a @code{struct
  317. value} before dereferencing it to find its type.
  318. @item
  319. If we know @var{x} is a vector, we can write
  320. @code{@var{x}->value.vector.elts[0]} to refer to its first element, as
  321. before.
  322. @item
  323. We can write @code{PAIR_P (@var{x})} to determine if @var{x} is a
  324. pair, and then write @code{GET_PAIR (@var{x})->car} to refer to its
  325. car.
  326. @end itemize
  327. This change in representation reduces our heap size by 15%. It also
  328. makes it cheaper to decide if a value is a pair, because no memory
  329. references are necessary; it suffices to check the bottom two bits of
  330. the @code{SCM} value. This may be significant when traversing lists, a
  331. common activity in a Scheme system.
  332. Again, most real Scheme systems use a slightly different implementation;
  333. for example, if GET_PAIR subtracts off the low bits of @code{x}, instead
  334. of masking them off, the optimizer will often be able to combine that
  335. subtraction with the addition of the offset of the structure member we
  336. are referencing, making a modified pointer as fast to use as an
  337. unmodified pointer.
  338. @node Guile Is Hairier
  339. @subsection Guile Is Hairier
  340. We originally started with a very simple typing system --- each object
  341. has a field that indicates its type. Then, for the sake of efficiency
  342. in both time and space, we moved some of the typing information directly
  343. into the @code{SCM} value, and left the rest in the @code{struct value}.
  344. Guile itself employs a more complex hierarchy, storing finer and finer
  345. gradations of type information in different places, depending on the
  346. object's coarser type.
  347. In the author's opinion, Guile could be simplified greatly without
  348. significant loss of efficiency, but the simplified system would still be
  349. more complex than what we've presented above.
  350. @node How Guile does it
  351. @section How Guile does it
  352. Here we present the specifics of how Guile represents its data. We
  353. don't go into complete detail; an exhaustive description of Guile's
  354. system would be boring, and we do not wish to encourage people to write
  355. code which depends on its details anyway. We do, however, present
  356. everything one need know to use Guile's data.
  357. This section is in limbo. It used to document the 'low-level' C API
  358. of Guile that was used both by clients of libguile and by libguile
  359. itself.
  360. In the future, clients should only need to look into the sections
  361. @ref{Programming in C} and @ref{API Reference}. This section will in
  362. the end only contain stuff about the internals of Guile.
  363. @menu
  364. * General Rules::
  365. * Conservative GC::
  366. * Immediates vs Non-immediates::
  367. * Immediate Datatypes::
  368. * Non-immediate Datatypes::
  369. * Signalling Type Errors::
  370. * Unpacking the SCM type::
  371. @end menu
  372. @node General Rules
  373. @subsection General Rules
  374. Any code which operates on Guile datatypes must @code{#include} the
  375. header file @code{<libguile.h>}. This file contains a definition for
  376. the @code{SCM} typedef (Guile's universal type, as in the examples
  377. above), and definitions and declarations for a host of macros and
  378. functions that operate on @code{SCM} values.
  379. All identifiers declared by @code{<libguile.h>} begin with @code{scm_}
  380. or @code{SCM_}.
  381. @c [[I wish this were true, but I don't think it is at the moment. -JimB]]
  382. @c Macros do not evaluate their arguments more than once, unless documented
  383. @c to do so.
  384. The functions described here generally check the types of their
  385. @code{SCM} arguments, and signal an error if their arguments are of an
  386. inappropriate type. Macros generally do not, unless that is their
  387. specified purpose. You must verify their argument types beforehand, as
  388. necessary.
  389. Macros and functions that return a boolean value have names ending in
  390. @code{P} or @code{_p} (for ``predicate''). Those that return a negated
  391. boolean value have names starting with @code{SCM_N}. For example,
  392. @code{SCM_IMP (@var{x})} is a predicate which returns non-zero iff
  393. @var{x} is an immediate value (an @code{IM}). @code{SCM_NCONSP
  394. (@var{x})} is a predicate which returns non-zero iff @var{x} is
  395. @emph{not} a pair object (a @code{CONS}).
  396. @node Conservative GC
  397. @subsection Conservative Garbage Collection
  398. Aside from the latent typing, the major source of constraints on a
  399. Scheme implementation's data representation is the garbage collector.
  400. The collector must be able to traverse every live object in the heap, to
  401. determine which objects are not live.
  402. There are many ways to implement this, but Guile uses an algorithm
  403. called @dfn{mark and sweep}. The collector scans the system's global
  404. variables and the local variables on the stack to determine which
  405. objects are immediately accessible by the C code. It then scans those
  406. objects to find the objects they point to, @i{et cetera}. The collector
  407. sets a @dfn{mark bit} on each object it finds, so each object is
  408. traversed only once. This process is called @dfn{tracing}.
  409. When the collector can find no unmarked objects pointed to by marked
  410. objects, it assumes that any objects that are still unmarked will never
  411. be used by the program (since there is no path of dereferences from any
  412. global or local variable that reaches them) and deallocates them.
  413. In the above paragraphs, we did not specify how the garbage collector
  414. finds the global and local variables; as usual, there are many different
  415. approaches. Frequently, the programmer must maintain a list of pointers
  416. to all global variables that refer to the heap, and another list
  417. (adjusted upon entry to and exit from each function) of local variables,
  418. for the collector's benefit.
  419. The list of global variables is usually not too difficult to maintain,
  420. since global variables are relatively rare. However, an explicitly
  421. maintained list of local variables (in the author's personal experience)
  422. is a nightmare to maintain. Thus, Guile uses a technique called
  423. @dfn{conservative garbage collection}, to make the local variable list
  424. unnecessary.
  425. The trick to conservative collection is to treat the stack as an
  426. ordinary range of memory, and assume that @emph{every} word on the stack
  427. is a pointer into the heap. Thus, the collector marks all objects whose
  428. addresses appear anywhere in the stack, without knowing for sure how
  429. that word is meant to be interpreted.
  430. Obviously, such a system will occasionally retain objects that are
  431. actually garbage, and should be freed. In practice, this is not a
  432. problem. The alternative, an explicitly maintained list of local
  433. variable addresses, is effectively much less reliable, due to programmer
  434. error.
  435. To accommodate this technique, data must be represented so that the
  436. collector can accurately determine whether a given stack word is a
  437. pointer or not. Guile does this as follows:
  438. @itemize @bullet
  439. @item
  440. Every heap object has a two-word header, called a @dfn{cell}. Some
  441. objects, like pairs, fit entirely in a cell's two words; others may
  442. store pointers to additional memory in either of the words. For
  443. example, strings and vectors store their length in the first word, and a
  444. pointer to their elements in the second.
  445. @item
  446. Guile allocates whole arrays of cells at a time, called @dfn{heap
  447. segments}. These segments are always allocated so that the cells they
  448. contain fall on eight-byte boundaries, or whatever is appropriate for
  449. the machine's word size. Guile keeps all cells in a heap segment
  450. initialized, whether or not they are currently in use.
  451. @item
  452. Guile maintains a sorted table of heap segments.
  453. @end itemize
  454. Thus, given any random word @var{w} fetched from the stack, Guile's
  455. garbage collector can consult the table to see if @var{w} falls within a
  456. known heap segment, and check @var{w}'s alignment. If both tests pass,
  457. the collector knows that @var{w} is a valid pointer to a cell,
  458. intentional or not, and proceeds to trace the cell.
  459. Note that heap segments do not contain all the data Guile uses; cells
  460. for objects like vectors and strings contain pointers to other memory
  461. areas. However, since those pointers are internal, and not shared among
  462. many pieces of code, it is enough for the collector to find the cell,
  463. and then use the cell's type to find more pointers to trace.
  464. @node Immediates vs Non-immediates
  465. @subsection Immediates vs Non-immediates
  466. Guile classifies Scheme objects into two kinds: those that fit entirely
  467. within an @code{SCM}, and those that require heap storage.
  468. The former class are called @dfn{immediates}. The class of immediates
  469. includes small integers, characters, boolean values, the empty list, the
  470. mysterious end-of-file object, and some others.
  471. The remaining types are called, not surprisingly, @dfn{non-immediates}.
  472. They include pairs, procedures, strings, vectors, and all other data
  473. types in Guile.
  474. @deftypefn Macro int SCM_IMP (SCM @var{x})
  475. Return non-zero iff @var{x} is an immediate object.
  476. @end deftypefn
  477. @deftypefn Macro int SCM_NIMP (SCM @var{x})
  478. Return non-zero iff @var{x} is a non-immediate object. This is the
  479. exact complement of @code{SCM_IMP}, above.
  480. @end deftypefn
  481. Note that for versions of Guile prior to 1.4 it was necessary to use the
  482. @code{SCM_NIMP} macro before calling a finer-grained predicate to
  483. determine @var{x}'s type, such as @code{SCM_CONSP} or
  484. @code{SCM_VECTORP}. This is no longer required: the definitions of all
  485. Guile type predicates now include a call to @code{SCM_NIMP} where
  486. necessary.
  487. @node Immediate Datatypes
  488. @subsection Immediate Datatypes
  489. The following datatypes are immediate values; that is, they fit entirely
  490. within an @code{SCM} value. The @code{SCM_IMP} and @code{SCM_NIMP}
  491. macros will distinguish these from non-immediates; see @ref{Immediates
  492. vs Non-immediates} for an explanation of the distinction.
  493. Note that the type predicates for immediate values work correctly on any
  494. @code{SCM} value; you do not need to call @code{SCM_IMP} first, to
  495. establish that a value is immediate.
  496. @menu
  497. * Integer Data::
  498. * Character Data::
  499. * Boolean Data::
  500. * Unique Values::
  501. @end menu
  502. @node Integer Data
  503. @subsubsection Integers
  504. Here are functions for operating on small integers, that fit within an
  505. @code{SCM}. Such integers are called @dfn{immediate numbers}, or
  506. @dfn{INUMs}. In general, INUMs occupy all but two bits of an
  507. @code{SCM}.
  508. Bignums and floating-point numbers are non-immediate objects, and have
  509. their own, separate accessors. The functions here will not work on
  510. them. This is not as much of a problem as you might think, however,
  511. because the system never constructs bignums that could fit in an INUM,
  512. and never uses floating point values for exact integers.
  513. @deftypefn Macro int SCM_INUMP (SCM @var{x})
  514. Return non-zero iff @var{x} is a small integer value.
  515. @end deftypefn
  516. @deftypefn Macro int SCM_NINUMP (SCM @var{x})
  517. The complement of SCM_INUMP.
  518. @end deftypefn
  519. @deftypefn Macro int SCM_INUM (SCM @var{x})
  520. Return the value of @var{x} as an ordinary, C integer. If @var{x}
  521. is not an INUM, the result is undefined.
  522. @end deftypefn
  523. @deftypefn Macro SCM SCM_MAKINUM (int @var{i})
  524. Given a C integer @var{i}, return its representation as an @code{SCM}.
  525. This function does not check for overflow.
  526. @end deftypefn
  527. @node Character Data
  528. @subsubsection Characters
  529. Here are functions for operating on characters.
  530. @deftypefn Macro int SCM_CHARP (SCM @var{x})
  531. Return non-zero iff @var{x} is a character value.
  532. @end deftypefn
  533. @deftypefn Macro {unsigned int} SCM_CHAR (SCM @var{x})
  534. Return the value of @code{x} as a C character. If @var{x} is not a
  535. Scheme character, the result is undefined.
  536. @end deftypefn
  537. @deftypefn Macro SCM SCM_MAKE_CHAR (int @var{c})
  538. Given a C character @var{c}, return its representation as a Scheme
  539. character value.
  540. @end deftypefn
  541. @node Boolean Data
  542. @subsubsection Booleans
  543. Booleans are represented as two specific immediate SCM values,
  544. @code{SCM_BOOL_T} and @code{SCM_BOOL_F}. @xref{Booleans}, for more
  545. information.
  546. @node Unique Values
  547. @subsubsection Unique Values
  548. The immediate values that are neither small integers, characters, nor
  549. booleans are all unique values --- that is, datatypes with only one
  550. instance.
  551. @deftypefn Macro SCM SCM_EOL
  552. The Scheme empty list object, or ``End Of List'' object, usually written
  553. in Scheme as @code{'()}.
  554. @end deftypefn
  555. @deftypefn Macro SCM SCM_EOF_VAL
  556. The Scheme end-of-file value. It has no standard written
  557. representation, for obvious reasons.
  558. @end deftypefn
  559. @deftypefn Macro SCM SCM_UNSPECIFIED
  560. The value returned by expressions which the Scheme standard says return
  561. an ``unspecified'' value.
  562. This is sort of a weirdly literal way to take things, but the standard
  563. read-eval-print loop prints nothing when the expression returns this
  564. value, so it's not a bad idea to return this when you can't think of
  565. anything else helpful.
  566. @end deftypefn
  567. @deftypefn Macro SCM SCM_UNDEFINED
  568. The ``undefined'' value. Its most important property is that is not
  569. equal to any valid Scheme value. This is put to various internal uses
  570. by C code interacting with Guile.
  571. For example, when you write a C function that is callable from Scheme
  572. and which takes optional arguments, the interpreter passes
  573. @code{SCM_UNDEFINED} for any arguments you did not receive.
  574. We also use this to mark unbound variables.
  575. @end deftypefn
  576. @deftypefn Macro int SCM_UNBNDP (SCM @var{x})
  577. Return true if @var{x} is @code{SCM_UNDEFINED}. Apply this to a
  578. symbol's value to see if it has a binding as a global variable.
  579. @end deftypefn
  580. @node Non-immediate Datatypes
  581. @subsection Non-immediate Datatypes
  582. A non-immediate datatype is one which lives in the heap, either because
  583. it cannot fit entirely within a @code{SCM} word, or because it denotes a
  584. specific storage location (in the nomenclature of the Revised^5 Report
  585. on Scheme).
  586. The @code{SCM_IMP} and @code{SCM_NIMP} macros will distinguish these
  587. from immediates; see @ref{Immediates vs Non-immediates}.
  588. Given a cell, Guile distinguishes between pairs and other non-immediate
  589. types by storing special @dfn{tag} values in a non-pair cell's car, that
  590. cannot appear in normal pairs. A cell with a non-tag value in its car
  591. is an ordinary pair. The type of a cell with a tag in its car depends
  592. on the tag; the non-immediate type predicates test this value. If a tag
  593. value appears elsewhere (in a vector, for example), the heap may become
  594. corrupted.
  595. Note how the type information for a non-immediate object is split
  596. between the @code{SCM} word and the cell that the @code{SCM} word points
  597. to. The @code{SCM} word itself only indicates that the object is
  598. non-immediate --- in other words stored in a heap cell. The tag stored
  599. in the first word of the heap cell indicates more precisely the type of
  600. that object.
  601. The type predicates for non-immediate values work correctly on any
  602. @code{SCM} value; you do not need to call @code{SCM_NIMP} first, to
  603. establish that a value is non-immediate.
  604. @menu
  605. * Pair Data::
  606. * Vector Data::
  607. * Procedures::
  608. * Closures::
  609. * Subrs::
  610. * Port Data::
  611. @end menu
  612. @node Pair Data
  613. @subsubsection Pairs
  614. Pairs are the essential building block of list structure in Scheme. A
  615. pair object has two fields, called the @dfn{car} and the @dfn{cdr}.
  616. It is conventional for a pair's @sc{car} to contain an element of a
  617. list, and the @sc{cdr} to point to the next pair in the list, or to
  618. contain @code{SCM_EOL}, indicating the end of the list. Thus, a set of
  619. pairs chained through their @sc{cdr}s constitutes a singly-linked list.
  620. Scheme and libguile define many functions which operate on lists
  621. constructed in this fashion, so although lists chained through the
  622. @sc{car}s of pairs will work fine too, they may be less convenient to
  623. manipulate, and receive less support from the community.
  624. Guile implements pairs by mapping the @sc{car} and @sc{cdr} of a pair
  625. directly into the two words of the cell.
  626. @deftypefn Macro int SCM_CONSP (SCM @var{x})
  627. Return non-zero iff @var{x} is a Scheme pair object.
  628. @end deftypefn
  629. @deftypefn Macro int SCM_NCONSP (SCM @var{x})
  630. The complement of SCM_CONSP.
  631. @end deftypefn
  632. @deftypefun SCM scm_cons (SCM @var{car}, SCM @var{cdr})
  633. Allocate (``CONStruct'') a new pair, with @var{car} and @var{cdr} as its
  634. contents.
  635. @end deftypefun
  636. The macros below perform no type checking. The results are undefined if
  637. @var{cell} is an immediate. However, since all non-immediate Guile
  638. objects are constructed from cells, and these macros simply return the
  639. first element of a cell, they actually can be useful on datatypes other
  640. than pairs. (Of course, it is not very modular to use them outside of
  641. the code which implements that datatype.)
  642. @deftypefn Macro SCM SCM_CAR (SCM @var{cell})
  643. Return the @sc{car}, or first field, of @var{cell}.
  644. @end deftypefn
  645. @deftypefn Macro SCM SCM_CDR (SCM @var{cell})
  646. Return the @sc{cdr}, or second field, of @var{cell}.
  647. @end deftypefn
  648. @deftypefn Macro void SCM_SETCAR (SCM @var{cell}, SCM @var{x})
  649. Set the @sc{car} of @var{cell} to @var{x}.
  650. @end deftypefn
  651. @deftypefn Macro void SCM_SETCDR (SCM @var{cell}, SCM @var{x})
  652. Set the @sc{cdr} of @var{cell} to @var{x}.
  653. @end deftypefn
  654. @deftypefn Macro SCM SCM_CAAR (SCM @var{cell})
  655. @deftypefnx Macro SCM SCM_CADR (SCM @var{cell})
  656. @deftypefnx Macro SCM SCM_CDAR (SCM @var{cell}) @dots{}
  657. @deftypefnx Macro SCM SCM_CDDDDR (SCM @var{cell})
  658. Return the @sc{car} of the @sc{car} of @var{cell}, the @sc{car} of the
  659. @sc{cdr} of @var{cell}, @i{et cetera}.
  660. @end deftypefn
  661. @node Vector Data
  662. @subsubsection Vectors, Strings, and Symbols
  663. Vectors, strings, and symbols have some properties in common. They all
  664. have a length, and they all have an array of elements. In the case of a
  665. vector, the elements are @code{SCM} values; in the case of a string or
  666. symbol, the elements are characters.
  667. All these types store their length (along with some tagging bits) in the
  668. @sc{car} of their header cell, and store a pointer to the elements in
  669. their @sc{cdr}. Thus, the @code{SCM_CAR} and @code{SCM_CDR} macros
  670. are (somewhat) meaningful when applied to these datatypes.
  671. @deftypefn Macro int SCM_VECTORP (SCM @var{x})
  672. Return non-zero iff @var{x} is a vector.
  673. @end deftypefn
  674. @deftypefn Macro int SCM_STRINGP (SCM @var{x})
  675. Return non-zero iff @var{x} is a string.
  676. @end deftypefn
  677. @deftypefn Macro int SCM_SYMBOLP (SCM @var{x})
  678. Return non-zero iff @var{x} is a symbol.
  679. @end deftypefn
  680. @deftypefn Macro int SCM_VECTOR_LENGTH (SCM @var{x})
  681. @deftypefnx Macro int SCM_STRING_LENGTH (SCM @var{x})
  682. @deftypefnx Macro int SCM_SYMBOL_LENGTH (SCM @var{x})
  683. Return the length of the object @var{x}. The result is undefined if
  684. @var{x} is not a vector, string, or symbol, respectively.
  685. @end deftypefn
  686. @deftypefn Macro {SCM *} SCM_VECTOR_BASE (SCM @var{x})
  687. Return a pointer to the array of elements of the vector @var{x}.
  688. The result is undefined if @var{x} is not a vector.
  689. @end deftypefn
  690. @deftypefn Macro {char *} SCM_STRING_CHARS (SCM @var{x})
  691. @deftypefnx Macro {char *} SCM_SYMBOL_CHARS (SCM @var{x})
  692. Return a pointer to the characters of @var{x}. The result is undefined
  693. if @var{x} is not a symbol or string, respectively.
  694. @end deftypefn
  695. There are also a few magic values stuffed into memory before a symbol's
  696. characters, but you don't want to know about those. What cruft!
  697. Note that @code{SCM_VECTOR_BASE}, @code{SCM_STRING_CHARS} and
  698. @code{SCM_SYMBOL_CHARS} return pointers to data within the respective
  699. object. Care must be taken that the object is not garbage collected
  700. while that data is still being accessed. This is the same as for a
  701. smob, @xref{Remembering During Operations}.
  702. @node Procedures
  703. @subsubsection Procedures
  704. Guile provides two kinds of procedures: @dfn{closures}, which are the
  705. result of evaluating a @code{lambda} expression, and @dfn{subrs}, which
  706. are C functions packaged up as Scheme objects, to make them available to
  707. Scheme programmers.
  708. (There are actually other sorts of procedures: compiled closures, and
  709. continuations; see the source code for details about them.)
  710. @deftypefun SCM scm_procedure_p (SCM @var{x})
  711. Return @code{SCM_BOOL_T} iff @var{x} is a Scheme procedure object, of
  712. any sort. Otherwise, return @code{SCM_BOOL_F}.
  713. @end deftypefun
  714. @node Closures
  715. @subsubsection Closures
  716. [FIXME: this needs to be further subbed, but texinfo has no subsubsub]
  717. A closure is a procedure object, generated as the value of a
  718. @code{lambda} expression in Scheme. The representation of a closure is
  719. straightforward --- it contains a pointer to the code of the lambda
  720. expression from which it was created, and a pointer to the environment
  721. it closes over.
  722. In Guile, each closure also has a property list, allowing the system to
  723. store information about the closure. I'm not sure what this is used for
  724. at the moment --- the debugger, maybe?
  725. @deftypefn Macro int SCM_CLOSUREP (SCM @var{x})
  726. Return non-zero iff @var{x} is a closure.
  727. @end deftypefn
  728. @deftypefn Macro SCM SCM_PROCPROPS (SCM @var{x})
  729. Return the property list of the closure @var{x}. The results are
  730. undefined if @var{x} is not a closure.
  731. @end deftypefn
  732. @deftypefn Macro void SCM_SETPROCPROPS (SCM @var{x}, SCM @var{p})
  733. Set the property list of the closure @var{x} to @var{p}. The results
  734. are undefined if @var{x} is not a closure.
  735. @end deftypefn
  736. @deftypefn Macro SCM SCM_CODE (SCM @var{x})
  737. Return the code of the closure @var{x}. The result is undefined if
  738. @var{x} is not a closure.
  739. This function should probably only be used internally by the
  740. interpreter, since the representation of the code is intimately
  741. connected with the interpreter's implementation.
  742. @end deftypefn
  743. @deftypefn Macro SCM SCM_ENV (SCM @var{x})
  744. Return the environment enclosed by @var{x}.
  745. The result is undefined if @var{x} is not a closure.
  746. This function should probably only be used internally by the
  747. interpreter, since the representation of the environment is intimately
  748. connected with the interpreter's implementation.
  749. @end deftypefn
  750. @node Subrs
  751. @subsubsection Subrs
  752. [FIXME: this needs to be further subbed, but texinfo has no subsubsub]
  753. A subr is a pointer to a C function, packaged up as a Scheme object to
  754. make it callable by Scheme code. In addition to the function pointer,
  755. the subr also contains a pointer to the name of the function, and
  756. information about the number of arguments accepted by the C function, for
  757. the sake of error checking.
  758. There is no single type predicate macro that recognizes subrs, as
  759. distinct from other kinds of procedures. The closest thing is
  760. @code{scm_procedure_p}; see @ref{Procedures}.
  761. @deftypefn Macro {char *} SCM_SNAME (@var{x})
  762. Return the name of the subr @var{x}. The result is undefined if
  763. @var{x} is not a subr.
  764. @end deftypefn
  765. @deftypefun SCM scm_c_define_gsubr (char *@var{name}, int @var{req}, int @var{opt}, int @var{rest}, SCM (*@var{function})())
  766. Create a new subr object named @var{name}, based on the C function
  767. @var{function}, make it visible to Scheme the value of as a global
  768. variable named @var{name}, and return the subr object.
  769. The subr object accepts @var{req} required arguments, @var{opt} optional
  770. arguments, and a @var{rest} argument iff @var{rest} is non-zero. The C
  771. function @var{function} should accept @code{@var{req} + @var{opt}}
  772. arguments, or @code{@var{req} + @var{opt} + 1} arguments if @code{rest}
  773. is non-zero.
  774. When a subr object is applied, it must be applied to at least @var{req}
  775. arguments, or else Guile signals an error. @var{function} receives the
  776. subr's first @var{req} arguments as its first @var{req} arguments. If
  777. there are fewer than @var{opt} arguments remaining, then @var{function}
  778. receives the value @code{SCM_UNDEFINED} for any missing optional
  779. arguments.
  780. If @var{rst} is non-zero, then any arguments after the first
  781. @code{@var{req} + @var{opt}} are packaged up as a list and passed as
  782. @var{function}'s last argument. @var{function} must not modify that
  783. list. (Because when subr is called through @code{apply} the list is
  784. directly from the @code{apply} argument, which the caller will expect
  785. to be unchanged.)
  786. Note that subrs can actually only accept a predefined set of
  787. combinations of required, optional, and rest arguments. For example, a
  788. subr can take one required argument, or one required and one optional
  789. argument, but a subr can't take one required and two optional arguments.
  790. It's bizarre, but that's the way the interpreter was written. If the
  791. arguments to @code{scm_c_define_gsubr} do not fit one of the predefined
  792. patterns, then @code{scm_c_define_gsubr} will return a compiled closure
  793. object instead of a subr object.
  794. @end deftypefun
  795. @node Port Data
  796. @subsubsection Ports
  797. Haven't written this yet, 'cos I don't understand ports yet.
  798. @node Signalling Type Errors
  799. @subsection Signalling Type Errors
  800. Every function visible at the Scheme level should aggressively check the
  801. types of its arguments, to avoid misinterpreting a value, and perhaps
  802. causing a segmentation fault. Guile provides some macros to make this
  803. easier.
  804. @deftypefn Macro void SCM_ASSERT (int @var{test}, SCM @var{obj}, unsigned int @var{position}, const char *@var{subr})
  805. If @var{test} is zero, signal a ``wrong type argument'' error,
  806. attributed to the subroutine named @var{subr}, operating on the value
  807. @var{obj}, which is the @var{position}'th argument of @var{subr}.
  808. @end deftypefn
  809. @deftypefn Macro int SCM_ARG1
  810. @deftypefnx Macro int SCM_ARG2
  811. @deftypefnx Macro int SCM_ARG3
  812. @deftypefnx Macro int SCM_ARG4
  813. @deftypefnx Macro int SCM_ARG5
  814. @deftypefnx Macro int SCM_ARG6
  815. @deftypefnx Macro int SCM_ARG7
  816. One of the above values can be used for @var{position} to indicate the
  817. number of the argument of @var{subr} which is being checked.
  818. Alternatively, a positive integer number can be used, which allows to
  819. check arguments after the seventh. However, for parameter numbers up to
  820. seven it is preferable to use @code{SCM_ARGN} instead of the
  821. corresponding raw number, since it will make the code easier to
  822. understand.
  823. @end deftypefn
  824. @deftypefn Macro int SCM_ARGn
  825. Passing a value of zero or @code{SCM_ARGn} for @var{position} allows to
  826. leave it unspecified which argument's type is incorrect. Again,
  827. @code{SCM_ARGn} should be preferred over a raw zero constant.
  828. @end deftypefn
  829. @node Unpacking the SCM type
  830. @subsection Unpacking the SCM Type
  831. The previous sections have explained how @code{SCM} values can refer to
  832. immediate and non-immediate Scheme objects. For immediate objects, the
  833. complete object value is stored in the @code{SCM} word itself, while for
  834. non-immediates, the @code{SCM} word contains a pointer to a heap cell,
  835. and further information about the object in question is stored in that
  836. cell. This section describes how the @code{SCM} type is actually
  837. represented and used at the C level.
  838. In fact, there are two basic C data types to represent objects in
  839. Guile: @code{SCM} and @code{scm_t_bits}.
  840. @menu
  841. * Relationship between SCM and scm_t_bits::
  842. * Immediate objects::
  843. * Non-immediate objects::
  844. * Allocating Cells::
  845. * Heap Cell Type Information::
  846. * Accessing Cell Entries::
  847. * Basic Rules for Accessing Cell Entries::
  848. @end menu
  849. @node Relationship between SCM and scm_t_bits
  850. @subsubsection Relationship between @code{SCM} and @code{scm_t_bits}
  851. A variable of type @code{SCM} is guaranteed to hold a valid Scheme
  852. object. A variable of type @code{scm_t_bits}, on the other hand, may
  853. hold a representation of a @code{SCM} value as a C integral type, but
  854. may also hold any C value, even if it does not correspond to a valid
  855. Scheme object.
  856. For a variable @var{x} of type @code{SCM}, the Scheme object's type
  857. information is stored in a form that is not directly usable. To be able
  858. to work on the type encoding of the scheme value, the @code{SCM}
  859. variable has to be transformed into the corresponding representation as
  860. a @code{scm_t_bits} variable @var{y} by using the @code{SCM_UNPACK}
  861. macro. Once this has been done, the type of the scheme object @var{x}
  862. can be derived from the content of the bits of the @code{scm_t_bits}
  863. value @var{y}, in the way illustrated by the example earlier in this
  864. chapter (@pxref{Cheaper Pairs}). Conversely, a valid bit encoding of a
  865. Scheme value as a @code{scm_t_bits} variable can be transformed into the
  866. corresponding @code{SCM} value using the @code{SCM_PACK} macro.
  867. @node Immediate objects
  868. @subsubsection Immediate objects
  869. A Scheme object may either be an immediate, i.e. carrying all necessary
  870. information by itself, or it may contain a reference to a @dfn{cell}
  871. with additional information on the heap. Although in general it should
  872. be irrelevant for user code whether an object is an immediate or not,
  873. within Guile's own code the distinction is sometimes of importance.
  874. Thus, the following low level macro is provided:
  875. @deftypefn Macro int SCM_IMP (SCM @var{x})
  876. A Scheme object is an immediate if it fulfills the @code{SCM_IMP}
  877. predicate, otherwise it holds an encoded reference to a heap cell. The
  878. result of the predicate is delivered as a C style boolean value. User
  879. code and code that extends Guile should normally not be required to use
  880. this macro.
  881. @end deftypefn
  882. @noindent
  883. Summary:
  884. @itemize @bullet
  885. @item
  886. Given a Scheme object @var{x} of unknown type, check first
  887. with @code{SCM_IMP (@var{x})} if it is an immediate object.
  888. @item
  889. If so, all of the type and value information can be determined from the
  890. @code{scm_t_bits} value that is delivered by @code{SCM_UNPACK
  891. (@var{x})}.
  892. @end itemize
  893. @node Non-immediate objects
  894. @subsubsection Non-immediate objects
  895. A Scheme object of type @code{SCM} that does not fulfill the
  896. @code{SCM_IMP} predicate holds an encoded reference to a heap cell.
  897. This reference can be decoded to a C pointer to a heap cell using the
  898. @code{SCM2PTR} macro. The encoding of a pointer to a heap cell into a
  899. @code{SCM} value is done using the @code{PTR2SCM} macro.
  900. @c (FIXME:: this name should be changed)
  901. @deftypefn Macro (scm_t_cell *) SCM2PTR (SCM @var{x})
  902. Extract and return the heap cell pointer from a non-immediate @code{SCM}
  903. object @var{x}.
  904. @end deftypefn
  905. @c (FIXME:: this name should be changed)
  906. @deftypefn Macro SCM PTR2SCM (scm_t_cell * @var{x})
  907. Return a @code{SCM} value that encodes a reference to the heap cell
  908. pointer @var{x}.
  909. @end deftypefn
  910. Note that it is also possible to transform a non-immediate @code{SCM}
  911. value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable.
  912. However, the result of @code{SCM_UNPACK} may not be used as a pointer to
  913. a @code{scm_t_cell}: only @code{SCM2PTR} is guaranteed to transform a
  914. @code{SCM} object into a valid pointer to a heap cell. Also, it is not
  915. allowed to apply @code{PTR2SCM} to anything that is not a valid pointer
  916. to a heap cell.
  917. @noindent
  918. Summary:
  919. @itemize @bullet
  920. @item
  921. Only use @code{SCM2PTR} on @code{SCM} values for which @code{SCM_IMP} is
  922. false!
  923. @item
  924. Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use @code{SCM2PTR
  925. (@var{x})} instead!
  926. @item
  927. Don't use @code{PTR2SCM} for anything but a cell pointer!
  928. @end itemize
  929. @node Allocating Cells
  930. @subsubsection Allocating Cells
  931. Guile provides both ordinary cells with two slots, and double cells
  932. with four slots. The following two function are the most primitive
  933. way to allocate such cells.
  934. If the caller intends to use it as a header for some other type, she
  935. must pass an appropriate magic value in @var{word_0}, to mark it as a
  936. member of that type, and pass whatever value as @var{word_1}, etc that
  937. the type expects. You should generally not need these functions,
  938. unless you are implementing a new datatype, and thoroughly understand
  939. the code in @code{<libguile/tags.h>}.
  940. If you just want to allocate pairs, use @code{scm_cons}.
  941. @deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1)
  942. Allocate a new cell, initialize the two slots with @var{word_0} and
  943. @var{word_1}, and return it.
  944. Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}.
  945. If you want to pass a @code{SCM} object, you need to use
  946. @code{SCM_UNPACK}.
  947. @end deftypefn
  948. @deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3)
  949. Like @code{scm_cell}, but allocates a double cell with four
  950. slots.
  951. @end deftypefn
  952. @node Heap Cell Type Information
  953. @subsubsection Heap Cell Type Information
  954. Heap cells contain a number of entries, each of which is either a scheme
  955. object of type @code{SCM} or a raw C value of type @code{scm_t_bits}.
  956. Which of the cell entries contain Scheme objects and which contain raw C
  957. values is determined by the first entry of the cell, which holds the
  958. cell type information.
  959. @deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x})
  960. For a non-immediate Scheme object @var{x}, deliver the content of the
  961. first entry of the heap cell referenced by @var{x}. This value holds
  962. the information about the cell type.
  963. @end deftypefn
  964. @deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t})
  965. For a non-immediate Scheme object @var{x}, write the value @var{t} into
  966. the first entry of the heap cell referenced by @var{x}. The value
  967. @var{t} must hold a valid cell type.
  968. @end deftypefn
  969. @node Accessing Cell Entries
  970. @subsubsection Accessing Cell Entries
  971. For a non-immediate Scheme object @var{x}, the object type can be
  972. determined by reading the cell type entry using the @code{SCM_CELL_TYPE}
  973. macro. For each different type of cell it is known which cell entries
  974. hold Scheme objects and which cell entries hold raw C data. To access
  975. the different cell entries appropriately, the following macros are
  976. provided.
  977. @deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n})
  978. Deliver the cell entry @var{n} of the heap cell referenced by the
  979. non-immediate Scheme object @var{x} as raw data. It is illegal, to
  980. access cell entries that hold Scheme objects by using these macros. For
  981. convenience, the following macros are also provided.
  982. @itemize @bullet
  983. @item
  984. SCM_CELL_WORD_0 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 0)
  985. @item
  986. SCM_CELL_WORD_1 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 1)
  987. @item
  988. @dots{}
  989. @item
  990. SCM_CELL_WORD_@var{n} (@var{x}) @result{} SCM_CELL_WORD (@var{x}, @var{n})
  991. @end itemize
  992. @end deftypefn
  993. @deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n})
  994. Deliver the cell entry @var{n} of the heap cell referenced by the
  995. non-immediate Scheme object @var{x} as a Scheme object. It is illegal,
  996. to access cell entries that do not hold Scheme objects by using these
  997. macros. For convenience, the following macros are also provided.
  998. @itemize @bullet
  999. @item
  1000. SCM_CELL_OBJECT_0 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 0)
  1001. @item
  1002. SCM_CELL_OBJECT_1 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 1)
  1003. @item
  1004. @dots{}
  1005. @item
  1006. SCM_CELL_OBJECT_@var{n} (@var{x}) @result{} SCM_CELL_OBJECT (@var{x},
  1007. @var{n})
  1008. @end itemize
  1009. @end deftypefn
  1010. @deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w})
  1011. Write the raw C value @var{w} into entry number @var{n} of the heap cell
  1012. referenced by the non-immediate Scheme value @var{x}. Values that are
  1013. written into cells this way may only be read from the cells using the
  1014. @code{SCM_CELL_WORD} macros or, in case cell entry 0 is written, using
  1015. the @code{SCM_CELL_TYPE} macro. For the special case of cell entry 0 it
  1016. has to be made sure that @var{w} contains a cell type information which
  1017. does not describe a Scheme object. For convenience, the following
  1018. macros are also provided.
  1019. @itemize @bullet
  1020. @item
  1021. SCM_SET_CELL_WORD_0 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD
  1022. (@var{x}, 0, @var{w})
  1023. @item
  1024. SCM_SET_CELL_WORD_1 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD
  1025. (@var{x}, 1, @var{w})
  1026. @item
  1027. @dots{}
  1028. @item
  1029. SCM_SET_CELL_WORD_@var{n} (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD
  1030. (@var{x}, @var{n}, @var{w})
  1031. @end itemize
  1032. @end deftypefn
  1033. @deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o})
  1034. Write the Scheme object @var{o} into entry number @var{n} of the heap
  1035. cell referenced by the non-immediate Scheme value @var{x}. Values that
  1036. are written into cells this way may only be read from the cells using
  1037. the @code{SCM_CELL_OBJECT} macros or, in case cell entry 0 is written,
  1038. using the @code{SCM_CELL_TYPE} macro. For the special case of cell
  1039. entry 0 the writing of a Scheme object into this cell is only allowed
  1040. if the cell forms a Scheme pair. For convenience, the following macros
  1041. are also provided.
  1042. @itemize @bullet
  1043. @item
  1044. SCM_SET_CELL_OBJECT_0 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT
  1045. (@var{x}, 0, @var{o})
  1046. @item
  1047. SCM_SET_CELL_OBJECT_1 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT
  1048. (@var{x}, 1, @var{o})
  1049. @item
  1050. @dots{}
  1051. @item
  1052. SCM_SET_CELL_OBJECT_@var{n} (@var{x}, @var{o}) @result{}
  1053. SCM_SET_CELL_OBJECT (@var{x}, @var{n}, @var{o})
  1054. @end itemize
  1055. @end deftypefn
  1056. @noindent
  1057. Summary:
  1058. @itemize @bullet
  1059. @item
  1060. For a non-immediate Scheme object @var{x} of unknown type, get the type
  1061. information by using @code{SCM_CELL_TYPE (@var{x})}.
  1062. @item
  1063. As soon as the cell type information is available, only use the
  1064. appropriate access methods to read and write data to the different cell
  1065. entries.
  1066. @end itemize
  1067. @node Basic Rules for Accessing Cell Entries
  1068. @subsubsection Basic Rules for Accessing Cell Entries
  1069. For each cell type it is generally up to the implementation of that type
  1070. which of the corresponding cell entries hold Scheme objects and which
  1071. hold raw C values. However, there is one basic rule that has to be
  1072. followed: Scheme pairs consist of exactly two cell entries, which both
  1073. contain Scheme objects. Further, a cell which contains a Scheme object
  1074. in it first entry has to be a Scheme pair. In other words, it is not
  1075. allowed to store a Scheme object in the first cell entry and a non
  1076. Scheme object in the second cell entry.
  1077. @c Fixme:shouldn't this rather be SCM_PAIRP / SCM_PAIR_P ?
  1078. @deftypefn Macro int SCM_CONSP (SCM @var{x})
  1079. Determine, whether the Scheme object @var{x} is a Scheme pair,
  1080. i.e. whether @var{x} references a heap cell consisting of exactly two
  1081. entries, where both entries contain a Scheme object. In this case, both
  1082. entries will have to be accessed using the @code{SCM_CELL_OBJECT}
  1083. macros. On the contrary, if the @code{SCM_CONSP} predicate is not
  1084. fulfilled, the first entry of the Scheme cell is guaranteed not to be a
  1085. Scheme value and thus the first cell entry must be accessed using the
  1086. @code{SCM_CELL_WORD_0} macro.
  1087. @end deftypefn
  1088. @c Local Variables:
  1089. @c TeX-master: "guile.texi"
  1090. @c End: