api-io.texi 96 KB


  1. @c -*-texinfo-*-
  2. @c This is part of the GNU Guile Reference Manual.
  3. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
  4. @c 2010, 2011, 2013 Free Software Foundation, Inc.
  5. @c See the file guile.texi for copying conditions.
  6. @node Input and Output
  7. @section Input and Output
  8. @menu
  9. * Ports:: The idea of the port abstraction.
  10. * Reading:: Procedures for reading from a port.
  11. * Writing:: Procedures for writing to a port.
  12. * Closing:: Procedures to close a port.
  13. * Random Access:: Moving around a random access port.
  14. * Line/Delimited:: Read and write lines or delimited text.
  15. * Block Reading and Writing:: Reading and writing blocks of text.
  16. * Default Ports:: Defaults for input, output and errors.
  17. * Port Types:: Types of port and how to make them.
  18. * R6RS I/O Ports:: The R6RS port API.
  19. * I/O Extensions:: Using and extending ports in C.
  20. * BOM Handling:: Handling of Unicode byte order marks.
  21. @end menu
  22. @node Ports
  23. @subsection Ports
  24. @cindex Port
  25. Sequential input/output in Scheme is represented by operations on a
  26. @dfn{port}. This chapter explains the operations that Guile provides
  27. for working with ports.
  28. Ports are created by opening, for instance @code{open-file} for a file
  29. (@pxref{File Ports}). Characters can be read from an input port and
  30. written to an output port, or both on an input/output port. A port
  31. can be closed (@pxref{Closing}) when no longer required, after which
  32. any attempt to read or write is an error.
  33. The formal definition of a port is very generic: an input port is
  34. simply ``an object which can deliver characters on demand,'' and an
  35. output port is ``an object which can accept characters.'' Because
  36. this definition is so loose, it is easy to write functions that
  37. simulate ports in software. @dfn{Soft ports} and @dfn{string ports}
  38. are two interesting and powerful examples of this technique.
  39. (@pxref{Soft Ports}, and @ref{String Ports}.)
  40. Ports are garbage collected in the usual way (@pxref{Memory
  41. Management}), and will be closed at that time if not already closed.
  42. In this case any errors occurring in the close will not be reported.
  43. Usually a program will want to explicitly close so as to be sure all
  44. its operations have been successful. Of course if a program has
  45. abandoned something due to an error or other condition then closing
  46. problems are probably not of interest.
  47. It is strongly recommended that file ports be closed explicitly when
  48. no longer required. Most systems have limits on how many files can be
  49. open, both on a per-process and a system-wide basis. A program that
  50. uses many files should take care not to hit those limits. The same
  51. applies to similar system resources such as pipes and sockets.
  52. Note that automatic garbage collection is triggered only by memory
  53. consumption, not by file or other resource usage, so a program cannot
  54. rely on that to keep it away from system limits. An explicit call to
  55. @code{gc} can of course be relied on to pick up unreferenced ports.
  56. If program flow makes it hard to be certain when to close then this
  57. may be an acceptable way to control resource usage.
  58. All file access uses the ``LFS'' large file support functions when
  59. available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
  60. read and written on a 32-bit system.
  61. Each port has an associated character encoding that controls how bytes
  62. read from the port are converted to characters and string and controls
  63. how characters and strings written to the port are converted to bytes.
  64. When ports are created, they inherit their character encoding from the
  65. current locale, but, that can be modified after the port is created.
  66. Currently, the ports only work with @emph{non-modal} encodings. Most
  67. encodings are non-modal, meaning that the conversion of bytes to a
  68. string doesn't depend on its context: the same byte sequence will always
  69. return the same string. A couple of modal encodings are in common use,
  70. like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
  71. Each port also has an associated conversion strategy: what to do when
  72. a Guile character can't be converted to the port's encoded character
  73. representation for output. There are three possible strategies: to
  74. raise an error, to replace the character with a hex escape, or to
  75. replace the character with a substitute character.
  76. @rnindex input-port?
  77. @deffn {Scheme Procedure} input-port? x
  78. @deffnx {C Function} scm_input_port_p (x)
  79. Return @code{#t} if @var{x} is an input port, otherwise return
  80. @code{#f}. Any object satisfying this predicate also satisfies
  81. @code{port?}.
  82. @end deffn
  83. @rnindex output-port?
  84. @deffn {Scheme Procedure} output-port? x
  85. @deffnx {C Function} scm_output_port_p (x)
  86. Return @code{#t} if @var{x} is an output port, otherwise return
  87. @code{#f}. Any object satisfying this predicate also satisfies
  88. @code{port?}.
  89. @end deffn
  90. @deffn {Scheme Procedure} port? x
  91. @deffnx {C Function} scm_port_p (x)
  92. Return a boolean indicating whether @var{x} is a port.
  93. Equivalent to @code{(or (input-port? @var{x}) (output-port?
  94. @var{x}))}.
  95. @end deffn
  96. @deffn {Scheme Procedure} set-port-encoding! port enc
  97. @deffnx {C Function} scm_set_port_encoding_x (port, enc)
  98. Sets the character encoding that will be used to interpret all port I/O.
  99. @var{enc} is a string containing the name of an encoding. Valid
  100. encoding names are those
  101. @url{http://www.iana.org/assignments/character-sets, defined by IANA}.
  102. @end deffn
  103. @defvr {Scheme Variable} %default-port-encoding
  104. A fluid containing @code{#f} or the name of the encoding to
  105. be used by default for newly created ports (@pxref{Fluids and Dynamic
  106. States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
  107. New ports are created with the encoding appropriate for the current
  108. locale if @code{setlocale} has been called or the value specified by
  109. this fluid otherwise.
  110. @end defvr
  111. @deffn {Scheme Procedure} port-encoding port
  112. @deffnx {C Function} scm_port_encoding (port)
  113. Returns, as a string, the character encoding that @var{port} uses to interpret
  114. its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
  115. @end deffn
  116. @deffn {Scheme Procedure} set-port-conversion-strategy! port sym
  117. @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
  118. Sets the behavior of the interpreter when outputting a character that
  119. is not representable in the port's current encoding. @var{sym} can be
  120. either @code{'error}, @code{'substitute}, or @code{'escape}. If it is
  121. @code{'error}, an error will be thrown when an nonconvertible character
  122. is encountered. If it is @code{'substitute}, then nonconvertible
  123. characters will be replaced with approximate characters, or with
  124. question marks if no approximately correct character is available. If
  125. it is @code{'escape}, it will appear as a hex escape when output.
  126. If @var{port} is an open port, the conversion error behavior
  127. is set for that port. If it is @code{#f}, it is set as the
  128. default behavior for any future ports that get created in
  129. this thread.
  130. @end deffn
  131. @deffn {Scheme Procedure} port-conversion-strategy port
  132. @deffnx {C Function} scm_port_conversion_strategy (port)
  133. Returns the behavior of the port when outputting a character that is
  134. not representable in the port's current encoding. It returns the
  135. symbol @code{error} if unrepresentable characters should cause
  136. exceptions, @code{substitute} if the port should try to replace
  137. unrepresentable characters with question marks or approximate
  138. characters, or @code{escape} if unrepresentable characters should be
  139. converted to string escapes.
  140. If @var{port} is @code{#f}, then the current default behavior will be
  141. returned. New ports will have this default behavior when they are
  142. created.
  143. @end deffn
  144. @deffn {Scheme Variable} %default-port-conversion-strategy
  145. The fluid that defines the conversion strategy for newly created ports,
  146. and for other conversion routines such as @code{scm_to_stringn},
  147. @code{scm_from_stringn}, @code{string->pointer}, and
  148. @code{pointer->string}.
  149. Its value must be one of the symbols described above, with the same
  150. semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
  151. When Guile starts, its value is @code{'substitute}.
  152. Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
  153. equivalent to @code{(fluid-set! %default-port-conversion-strategy
  154. @var{sym})}.
  155. @end deffn
  156. @node Reading
  157. @subsection Reading
  158. @cindex Reading
  159. [Generic procedures for reading from ports.]
  160. These procedures pertain to reading characters and strings from
  161. ports. To read general S-expressions from ports, @xref{Scheme Read}.
  162. @rnindex eof-object?
  163. @cindex End of file object
  164. @deffn {Scheme Procedure} eof-object? x
  165. @deffnx {C Function} scm_eof_object_p (x)
  166. Return @code{#t} if @var{x} is an end-of-file object; otherwise
  167. return @code{#f}.
  168. @end deffn
  169. @rnindex char-ready?
  170. @deffn {Scheme Procedure} char-ready? [port]
  171. @deffnx {C Function} scm_char_ready_p (port)
  172. Return @code{#t} if a character is ready on input @var{port}
  173. and return @code{#f} otherwise. If @code{char-ready?} returns
  174. @code{#t} then the next @code{read-char} operation on
  175. @var{port} is guaranteed not to hang. If @var{port} is a file
  176. port at end of file then @code{char-ready?} returns @code{#t}.
  177. @code{char-ready?} exists to make it possible for a
  178. program to accept characters from interactive ports without
  179. getting stuck waiting for input. Any input editors associated
  180. with such ports must make sure that characters whose existence
  181. has been asserted by @code{char-ready?} cannot be rubbed out.
  182. If @code{char-ready?} were to return @code{#f} at end of file,
  183. a port at end of file would be indistinguishable from an
  184. interactive port that has no ready characters.
  185. @end deffn
  186. @rnindex read-char
  187. @deffn {Scheme Procedure} read-char [port]
  188. @deffnx {C Function} scm_read_char (port)
  189. Return the next character available from @var{port}, updating
  190. @var{port} to point to the following character. If no more
  191. characters are available, the end-of-file object is returned.
  192. When @var{port}'s data cannot be decoded according to its
  193. character encoding, a @code{decoding-error} is raised and
  194. @var{port} points past the erroneous byte sequence.
  195. @end deffn
  196. @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
  197. Read up to @var{size} bytes from @var{port} and store them in
  198. @var{buffer}. The return value is the number of bytes actually read,
  199. which can be less than @var{size} if end-of-file has been reached.
  200. Note that this function does not update @code{port-line} and
  201. @code{port-column} below.
  202. @end deftypefn
  203. @rnindex peek-char
  204. @deffn {Scheme Procedure} peek-char [port]
  205. @deffnx {C Function} scm_peek_char (port)
  206. Return the next character available from @var{port},
  207. @emph{without} updating @var{port} to point to the following
  208. character. If no more characters are available, the
  209. end-of-file object is returned.
  210. The value returned by
  211. a call to @code{peek-char} is the same as the value that would
  212. have been returned by a call to @code{read-char} on the same
  213. port. The only difference is that the very next call to
  214. @code{read-char} or @code{peek-char} on that @var{port} will
  215. return the value returned by the preceding call to
  216. @code{peek-char}. In particular, a call to @code{peek-char} on
  217. an interactive port will hang waiting for input whenever a call
  218. to @code{read-char} would have hung.
  219. As for @code{read-char}, a @code{decoding-error} may be raised
  220. if such a situation occurs. However, unlike with @code{read-char},
  221. @var{port} still points at the beginning of the erroneous byte
  222. sequence when the error is raised.
  223. @end deffn
  224. @deffn {Scheme Procedure} unread-char cobj [port]
  225. @deffnx {C Function} scm_unread_char (cobj, port)
  226. Place character @var{cobj} in @var{port} so that it will be read by the
  227. next read operation. If called multiple times, the unread characters
  228. will be read again in last-in first-out order. If @var{port} is
  229. not supplied, the current input port is used.
  230. @end deffn
  231. @deffn {Scheme Procedure} unread-string str port
  232. @deffnx {C Function} scm_unread_string (str, port)
  233. Place the string @var{str} in @var{port} so that its characters will
  234. be read from left-to-right as the next characters from @var{port}
  235. during subsequent read operations. If called multiple times, the
  236. unread characters will be read again in last-in first-out order. If
  237. @var{port} is not supplied, the @code{current-input-port} is used.
  238. @end deffn
  239. @deffn {Scheme Procedure} drain-input port
  240. @deffnx {C Function} scm_drain_input (port)
  241. This procedure clears a port's input buffers, similar
  242. to the way that force-output clears the output buffer. The
  243. contents of the buffers are returned as a single string, e.g.,
  244. @lisp
  245. (define p (open-input-file ...))
  246. (drain-input p) => empty string, nothing buffered yet.
  247. (unread-char (read-char p) p)
  248. (drain-input p) => initial chars from p, up to the buffer size.
  249. @end lisp
  250. Draining the buffers may be useful for cleanly finishing
  251. buffered I/O so that the file descriptor can be used directly
  252. for further input.
  253. @end deffn
  254. @deffn {Scheme Procedure} port-column port
  255. @deffnx {Scheme Procedure} port-line port
  256. @deffnx {C Function} scm_port_column (port)
  257. @deffnx {C Function} scm_port_line (port)
  258. Return the current column number or line number of @var{port}.
  259. If the number is
  260. unknown, the result is #f. Otherwise, the result is a 0-origin integer
  261. - i.e.@: the first character of the first line is line 0, column 0.
  262. (However, when you display a file position, for example in an error
  263. message, we recommend you add 1 to get 1-origin integers. This is
  264. because lines and column numbers traditionally start with 1, and that is
  265. what non-programmers will find most natural.)
  266. @end deffn
  267. @deffn {Scheme Procedure} set-port-column! port column
  268. @deffnx {Scheme Procedure} set-port-line! port line
  269. @deffnx {C Function} scm_set_port_column_x (port, column)
  270. @deffnx {C Function} scm_set_port_line_x (port, line)
  271. Set the current column or line number of @var{port}.
  272. @end deffn
  273. @node Writing
  274. @subsection Writing
  275. @cindex Writing
  276. [Generic procedures for writing to ports.]
  277. These procedures are for writing characters and strings to
  278. ports. For more information on writing arbitrary Scheme objects to
  279. ports, @xref{Scheme Write}.
  280. @deffn {Scheme Procedure} get-print-state port
  281. @deffnx {C Function} scm_get_print_state (port)
  282. Return the print state of the port @var{port}. If @var{port}
  283. has no associated print state, @code{#f} is returned.
  284. @end deffn
  285. @rnindex newline
  286. @deffn {Scheme Procedure} newline [port]
  287. @deffnx {C Function} scm_newline (port)
  288. Send a newline to @var{port}.
  289. If @var{port} is omitted, send to the current output port.
  290. @end deffn
  291. @deffn {Scheme Procedure} port-with-print-state port [pstate]
  292. @deffnx {C Function} scm_port_with_print_state (port, pstate)
  293. Create a new port which behaves like @var{port}, but with an
  294. included print state @var{pstate}. @var{pstate} is optional.
  295. If @var{pstate} isn't supplied and @var{port} already has
  296. a print state, the old print state is reused.
  297. @end deffn
  298. @deffn {Scheme Procedure} simple-format destination message . args
  299. @deffnx {C Function} scm_simple_format (destination, message, args)
  300. Write @var{message} to @var{destination}, defaulting to
  301. the current output port.
  302. @var{message} can contain @code{~A} (was @code{%s}) and
  303. @code{~S} (was @code{%S}) escapes. When printed,
  304. the escapes are replaced with corresponding members of
  305. @var{args}:
  306. @code{~A} formats using @code{display} and @code{~S} formats
  307. using @code{write}.
  308. If @var{destination} is @code{#t}, then use the current output
  309. port, if @var{destination} is @code{#f}, then return a string
  310. containing the formatted text. Does not add a trailing newline.
  311. @end deffn
  312. @rnindex write-char
  313. @deffn {Scheme Procedure} write-char chr [port]
  314. @deffnx {C Function} scm_write_char (chr, port)
  315. Send character @var{chr} to @var{port}.
  316. @end deffn
  317. @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
  318. Write @var{size} bytes at @var{buffer} to @var{port}.
  319. Note that this function does not update @code{port-line} and
  320. @code{port-column} (@pxref{Reading}).
  321. @end deftypefn
  322. @findex fflush
  323. @deffn {Scheme Procedure} force-output [port]
  324. @deffnx {C Function} scm_force_output (port)
  325. Flush the specified output port, or the current output port if @var{port}
  326. is omitted. The current output buffer contents are passed to the
  327. underlying port implementation (e.g., in the case of fports, the
  328. data will be written to the file and the output buffer will be cleared.)
  329. It has no effect on an unbuffered port.
  330. The return value is unspecified.
  331. @end deffn
  332. @deffn {Scheme Procedure} flush-all-ports
  333. @deffnx {C Function} scm_flush_all_ports ()
  334. Equivalent to calling @code{force-output} on
  335. all open output ports. The return value is unspecified.
  336. @end deffn
  337. @node Closing
  338. @subsection Closing
  339. @cindex Closing ports
  340. @cindex Port, close
  341. @deffn {Scheme Procedure} close-port port
  342. @deffnx {C Function} scm_close_port (port)
  343. Close the specified port object. Return @code{#t} if it
  344. successfully closes a port or @code{#f} if it was already
  345. closed. An exception may be raised if an error occurs, for
  346. example when flushing buffered output. See also @ref{Ports and
  347. File Descriptors, close}, for a procedure which can close file
  348. descriptors.
  349. @end deffn
  350. @deffn {Scheme Procedure} close-input-port port
  351. @deffnx {Scheme Procedure} close-output-port port
  352. @deffnx {C Function} scm_close_input_port (port)
  353. @deffnx {C Function} scm_close_output_port (port)
  354. @rnindex close-input-port
  355. @rnindex close-output-port
  356. Close the specified input or output @var{port}. An exception may be
  357. raised if an error occurs while closing. If @var{port} is already
  358. closed, nothing is done. The return value is unspecified.
  359. See also @ref{Ports and File Descriptors, close}, for a procedure
  360. which can close file descriptors.
  361. @end deffn
  362. @deffn {Scheme Procedure} port-closed? port
  363. @deffnx {C Function} scm_port_closed_p (port)
  364. Return @code{#t} if @var{port} is closed or @code{#f} if it is
  365. open.
  366. @end deffn
  367. @node Random Access
  368. @subsection Random Access
  369. @cindex Random access, ports
  370. @cindex Port, random access
  371. @deffn {Scheme Procedure} seek fd_port offset whence
  372. @deffnx {C Function} scm_seek (fd_port, offset, whence)
  373. Sets the current position of @var{fd_port} to the integer
  374. @var{offset}, which is interpreted according to the value of
  375. @var{whence}.
  376. One of the following variables should be supplied for
  377. @var{whence}:
  378. @defvar SEEK_SET
  379. Seek from the beginning of the file.
  380. @end defvar
  381. @defvar SEEK_CUR
  382. Seek from the current position.
  383. @end defvar
  384. @defvar SEEK_END
  385. Seek from the end of the file.
  386. @end defvar
  387. If @var{fd_port} is a file descriptor, the underlying system
  388. call is @code{lseek}. @var{port} may be a string port.
  389. The value returned is the new position in the file. This means
  390. that the current position of a port can be obtained using:
  391. @lisp
  392. (seek port 0 SEEK_CUR)
  393. @end lisp
  394. @end deffn
  395. @deffn {Scheme Procedure} ftell fd_port
  396. @deffnx {C Function} scm_ftell (fd_port)
  397. Return an integer representing the current position of
  398. @var{fd_port}, measured from the beginning. Equivalent to:
  399. @lisp
  400. (seek port 0 SEEK_CUR)
  401. @end lisp
  402. @end deffn
  403. @findex truncate
  404. @findex ftruncate
  405. @deffn {Scheme Procedure} truncate-file file [length]
  406. @deffnx {C Function} scm_truncate_file (file, length)
  407. Truncate @var{file} to @var{length} bytes. @var{file} can be a
  408. filename string, a port object, or an integer file descriptor. The
  409. return value is unspecified.
  410. For a port or file descriptor @var{length} can be omitted, in which
  411. case the file is truncated at the current position (per @code{ftell}
  412. above).
  413. On most systems a file can be extended by giving a length greater than
  414. the current size, but this is not mandatory in the POSIX standard.
  415. @end deffn
  416. @node Line/Delimited
  417. @subsection Line Oriented and Delimited Text
  418. @cindex Line input/output
  419. @cindex Port, line input/output
  420. The delimited-I/O module can be accessed with:
  421. @lisp
  422. (use-modules (ice-9 rdelim))
  423. @end lisp
  424. It can be used to read or write lines of text, or read text delimited by
  425. a specified set of characters. It's similar to the @code{(scsh rdelim)}
  426. module from guile-scsh, but does not use multiple values or character
  427. sets and has an extra procedure @code{write-line}.
  428. @c begin (scm-doc-string "rdelim.scm" "read-line")
  429. @deffn {Scheme Procedure} read-line [port] [handle-delim]
  430. Return a line of text from @var{port} if specified, otherwise from the
  431. value returned by @code{(current-input-port)}. Under Unix, a line of text
  432. is terminated by the first end-of-line character or by end-of-file.
  433. If @var{handle-delim} is specified, it should be one of the following
  434. symbols:
  435. @table @code
  436. @item trim
  437. Discard the terminating delimiter. This is the default, but it will
  438. be impossible to tell whether the read terminated with a delimiter or
  439. end-of-file.
  440. @item concat
  441. Append the terminating delimiter (if any) to the returned string.
  442. @item peek
  443. Push the terminating delimiter (if any) back on to the port.
  444. @item split
  445. Return a pair containing the string read from the port and the
  446. terminating delimiter or end-of-file object.
  447. @end table
  448. Like @code{read-char}, this procedure can throw to @code{decoding-error}
  449. (@pxref{Reading, @code{read-char}}).
  450. @end deffn
  451. @c begin (scm-doc-string "rdelim.scm" "read-line!")
  452. @deffn {Scheme Procedure} read-line! buf [port]
  453. Read a line of text into the supplied string @var{buf} and return the
  454. number of characters added to @var{buf}. If @var{buf} is filled, then
  455. @code{#f} is returned.
  456. Read from @var{port} if
  457. specified, otherwise from the value returned by @code{(current-input-port)}.
  458. @end deffn
  459. @c begin (scm-doc-string "rdelim.scm" "read-delimited")
  460. @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
  461. Read text until one of the characters in the string @var{delims} is found
  462. or end-of-file is reached. Read from @var{port} if supplied, otherwise
  463. from the value returned by @code{(current-input-port)}.
  464. @var{handle-delim} takes the same values as described for @code{read-line}.
  465. @end deffn
  466. @c begin (scm-doc-string "rdelim.scm" "read-delimited!")
  467. @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
  468. Read text into the supplied string @var{buf}.
  469. If a delimiter was found, return the number of characters written,
  470. except if @var{handle-delim} is @code{split}, in which case the return
  471. value is a pair, as noted above.
  472. As a special case, if @var{port} was already at end-of-stream, the EOF
  473. object is returned. Also, if no characters were written because the
  474. buffer was full, @code{#f} is returned.
  475. It's something of a wacky interface, to be honest.
  476. @end deffn
  477. @deffn {Scheme Procedure} write-line obj [port]
  478. @deffnx {C Function} scm_write_line (obj, port)
  479. Display @var{obj} and a newline character to @var{port}. If
  480. @var{port} is not specified, @code{(current-output-port)} is
  481. used. This function is equivalent to:
  482. @lisp
  483. (display obj [port])
  484. (newline [port])
  485. @end lisp
  486. @end deffn
  487. In the past, Guile did not have a procedure that would just read out all
  488. of the characters from a port. As a workaround, many people just called
  489. @code{read-delimited} with no delimiters, knowing that would produce the
  490. behavior they wanted. This prompted Guile developers to add some
  491. routines that would read all characters from a port. So it is that
  492. @code{(ice-9 rdelim)} is also the home for procedures that can reading
  493. undelimited text:
  494. @deffn {Scheme Procedure} read-string [port] [count]
  495. Read all of the characters out of @var{port} and return them as a
  496. string. If the @var{count} is present, treat it as a limit to the
  497. number of characters to read.
  498. By default, read from the current input port, with no size limit on the
  499. result. This procedure always returns a string, even if no characters
  500. were read.
  501. @end deffn
  502. @deffn {Scheme Procedure} read-string! buf [port] [start] [end]
  503. Fill @var{buf} with characters read from @var{port}, defaulting to the
  504. current input port. Return the number of characters read.
  505. If @var{start} or @var{end} are specified, store data only into the
  506. substring of @var{str} bounded by @var{start} and @var{end} (which
  507. default to the beginning and end of the string, respectively).
  508. @end deffn
  509. Some of the aforementioned I/O functions rely on the following C
  510. primitives. These will mainly be of interest to people hacking Guile
  511. internals.
  512. @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
  513. @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
  514. Read characters from @var{port} into @var{str} until one of the
  515. characters in the @var{delims} string is encountered. If
  516. @var{gobble} is true, discard the delimiter character;
  517. otherwise, leave it in the input stream for the next read. If
  518. @var{port} is not specified, use the value of
  519. @code{(current-input-port)}. If @var{start} or @var{end} are
  520. specified, store data only into the substring of @var{str}
  521. bounded by @var{start} and @var{end} (which default to the
  522. beginning and end of the string, respectively).
  523. Return a pair consisting of the delimiter that terminated the
  524. string and the number of characters read. If reading stopped
  525. at the end of file, the delimiter returned is the
  526. @var{eof-object}; if the string was filled without encountering
  527. a delimiter, this value is @code{#f}.
  528. @end deffn
  529. @deffn {Scheme Procedure} %read-line [port]
  530. @deffnx {C Function} scm_read_line (port)
  531. Read a newline-terminated line from @var{port}, allocating storage as
  532. necessary. The newline terminator (if any) is removed from the string,
  533. and a pair consisting of the line and its delimiter is returned. The
  534. delimiter may be either a newline or the @var{eof-object}; if
  535. @code{%read-line} is called at the end of file, it returns the pair
  536. @code{(#<eof> . #<eof>)}.
  537. @end deffn
  538. @node Block Reading and Writing
  539. @subsection Block reading and writing
  540. @cindex Block read/write
  541. @cindex Port, block read/write
  542. The Block-string-I/O module can be accessed with:
  543. @lisp
  544. (use-modules (ice-9 rw))
  545. @end lisp
  546. It currently contains procedures that help to implement the
  547. @code{(scsh rw)} module in guile-scsh.
  548. @deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
  549. @deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
  550. Read characters from a port or file descriptor into a
  551. string @var{str}. A port must have an underlying file
  552. descriptor --- a so-called fport. This procedure is
  553. scsh-compatible and can efficiently read large strings.
  554. It will:
  555. @itemize
  556. @item
  557. attempt to fill the entire string, unless the @var{start}
  558. and/or @var{end} arguments are supplied. i.e., @var{start}
  559. defaults to 0 and @var{end} defaults to
  560. @code{(string-length str)}
  561. @item
  562. use the current input port if @var{port_or_fdes} is not
  563. supplied.
  564. @item
  565. return fewer than the requested number of characters in some
  566. cases, e.g., on end of file, if interrupted by a signal, or if
  567. not all the characters are immediately available.
  568. @item
  569. wait indefinitely for some input if no characters are
  570. currently available,
  571. unless the port is in non-blocking mode.
  572. @item
  573. read characters from the port's input buffers if available,
  574. instead from the underlying file descriptor.
  575. @item
  576. return @code{#f} if end-of-file is encountered before reading
  577. any characters, otherwise return the number of characters
  578. read.
  579. @item
  580. return 0 if the port is in non-blocking mode and no characters
  581. are immediately available.
  582. @item
  583. return 0 if the request is for 0 bytes, with no
  584. end-of-file check.
  585. @end itemize
  586. @end deffn
  587. @deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
  588. @deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
  589. Write characters from a string @var{str} to a port or file
  590. descriptor. A port must have an underlying file descriptor
  591. --- a so-called fport. This procedure is
  592. scsh-compatible and can efficiently write large strings.
  593. It will:
  594. @itemize
  595. @item
  596. attempt to write the entire string, unless the @var{start}
  597. and/or @var{end} arguments are supplied. i.e., @var{start}
  598. defaults to 0 and @var{end} defaults to
  599. @code{(string-length str)}
  600. @item
  601. use the current output port if @var{port_of_fdes} is not
  602. supplied.
  603. @item
  604. in the case of a buffered port, store the characters in the
  605. port's output buffer, if all will fit. If they will not fit
  606. then any existing buffered characters will be flushed
  607. before attempting
  608. to write the new characters directly to the underlying file
  609. descriptor. If the port is in non-blocking mode and
  610. buffered characters can not be flushed immediately, then an
  611. @code{EAGAIN} system-error exception will be raised (Note:
  612. scsh does not support the use of non-blocking buffered ports.)
  613. @item
  614. write fewer than the requested number of
  615. characters in some cases, e.g., if interrupted by a signal or
  616. if not all of the output can be accepted immediately.
  617. @item
  618. wait indefinitely for at least one character
  619. from @var{str} to be accepted by the port, unless the port is
  620. in non-blocking mode.
  621. @item
  622. return the number of characters accepted by the port.
  623. @item
  624. return 0 if the port is in non-blocking mode and can not accept
  625. at least one character from @var{str} immediately
  626. @item
  627. return 0 immediately if the request size is 0 bytes.
  628. @end itemize
  629. @end deffn
  630. @node Default Ports
  631. @subsection Default Ports for Input, Output and Errors
  632. @cindex Default ports
  633. @cindex Port, default
  634. @rnindex current-input-port
  635. @deffn {Scheme Procedure} current-input-port
  636. @deffnx {C Function} scm_current_input_port ()
  637. @cindex standard input
  638. Return the current input port. This is the default port used
  639. by many input procedures.
  640. Initially this is the @dfn{standard input} in Unix and C terminology.
  641. When the standard input is a tty the port is unbuffered, otherwise
  642. it's fully buffered.
  643. Unbuffered input is good if an application runs an interactive
  644. subprocess, since any type-ahead input won't go into Guile's buffer
  645. and be unavailable to the subprocess.
  646. Note that Guile buffering is completely separate from the tty ``line
  647. discipline''. In the usual cooked mode on a tty Guile only sees a
  648. line of input once the user presses @key{Return}.
  649. @end deffn
  650. @rnindex current-output-port
  651. @deffn {Scheme Procedure} current-output-port
  652. @deffnx {C Function} scm_current_output_port ()
  653. @cindex standard output
  654. Return the current output port. This is the default port used
  655. by many output procedures.
  656. Initially this is the @dfn{standard output} in Unix and C terminology.
  657. When the standard output is a tty this port is unbuffered, otherwise
  658. it's fully buffered.
  659. Unbuffered output to a tty is good for ensuring progress output or a
  660. prompt is seen. But an application which always prints whole lines
  661. could change to line buffered, or an application with a lot of output
  662. could go fully buffered and perhaps make explicit @code{force-output}
  663. calls (@pxref{Writing}) at selected points.
  664. @end deffn
  665. @deffn {Scheme Procedure} current-error-port
  666. @deffnx {C Function} scm_current_error_port ()
  667. @cindex standard error output
  668. Return the port to which errors and warnings should be sent.
  669. Initially this is the @dfn{standard error} in Unix and C terminology.
  670. When the standard error is a tty this port is unbuffered, otherwise
  671. it's fully buffered.
  672. @end deffn
  673. @deffn {Scheme Procedure} set-current-input-port port
  674. @deffnx {Scheme Procedure} set-current-output-port port
  675. @deffnx {Scheme Procedure} set-current-error-port port
  676. @deffnx {C Function} scm_set_current_input_port (port)
  677. @deffnx {C Function} scm_set_current_output_port (port)
  678. @deffnx {C Function} scm_set_current_error_port (port)
  679. Change the ports returned by @code{current-input-port},
  680. @code{current-output-port} and @code{current-error-port}, respectively,
  681. so that they use the supplied @var{port} for input or output.
  682. @end deffn
  683. @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
  684. @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
  685. @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
  686. These functions must be used inside a pair of calls to
  687. @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
  688. Wind}). During the dynwind context, the indicated port is set to
  689. @var{port}.
  690. More precisely, the current port is swapped with a `backup' value
  691. whenever the dynwind context is entered or left. The backup value is
  692. initialized with the @var{port} argument.
  693. @end deftypefn
  694. @node Port Types
  695. @subsection Types of Port
  696. @cindex Types of ports
  697. @cindex Port, types
  698. [Types of port; how to make them.]
  699. @menu
  700. * File Ports:: Ports on an operating system file.
  701. * String Ports:: Ports on a Scheme string.
  702. * Soft Ports:: Ports on arbitrary Scheme procedures.
  703. * Void Ports:: Ports on nothing at all.
  704. @end menu
  705. @node File Ports
  706. @subsubsection File Ports
  707. @cindex File port
  708. @cindex Port, file
  709. The following procedures are used to open file ports.
  710. See also @ref{Ports and File Descriptors, open}, for an interface
  711. to the Unix @code{open} system call.
  712. Most systems have limits on how many files can be open, so it's
  713. strongly recommended that file ports be closed explicitly when no
  714. longer required (@pxref{Ports}).
  715. @deffn {Scheme Procedure} open-file filename mode @
  716. [#:guess-encoding=#f] [#:encoding=#f]
  717. @deffnx {C Function} scm_open_file_with_encoding @
  718. (filename, mode, guess_encoding, encoding)
  719. @deffnx {C Function} scm_open_file (filename, mode)
  720. Open the file whose name is @var{filename}, and return a port
  721. representing that file. The attributes of the port are
  722. determined by the @var{mode} string. The way in which this is
  723. interpreted is similar to C stdio. The first character must be
  724. one of the following:
  725. @table @samp
  726. @item r
  727. Open an existing file for input.
  728. @item w
  729. Open a file for output, creating it if it doesn't already exist
  730. or removing its contents if it does.
  731. @item a
  732. Open a file for output, creating it if it doesn't already
  733. exist. All writes to the port will go to the end of the file.
  734. The "append mode" can be turned off while the port is in use
  735. @pxref{Ports and File Descriptors, fcntl}
  736. @end table
  737. The following additional characters can be appended:
  738. @table @samp
  739. @item +
  740. Open the port for both input and output. E.g., @code{r+}: open
  741. an existing file for both input and output.
  742. @item 0
  743. Create an "unbuffered" port. In this case input and output
  744. operations are passed directly to the underlying port
  745. implementation without additional buffering. This is likely to
  746. slow down I/O operations. The buffering mode can be changed
  747. while a port is in use @pxref{Ports and File Descriptors,
  748. setvbuf}
  749. @item l
  750. Add line-buffering to the port. The port output buffer will be
  751. automatically flushed whenever a newline character is written.
  752. @item b
  753. Use binary mode, ensuring that each byte in the file will be read as one
  754. Scheme character.
  755. To provide this property, the file will be opened with the 8-bit
  756. character encoding "ISO-8859-1", ignoring the default port encoding.
  757. @xref{Ports}, for more information on port encodings.
  758. Note that while it is possible to read and write binary data as
  759. characters or strings, it is usually better to treat bytes as octets,
  760. and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
  761. @ref{R6RS Binary Output}, for more.
  762. This option had another historical meaning, for DOS compatibility: in
  763. the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
  764. The @code{b} flag prevents this from happening, adding @code{O_BINARY}
  765. to the underlying @code{open} call. Still, the flag is generally useful
  766. because of its port encoding ramifications.
  767. @end table
  768. Unless binary mode is requested, the character encoding of the new port
  769. is determined as follows: First, if @var{guess-encoding} is true, the
  770. @code{file-encoding} procedure is used to guess the encoding of the file
  771. (@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
  772. is false or if @code{file-encoding} fails, @var{encoding} is used unless
  773. it is also false. As a last resort, the default port encoding is used.
  774. @xref{Ports}, for more information on port encodings. It is an error to
  775. pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
  776. is requested.
  777. If a file cannot be opened with the access requested, @code{open-file}
  778. throws an exception.
  779. When the file is opened, its encoding is set to the current
  780. @code{%default-port-encoding}, unless the @code{b} flag was supplied.
  781. Sometimes it is desirable to honor Emacs-style coding declarations in
  782. files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This
  783. behavior was deemed inappropriate and disabled starting from Guile
  784. 2.0.8.}. When that is the case, the @code{file-encoding} procedure can
  785. be used as follows (@pxref{Character Encoding of Source Files,
  786. @code{file-encoding}}):
  787. @example
  788. (let* ((port (open-input-file file))
  789. (encoding (file-encoding port)))
  790. (set-port-encoding! port (or encoding (port-encoding port))))
  791. @end example
  792. In theory we could create read/write ports which were buffered
  793. in one direction only. However this isn't included in the
  794. current interfaces.
  795. @end deffn
  796. @rnindex open-input-file
  797. @deffn {Scheme Procedure} open-input-file filename @
  798. [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
  799. Open @var{filename} for input. If @var{binary} is true, open the port
  800. in binary mode, otherwise use text mode. @var{encoding} and
  801. @var{guess-encoding} determine the character encoding as described above
  802. for @code{open-file}. Equivalent to
  803. @lisp
  804. (open-file @var{filename}
  805. (if @var{binary} "rb" "r")
  806. #:guess-encoding @var{guess-encoding}
  807. #:encoding @var{encoding})
  808. @end lisp
  809. @end deffn
  810. @rnindex open-output-file
  811. @deffn {Scheme Procedure} open-output-file filename @
  812. [#:encoding=#f] [#:binary=#f]
  813. Open @var{filename} for output. If @var{binary} is true, open the port
  814. in binary mode, otherwise use text mode. @var{encoding} specifies the
  815. character encoding as described above for @code{open-file}. Equivalent
  816. to
  817. @lisp
  818. (open-file @var{filename}
  819. (if @var{binary} "wb" "w")
  820. #:encoding @var{encoding})
  821. @end lisp
  822. @end deffn
  823. @deffn {Scheme Procedure} call-with-input-file filename proc @
  824. [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
  825. @deffnx {Scheme Procedure} call-with-output-file filename proc @
  826. [#:encoding=#f] [#:binary=#f]
  827. @rnindex call-with-input-file
  828. @rnindex call-with-output-file
  829. Open @var{filename} for input or output, and call @code{(@var{proc}
  830. port)} with the resulting port. Return the value returned by
  831. @var{proc}. @var{filename} is opened as per @code{open-input-file} or
  832. @code{open-output-file} respectively, and an error is signaled if it
  833. cannot be opened.
  834. When @var{proc} returns, the port is closed. If @var{proc} does not
  835. return (e.g.@: if it throws an error), then the port might not be
  836. closed automatically, though it will be garbage collected in the usual
  837. way if not otherwise referenced.
  838. @end deffn
  839. @deffn {Scheme Procedure} with-input-from-file filename thunk @
  840. [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
  841. @deffnx {Scheme Procedure} with-output-to-file filename thunk @
  842. [#:encoding=#f] [#:binary=#f]
  843. @deffnx {Scheme Procedure} with-error-to-file filename thunk @
  844. [#:encoding=#f] [#:binary=#f]
  845. @rnindex with-input-from-file
  846. @rnindex with-output-to-file
  847. Open @var{filename} and call @code{(@var{thunk})} with the new port
  848. setup as respectively the @code{current-input-port},
  849. @code{current-output-port}, or @code{current-error-port}. Return the
  850. value returned by @var{thunk}. @var{filename} is opened as per
  851. @code{open-input-file} or @code{open-output-file} respectively, and an
  852. error is signaled if it cannot be opened.
  853. When @var{thunk} returns, the port is closed and the previous setting
  854. of the respective current port is restored.
  855. The current port setting is managed with @code{dynamic-wind}, so the
  856. previous value is restored no matter how @var{thunk} exits (eg.@: an
  857. exception), and if @var{thunk} is re-entered (via a captured
  858. continuation) then it's set again to the @var{filename} port.
  859. The port is closed when @var{thunk} returns normally, but not when
  860. exited via an exception or new continuation. This ensures it's still
  861. ready for use if @var{thunk} is re-entered by a captured continuation.
  862. Of course the port is always garbage collected and closed in the usual
  863. way when no longer referenced anywhere.
  864. @end deffn
  865. @deffn {Scheme Procedure} port-mode port
  866. @deffnx {C Function} scm_port_mode (port)
  867. Return the port modes associated with the open port @var{port}.
  868. These will not necessarily be identical to the modes used when
  869. the port was opened, since modes such as "append" which are
  870. used only during port creation are not retained.
  871. @end deffn
  872. @deffn {Scheme Procedure} port-filename port
  873. @deffnx {C Function} scm_port_filename (port)
  874. Return the filename associated with @var{port}, or @code{#f} if no
  875. filename is associated with the port.
  876. @var{port} must be open, @code{port-filename} cannot be used once the
  877. port is closed.
  878. @end deffn
  879. @deffn {Scheme Procedure} set-port-filename! port filename
  880. @deffnx {C Function} scm_set_port_filename_x (port, filename)
  881. Change the filename associated with @var{port}, using the current input
  882. port if none is specified. Note that this does not change the port's
  883. source of data, but only the value that is returned by
  884. @code{port-filename} and reported in diagnostic output.
  885. @end deffn
  886. @deffn {Scheme Procedure} file-port? obj
  887. @deffnx {C Function} scm_file_port_p (obj)
  888. Determine whether @var{obj} is a port that is related to a file.
  889. @end deffn
  890. @node String Ports
  891. @subsubsection String Ports
  892. @cindex String port
  893. @cindex Port, string
  894. The following allow string ports to be opened by analogy to R4RS
  895. file port facilities:
  896. With string ports, the port-encoding is treated differently than other
  897. types of ports. When string ports are created, they do not inherit a
  898. character encoding from the current locale. They are given a
  899. default locale that allows them to handle all valid string characters.
  900. Typically one should not modify a string port's character encoding
  901. away from its default.
  902. @deffn {Scheme Procedure} call-with-output-string proc
  903. @deffnx {C Function} scm_call_with_output_string (proc)
  904. Calls the one-argument procedure @var{proc} with a newly created output
  905. port. When the function returns, the string composed of the characters
  906. written into the port is returned. @var{proc} should not close the port.
  907. @end deffn
  908. @deffn {Scheme Procedure} call-with-input-string string proc
  909. @deffnx {C Function} scm_call_with_input_string (string, proc)
  910. Calls the one-argument procedure @var{proc} with a newly
  911. created input port from which @var{string}'s contents may be
  912. read. The value yielded by the @var{proc} is returned.
  913. @end deffn
  914. @deffn {Scheme Procedure} with-output-to-string thunk
  915. Calls the zero-argument procedure @var{thunk} with the current output
  916. port set temporarily to a new string port. It returns a string
  917. composed of the characters written to the current output.
  918. @end deffn
  919. @deffn {Scheme Procedure} with-input-from-string string thunk
  920. Calls the zero-argument procedure @var{thunk} with the current input
  921. port set temporarily to a string port opened on the specified
  922. @var{string}. The value yielded by @var{thunk} is returned.
  923. @end deffn
  924. @deffn {Scheme Procedure} open-input-string str
  925. @deffnx {C Function} scm_open_input_string (str)
  926. Take a string and return an input port that delivers characters
  927. from the string. The port can be closed by
  928. @code{close-input-port}, though its storage will be reclaimed
  929. by the garbage collector if it becomes inaccessible.
  930. @end deffn
  931. @deffn {Scheme Procedure} open-output-string
  932. @deffnx {C Function} scm_open_output_string ()
  933. Return an output port that will accumulate characters for
  934. retrieval by @code{get-output-string}. The port can be closed
  935. by the procedure @code{close-output-port}, though its storage
  936. will be reclaimed by the garbage collector if it becomes
  937. inaccessible.
  938. @end deffn
  939. @deffn {Scheme Procedure} get-output-string port
  940. @deffnx {C Function} scm_get_output_string (port)
  941. Given an output port created by @code{open-output-string},
  942. return a string consisting of the characters that have been
  943. output to the port so far.
  944. @code{get-output-string} must be used before closing @var{port}, once
  945. closed the string cannot be obtained.
  946. @end deffn
  947. A string port can be used in many procedures which accept a port
  948. but which are not dependent on implementation details of fports.
  949. E.g., seeking and truncating will work on a string port,
  950. but trying to extract the file descriptor number will fail.
  951. @node Soft Ports
  952. @subsubsection Soft Ports
  953. @cindex Soft port
  954. @cindex Port, soft
  955. A @dfn{soft-port} is a port based on a vector of procedures capable of
  956. accepting or delivering characters. It allows emulation of I/O ports.
  957. @deffn {Scheme Procedure} make-soft-port pv modes
  958. @deffnx {C Function} scm_make_soft_port (pv, modes)
  959. Return a port capable of receiving or delivering characters as
  960. specified by the @var{modes} string (@pxref{File Ports,
  961. open-file}). @var{pv} must be a vector of length 5 or 6. Its
  962. components are as follows:
  963. @enumerate 0
  964. @item
  965. procedure accepting one character for output
  966. @item
  967. procedure accepting a string for output
  968. @item
  969. thunk for flushing output
  970. @item
  971. thunk for getting one character
  972. @item
  973. thunk for closing port (not by garbage collection)
  974. @item
  975. (if present and not @code{#f}) thunk for computing the number of
  976. characters that can be read from the port without blocking.
  977. @end enumerate
  978. For an output-only port only elements 0, 1, 2, and 4 need be
  979. procedures. For an input-only port only elements 3 and 4 need
  980. be procedures. Thunks 2 and 4 can instead be @code{#f} if
  981. there is no useful operation for them to perform.
  982. If thunk 3 returns @code{#f} or an @code{eof-object}
  983. (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
  984. Scheme}) it indicates that the port has reached end-of-file.
  985. For example:
  986. @lisp
  987. (define stdout (current-output-port))
  988. (define p (make-soft-port
  989. (vector
  990. (lambda (c) (write c stdout))
  991. (lambda (s) (display s stdout))
  992. (lambda () (display "." stdout))
  993. (lambda () (char-upcase (read-char)))
  994. (lambda () (display "@@" stdout)))
  995. "rw"))
  996. (write p p) @result{} #<input-output: soft 8081e20>
  997. @end lisp
  998. @end deffn
  999. @node Void Ports
  1000. @subsubsection Void Ports
  1001. @cindex Void port
  1002. @cindex Port, void
  1003. This kind of port causes any data to be discarded when written to, and
  1004. always returns the end-of-file object when read from.
  1005. @deffn {Scheme Procedure} %make-void-port mode
  1006. @deffnx {C Function} scm_sys_make_void_port (mode)
  1007. Create and return a new void port. A void port acts like
  1008. @file{/dev/null}. The @var{mode} argument
  1009. specifies the input/output modes for this port: see the
  1010. documentation for @code{open-file} in @ref{File Ports}.
  1011. @end deffn
  1012. @node R6RS I/O Ports
  1013. @subsection R6RS I/O Ports
  1014. @cindex R6RS
  1015. @cindex R6RS ports
  1016. The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
  1017. the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
  1018. io ports)} module. It provides features, such as binary I/O and Unicode
  1019. string I/O, that complement or refine Guile's historical port API
  1020. presented above (@pxref{Input and Output}). Note that R6RS ports are not
  1021. disjoint from Guile's native ports, so Guile-specific procedures will
  1022. work on ports created using the R6RS API, and vice versa.
  1023. The text in this section is taken from the R6RS standard libraries
  1024. document, with only minor adaptions for inclusion in this manual. The
  1025. Guile developers offer their thanks to the R6RS editors for having
  1026. provided the report's text under permissive conditions making this
  1027. possible.
  1028. @c FIXME: Update description when implemented.
  1029. @emph{Note}: The implementation of this R6RS API is not complete yet.
  1030. @menu
  1031. * R6RS File Names:: File names.
  1032. * R6RS File Options:: Options for opening files.
  1033. * R6RS Buffer Modes:: Influencing buffering behavior.
  1034. * R6RS Transcoders:: Influencing port encoding.
  1035. * R6RS End-of-File:: The end-of-file object.
  1036. * R6RS Port Manipulation:: Manipulating R6RS ports.
  1037. * R6RS Input Ports:: Input Ports.
  1038. * R6RS Binary Input:: Binary input.
  1039. * R6RS Textual Input:: Textual input.
  1040. * R6RS Output Ports:: Output Ports.
  1041. * R6RS Binary Output:: Binary output.
  1042. * R6RS Textual Output:: Textual output.
  1043. @end menu
  1044. A subset of the @code{(rnrs io ports)} module, plus one non-standard
  1045. procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
  1046. provided by the @code{(ice-9 binary-ports)} module. It contains binary
  1047. input/output procedures and does not rely on R6RS support.
  1048. @node R6RS File Names
  1049. @subsubsection File Names
  1050. Some of the procedures described in this chapter accept a file name as an
  1051. argument. Valid values for such a file name include strings that name a file
  1052. using the native notation of file system paths on an implementation's
  1053. underlying operating system, and may include implementation-dependent
  1054. values as well.
  1055. A @var{filename} parameter name means that the
  1056. corresponding argument must be a file name.
  1057. @node R6RS File Options
  1058. @subsubsection File Options
  1059. @cindex file options
  1060. When opening a file, the various procedures in this library accept a
  1061. @code{file-options} object that encapsulates flags to specify how the
  1062. file is to be opened. A @code{file-options} object is an enum-set
  1063. (@pxref{rnrs enums}) over the symbols constituting valid file options.
  1064. A @var{file-options} parameter name means that the corresponding
  1065. argument must be a file-options object.
  1066. @deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
  1067. Each @var{file-options-symbol} must be a symbol.
  1068. The @code{file-options} syntax returns a file-options object that
  1069. encapsulates the specified options.
  1070. When supplied to an operation that opens a file for output, the
  1071. file-options object returned by @code{(file-options)} specifies that the
  1072. file is created if it does not exist and an exception with condition
  1073. type @code{&i/o-file-already-exists} is raised if it does exist. The
  1074. following standard options can be included to modify the default
  1075. behavior.
  1076. @table @code
  1077. @item no-create
  1078. If the file does not already exist, it is not created;
  1079. instead, an exception with condition type @code{&i/o-file-does-not-exist}
  1080. is raised.
  1081. If the file already exists, the exception with condition type
  1082. @code{&i/o-file-already-exists} is not raised
  1083. and the file is truncated to zero length.
  1084. @item no-fail
  1085. If the file already exists, the exception with condition type
  1086. @code{&i/o-file-already-exists} is not raised,
  1087. even if @code{no-create} is not included,
  1088. and the file is truncated to zero length.
  1089. @item no-truncate
  1090. If the file already exists and the exception with condition type
  1091. @code{&i/o-file-already-exists} has been inhibited by inclusion of
  1092. @code{no-create} or @code{no-fail}, the file is not truncated, but
  1093. the port's current position is still set to the beginning of the
  1094. file.
  1095. @end table
  1096. These options have no effect when a file is opened only for input.
  1097. Symbols other than those listed above may be used as
  1098. @var{file-options-symbol}s; they have implementation-specific meaning,
  1099. if any.
  1100. @quotation Note
  1101. Only the name of @var{file-options-symbol} is significant.
  1102. @end quotation
  1103. @end deffn
  1104. @node R6RS Buffer Modes
  1105. @subsubsection Buffer Modes
  1106. Each port has an associated buffer mode. For an output port, the
  1107. buffer mode defines when an output operation flushes the buffer
  1108. associated with the output port. For an input port, the buffer mode
  1109. defines how much data will be read to satisfy read operations. The
  1110. possible buffer modes are the symbols @code{none} for no buffering,
  1111. @code{line} for flushing upon line endings and reading up to line
  1112. endings, or other implementation-dependent behavior,
  1113. and @code{block} for arbitrary buffering. This section uses
  1114. the parameter name @var{buffer-mode} for arguments that must be
  1115. buffer-mode symbols.
  1116. If two ports are connected to the same mutable source, both ports
  1117. are unbuffered, and reading a byte or character from that shared
  1118. source via one of the two ports would change the bytes or characters
  1119. seen via the other port, a lookahead operation on one port will
  1120. render the peeked byte or character inaccessible via the other port,
  1121. while a subsequent read operation on the peeked port will see the
  1122. peeked byte or character even though the port is otherwise unbuffered.
  1123. In other words, the semantics of buffering is defined in terms of side
  1124. effects on shared mutable sources, and a lookahead operation has the
  1125. same side effect on the shared source as a read operation.
  1126. @deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
  1127. @var{buffer-mode-symbol} must be a symbol whose name is one of
  1128. @code{none}, @code{line}, and @code{block}. The result is the
  1129. corresponding symbol, and specifies the associated buffer mode.
  1130. @quotation Note
  1131. Only the name of @var{buffer-mode-symbol} is significant.
  1132. @end quotation
  1133. @end deffn
  1134. @deffn {Scheme Procedure} buffer-mode? obj
  1135. Returns @code{#t} if the argument is a valid buffer-mode symbol, and
  1136. returns @code{#f} otherwise.
  1137. @end deffn
  1138. @node R6RS Transcoders
  1139. @subsubsection Transcoders
  1140. @cindex codec
  1141. @cindex end-of-line style
  1142. @cindex transcoder
  1143. @cindex binary port
  1144. @cindex textual port
  1145. Several different Unicode encoding schemes describe standard ways to
  1146. encode characters and strings as byte sequences and to decode those
  1147. sequences. Within this document, a @dfn{codec} is an immutable Scheme
  1148. object that represents a Unicode or similar encoding scheme.
  1149. An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
  1150. describes how a textual port transcodes representations of line endings.
  1151. A @dfn{transcoder} is an immutable Scheme object that combines a codec
  1152. with an end-of-line style and a method for handling decoding errors.
  1153. Each transcoder represents some specific bidirectional (but not
  1154. necessarily lossless), possibly stateful translation between byte
  1155. sequences and Unicode characters and strings. Every transcoder can
  1156. operate in the input direction (bytes to characters) or in the output
  1157. direction (characters to bytes). A @var{transcoder} parameter name
  1158. means that the corresponding argument must be a transcoder.
  1159. A @dfn{binary port} is a port that supports binary I/O, does not have an
  1160. associated transcoder and does not support textual I/O. A @dfn{textual
  1161. port} is a port that supports textual I/O, and does not support binary
  1162. I/O. A textual port may or may not have an associated transcoder.
  1163. @deffn {Scheme Procedure} latin-1-codec
  1164. @deffnx {Scheme Procedure} utf-8-codec
  1165. @deffnx {Scheme Procedure} utf-16-codec
  1166. These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
  1167. encoding schemes.
  1168. A call to any of these procedures returns a value that is equal in the
  1169. sense of @code{eqv?} to the result of any other call to the same
  1170. procedure.
  1171. @end deffn
  1172. @deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
  1173. @var{eol-style-symbol} should be a symbol whose name is one of
  1174. @code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
  1175. and @code{none}.
  1176. The form evaluates to the corresponding symbol. If the name of
  1177. @var{eol-style-symbol} is not one of these symbols, the effect and
  1178. result are implementation-dependent; in particular, the result may be an
  1179. eol-style symbol acceptable as an @var{eol-style} argument to
  1180. @code{make-transcoder}. Otherwise, an exception is raised.
  1181. All eol-style symbols except @code{none} describe a specific
  1182. line-ending encoding:
  1183. @table @code
  1184. @item lf
  1185. linefeed
  1186. @item cr
  1187. carriage return
  1188. @item crlf
  1189. carriage return, linefeed
  1190. @item nel
  1191. next line
  1192. @item crnel
  1193. carriage return, next line
  1194. @item ls
  1195. line separator
  1196. @end table
  1197. For a textual port with a transcoder, and whose transcoder has an
  1198. eol-style symbol @code{none}, no conversion occurs. For a textual input
  1199. port, any eol-style symbol other than @code{none} means that all of the
  1200. above line-ending encodings are recognized and are translated into a
  1201. single linefeed. For a textual output port, @code{none} and @code{lf}
  1202. are equivalent. Linefeed characters are encoded according to the
  1203. specified eol-style symbol, and all other characters that participate in
  1204. possible line endings are encoded as is.
  1205. @quotation Note
  1206. Only the name of @var{eol-style-symbol} is significant.
  1207. @end quotation
  1208. @end deffn
  1209. @deffn {Scheme Procedure} native-eol-style
  1210. Returns the default end-of-line style of the underlying platform, e.g.,
  1211. @code{lf} on Unix and @code{crlf} on Windows.
  1212. @end deffn
  1213. @deffn {Condition Type} &i/o-decoding
  1214. @deffnx {Scheme Procedure} make-i/o-decoding-error port
  1215. @deffnx {Scheme Procedure} i/o-decoding-error? obj
  1216. This condition type could be defined by
  1217. @lisp
  1218. (define-condition-type &i/o-decoding &i/o-port
  1219. make-i/o-decoding-error i/o-decoding-error?)
  1220. @end lisp
  1221. An exception with this type is raised when one of the operations for
  1222. textual input from a port encounters a sequence of bytes that cannot be
  1223. translated into a character or string by the input direction of the
  1224. port's transcoder.
  1225. When such an exception is raised, the port's position is past the
  1226. invalid encoding.
  1227. @end deffn
  1228. @deffn {Condition Type} &i/o-encoding
  1229. @deffnx {Scheme Procedure} make-i/o-encoding-error port char
  1230. @deffnx {Scheme Procedure} i/o-encoding-error? obj
  1231. @deffnx {Scheme Procedure} i/o-encoding-error-char condition
  1232. This condition type could be defined by
  1233. @lisp
  1234. (define-condition-type &i/o-encoding &i/o-port
  1235. make-i/o-encoding-error i/o-encoding-error?
  1236. (char i/o-encoding-error-char))
  1237. @end lisp
  1238. An exception with this type is raised when one of the operations for
  1239. textual output to a port encounters a character that cannot be
  1240. translated into bytes by the output direction of the port's transcoder.
  1241. @var{char} is the character that could not be encoded.
  1242. @end deffn
  1243. @deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
  1244. @var{error-handling-mode-symbol} should be a symbol whose name is one of
  1245. @code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
  1246. the corresponding symbol. If @var{error-handling-mode-symbol} is not
  1247. one of these identifiers, effect and result are
  1248. implementation-dependent: The result may be an error-handling-mode
  1249. symbol acceptable as a @var{handling-mode} argument to
  1250. @code{make-transcoder}. If it is not acceptable as a
  1251. @var{handling-mode} argument to @code{make-transcoder}, an exception is
  1252. raised.
  1253. @quotation Note
  1254. Only the name of @var{error-handling-mode-symbol} is significant.
  1255. @end quotation
  1256. The error-handling mode of a transcoder specifies the behavior
  1257. of textual I/O operations in the presence of encoding or decoding
  1258. errors.
  1259. If a textual input operation encounters an invalid or incomplete
  1260. character encoding, and the error-handling mode is @code{ignore}, an
  1261. appropriate number of bytes of the invalid encoding are ignored and
  1262. decoding continues with the following bytes.
  1263. If the error-handling mode is @code{replace}, the replacement
  1264. character U+FFFD is injected into the data stream, an appropriate
  1265. number of bytes are ignored, and decoding
  1266. continues with the following bytes.
  1267. If the error-handling mode is @code{raise}, an exception with condition
  1268. type @code{&i/o-decoding} is raised.
  1269. If a textual output operation encounters a character it cannot encode,
  1270. and the error-handling mode is @code{ignore}, the character is ignored
  1271. and encoding continues with the next character. If the error-handling
  1272. mode is @code{replace}, a codec-specific replacement character is
  1273. emitted by the transcoder, and encoding continues with the next
  1274. character. The replacement character is U+FFFD for transcoders whose
  1275. codec is one of the Unicode encodings, but is the @code{?} character
  1276. for the Latin-1 encoding. If the error-handling mode is @code{raise},
  1277. an exception with condition type @code{&i/o-encoding} is raised.
  1278. @end deffn
  1279. @deffn {Scheme Procedure} make-transcoder codec
  1280. @deffnx {Scheme Procedure} make-transcoder codec eol-style
  1281. @deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
  1282. @var{codec} must be a codec; @var{eol-style}, if present, an eol-style
  1283. symbol; and @var{handling-mode}, if present, an error-handling-mode
  1284. symbol.
  1285. @var{eol-style} may be omitted, in which case it defaults to the native
  1286. end-of-line style of the underlying platform. @var{handling-mode} may
  1287. be omitted, in which case it defaults to @code{replace}. The result is
  1288. a transcoder with the behavior specified by its arguments.
  1289. @end deffn
  1290. @deffn {Scheme procedure} native-transcoder
  1291. Returns an implementation-dependent transcoder that represents a
  1292. possibly locale-dependent ``native'' transcoding.
  1293. @end deffn
  1294. @deffn {Scheme Procedure} transcoder-codec transcoder
  1295. @deffnx {Scheme Procedure} transcoder-eol-style transcoder
  1296. @deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
  1297. These are accessors for transcoder objects; when applied to a
  1298. transcoder returned by @code{make-transcoder}, they return the
  1299. @var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
  1300. respectively.
  1301. @end deffn
  1302. @deffn {Scheme Procedure} bytevector->string bytevector transcoder
  1303. Returns the string that results from transcoding the
  1304. @var{bytevector} according to the input direction of the transcoder.
  1305. @end deffn
  1306. @deffn {Scheme Procedure} string->bytevector string transcoder
  1307. Returns the bytevector that results from transcoding the
  1308. @var{string} according to the output direction of the transcoder.
  1309. @end deffn
  1310. @node R6RS End-of-File
  1311. @subsubsection The End-of-File Object
  1312. @cindex EOF
  1313. @cindex end-of-file
  1314. R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
  1315. ports)} module:
  1316. @deffn {Scheme Procedure} eof-object? obj
  1317. @deffnx {C Function} scm_eof_object_p (obj)
  1318. Return true if @var{obj} is the end-of-file (EOF) object.
  1319. @end deffn
  1320. In addition, the following procedure is provided:
  1321. @deffn {Scheme Procedure} eof-object
  1322. @deffnx {C Function} scm_eof_object ()
  1323. Return the end-of-file (EOF) object.
  1324. @lisp
  1325. (eof-object? (eof-object))
  1326. @result{} #t
  1327. @end lisp
  1328. @end deffn
  1329. @node R6RS Port Manipulation
  1330. @subsubsection Port Manipulation
  1331. The procedures listed below operate on any kind of R6RS I/O port.
  1332. @deffn {Scheme Procedure} port? obj
  1333. Returns @code{#t} if the argument is a port, and returns @code{#f}
  1334. otherwise.
  1335. @end deffn
  1336. @deffn {Scheme Procedure} port-transcoder port
  1337. Returns the transcoder associated with @var{port} if @var{port} is
  1338. textual and has an associated transcoder, and returns @code{#f} if
  1339. @var{port} is binary or does not have an associated transcoder.
  1340. @end deffn
  1341. @deffn {Scheme Procedure} binary-port? port
  1342. Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
  1343. binary data input/output.
  1344. Note that internally Guile does not differentiate between binary and
  1345. textual ports, unlike the R6RS. Thus, this procedure returns true when
  1346. @var{port} does not have an associated encoding---i.e., when
  1347. @code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
  1348. port-encoding}). This is the case for ports returned by R6RS procedures
  1349. such as @code{open-bytevector-input-port} and
  1350. @code{make-custom-binary-output-port}.
  1351. However, Guile currently does not prevent use of textual I/O procedures
  1352. such as @code{display} or @code{read-char} with binary ports. Doing so
  1353. ``upgrades'' the port from binary to textual, under the ISO-8859-1
  1354. encoding. Likewise, Guile does not prevent use of
  1355. @code{set-port-encoding!} on a binary port, which also turns it into a
  1356. ``textual'' port.
  1357. @end deffn
  1358. @deffn {Scheme Procedure} textual-port? port
  1359. Always return @code{#t}, as all ports can be used for textual I/O in
  1360. Guile.
  1361. @end deffn
  1362. @deffn {Scheme Procedure} transcoded-port binary-port transcoder
  1363. The @code{transcoded-port} procedure
  1364. returns a new textual port with the specified @var{transcoder}.
  1365. Otherwise the new textual port's state is largely the same as
  1366. that of @var{binary-port}.
  1367. If @var{binary-port} is an input port, the new textual
  1368. port will be an input port and
  1369. will transcode the bytes that have not yet been read from
  1370. @var{binary-port}.
  1371. If @var{binary-port} is an output port, the new textual
  1372. port will be an output port and
  1373. will transcode output characters into bytes that are
  1374. written to the byte sink represented by @var{binary-port}.
  1375. As a side effect, however, @code{transcoded-port}
  1376. closes @var{binary-port} in
  1377. a special way that allows the new textual port to continue to
  1378. use the byte source or sink represented by @var{binary-port},
  1379. even though @var{binary-port} itself is closed and cannot
  1380. be used by the input and output operations described in this
  1381. chapter.
  1382. @end deffn
  1383. @deffn {Scheme Procedure} port-position port
  1384. If @var{port} supports it (see below), return the offset (an integer)
  1385. indicating where the next octet will be read from/written to in
  1386. @var{port}. If @var{port} does not support this operation, an error
  1387. condition is raised.
  1388. This is similar to Guile's @code{seek} procedure with the
  1389. @code{SEEK_CUR} argument (@pxref{Random Access}).
  1390. @end deffn
  1391. @deffn {Scheme Procedure} port-has-port-position? port
  1392. Return @code{#t} is @var{port} supports @code{port-position}.
  1393. @end deffn
  1394. @deffn {Scheme Procedure} set-port-position! port offset
  1395. If @var{port} supports it (see below), set the position where the next
  1396. octet will be read from/written to @var{port} to @var{offset} (an
  1397. integer). If @var{port} does not support this operation, an error
  1398. condition is raised.
  1399. This is similar to Guile's @code{seek} procedure with the
  1400. @code{SEEK_SET} argument (@pxref{Random Access}).
  1401. @end deffn
  1402. @deffn {Scheme Procedure} port-has-set-port-position!? port
  1403. Return @code{#t} is @var{port} supports @code{set-port-position!}.
  1404. @end deffn
  1405. @deffn {Scheme Procedure} call-with-port port proc
  1406. Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
  1407. of @var{proc}. Return the return values of @var{proc}.
  1408. @end deffn
  1409. @node R6RS Input Ports
  1410. @subsubsection Input Ports
  1411. @deffn {Scheme Procedure} input-port? obj
  1412. Returns @code{#t} if the argument is an input port (or a combined input
  1413. and output port), and returns @code{#f} otherwise.
  1414. @end deffn
  1415. @deffn {Scheme Procedure} port-eof? input-port
  1416. Returns @code{#t}
  1417. if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
  1418. or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
  1419. would return
  1420. the end-of-file object, and @code{#f} otherwise.
  1421. The operation may block indefinitely if no data is available
  1422. but the port cannot be determined to be at end of file.
  1423. @end deffn
  1424. @deffn {Scheme Procedure} open-file-input-port filename
  1425. @deffnx {Scheme Procedure} open-file-input-port filename file-options
  1426. @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
  1427. @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
  1428. @var{maybe-transcoder} must be either a transcoder or @code{#f}.
  1429. The @code{open-file-input-port} procedure returns an
  1430. input port for the named file. The @var{file-options} and
  1431. @var{maybe-transcoder} arguments are optional.
  1432. The @var{file-options} argument, which may determine
  1433. various aspects of the returned port (@pxref{R6RS File Options}),
  1434. defaults to the value of @code{(file-options)}.
  1435. The @var{buffer-mode} argument, if supplied,
  1436. must be one of the symbols that name a buffer mode.
  1437. The @var{buffer-mode} argument defaults to @code{block}.
  1438. If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
  1439. with the returned port.
  1440. If @var{maybe-transcoder} is @code{#f} or absent,
  1441. the port will be a binary port and will support the
  1442. @code{port-position} and @code{set-port-position!} operations.
  1443. Otherwise the port will be a textual port, and whether it supports
  1444. the @code{port-position} and @code{set-port-position!} operations
  1445. is implementation-dependent (and possibly transcoder-dependent).
  1446. @end deffn
  1447. @deffn {Scheme Procedure} standard-input-port
  1448. Returns a fresh binary input port connected to standard input. Whether
  1449. the port supports the @code{port-position} and @code{set-port-position!}
  1450. operations is implementation-dependent.
  1451. @end deffn
  1452. @deffn {Scheme Procedure} current-input-port
  1453. This returns a default textual port for input. Normally, this default
  1454. port is associated with standard input, but can be dynamically
  1455. re-assigned using the @code{with-input-from-file} procedure from the
  1456. @code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
  1457. may not have an associated transcoder; if it does, the transcoder is
  1458. implementation-dependent.
  1459. @end deffn
  1460. @node R6RS Binary Input
  1461. @subsubsection Binary Input
  1462. @cindex binary input
  1463. R6RS binary input ports can be created with the procedures described
  1464. below.
  1465. @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
  1466. @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
  1467. Return an input port whose contents are drawn from bytevector @var{bv}
  1468. (@pxref{Bytevectors}).
  1469. @c FIXME: Update description when implemented.
  1470. The @var{transcoder} argument is currently not supported.
  1471. @end deffn
  1472. @cindex custom binary input ports
  1473. @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
  1474. @deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
  1475. Return a new custom binary input port@footnote{This is similar in spirit
  1476. to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
  1477. string) whose input is drained by invoking @var{read!} and passing it a
  1478. bytevector, an index where bytes should be written, and the number of
  1479. bytes to read. The @code{read!} procedure must return an integer
  1480. indicating the number of bytes read, or @code{0} to indicate the
  1481. end-of-file.
  1482. Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
  1483. that will be called when @code{port-position} is invoked on the custom
  1484. binary port and should return an integer indicating the position within
  1485. the underlying data stream; if @var{get-position} was not supplied, the
  1486. returned port does not support @code{port-position}.
  1487. Likewise, if @var{set-position!} is not @code{#f}, it should be a
  1488. one-argument procedure. When @code{set-port-position!} is invoked on the
  1489. custom binary input port, @var{set-position!} is passed an integer
  1490. indicating the position of the next byte is to read.
  1491. Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
  1492. invoked when the custom binary input port is closed.
  1493. Using a custom binary input port, the @code{open-bytevector-input-port}
  1494. procedure could be implemented as follows:
  1495. @lisp
  1496. (define (open-bytevector-input-port source)
  1497. (define position 0)
  1498. (define length (bytevector-length source))
  1499. (define (read! bv start count)
  1500. (let ((count (min count (- length position))))
  1501. (bytevector-copy! source position
  1502. bv start count)
  1503. (set! position (+ position count))
  1504. count))
  1505. (define (get-position) position)
  1506. (define (set-position! new-position)
  1507. (set! position new-position))
  1508. (make-custom-binary-input-port "the port" read!
  1509. get-position
  1510. set-position!))
  1511. (read (open-bytevector-input-port (string->utf8 "hello")))
  1512. @result{} hello
  1513. @end lisp
  1514. @end deffn
  1515. @cindex binary input
  1516. Binary input is achieved using the procedures below:
  1517. @deffn {Scheme Procedure} get-u8 port
  1518. @deffnx {C Function} scm_get_u8 (port)
  1519. Return an octet read from @var{port}, a binary input port, blocking as
  1520. necessary, or the end-of-file object.
  1521. @end deffn
  1522. @deffn {Scheme Procedure} lookahead-u8 port
  1523. @deffnx {C Function} scm_lookahead_u8 (port)
  1524. Like @code{get-u8} but does not update @var{port}'s position to point
  1525. past the octet.
  1526. @end deffn
  1527. @deffn {Scheme Procedure} get-bytevector-n port count
  1528. @deffnx {C Function} scm_get_bytevector_n (port, count)
  1529. Read @var{count} octets from @var{port}, blocking as necessary and
  1530. return a bytevector containing the octets read. If fewer bytes are
  1531. available, a bytevector smaller than @var{count} is returned.
  1532. @end deffn
  1533. @deffn {Scheme Procedure} get-bytevector-n! port bv start count
  1534. @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
  1535. Read @var{count} bytes from @var{port} and store them in @var{bv}
  1536. starting at index @var{start}. Return either the number of bytes
  1537. actually read or the end-of-file object.
  1538. @end deffn
  1539. @deffn {Scheme Procedure} get-bytevector-some port
  1540. @deffnx {C Function} scm_get_bytevector_some (port)
  1541. Read from @var{port}, blocking as necessary, until bytes are available
  1542. or an end-of-file is reached. Return either the end-of-file object or a
  1543. new bytevector containing some of the available bytes (at least one),
  1544. and update the port position to point just past these bytes.
  1545. @end deffn
  1546. @deffn {Scheme Procedure} get-bytevector-all port
  1547. @deffnx {C Function} scm_get_bytevector_all (port)
  1548. Read from @var{port}, blocking as necessary, until the end-of-file is
  1549. reached. Return either a new bytevector containing the data read or the
  1550. end-of-file object (if no data were available).
  1551. @end deffn
  1552. The @code{(ice-9 binary-ports)} module provides the following procedure
  1553. as an extension to @code{(rnrs io ports)}:
  1554. @deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
  1555. @deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
  1556. Place the contents of @var{bv} in @var{port}, optionally starting at
  1557. index @var{start} and limiting to @var{count} octets, so that its bytes
  1558. will be read from left-to-right as the next bytes from @var{port} during
  1559. subsequent read operations. If called multiple times, the unread bytes
  1560. will be read again in last-in first-out order.
  1561. @end deffn
  1562. @node R6RS Textual Input
  1563. @subsubsection Textual Input
  1564. @deffn {Scheme Procedure} get-char textual-input-port
  1565. Reads from @var{textual-input-port}, blocking as necessary, until a
  1566. complete character is available from @var{textual-input-port},
  1567. or until an end of file is reached.
  1568. If a complete character is available before the next end of file,
  1569. @code{get-char} returns that character and updates the input port to
  1570. point past the character. If an end of file is reached before any
  1571. character is read, @code{get-char} returns the end-of-file object.
  1572. @end deffn
  1573. @deffn {Scheme Procedure} lookahead-char textual-input-port
  1574. The @code{lookahead-char} procedure is like @code{get-char}, but it does
  1575. not update @var{textual-input-port} to point past the character.
  1576. @end deffn
  1577. @deffn {Scheme Procedure} get-string-n textual-input-port count
  1578. @var{count} must be an exact, non-negative integer object, representing
  1579. the number of characters to be read.
  1580. The @code{get-string-n} procedure reads from @var{textual-input-port},
  1581. blocking as necessary, until @var{count} characters are available, or
  1582. until an end of file is reached.
  1583. If @var{count} characters are available before end of file,
  1584. @code{get-string-n} returns a string consisting of those @var{count}
  1585. characters. If fewer characters are available before an end of file, but
  1586. one or more characters can be read, @code{get-string-n} returns a string
  1587. containing those characters. In either case, the input port is updated
  1588. to point just past the characters read. If no characters can be read
  1589. before an end of file, the end-of-file object is returned.
  1590. @end deffn
  1591. @deffn {Scheme Procedure} get-string-n! textual-input-port string start count
  1592. @var{start} and @var{count} must be exact, non-negative integer objects,
  1593. with @var{count} representing the number of characters to be read.
  1594. @var{string} must be a string with at least $@var{start} + @var{count}$
  1595. characters.
  1596. The @code{get-string-n!} procedure reads from @var{textual-input-port}
  1597. in the same manner as @code{get-string-n}. If @var{count} characters
  1598. are available before an end of file, they are written into @var{string}
  1599. starting at index @var{start}, and @var{count} is returned. If fewer
  1600. characters are available before an end of file, but one or more can be
  1601. read, those characters are written into @var{string} starting at index
  1602. @var{start} and the number of characters actually read is returned as an
  1603. exact integer object. If no characters can be read before an end of
  1604. file, the end-of-file object is returned.
  1605. @end deffn
  1606. @deffn {Scheme Procedure} get-string-all textual-input-port
  1607. Reads from @var{textual-input-port} until an end of file, decoding
  1608. characters in the same manner as @code{get-string-n} and
  1609. @code{get-string-n!}.
  1610. If characters are available before the end of file, a string containing
  1611. all the characters decoded from that data are returned. If no character
  1612. precedes the end of file, the end-of-file object is returned.
  1613. @end deffn
  1614. @deffn {Scheme Procedure} get-line textual-input-port
  1615. Reads from @var{textual-input-port} up to and including the linefeed
  1616. character or end of file, decoding characters in the same manner as
  1617. @code{get-string-n} and @code{get-string-n!}.
  1618. If a linefeed character is read, a string containing all of the text up
  1619. to (but not including) the linefeed character is returned, and the port
  1620. is updated to point just past the linefeed character. If an end of file
  1621. is encountered before any linefeed character is read, but some
  1622. characters have been read and decoded as characters, a string containing
  1623. those characters is returned. If an end of file is encountered before
  1624. any characters are read, the end-of-file object is returned.
  1625. @quotation Note
  1626. The end-of-line style, if not @code{none}, will cause all line endings
  1627. to be read as linefeed characters. @xref{R6RS Transcoders}.
  1628. @end quotation
  1629. @end deffn
  1630. @deffn {Scheme Procedure} get-datum textual-input-port count
  1631. Reads an external representation from @var{textual-input-port} and returns the
  1632. datum it represents. The @code{get-datum} procedure returns the next
  1633. datum that can be parsed from the given @var{textual-input-port}, updating
  1634. @var{textual-input-port} to point exactly past the end of the external
  1635. representation of the object.
  1636. Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
  1637. Syntax}) in the input is first skipped. If an end of file occurs after
  1638. the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
  1639. is returned.
  1640. If a character inconsistent with an external representation is
  1641. encountered in the input, an exception with condition types
  1642. @code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
  1643. file is encountered after the beginning of an external representation,
  1644. but the external representation is incomplete and therefore cannot be
  1645. parsed, an exception with condition types @code{&lexical} and
  1646. @code{&i/o-read} is raised.
  1647. @end deffn
  1648. @node R6RS Output Ports
  1649. @subsubsection Output Ports
  1650. @deffn {Scheme Procedure} output-port? obj
  1651. Returns @code{#t} if the argument is an output port (or a
  1652. combined input and output port), @code{#f} otherwise.
  1653. @end deffn
  1654. @deffn {Scheme Procedure} flush-output-port port
  1655. Flushes any buffered output from the buffer of @var{output-port} to the
  1656. underlying file, device, or object. The @code{flush-output-port}
  1657. procedure returns an unspecified values.
  1658. @end deffn
  1659. @deffn {Scheme Procedure} open-file-output-port filename
  1660. @deffnx {Scheme Procedure} open-file-output-port filename file-options
  1661. @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
  1662. @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
  1663. @var{maybe-transcoder} must be either a transcoder or @code{#f}.
  1664. The @code{open-file-output-port} procedure returns an output port for the named file.
  1665. The @var{file-options} argument, which may determine various aspects of
  1666. the returned port (@pxref{R6RS File Options}), defaults to the value of
  1667. @code{(file-options)}.
  1668. The @var{buffer-mode} argument, if supplied,
  1669. must be one of the symbols that name a buffer mode.
  1670. The @var{buffer-mode} argument defaults to @code{block}.
  1671. If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
  1672. associated with the port.
  1673. If @var{maybe-transcoder} is @code{#f} or absent,
  1674. the port will be a binary port and will support the
  1675. @code{port-position} and @code{set-port-position!} operations.
  1676. Otherwise the port will be a textual port, and whether it supports
  1677. the @code{port-position} and @code{set-port-position!} operations
  1678. is implementation-dependent (and possibly transcoder-dependent).
  1679. @end deffn
  1680. @deffn {Scheme Procedure} standard-output-port
  1681. @deffnx {Scheme Procedure} standard-error-port
  1682. Returns a fresh binary output port connected to the standard output or
  1683. standard error respectively. Whether the port supports the
  1684. @code{port-position} and @code{set-port-position!} operations is
  1685. implementation-dependent.
  1686. @end deffn
  1687. @deffn {Scheme Procedure} current-output-port
  1688. @deffnx {Scheme Procedure} current-error-port
  1689. These return default textual ports for regular output and error output.
  1690. Normally, these default ports are associated with standard output, and
  1691. standard error, respectively. The return value of
  1692. @code{current-output-port} can be dynamically re-assigned using the
  1693. @code{with-output-to-file} procedure from the @code{io simple (6)}
  1694. library (@pxref{rnrs io simple}). A port returned by one of these
  1695. procedures may or may not have an associated transcoder; if it does, the
  1696. transcoder is implementation-dependent.
  1697. @end deffn
  1698. @node R6RS Binary Output
  1699. @subsubsection Binary Output
  1700. Binary output ports can be created with the procedures below.
  1701. @deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
  1702. @deffnx {C Function} scm_open_bytevector_output_port (transcoder)
  1703. Return two values: a binary output port and a procedure. The latter
  1704. should be called with zero arguments to obtain a bytevector containing
  1705. the data accumulated by the port, as illustrated below.
  1706. @lisp
  1707. (call-with-values
  1708. (lambda ()
  1709. (open-bytevector-output-port))
  1710. (lambda (port get-bytevector)
  1711. (display "hello" port)
  1712. (get-bytevector)))
  1713. @result{} #vu8(104 101 108 108 111)
  1714. @end lisp
  1715. @c FIXME: Update description when implemented.
  1716. The @var{transcoder} argument is currently not supported.
  1717. @end deffn
  1718. @cindex custom binary output ports
  1719. @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
  1720. @deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
  1721. Return a new custom binary output port named @var{id} (a string) whose
  1722. output is sunk by invoking @var{write!} and passing it a bytevector, an
  1723. index where bytes should be read from this bytevector, and the number of
  1724. bytes to be ``written''. The @code{write!} procedure must return an
  1725. integer indicating the number of bytes actually written; when it is
  1726. passed @code{0} as the number of bytes to write, it should behave as
  1727. though an end-of-file was sent to the byte sink.
  1728. The other arguments are as for @code{make-custom-binary-input-port}
  1729. (@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
  1730. @end deffn
  1731. @cindex binary output
  1732. Writing to a binary output port can be done using the following
  1733. procedures:
  1734. @deffn {Scheme Procedure} put-u8 port octet
  1735. @deffnx {C Function} scm_put_u8 (port, octet)
  1736. Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
  1737. binary output port.
  1738. @end deffn
  1739. @deffn {Scheme Procedure} put-bytevector port bv [start [count]]
  1740. @deffnx {C Function} scm_put_bytevector (port, bv, start, count)
  1741. Write the contents of @var{bv} to @var{port}, optionally starting at
  1742. index @var{start} and limiting to @var{count} octets.
  1743. @end deffn
  1744. @node R6RS Textual Output
  1745. @subsubsection Textual Output
  1746. @deffn {Scheme Procedure} put-char port char
  1747. Writes @var{char} to the port. The @code{put-char} procedure returns
  1748. @end deffn
  1749. @deffn {Scheme Procedure} put-string port string
  1750. @deffnx {Scheme Procedure} put-string port string start
  1751. @deffnx {Scheme Procedure} put-string port string start count
  1752. @var{start} and @var{count} must be non-negative exact integer objects.
  1753. @var{string} must have a length of at least @math{@var{start} +
  1754. @var{count}}. @var{start} defaults to 0. @var{count} defaults to
  1755. @math{@code{(string-length @var{string})} - @var{start}}$. The
  1756. @code{put-string} procedure writes the @var{count} characters of
  1757. @var{string} starting at index @var{start} to the port. The
  1758. @code{put-string} procedure returns an unspecified value.
  1759. @end deffn
  1760. @deffn {Scheme Procedure} put-datum textual-output-port datum
  1761. @var{datum} should be a datum value. The @code{put-datum} procedure
  1762. writes an external representation of @var{datum} to
  1763. @var{textual-output-port}. The specific external representation is
  1764. implementation-dependent. However, whenever possible, an implementation
  1765. should produce a representation for which @code{get-datum}, when reading
  1766. the representation, will return an object equal (in the sense of
  1767. @code{equal?}) to @var{datum}.
  1768. @quotation Note
  1769. Not all datums may allow producing an external representation for which
  1770. @code{get-datum} will produce an object that is equal to the
  1771. original. Specifically, NaNs contained in @var{datum} may make
  1772. this impossible.
  1773. @end quotation
  1774. @quotation Note
  1775. The @code{put-datum} procedure merely writes the external
  1776. representation, but no trailing delimiter. If @code{put-datum} is
  1777. used to write several subsequent external representations to an
  1778. output port, care should be taken to delimit them properly so they can
  1779. be read back in by subsequent calls to @code{get-datum}.
  1780. @end quotation
  1781. @end deffn
  1782. @node I/O Extensions
  1783. @subsection Using and Extending Ports in C
  1784. @menu
  1785. * C Port Interface:: Using ports from C.
  1786. * Port Implementation:: How to implement a new port type in C.
  1787. @end menu
  1788. @node C Port Interface
  1789. @subsubsection C Port Interface
  1790. @cindex C port interface
  1791. @cindex Port, C interface
  1792. This section describes how to use Scheme ports from C.
  1793. @subsubheading Port basics
  1794. @cindex ptob
  1795. @tindex scm_ptob_descriptor
  1796. @tindex scm_port
  1797. @findex SCM_PTAB_ENTRY
  1798. @findex SCM_PTOBNUM
  1799. @vindex scm_ptobs
  1800. There are two main data structures. A port type object (ptob) is of
  1801. type @code{scm_ptob_descriptor}. A port instance is of type
  1802. @code{scm_port}. Given an @code{SCM} variable which points to a port,
  1803. the corresponding C port object can be obtained using the
  1804. @code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
  1805. @code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
  1806. global array.
  1807. @subsubheading Port buffers
  1808. An input port always has a read buffer and an output port always has a
  1809. write buffer. However the size of these buffers is not guaranteed to be
  1810. more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
  1811. which is used when no other buffer is allocated). The way in which the
  1812. buffers are allocated depends on the implementation of the ptob. For
  1813. example in the case of an fport, buffers may be allocated with malloc
  1814. when the port is created, but in the case of an strport the underlying
  1815. string is used as the buffer.
  1816. @subsubheading The @code{rw_random} flag
  1817. Special treatment is required for ports which can be seeked at random.
  1818. Before various operations, such as seeking the port or changing from
  1819. input to output on a bidirectional port or vice versa, the port
  1820. implementation must be given a chance to update its state. The write
  1821. buffer is updated by calling the @code{flush} ptob procedure and the
  1822. input buffer is updated by calling the @code{end_input} ptob procedure.
  1823. In the case of an fport, @code{flush} causes buffered output to be
  1824. written to the file descriptor, while @code{end_input} causes the
  1825. descriptor position to be adjusted to account for buffered input which
  1826. was never read.
  1827. The special treatment must be performed if the @code{rw_random} flag in
  1828. the port is non-zero.
  1829. @subsubheading The @code{rw_active} variable
  1830. The @code{rw_active} variable in the port is only used if
  1831. @code{rw_random} is set. It's defined as an enum with the following
  1832. values:
  1833. @table @code
  1834. @item SCM_PORT_READ
  1835. the read buffer may have unread data.
  1836. @item SCM_PORT_WRITE
  1837. the write buffer may have unwritten data.
  1838. @item SCM_PORT_NEITHER
  1839. neither the write nor the read buffer has data.
  1840. @end table
  1841. @subsubheading Reading from a port.
  1842. To read from a port, it's possible to either call existing libguile
  1843. procedures such as @code{scm_getc} and @code{scm_read_line} or to read
  1844. data from the read buffer directly. Reading from the buffer involves
  1845. the following steps:
  1846. @enumerate
  1847. @item
  1848. Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
  1849. @item
  1850. Fill the read buffer, if it's empty, using @code{scm_fill_input}.
  1851. @item Read the data from the buffer and update the read position in
  1852. the buffer. Steps 2) and 3) may be repeated as many times as required.
  1853. @item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
  1854. @item update the port's line and column counts.
  1855. @end enumerate
  1856. @subsubheading Writing to a port.
  1857. To write data to a port, calling @code{scm_lfwrite} should be sufficient for
  1858. most purposes. This takes care of the following steps:
  1859. @enumerate
  1860. @item
  1861. End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
  1862. @item
  1863. Pass the data to the ptob implementation using the @code{write} ptob
  1864. procedure. The advantage of using the ptob @code{write} instead of
  1865. manipulating the write buffer directly is that it allows the data to be
  1866. written in one operation even if the port is using the single-byte
  1867. @code{shortbuf}.
  1868. @item
  1869. Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
  1870. is set.
  1871. @end enumerate
  1872. @node Port Implementation
  1873. @subsubsection Port Implementation
  1874. @cindex Port implementation
  1875. This section describes how to implement a new port type in C.
  1876. As described in the previous section, a port type object (ptob) is
  1877. a structure of type @code{scm_ptob_descriptor}. A ptob is created by
  1878. calling @code{scm_make_port_type}.
  1879. @deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size))
  1880. Return a new port type object. The @var{name}, @var{fill_input} and
  1881. @var{write} parameters are initial values for those port type fields,
  1882. as described below. The other fields are initialized with default
  1883. values and can be changed later.
  1884. @end deftypefun
  1885. All of the elements of the ptob, apart from @code{name}, are procedures
  1886. which collectively implement the port behaviour. Creating a new port
  1887. type mostly involves writing these procedures.
  1888. @table @code
  1889. @item name
  1890. A pointer to a NUL terminated string: the name of the port type. This
  1891. is the only element of @code{scm_ptob_descriptor} which is not
  1892. a procedure. Set via the first argument to @code{scm_make_port_type}.
  1893. @item mark
  1894. Called during garbage collection to mark any SCM objects that a port
  1895. object may contain. It doesn't need to be set unless the port has
  1896. @code{SCM} components. Set using
  1897. @deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port))
  1898. @end deftypefun
  1899. @item free
  1900. Called when the port is collected during gc. It
  1901. should free any resources used by the port.
  1902. Set using
  1903. @deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port))
  1904. @end deftypefun
  1905. @item print
  1906. Called when @code{write} is called on the port object, to print a
  1907. port description. E.g., for an fport it may produce something like:
  1908. @code{#<input: /etc/passwd 3>}. Set using
  1909. @deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
  1910. The first argument @var{port} is the object being printed, the second
  1911. argument @var{dest_port} is where its description should go.
  1912. @end deftypefun
  1913. @item equalp
  1914. Not used at present. Set using
  1915. @deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM))
  1916. @end deftypefun
  1917. @item close
  1918. Called when the port is closed, unless it was collected during gc. It
  1919. should free any resources used by the port.
  1920. Set using
  1921. @deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port))
  1922. @end deftypefun
  1923. @item write
  1924. Accept data which is to be written using the port. The port implementation
  1925. may choose to buffer the data instead of processing it directly.
  1926. Set via the third argument to @code{scm_make_port_type}.
  1927. @item flush
  1928. Complete the processing of buffered output data. Reset the value of
  1929. @code{rw_active} to @code{SCM_PORT_NEITHER}.
  1930. Set using
  1931. @deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port))
  1932. @end deftypefun
  1933. @item end_input
  1934. Perform any synchronization required when switching from input to output
  1935. on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
  1936. Set using
  1937. @deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset))
  1938. @end deftypefun
  1939. @item fill_input
  1940. Read new data into the read buffer and return the first character. It
  1941. can be assumed that the read buffer is empty when this procedure is called.
  1942. Set via the second argument to @code{scm_make_port_type}.
  1943. @item input_waiting
  1944. Return a lower bound on the number of bytes that could be read from the
  1945. port without blocking. It can be assumed that the current state of
  1946. @code{rw_active} is @code{SCM_PORT_NEITHER}.
  1947. Set using
  1948. @deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port))
  1949. @end deftypefun
  1950. @item seek
  1951. Set the current position of the port. The procedure can not make
  1952. any assumptions about the value of @code{rw_active} when it's
  1953. called. It can reset the buffers first if desired by using something
  1954. like:
  1955. @example
  1956. if (pt->rw_active == SCM_PORT_READ)
  1957. scm_end_input (port);
  1958. else if (pt->rw_active == SCM_PORT_WRITE)
  1959. ptob->flush (port);
  1960. @end example
  1961. However note that this will have the side effect of discarding any data
  1962. in the unread-char buffer, in addition to any side effects from the
  1963. @code{end_input} and @code{flush} ptob procedures. This is undesirable
  1964. when seek is called to measure the current position of the port, i.e.,
  1965. @code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
  1966. implementations take care to avoid this problem.
  1967. The procedure is set using
  1968. @deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
  1969. @end deftypefun
  1970. @item truncate
  1971. Truncate the port data to be specified length. It can be assumed that the
  1972. current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
  1973. Set using
  1974. @deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length))
  1975. @end deftypefun
  1976. @end table
  1977. @node BOM Handling
  1978. @subsection Handling of Unicode byte order marks.
  1979. @cindex BOM
  1980. @cindex byte order mark
  1981. This section documents the finer points of Guile's handling of Unicode
  1982. byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
  1983. at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
  1984. determine the byte order. Occasionally, a BOM is found at the start of
  1985. a UTF-8 stream, but this is much less common and not generally
  1986. recommended.
  1987. Guile attempts to handle BOMs automatically, and in accordance with the
  1988. recommendations of the Unicode Standard, when the port encoding is set
  1989. to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
  1990. automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
  1991. and automatically consumes one from the start of a UTF-8, UTF-16, or
  1992. UTF-32 stream.
  1993. As specified in the Unicode Standard, a BOM is only handled specially at
  1994. the start of a stream, and only if the port encoding is set to
  1995. @code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
  1996. set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
  1997. @code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
  1998. the special handling described in this section applies.
  1999. @itemize @bullet
  2000. @item
  2001. To ensure that Guile will properly detect the byte order of a UTF-16 or
  2002. UTF-32 stream, you must perform a textual read before any writes, seeks,
  2003. or binary I/O. Guile will not attempt to read a BOM unless a read is
  2004. explicitly requested at the start of the stream.
  2005. @item
  2006. If a textual write is performed before the first read, then an arbitrary
  2007. byte order will be chosen. Currently, big endian is the default on all
  2008. platforms, but that may change in the future. If you wish to explicitly
  2009. control the byte order of an output stream, set the port encoding to
  2010. @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
  2011. and explicitly write a BOM (@code{#\xFEFF}) if desired.
  2012. @item
  2013. If @code{set-port-encoding!} is called in the middle of a stream, Guile
  2014. treats this as a new logical ``start of stream'' for purposes of BOM
  2015. handling, and will forget about any BOMs that had previously been seen.
  2016. Therefore, it may choose a different byte order than had been used
  2017. previously. This is intended to support multiple logical text streams
  2018. embedded within a larger binary stream.
  2019. @item
  2020. Binary I/O operations are not guaranteed to update Guile's notion of
  2021. whether the port is at the ``start of the stream'', nor are they
  2022. guaranteed to produce or consume BOMs.
  2023. @item
  2024. For ports that support seeking (e.g. normal files), the input and output
  2025. streams are considered linked: if the user reads first, then a BOM will
  2026. be consumed (if appropriate), but later writes will @emph{not} produce a
  2027. BOM. Similarly, if the user writes first, then later reads will
  2028. @emph{not} consume a BOM.
  2029. @item
  2030. For ports that do not support seeking (e.g. pipes, sockets, and
  2031. terminals), the input and output streams are considered
  2032. @emph{independent} for purposes of BOM handling: the first read will
  2033. consume a BOM (if appropriate), and the first write will @emph{also}
  2034. produce a BOM (if appropriate). However, the input and output streams
  2035. will always use the same byte order.
  2036. @item
  2037. Seeks to the beginning of a file will set the ``start of stream'' flags.
  2038. Therefore, a subsequent textual read or write will consume or produce a
  2039. BOM. However, unlike @code{set-port-encoding!}, if a byte order had
  2040. already been chosen for the port, it will remain in effect after a seek,
  2041. and cannot be changed by the presence of a BOM. Seeks anywhere other
  2042. than the beginning of a file clear the ``start of stream'' flags.
  2043. @end itemize
  2044. @c Local Variables:
  2045. @c TeX-master: "guile.texi"
  2046. @c End: