api-io.texi 95 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208120912101211121212131214121512161217121812191220122112221223122412251226122712281229123012311232123312341235123612371238123912401241124212431244124512461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268126912701271127212731274127512761277127812791280128112821283128412851286128712881289129012911292129312941295129612971298129913001301130213031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137113721373137413751376137713781379138013811382138313841385138613871388138913901391139213931394139513961397139813991400140114021403140414051406140714081409141014111412141314141415141614171418141914201421142214231424142514261427142814291430143114321433143414351436143714381439144014411442144314441445144614471448144914501451145214531454145514561457145814591460146114621463146414651466146714681469147014711472147314741475147614771478147914801481148214831484148514861487148814891490149114921493149414951496149714981499150015011502150315041505150615071508150915101511151215131514151515161517151815191520152115221523152415251526152715281529153015311532153315341535153615371538153915401541154215431544154515461547154815491550155115521553155415551556155715581559156015611562156315641565156615671568156915701571157215731574157515761577157815791580158115821583158415851586158715881589159015911592159315941595159615971598159916001601160216031604160516061607160816091610161116121613161416151616161716181619162016211622162316241625162616271628162916301631163216331634163516361637163816391640164116421643164416451646164716481649165016511652165316541655165616571658165916601661166216631664166516661667166816691670167116721673167416751676167716781679168016811682168316841685168616871688168916901691169216931694169516961697169816991700170117021703170417051706170717081709171017111712171317141715171617171718171917201721172217231724172517261727172817291730173117321733173417351736173717381739174017411742174317441745174617471748174917501751175217531754175517561757175817591760176117621763176417651766176717681769177017711772177317741775177617771778177917801781178217831784178517861787178817891790179117921793179417951796179717981799180018011802180318041805180618071808180918101811181218131814181518161817181818191820182118221823182418251826182718281829183018311832183318341835183618371838183918401841184218431844184518461847184818491850185118521853185418551856185718581859186018611862186318641865186618671868186918701871187218731874187518761877187818791880188118821883188418851886188718881889189018911892189318941895189618971898189919001901190219031904190519061907190819091910191119121913191419151916191719181919192019211922192319241925192619271928192919301931193219331934193519361937193819391940194119421943194419451946194719481949195019511952195319541955195619571958195919601961196219631964196519661967196819691970197119721973197419751976197719781979198019811982198319841985198619871988198919901991199219931994199519961997199819992000200120022003200420052006200720082009201020112012201320142015201620172018201920202021202220232024202520262027202820292030203120322033203420352036203720382039204020412042204320442045204620472048204920502051205220532054205520562057205820592060206120622063206420652066206720682069207020712072207320742075207620772078207920802081208220832084208520862087208820892090209120922093209420952096209720982099210021012102210321042105210621072108210921102111211221132114211521162117211821192120212121222123212421252126212721282129213021312132213321342135213621372138213921402141214221432144214521462147214821492150215121522153215421552156215721582159216021612162216321642165216621672168216921702171217221732174217521762177217821792180218121822183218421852186218721882189219021912192219321942195219621972198219922002201220222032204220522062207220822092210221122122213221422152216221722182219222022212222222322242225222622272228222922302231223222332234223522362237223822392240224122422243224422452246224722482249225022512252225322542255225622572258225922602261226222632264226522662267226822692270
  1. @c -*-texinfo-*-
  2. @c This is part of the GNU Guile Reference Manual.
  3. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
  4. @c 2010, 2011, 2013, 2016, 2019, 2021, 2023 Free Software Foundation, Inc.
  5. @c See the file guile.texi for copying conditions.
  6. @node Input and Output
  7. @section Input and Output
  8. @menu
  9. * Ports:: What's a port?
  10. * Binary I/O:: Reading and writing bytes.
  11. * Encoding:: Characters as bytes.
  12. * Textual I/O:: Reading and writing characters.
  13. * Simple Output:: Simple syntactic sugar solution.
  14. * Buffering:: Controlling when data is written to ports.
  15. * Random Access:: Moving around a random access port.
  16. * Line/Delimited:: Read and write lines or delimited text.
  17. * Default Ports:: Defaults for input, output and errors.
  18. * Port Types:: Types of port and how to make them.
  19. * Venerable Port Interfaces:: Procedures from the last millennium.
  20. * Using Ports from C:: Nice interfaces for C.
  21. * Non-Blocking I/O:: How Guile deals with EWOULDBLOCK.
  22. * BOM Handling:: Handling of Unicode byte order marks.
  23. @end menu
  24. @node Ports
  25. @subsection Ports
  26. @cindex Port
  27. Ports are the way that Guile performs input and output. Guile can read
  28. in characters or bytes from an @dfn{input port}, or write them out to an
  29. @dfn{output port}. Some ports support both interfaces.
  30. There are a number of different port types implemented in Guile. File
  31. ports provide input and output over files, as you might imagine. For
  32. example, we might display a string to a file like this:
  33. @example
  34. (let ((port (open-output-file "foo.txt")))
  35. (display "Hello, world!\n" port)
  36. (close-port port))
  37. @end example
  38. There are also string ports, for taking input from a string, or
  39. collecting output to a string; bytevector ports, for doing the same but
  40. using a bytevector as a source or sink of data; and custom ports, for
  41. arranging to call Scheme functions to provide input or handle output.
  42. @xref{Port Types}.
  43. Ports should be @dfn{closed} when they are not needed by calling
  44. @code{close-port} on them, as in the example above. This will make sure
  45. that any pending output is successfully written out to disk, in the case
  46. of a file port, or otherwise to whatever mutable store is backed by the
  47. port. Any error that occurs while writing out that buffered data would
  48. also be raised promptly at the @code{close-port}, and not later when the
  49. port is closed by the garbage collector. @xref{Buffering}, for more on
  50. buffered output.
  51. Closing a port also releases any precious resource the file might have.
  52. Usually in Scheme a programmer doesn't have to clean up after their data
  53. structures (@pxref{Memory Management}), but most systems have strict
  54. limits on how many files can be open, both on a per-process and a
  55. system-wide basis. A program that uses many files should take care not
  56. to hit those limits. The same applies to similar system resources such
  57. as pipes and sockets.
  58. Indeed for these reasons the above example is not the most idiomatic way
  59. to use ports. It is more common to acquire ports via procedures like
  60. @code{call-with-output-file}, which handle the @code{close-port}
  61. automatically:
  62. @example
  63. (call-with-output-file "foo.txt"
  64. (lambda (port)
  65. (display "Hello, world!\n" port)))
  66. @end example
  67. Finally, all ports have associated input and output buffers, as
  68. appropriate. Buffering is a common strategy to limit the overhead of
  69. small reads and writes: without buffering, each character fetched from a
  70. file would involve at least one call into the kernel, and maybe more
  71. depending on the character and the encoding. Instead, Guile will batch
  72. reads and writes into internal buffers. However, sometimes you want to
  73. make output on a port show up immediately. @xref{Buffering}, for more
  74. on interfaces to control port buffering.
  75. @deffn {Scheme Procedure} port? x
  76. @deffnx {C Function} scm_port_p (x)
  77. Return a boolean indicating whether @var{x} is a port.
  78. Equivalent to @code{(or (input-port? @var{x}) (output-port? @var{x}))}.
  79. @end deffn
  80. @rnindex input-port?
  81. @deffn {Scheme Procedure} input-port? x
  82. @deffnx {C Function} scm_input_port_p (x)
  83. Return @code{#t} if @var{x} is an input port, otherwise return
  84. @code{#f}. Any object satisfying this predicate also satisfies
  85. @code{port?}.
  86. @end deffn
  87. @rnindex output-port?
  88. @deffn {Scheme Procedure} output-port? x
  89. @deffnx {C Function} scm_output_port_p (x)
  90. Return @code{#t} if @var{x} is an output port, otherwise return
  91. @code{#f}. Any object satisfying this predicate also satisfies
  92. @code{port?}.
  93. @end deffn
  94. @cindex Closing ports
  95. @cindex Port, close
  96. @deffn {Scheme Procedure} close-port port
  97. @deffnx {C Function} scm_close_port (port)
  98. Close the specified port object. Return @code{#t} if it successfully
  99. closes a port or @code{#f} if it was already closed. An exception may
  100. be raised if an error occurs, for example when flushing buffered output.
  101. @xref{Buffering}, for more on buffered output. @xref{Ports and File
  102. Descriptors, close}, for a procedure which can close file descriptors.
  103. @end deffn
  104. @deffn {Scheme Procedure} port-closed? port
  105. @deffnx {C Function} scm_port_closed_p (port)
  106. Return @code{#t} if @var{port} is closed or @code{#f} if it is
  107. open.
  108. @end deffn
  109. @deffn {Scheme Procedure} call-with-port port proc
  110. Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
  111. of @var{proc}. Return the return values of @var{proc}.
  112. @end deffn
  113. @node Binary I/O
  114. @subsection Binary I/O
  115. Guile's ports are fundamentally binary in nature: at the lowest level,
  116. they work on bytes. This section describes Guile's core binary I/O
  117. operations. @xref{Textual I/O}, for input and output of strings and
  118. characters.
  119. To use these routines, first include the binary I/O module:
  120. @example
  121. (use-modules (ice-9 binary-ports))
  122. @end example
  123. Note that although this module's name suggests that binary ports are
  124. some different kind of port, that's not the case: all ports in Guile are
  125. both binary and textual ports.
  126. @cindex binary input
  127. @anchor{x-get-u8}
  128. @deffn {Scheme Procedure} get-u8 port
  129. @deffnx {C Function} scm_get_u8 (port)
  130. Return an octet read from @var{port}, an input port, blocking as
  131. necessary, or the end-of-file object.
  132. @end deffn
  133. @anchor{x-lookahead-u8}
  134. @deffn {Scheme Procedure} lookahead-u8 port
  135. @deffnx {C Function} scm_lookahead_u8 (port)
  136. Like @code{get-u8} but does not update @var{port}'s position to point
  137. past the octet.
  138. @end deffn
  139. The end-of-file object is unlike any other kind of object: it's not a
  140. pair, a symbol, or anything else. To check if a value is the
  141. end-of-file object, use the @code{eof-object?} predicate.
  142. @rnindex eof-object?
  143. @cindex End of file object
  144. @deffn {Scheme Procedure} eof-object? x
  145. @deffnx {C Function} scm_eof_object_p (x)
  146. Return @code{#t} if @var{x} is an end-of-file object, or @code{#f}
  147. otherwise.
  148. @end deffn
  149. Note that unlike other procedures in this module, @code{eof-object?} is
  150. defined in the default environment.
  151. @deffn {Scheme Procedure} get-bytevector-n port count
  152. @deffnx {C Function} scm_get_bytevector_n (port, count)
  153. Read @var{count} octets from @var{port}, blocking as necessary and
  154. return a bytevector containing the octets read. If fewer bytes are
  155. available, a bytevector smaller than @var{count} is returned.
  156. @end deffn
  157. @deffn {Scheme Procedure} get-bytevector-n! port bv start count
  158. @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
  159. Read @var{count} bytes from @var{port} and store them in @var{bv}
  160. starting at index @var{start}. Return either the number of bytes
  161. actually read or the end-of-file object.
  162. @end deffn
  163. @deffn {Scheme Procedure} get-bytevector-some port
  164. @deffnx {C Function} scm_get_bytevector_some (port)
  165. Read from @var{port}, blocking as necessary, until bytes are available
  166. or an end-of-file is reached. Return either the end-of-file object or a
  167. new bytevector containing some of the available bytes (at least one),
  168. and update the port position to point just past these bytes.
  169. @end deffn
  170. @deffn {Scheme Procedure} get-bytevector-some! port bv start count
  171. @deffnx {C Function} scm_get_bytevector_some_x (port, bv, start, count)
  172. Read up to @var{count} bytes from @var{port}, blocking as necessary
  173. until at least one byte is available or an end-of-file is reached.
  174. Store them in @var{bv} starting at index @var{start}. Return the number
  175. of bytes actually read, or an end-of-file object.
  176. @end deffn
  177. @deffn {Scheme Procedure} get-bytevector-all port
  178. @deffnx {C Function} scm_get_bytevector_all (port)
  179. Read from @var{port}, blocking as necessary, until the end-of-file is
  180. reached. Return either a new bytevector containing the data read or the
  181. end-of-file object (if no data were available).
  182. @end deffn
  183. @deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
  184. @deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
  185. Place the contents of @var{bv} in @var{port}, optionally starting at
  186. index @var{start} and limiting to @var{count} octets, so that its bytes
  187. will be read from left-to-right as the next bytes from @var{port} during
  188. subsequent read operations. If called multiple times, the unread bytes
  189. will be read again in last-in first-out order.
  190. @end deffn
  191. @cindex binary output
  192. To perform binary output on a port, use @code{put-u8} or
  193. @code{put-bytevector}.
  194. @anchor{x-put-u8}
  195. @deffn {Scheme Procedure} put-u8 port octet
  196. @deffnx {C Function} scm_put_u8 (port, octet)
  197. Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
  198. binary output port.
  199. @end deffn
  200. @deffn {Scheme Procedure} put-bytevector port bv [start [count]]
  201. @deffnx {C Function} scm_put_bytevector (port, bv, start, count)
  202. Write the contents of @var{bv} to @var{port}, optionally starting at
  203. index @var{start} and limiting to @var{count} octets.
  204. @end deffn
  205. @subsubheading Binary I/O in R7RS
  206. @ref{R7RS Standard Libraries,R7RS} defines the following binary I/O
  207. procedures. Access them with
  208. @example
  209. (use-modules (scheme base))
  210. @end example
  211. @anchor{x-open-output-bytevector}
  212. @deffn {Scheme Procedure} open-output-bytevector
  213. Returns a binary output port that will accumulate bytes
  214. for retrieval by @ref{x-get-output-bytevector,@code{get-output-bytevector}}.
  215. @end deffn
  216. @deffn {Scheme Procedure} write-u8 byte [out]
  217. Writes @var{byte} to the given binary output port @var{out} and returns
  218. an unspecified value. @var{out} defaults to @code{(current-output-port)}.
  219. See also @ref{x-put-u8,@code{put-u8}}.
  220. @end deffn
  221. @deffn {Scheme Procedure} read-u8 [in]
  222. Returns the next byte available from the binary input port @var{in},
  223. updating the port to point to the following byte. If no more bytes are
  224. available, an end-of-file object is returned. @var{in} defaults to
  225. @code{(current-input-port)}.
  226. See also @ref{x-get-u8,@code{get-u8}}.
  227. @end deffn
  228. @deffn {Scheme Procedure} peek-u8 [in]
  229. Returns the next byte available from the binary input port @var{in},
  230. but without updating the port to point to the following
  231. byte. If no more bytes are available, an end-of-file object
  232. is returned. @var{in} defaults to @code{(current-input-port)}.
  233. See also @ref{x-lookahead-u8,@code{lookahead-u8}}.
  234. @end deffn
  235. @anchor{x-get-output-bytevector}
  236. @deffn {Scheme Procedure} get-output-bytevector port
  237. Returns a bytevector consisting of the bytes that have been output to
  238. @var{port} so far in the order they were output. It is an error if
  239. @var{port} was not created with
  240. @ref{x-open-output-bytevector,@code{open-output-bytevector}}.
  241. @example
  242. (define out (open-output-bytevector))
  243. (write-u8 1 out)
  244. (write-u8 2 out)
  245. (write-u8 3 out)
  246. (get-output-bytevector out) @result{} #vu8(1 2 3)
  247. @end example
  248. @end deffn
  249. @deffn {Scheme Procedure} open-input-bytevector bv
  250. Takes a bytevector @var{bv} and returns a binary input port that
  251. delivers bytes from @var{bv}.
  252. @example
  253. (define in (open-input-bytevector #vu8(1 2 3)))
  254. (read-u8 in) @result{} 1
  255. (peek-u8 in) @result{} 2
  256. (read-u8 in) @result{} 2
  257. (read-u8 in) @result{} 3
  258. (read-u8 in) @result{} #<eof>
  259. @end example
  260. @end deffn
  261. @deffn {Scheme Procedure} read-bytevector! bv [port [start [end]]]
  262. Reads the next @var{end} - @var{start} bytes, or as many as are
  263. available before the end of file, from the binary input port into the
  264. bytevector @var{bv} in left-to-right order beginning at the @var{start}
  265. position. If @var{end} is not supplied, reads until the end of @var{bv}
  266. has been reached. If @var{start} is not supplied, reads beginning at
  267. position 0.
  268. Returns the number of bytes read. If no bytes are available, an
  269. end-of-file object is returned.
  270. @example
  271. (define in (open-input-bytevector #vu8(1 2 3)))
  272. (define bv (make-bytevector 5 0))
  273. (read-bytevector! bv in 1 3) @result{} 2
  274. bv @result{} #vu8(0 1 2 0 0 0)
  275. @end example
  276. @end deffn
  277. @deffn {Scheme Procedure} read-bytevector k in
  278. Reads the next @var{k} bytes, or as many as are available before the end
  279. of file if that is less than @var{k}, from the binary input port
  280. @var{in} into a newly allocated bytevector in left-to-right order, and
  281. returns the bytevector. If no bytes are available before the end of
  282. file, an end-of-file object is returned.
  283. @example
  284. (define bv #vu8(1 2 3))
  285. (read-bytevector 2 (open-input-bytevector bv)) @result{} #vu8(1 2)
  286. (read-bytevector 10 (open-input-bytevector bv)) @result{} #vu8(1 2 3)
  287. @end example
  288. @end deffn
  289. @deffn {Scheme Procedure} write-bytevector bv [port [start [end]]]
  290. Writes the bytes of bytevector @var{bv} from @var{start} to @var{end} in
  291. left-to-right order to the binary output @var{port}. @var{start}
  292. defaults to 0 and @var{end} defaults to the length of @var{bv}.
  293. @example
  294. (define out (open-output-bytevector))
  295. (write-bytevector #vu8(0 1 2 3 4) out 2 4)
  296. (get-output-bytevector out) @result{} #vu8(2 3)
  297. @end example
  298. @end deffn
  299. @node Encoding
  300. @subsection Encoding
  301. Textual input and output on Guile ports is layered on top of binary
  302. operations. To this end, each port has an associated character encoding
  303. that controls how bytes read from the port are converted to characters,
  304. and how characters written to the port are converted to bytes.
  305. @deffn {Scheme Procedure} port-encoding port
  306. @deffnx {C Function} scm_port_encoding (port)
  307. Returns, as a string, the character encoding that @var{port} uses to
  308. interpret its input and output.
  309. @end deffn
  310. @deffn {Scheme Procedure} set-port-encoding! port enc
  311. @deffnx {C Function} scm_set_port_encoding_x (port, enc)
  312. Sets the character encoding that will be used to interpret I/O to
  313. @var{port}. @var{enc} is a string containing the name of an encoding.
  314. Valid encoding names are those
  315. @url{http://www.iana.org/assignments/character-sets, defined by IANA},
  316. for example @code{"UTF-8"} or @code{"ISO-8859-1"}.
  317. @end deffn
  318. When ports are created, they are assigned an encoding. The usual
  319. process to determine the initial encoding for a port is to take the
  320. value of the @code{%default-port-encoding} fluid.
  321. @defvr {Scheme Variable} %default-port-encoding
  322. A fluid containing name of the encoding to be used by default for newly
  323. created ports (@pxref{Fluids and Dynamic States}). As a special case,
  324. the value @code{#f} is equivalent to @code{"ISO-8859-1"}.
  325. @end defvr
  326. The @code{%default-port-encoding} itself defaults to the encoding
  327. appropriate for the current locale, if @code{setlocale} has been called.
  328. @xref{Locales}, for more on locales and when you might need to call
  329. @code{setlocale} explicitly.
  330. Some port types have other ways of determining their initial locales.
  331. String ports, for example, default to the UTF-8 encoding, in order to be
  332. able to represent all characters regardless of the current locale. File
  333. ports can optionally sniff their file for a @code{coding:} declaration;
  334. @xref{File Ports}. Binary ports might be initialized to the ISO-8859-1
  335. encoding in which each codepoint between 0 and 255 corresponds to a byte
  336. with that value.
  337. Currently, the ports only work with @emph{non-modal} encodings. Most
  338. encodings are non-modal, meaning that the conversion of bytes to a
  339. string doesn't depend on its context: the same byte sequence will always
  340. return the same string. A couple of modal encodings are in common use,
  341. like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
  342. @cindex port conversion strategy
  343. @cindex conversion strategy, port
  344. @cindex decoding error
  345. @cindex encoding error
  346. Each port also has an associated conversion strategy, which determines
  347. what to do when a Guile character can't be converted to the port's
  348. encoded character representation for output. There are three possible
  349. strategies: to raise an error, to replace the character with a hex
  350. escape, or to replace the character with a substitute character. Port
  351. conversion strategies are also used when decoding characters from an
  352. input port.
  353. @deffn {Scheme Procedure} port-conversion-strategy port
  354. @deffnx {C Function} scm_port_conversion_strategy (port)
  355. Returns the behavior of the port when outputting a character that is not
  356. representable in the port's current encoding.
  357. If @var{port} is @code{#f}, then the current default behavior will be
  358. returned. New ports will have this default behavior when they are
  359. created.
  360. @end deffn
  361. @deffn {Scheme Procedure} set-port-conversion-strategy! port sym
  362. @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
  363. Sets the behavior of Guile when outputting a character that is not
  364. representable in the port's current encoding, or when Guile encounters a
  365. decoding error when trying to read a character. @var{sym} can be either
  366. @code{error}, @code{substitute}, or @code{escape}.
  367. If @var{port} is an open port, the conversion error behavior is set for
  368. that port. If it is @code{#f}, it is set as the default behavior for
  369. any future ports that get created in this thread.
  370. @end deffn
  371. As with port encodings, there is a fluid which determines the initial
  372. conversion strategy for a port.
  373. @deffn {Scheme Variable} %default-port-conversion-strategy
  374. The fluid that defines the conversion strategy for newly created ports,
  375. and also for other conversion routines such as @code{scm_to_stringn},
  376. @code{scm_from_stringn}, @code{string->pointer}, and
  377. @code{pointer->string}.
  378. Its value must be one of the symbols described above, with the same
  379. semantics: @code{error}, @code{substitute}, or @code{escape}.
  380. When Guile starts, its value is @code{substitute}.
  381. Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
  382. equivalent to @code{(fluid-set! %default-port-conversion-strategy
  383. @var{sym})}.
  384. @end deffn
  385. As mentioned above, for an output port there are three possible port
  386. conversion strategies. The @code{error} strategy will throw an error
  387. when a nonconvertible character is encountered. The @code{substitute}
  388. strategy will replace nonconvertible characters with a question mark
  389. (@samp{?}). Finally the @code{escape} strategy will print
  390. nonconvertible characters as a hex escape, using the escaping that is
  391. recognized by Guile's string syntax. Note that if the port's encoding
  392. is a Unicode encoding, like @code{UTF-8}, then encoding errors are
  393. impossible.
  394. For an input port, the @code{error} strategy will cause Guile to throw
  395. an error if it encounters an invalid encoding, such as might happen if
  396. you tried to read @code{ISO-8859-1} as @code{UTF-8}. The error is
  397. thrown before advancing the read position. The @code{substitute}
  398. strategy will replace the bad bytes with a U+FFFD replacement character,
  399. in accordance with Unicode recommendations. When reading from an input
  400. port, the @code{escape} strategy is treated as if it were @code{error}.
  401. @node Textual I/O
  402. @subsection Textual I/O
  403. @cindex textual input
  404. @cindex textual output
  405. This section describes Guile's core textual I/O operations on characters
  406. and strings. @xref{Binary I/O}, for input and output of bytes and
  407. bytevectors. @xref{Encoding}, for more on how characters relate to
  408. bytes. To read general S-expressions from ports, @xref{Scheme Read}.
  409. @xref{Scheme Write}, for interfaces that write generic Scheme datums.
  410. To use these routines, first include the textual I/O module:
  411. @example
  412. (use-modules (ice-9 textual-ports))
  413. @end example
  414. Note that although this module's name suggests that textual ports are
  415. some different kind of port, that's not the case: all ports in Guile are
  416. both binary and textual ports.
  417. @deffn {Scheme Procedure} get-char input-port
  418. Reads from @var{input-port}, blocking as necessary, until a
  419. complete character is available from @var{input-port},
  420. or until an end of file is reached.
  421. If a complete character is available before the next end of file,
  422. @code{get-char} returns that character and updates the input port to
  423. point past the character. If an end of file is reached before any
  424. character is read, @code{get-char} returns the end-of-file object.
  425. @end deffn
  426. @deffn {Scheme Procedure} lookahead-char input-port
  427. The @code{lookahead-char} procedure is like @code{get-char}, but it does
  428. not update @var{input-port} to point past the character.
  429. @end deffn
  430. In the same way that it's possible to "unget" a byte or bytes, it's
  431. possible to "unget" the bytes corresponding to an encoded character.
  432. @deffn {Scheme Procedure} unget-char port char
  433. Place character @var{char} in @var{port} so that it will be read by the
  434. next read operation. If called multiple times, the unread characters
  435. will be read again in last-in first-out order.
  436. @end deffn
  437. @deffn {Scheme Procedure} unget-string port str
  438. Place the string @var{str} in @var{port} so that its characters will
  439. be read from left-to-right as the next characters from @var{port}
  440. during subsequent read operations. If called multiple times, the
  441. unread characters will be read again in last-in first-out order.
  442. @end deffn
  443. Reading in a character at a time can be inefficient. If it's possible
  444. to perform I/O over multiple characters at a time, via strings, that
  445. might be faster.
  446. @deffn {Scheme Procedure} get-string-n input-port count
  447. The @code{get-string-n} procedure reads from @var{input-port}, blocking
  448. as necessary, until @var{count} characters are available, or until an
  449. end of file is reached. @var{count} must be an exact, non-negative
  450. integer, representing the number of characters to be read.
  451. If @var{count} characters are available before end of file,
  452. @code{get-string-n} returns a string consisting of those @var{count}
  453. characters. If fewer characters are available before an end of file, but
  454. one or more characters can be read, @code{get-string-n} returns a string
  455. containing those characters. In either case, the input port is updated
  456. to point just past the characters read. If no characters can be read
  457. before an end of file, the end-of-file object is returned.
  458. @end deffn
  459. @deffn {Scheme Procedure} get-string-n! input-port string start count
  460. The @code{get-string-n!} procedure reads from @var{input-port} in the
  461. same manner as @code{get-string-n}. @var{start} and @var{count} must be
  462. exact, non-negative integer objects, with @var{count} representing the
  463. number of characters to be read. @var{string} must be a string with at
  464. least $@var{start} + @var{count}$ characters.
  465. If @var{count} characters are available before an end of file, they are
  466. written into @var{string} starting at index @var{start}, and @var{count}
  467. is returned. If fewer characters are available before an end of file,
  468. but one or more can be read, those characters are written into
  469. @var{string} starting at index @var{start} and the number of characters
  470. actually read is returned as an exact integer object. If no characters
  471. can be read before an end of file, the end-of-file object is returned.
  472. @end deffn
  473. @deffn {Scheme Procedure} get-string-all input-port
  474. Reads from @var{input-port} until an end of file, decoding characters in
  475. the same manner as @code{get-string-n} and @code{get-string-n!}.
  476. If characters are available before the end of file, a string containing
  477. all the characters decoded from that data are returned. If no character
  478. precedes the end of file, the end-of-file object is returned.
  479. @end deffn
  480. @deffn {Scheme Procedure} get-line input-port
  481. Reads from @var{input-port} up to and including the linefeed
  482. character or end of file, decoding characters in the same manner as
  483. @code{get-string-n} and @code{get-string-n!}.
  484. If a linefeed character is read, a string containing all of the text up
  485. to (but not including) the linefeed character is returned, and the port
  486. is updated to point just past the linefeed character. If an end of file
  487. is encountered before any linefeed character is read, but some
  488. characters have been read and decoded as characters, a string containing
  489. those characters is returned. If an end of file is encountered before
  490. any characters are read, the end-of-file object is returned.
  491. @end deffn
  492. Finally, there are just two core procedures to write characters to a
  493. port.
  494. @deffn {Scheme Procedure} put-char port char
  495. Writes @var{char} to the port. The @code{put-char} procedure returns
  496. an unspecified value.
  497. @end deffn
  498. @deffn {Scheme Procedure} put-string port string
  499. @deffnx {Scheme Procedure} put-string port string start
  500. @deffnx {Scheme Procedure} put-string port string start count
  501. Write the @var{count} characters of @var{string} starting at index
  502. @var{start} to the port.
  503. @var{start} and @var{count} must be non-negative exact integer objects.
  504. @var{string} must have a length of at least @math{@var{start} +
  505. @var{count}}. @var{start} defaults to 0. @var{count} defaults to
  506. @math{@code{(string-length @var{string})} - @var{start}}$.
  507. Calling @code{put-string} is equivalent in all respects to calling
  508. @code{put-char} on the relevant sequence of characters, except that it
  509. will attempt to write multiple characters to the port at a time, even if
  510. the port is unbuffered.
  511. The @code{put-string} procedure returns an unspecified value.
  512. @end deffn
  513. Textual ports have a textual position associated with them: a line and a
  514. column. Reading in characters or writing them out advances the line and
  515. the column appropriately.
  516. @deffn {Scheme Procedure} port-column port
  517. @deffnx {Scheme Procedure} port-line port
  518. @deffnx {C Function} scm_port_column (port)
  519. @deffnx {C Function} scm_port_line (port)
  520. Return the current column number or line number of @var{port}.
  521. @end deffn
  522. Port lines and positions are represented as 0-origin integers, which is
  523. to say that the first character of the first line is line 0, column
  524. 0. However, when you display a line number, for example in an error
  525. message, we recommend you add 1 to get 1-origin integers. This is
  526. because lines numbers traditionally start with 1, and that is what
  527. non-programmers will find most natural.
  528. @deffn {Scheme Procedure} set-port-column! port column
  529. @deffnx {Scheme Procedure} set-port-line! port line
  530. @deffnx {C Function} scm_set_port_column_x (port, column)
  531. @deffnx {C Function} scm_set_port_line_x (port, line)
  532. Set the current column or line number of @var{port}.
  533. @end deffn
  534. @node Simple Output
  535. @subsection Simple Textual Output
  536. Guile exports a simple formatted output function, @code{simple-format}.
  537. For a more capable formatted output facility, @xref{Formatted Output}.
  538. @deffn {Scheme Procedure} simple-format destination message . args
  539. @deffnx {C Function} scm_simple_format (destination, message, args)
  540. Write @var{message} to @var{destination}, defaulting to the current
  541. output port. @var{message} can contain @code{~A} and @code{~S} escapes.
  542. When printed, the escapes are replaced with corresponding members of
  543. @var{args}: @code{~A} formats using @code{display} and @code{~S} formats
  544. using @code{write}. If @var{destination} is @code{#t}, then use the
  545. current output port, if @var{destination} is @code{#f}, then return a
  546. string containing the formatted text. Does not add a trailing newline.
  547. @end deffn
  548. Somewhat confusingly, Guile binds the @code{format} identifier to
  549. @code{simple-format} at startup. Once @code{(ice-9 format)} loads, it
  550. actually replaces the core @code{format} binding, so depending on
  551. whether you or a module you use has loaded @code{(ice-9 format)}, you
  552. may be using the simple or the more capable version.
  553. @node Buffering
  554. @subsection Buffering
  555. @cindex Port, buffering
  556. Every port has associated input and output buffers. You can think of
  557. ports as being backed by some mutable store, and that store might be far
  558. away. For example, ports backed by file descriptors have to go all the
  559. way to the kernel to read and write their data. To avoid this
  560. round-trip cost, Guile usually reads in data from the mutable store in
  561. chunks, and then services small requests like @code{get-char} out of
  562. that intermediate buffer. Similarly, small writes like
  563. @code{write-char} first go to a buffer, and are sent to the store when
  564. the buffer is full (or when port is flushed). Buffered ports speed up
  565. your program by reducing the number of round-trips to the mutable store,
  566. and they do so in a way that is mostly transparent to the user.
  567. There are two major ways, however, in which buffering affects program
  568. semantics. Building correct, performant programs requires understanding
  569. these situations.
  570. The first case is in random-access read/write ports (@pxref{Random
  571. Access}). These ports, usually backed by a file, logically operate over
  572. the same mutable store when both reading and writing. So, if you read a
  573. character, causing the buffer to fill, then write a character, the bytes
  574. you filled in your read buffer are now invalid. Every time you switch
  575. between reading and writing, Guile has to flush any pending buffer. If
  576. this happens frequently, the cost can be high. In that case you should
  577. reduce the amount that you buffer, in both directions. Similarly, Guile
  578. has to flush buffers before seeking. None of these considerations apply
  579. to sockets, which don't logically read from and write to the same
  580. mutable store, and are not seekable. Note also that sockets are
  581. unbuffered by default. @xref{Network Sockets and Communication}.
  582. The second case is the more pernicious one. If you write data to a
  583. buffered port, it probably doesn't go out to the mutable store directly.
  584. (This ``probably'' introduces some indeterminism in your program: what
  585. goes to the store, and when, depends on how full the buffer is. It is
  586. something that the user needs to explicitly be aware of.) The data is
  587. written to the store later -- when the buffer fills up due to another
  588. write, or when @code{force-output} is called, or when @code{close-port}
  589. is called, or when the program exits, or even when the garbage collector
  590. runs. The salient point is, @emph{the errors are signaled then too}.
  591. Buffered writes defer error detection (and defer the side effects to the
  592. mutable store), perhaps indefinitely if the port type does not need to
  593. be closed at GC.
  594. One common heuristic that works well for textual ports is to flush
  595. output when a newline (@code{\n}) is written. This @dfn{line buffering}
  596. mode is on by default for TTY ports. Most other ports are @dfn{block
  597. buffered}, meaning that once the output buffer reaches the block size,
  598. which depends on the port and its configuration, the output is flushed
  599. as a block, without regard to what is in the block. Likewise reads are
  600. read in at the block size, though if there are fewer bytes available to
  601. read, the buffer may not be entirely filled.
  602. Note that binary reads or writes that are larger than the buffer size go
  603. directly to the mutable store without passing through the buffers. If
  604. your access pattern involves many big reads or writes, buffering might
  605. not matter so much to you.
  606. To control the buffering behavior of a port, use @code{setvbuf}.
  607. @deffn {Scheme Procedure} setvbuf port mode [size]
  608. @deffnx {C Function} scm_setvbuf (port, mode, size)
  609. @cindex port buffering
  610. Set the buffering mode for @var{port}. @var{mode} can be one of the
  611. following symbols:
  612. @table @code
  613. @item none
  614. non-buffered
  615. @item line
  616. line buffered
  617. @item block
  618. block buffered, using a newly allocated buffer of @var{size} bytes.
  619. If @var{size} is omitted, a default size will be used.
  620. @end table
  621. @end deffn
  622. Another way to set the buffering, for file ports, is to open the file
  623. with @code{0} or @code{l} as part of the mode string, for unbuffered or
  624. line-buffered ports, respectively. @xref{File Ports}, for more.
  625. Any buffered output data will be written out when the port is closed.
  626. To make sure to flush it at specific points in your program, use
  627. @code{force-output}.
  628. @findex fflush
  629. @deffn {Scheme Procedure} force-output [port]
  630. @deffnx {C Function} scm_force_output (port)
  631. Flush the specified output port, or the current output port if
  632. @var{port} is omitted. The current output buffer contents, if any, are
  633. passed to the underlying port implementation.
  634. The return value is unspecified.
  635. @end deffn
  636. @deffn {Scheme Procedure} flush-all-ports
  637. @deffnx {C Function} scm_flush_all_ports ()
  638. Equivalent to calling @code{force-output} on all open output ports. The
  639. return value is unspecified.
  640. @end deffn
  641. Similarly, sometimes you might want to switch from using Guile's ports
  642. to working directly on file descriptors. In that case, for input ports
  643. use @code{drain-input} to get any buffered input from that port.
  644. @deffn {Scheme Procedure} drain-input port
  645. @deffnx {C Function} scm_drain_input (port)
  646. This procedure clears a port's input buffers, similar
  647. to the way that force-output clears the output buffer. The
  648. contents of the buffers are returned as a single string, e.g.,
  649. @lisp
  650. (define p (open-input-file ...))
  651. (drain-input p) => empty string, nothing buffered yet.
  652. (unread-char (read-char p) p)
  653. (drain-input p) => initial chars from p, up to the buffer size.
  654. @end lisp
  655. @end deffn
  656. All of these considerations are very similar to those of streams in the
  657. C library, although Guile's ports are not built on top of C streams.
  658. Still, it is useful to read what other systems do.
  659. @xref{Streams,,,libc,The GNU C Library Reference Manual}, for more
  660. discussion on C streams.
  661. @node Random Access
  662. @subsection Random Access
  663. @cindex Random access, ports
  664. @cindex Port, random access
  665. @deffn {Scheme Procedure} seek fd_port offset whence
  666. @deffnx {C Function} scm_seek (fd_port, offset, whence)
  667. Sets the current position of @var{fd_port} to the integer
  668. @var{offset}. For a file port, @var{offset} is expressed
  669. as a number of bytes; for other types of ports, such as string
  670. ports, @var{offset} is an abstract representation of the
  671. position within the port's data, not necessarily expressed
  672. as a number of bytes. @var{offset} is interpreted according to
  673. the value of @var{whence}.
  674. One of the following variables should be supplied for
  675. @var{whence}:
  676. @defvar SEEK_SET
  677. Seek from the beginning of the file.
  678. @end defvar
  679. @defvar SEEK_CUR
  680. Seek from the current position.
  681. @end defvar
  682. @defvar SEEK_END
  683. Seek from the end of the file.
  684. @end defvar
  685. On systems that support it, such as GNU/Linux, the following
  686. constants can be used for @var{whence} to navigate ``holes'' in
  687. sparse files:
  688. @defvar SEEK_DATA
  689. Seek to the next location in the file greater than or equal to
  690. @var{offset} containing data. If @var{offset} points to data,
  691. then the file offset is set to @var{offset}.
  692. @end defvar
  693. @defvar SEEK_HOLE
  694. Seek to the next hole in the file greater than or equal to the
  695. @var{offset}. If @var{offset} points into the middle of a hole,
  696. then the file offset is set to @var{offset}. If there is no hole
  697. past @var{offset}, then the file offset is adjusted to the end of
  698. the file---i.e., there is an implicit hole at the end of any file.
  699. @end defvar
  700. If @var{fd_port} is a file descriptor, the underlying system call
  701. is @code{lseek} (@pxref{File Position Primitive,,, libc, The GNU C
  702. Library Reference Manual}). @var{port} may be a string port.
  703. The value returned is the new position in @var{fd_port}. This means
  704. that the current position of a port can be obtained using:
  705. @lisp
  706. (seek port 0 SEEK_CUR)
  707. @end lisp
  708. @end deffn
  709. @deffn {Scheme Procedure} ftell fd_port
  710. @deffnx {C Function} scm_ftell (fd_port)
  711. Return an integer representing the current position of
  712. @var{fd_port}, measured from the beginning. Equivalent to:
  713. @lisp
  714. (seek port 0 SEEK_CUR)
  715. @end lisp
  716. @end deffn
  717. @findex truncate
  718. @findex ftruncate
  719. @deffn {Scheme Procedure} truncate-file file [length]
  720. @deffnx {C Function} scm_truncate_file (file, length)
  721. Truncate @var{file} to @var{length} bytes. @var{file} can be a
  722. filename string, a port object, or an integer file descriptor. The
  723. return value is unspecified.
  724. For a port or file descriptor @var{length} can be omitted, in which
  725. case the file is truncated at the current position (per @code{ftell}
  726. above).
  727. On most systems a file can be extended by giving a length greater than
  728. the current size, but this is not mandatory in the POSIX standard.
  729. @end deffn
  730. @node Line/Delimited
  731. @subsection Line Oriented and Delimited Text
  732. @cindex Line input/output
  733. @cindex Port, line input/output
  734. The delimited-I/O module can be accessed with:
  735. @lisp
  736. (use-modules (ice-9 rdelim))
  737. @end lisp
  738. It can be used to read or write lines of text, or read text delimited by
  739. a specified set of characters.
  740. @deffn {Scheme Procedure} read-line [port] [handle-delim]
  741. Return a line of text from @var{port} if specified, otherwise from the
  742. value returned by @code{(current-input-port)}. Under Unix, a line of text
  743. is terminated by the first end-of-line character or by end-of-file.
  744. If @var{handle-delim} is specified, it should be one of the following
  745. symbols:
  746. @table @code
  747. @item trim
  748. Discard the terminating delimiter. This is the default, but it will
  749. be impossible to tell whether the read terminated with a delimiter or
  750. end-of-file.
  751. @item concat
  752. Append the terminating delimiter (if any) to the returned string.
  753. @item peek
  754. Push the terminating delimiter (if any) back on to the port.
  755. @item split
  756. Return a pair containing the string read from the port and the
  757. terminating delimiter or end-of-file object.
  758. @end table
  759. @end deffn
  760. @deffn {Scheme Procedure} read-line! buf [port]
  761. Read a line of text into the supplied string @var{buf} and return the
  762. number of characters added to @var{buf}. If @var{buf} is filled, then
  763. @code{#f} is returned. Read from @var{port} if specified, otherwise
  764. from the value returned by @code{(current-input-port)}.
  765. @end deffn
  766. @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
  767. Read text until one of the characters in the string @var{delims} is
  768. found or end-of-file is reached. Read from @var{port} if supplied,
  769. otherwise from the value returned by @code{(current-input-port)}.
  770. @var{handle-delim} takes the same values as described for
  771. @code{read-line}.
  772. @end deffn
  773. @c begin (scm-doc-string "rdelim.scm" "read-delimited!")
  774. @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
  775. Read text into the supplied string @var{buf}.
  776. If a delimiter was found, return the number of characters written,
  777. except if @var{handle-delim} is @code{split}, in which case the return
  778. value is a pair, as noted above.
  779. As a special case, if @var{port} was already at end-of-stream, the EOF
  780. object is returned. Also, if no characters were written because the
  781. buffer was full, @code{#f} is returned.
  782. It's something of a wacky interface, to be honest.
  783. @end deffn
  784. @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
  785. @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
  786. Read characters from @var{port} into @var{str} until one of the
  787. characters in the @var{delims} string is encountered. If
  788. @var{gobble} is true, discard the delimiter character;
  789. otherwise, leave it in the input stream for the next read. If
  790. @var{port} is not specified, use the value of
  791. @code{(current-input-port)}. If @var{start} or @var{end} are
  792. specified, store data only into the substring of @var{str}
  793. bounded by @var{start} and @var{end} (which default to the
  794. beginning and end of the string, respectively).
  795. Return a pair consisting of the delimiter that terminated the
  796. string and the number of characters read. If reading stopped
  797. at the end of file, the delimiter returned is the
  798. @var{eof-object}; if the string was filled without encountering
  799. a delimiter, this value is @code{#f}.
  800. @end deffn
  801. @deffn {Scheme Procedure} %read-line [port]
  802. @deffnx {C Function} scm_read_line (port)
  803. Read a newline-terminated line from @var{port}, allocating storage as
  804. necessary. The newline terminator (if any) is removed from the string,
  805. and a pair consisting of the line and its delimiter is returned. The
  806. delimiter may be either a newline or the @var{eof-object}; if
  807. @code{%read-line} is called at the end of file, it returns the pair
  808. @code{(#<eof> . #<eof>)}.
  809. @end deffn
  810. @deffn {Scheme Procedure} write-line obj [port]
  811. @deffnx {C Function} scm_write_line (obj, port)
  812. Display @var{obj} and a newline character to @var{port}. If
  813. @var{port} is not specified, @code{(current-output-port)} is
  814. used. This procedure is equivalent to:
  815. @lisp
  816. (display obj [port])
  817. (newline [port])
  818. @end lisp
  819. @end deffn
  820. @node Default Ports
  821. @subsection Default Ports for Input, Output and Errors
  822. @cindex Default ports
  823. @cindex Port, default
  824. @rnindex current-input-port
  825. @deffn {Scheme Procedure} current-input-port
  826. @deffnx {C Function} scm_current_input_port ()
  827. @cindex standard input
  828. Return the current input port. This is the default port used
  829. by many input procedures.
  830. Initially this is the @dfn{standard input} in Unix and C terminology.
  831. When the standard input is a TTY the port is unbuffered, otherwise
  832. it's fully buffered.
  833. Unbuffered input is good if an application runs an interactive
  834. subprocess, since any type-ahead input won't go into Guile's buffer
  835. and be unavailable to the subprocess.
  836. Note that Guile buffering is completely separate from the TTY ``line
  837. discipline''. In the usual cooked mode on a TTY Guile only sees a
  838. line of input once the user presses @key{Return}.
  839. @end deffn
  840. @rnindex current-output-port
  841. @deffn {Scheme Procedure} current-output-port
  842. @deffnx {C Function} scm_current_output_port ()
  843. @cindex standard output
  844. Return the current output port. This is the default port used
  845. by many output procedures.
  846. Initially this is the @dfn{standard output} in Unix and C terminology.
  847. When the standard output is a TTY this port is unbuffered, otherwise
  848. it's fully buffered.
  849. Unbuffered output to a TTY is good for ensuring progress output or a
  850. prompt is seen. But an application which always prints whole lines
  851. could change to line buffered, or an application with a lot of output
  852. could go fully buffered and perhaps make explicit @code{force-output}
  853. calls (@pxref{Buffering}) at selected points.
  854. @end deffn
  855. @deffn {Scheme Procedure} current-error-port
  856. @deffnx {C Function} scm_current_error_port ()
  857. @cindex standard error output
  858. Return the port to which errors and warnings should be sent.
  859. Initially this is the @dfn{standard error} in Unix and C terminology.
  860. When the standard error is a TTY this port is unbuffered, otherwise
  861. it's fully buffered.
  862. @end deffn
  863. @deffn {Scheme Procedure} set-current-input-port port
  864. @deffnx {Scheme Procedure} set-current-output-port port
  865. @deffnx {Scheme Procedure} set-current-error-port port
  866. @deffnx {C Function} scm_set_current_input_port (port)
  867. @deffnx {C Function} scm_set_current_output_port (port)
  868. @deffnx {C Function} scm_set_current_error_port (port)
  869. Change the ports returned by @code{current-input-port},
  870. @code{current-output-port} and @code{current-error-port}, respectively,
  871. so that they use the supplied @var{port} for input or output.
  872. @end deffn
  873. @deffn {Scheme Procedure} with-input-from-port port thunk
  874. @deffnx {Scheme Procedure} with-output-to-port port thunk
  875. @deffnx {Scheme Procedure} with-error-to-port port thunk
  876. Call @var{thunk} in a dynamic environment in which
  877. @code{current-input-port}, @code{current-output-port} or
  878. @code{current-error-port} is rebound to the given @var{port}.
  879. @end deffn
  880. @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
  881. @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
  882. @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
  883. These functions must be used inside a pair of calls to
  884. @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
  885. Wind}). During the dynwind context, the indicated port is set to
  886. @var{port}.
  887. More precisely, the current port is swapped with a `backup' value
  888. whenever the dynwind context is entered or left. The backup value is
  889. initialized with the @var{port} argument.
  890. @end deftypefn
  891. @node Port Types
  892. @subsection Types of Port
  893. @cindex Types of ports
  894. @cindex Port, types
  895. @menu
  896. * File Ports:: Ports on an operating system file.
  897. * Bytevector Ports:: Ports on a bytevector.
  898. * String Ports:: Ports on a Scheme string.
  899. * Custom Ports:: Ports whose implementation you control.
  900. * Soft Ports:: A Guile-specific version of custom ports.
  901. * Void Ports:: Ports on nothing at all.
  902. * Low-Level Custom Ports:: Implementing new kinds of port.
  903. * Low-Level Custom Ports in C:: A C counterpart to make-custom-port.
  904. @end menu
  905. @node File Ports
  906. @subsubsection File Ports
  907. @cindex File port
  908. @cindex Port, file
  909. The following procedures are used to open file ports.
  910. See also @ref{Ports and File Descriptors, open}, for an interface
  911. to the Unix @code{open} system call.
  912. All file access uses the ``LFS'' large file support functions when
  913. available, so files bigger than 2 gibibytes (@math{2^31} bytes) can be
  914. read and written on a 32-bit system.
  915. Most systems have limits on how many files can be open, so it's
  916. strongly recommended that file ports be closed explicitly when no
  917. longer required (@pxref{Ports}).
  918. @deffn {Scheme Procedure} open-file filename mode @
  919. [#:guess-encoding=#f] [#:encoding=#f]
  920. @deffnx {C Function} scm_open_file_with_encoding @
  921. (filename, mode, guess_encoding, encoding)
  922. @deffnx {C Function} scm_open_file (filename, mode)
  923. Open the file whose name is @var{filename}, and return a port
  924. representing that file. The attributes of the port are
  925. determined by the @var{mode} string. The way in which this is
  926. interpreted is similar to C stdio. The first character must be
  927. one of the following:
  928. @table @samp
  929. @item r
  930. Open an existing file for input.
  931. @item w
  932. Open a file for output, creating it if it doesn't already exist
  933. or removing its contents if it does.
  934. @item a
  935. Open a file for output, creating it if it doesn't already
  936. exist. All writes to the port will go to the end of the file.
  937. The "append mode" can be turned off while the port is in use
  938. @pxref{Ports and File Descriptors, fcntl}
  939. @end table
  940. The following additional characters can be appended:
  941. @table @samp
  942. @item b
  943. Open the underlying file in binary mode, if supported by the system.
  944. Also, open the file using the binary-compatible character encoding
  945. "ISO-8859-1", ignoring the default port encoding.
  946. @item +
  947. Open the port for both input and output. E.g., @code{r+}: open
  948. an existing file for both input and output.
  949. @item e
  950. Mark the underlying file descriptor as close-on-exec, as per the
  951. @code{O_CLOEXEC} flag.
  952. @item 0
  953. Create an "unbuffered" port. In this case input and output
  954. operations are passed directly to the underlying port
  955. implementation without additional buffering. This is likely to
  956. slow down I/O operations. The buffering mode can be changed
  957. while a port is in use (@pxref{Buffering}).
  958. @item l
  959. Add line-buffering to the port. The port output buffer will be
  960. automatically flushed whenever a newline character is written.
  961. @item b
  962. Use binary mode, ensuring that each byte in the file will be read as one
  963. Scheme character.
  964. To provide this property, the file will be opened with the 8-bit
  965. character encoding "ISO-8859-1", ignoring the default port encoding.
  966. @xref{Ports}, for more information on port encodings.
  967. Note that while it is possible to read and write binary data as
  968. characters or strings, it is usually better to treat bytes as octets,
  969. and byte sequences as bytevectors. @xref{Binary I/O}, for more.
  970. This option had another historical meaning, for DOS compatibility: in
  971. the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
  972. The @code{b} flag prevents this from happening, adding @code{O_BINARY}
  973. to the underlying @code{open} call. Still, the flag is generally useful
  974. because of its port encoding ramifications.
  975. @end table
  976. Unless binary mode is requested, the character encoding of the new port
  977. is determined as follows: First, if @var{guess-encoding} is true, the
  978. @code{file-encoding} procedure is used to guess the encoding of the file
  979. (@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
  980. is false or if @code{file-encoding} fails, @var{encoding} is used unless
  981. it is also false. As a last resort, the default port encoding is used.
  982. @xref{Ports}, for more information on port encodings. It is an error to
  983. pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
  984. is requested.
  985. If a file cannot be opened with the access requested, @code{open-file}
  986. throws an exception.
  987. @end deffn
  988. @rnindex open-input-file
  989. @deffn {Scheme Procedure} open-input-file filename @
  990. [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
  991. Open @var{filename} for input. If @var{binary} is true, open the port
  992. in binary mode, otherwise use text mode. @var{encoding} and
  993. @var{guess-encoding} determine the character encoding as described above
  994. for @code{open-file}. Equivalent to
  995. @lisp
  996. (open-file @var{filename}
  997. (if @var{binary} "rb" "r")
  998. #:guess-encoding @var{guess-encoding}
  999. #:encoding @var{encoding})
  1000. @end lisp
  1001. @end deffn
  1002. @rnindex open-output-file
  1003. @deffn {Scheme Procedure} open-output-file filename @
  1004. [#:encoding=#f] [#:binary=#f]
  1005. Open @var{filename} for output. If @var{binary} is true, open the port
  1006. in binary mode, otherwise use text mode. @var{encoding} specifies the
  1007. character encoding as described above for @code{open-file}. Equivalent
  1008. to
  1009. @lisp
  1010. (open-file @var{filename}
  1011. (if @var{binary} "wb" "w")
  1012. #:encoding @var{encoding})
  1013. @end lisp
  1014. @end deffn
  1015. @deffn {Scheme Procedure} call-with-input-file filename proc @
  1016. [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
  1017. @deffnx {Scheme Procedure} call-with-output-file filename proc @
  1018. [#:encoding=#f] [#:binary=#f]
  1019. @rnindex call-with-input-file
  1020. @rnindex call-with-output-file
  1021. Open @var{filename} for input or output, and call @code{(@var{proc}
  1022. port)} with the resulting port. Return the value returned by
  1023. @var{proc}. @var{filename} is opened as per @code{open-input-file} or
  1024. @code{open-output-file} respectively, and an error is signaled if it
  1025. cannot be opened.
  1026. When @var{proc} returns, the port is closed. If @var{proc} does not
  1027. return (e.g.@: if it throws an error), then the port might not be
  1028. closed automatically, though it will be garbage collected in the usual
  1029. way if not otherwise referenced.
  1030. @end deffn
  1031. @deffn {Scheme Procedure} with-input-from-file filename thunk @
  1032. [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
  1033. @deffnx {Scheme Procedure} with-output-to-file filename thunk @
  1034. [#:encoding=#f] [#:binary=#f]
  1035. @deffnx {Scheme Procedure} with-error-to-file filename thunk @
  1036. [#:encoding=#f] [#:binary=#f]
  1037. @rnindex with-input-from-file
  1038. @rnindex with-output-to-file
  1039. Open @var{filename} and call @code{(@var{thunk})} with the new port
  1040. setup as respectively the @code{current-input-port},
  1041. @code{current-output-port}, or @code{current-error-port}. Return the
  1042. value returned by @var{thunk}. @var{filename} is opened as per
  1043. @code{open-input-file} or @code{open-output-file} respectively, and an
  1044. error is signaled if it cannot be opened.
  1045. When @var{thunk} returns, the port is closed and the previous setting
  1046. of the respective current port is restored.
  1047. The current port setting is managed with @code{dynamic-wind}, so the
  1048. previous value is restored no matter how @var{thunk} exits (eg.@: an
  1049. exception), and if @var{thunk} is re-entered (via a captured
  1050. continuation) then it's set again to the @var{filename} port.
  1051. The port is closed when @var{thunk} returns normally, but not when
  1052. exited via an exception or new continuation. This ensures it's still
  1053. ready for use if @var{thunk} is re-entered by a captured continuation.
  1054. Of course the port is always garbage collected and closed in the usual
  1055. way when no longer referenced anywhere.
  1056. @end deffn
  1057. @deffn {Scheme Procedure} port-mode port
  1058. @deffnx {C Function} scm_port_mode (port)
  1059. Return the port modes associated with the open port @var{port}.
  1060. These will not necessarily be identical to the modes used when
  1061. the port was opened, since modes such as "append" which are
  1062. used only during port creation are not retained.
  1063. @end deffn
  1064. @deffn {Scheme Procedure} port-filename port
  1065. @deffnx {C Function} scm_port_filename (port)
  1066. Return the filename associated with @var{port}, or @code{#f} if no
  1067. filename is associated with the port.
  1068. @var{port} must be open; @code{port-filename} cannot be used once the
  1069. port is closed.
  1070. @end deffn
  1071. @deffn {Scheme Procedure} set-port-filename! port filename
  1072. @deffnx {C Function} scm_set_port_filename_x (port, filename)
  1073. Change the filename associated with @var{port}, using the current input
  1074. port if none is specified. Note that this does not change the port's
  1075. source of data, but only the value that is returned by
  1076. @code{port-filename} and reported in diagnostic output.
  1077. @end deffn
  1078. @deffn {Scheme Procedure} file-port? obj
  1079. @deffnx {C Function} scm_file_port_p (obj)
  1080. Determine whether @var{obj} is a port that is related to a file.
  1081. @end deffn
  1082. @node Bytevector Ports
  1083. @subsubsection Bytevector Ports
  1084. @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
  1085. @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
  1086. Return an input port whose contents are drawn from bytevector @var{bv}
  1087. (@pxref{Bytevectors}).
  1088. @c FIXME: Update description when implemented.
  1089. The @var{transcoder} argument is currently not supported.
  1090. @end deffn
  1091. @deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
  1092. @deffnx {C Function} scm_open_bytevector_output_port (transcoder)
  1093. Return two values: a binary output port and a procedure. The latter
  1094. should be called with zero arguments to obtain a bytevector containing
  1095. the data accumulated by the port, as illustrated below.
  1096. @lisp
  1097. (call-with-values
  1098. (lambda ()
  1099. (open-bytevector-output-port))
  1100. (lambda (port get-bytevector)
  1101. (display "hello" port)
  1102. (get-bytevector)))
  1103. @result{} #vu8(104 101 108 108 111)
  1104. @end lisp
  1105. @c FIXME: Update description when implemented.
  1106. The @var{transcoder} argument is currently not supported.
  1107. @end deffn
  1108. @deffn {Scheme Procedure} call-with-output-bytevector proc
  1109. Call the one-argument procedure @var{proc} with a newly created
  1110. bytevector output port. When the function returns, the bytevector
  1111. composed of the characters written into the port is returned.
  1112. @var{proc} should not close the port.
  1113. @end deffn
  1114. @deffn {Scheme Procedure} call-with-input-bytevector bytevector proc
  1115. Call the one-argument procedure @var{proc} with a newly created input
  1116. port from which @var{bytevector}'s contents may be read. The values
  1117. yielded by the @var{proc} is returned.
  1118. @end deffn
  1119. @node String Ports
  1120. @subsubsection String Ports
  1121. @cindex String port
  1122. @cindex Port, string
  1123. @deffn {Scheme Procedure} call-with-output-string proc
  1124. @deffnx {C Function} scm_call_with_output_string (proc)
  1125. Calls the one-argument procedure @var{proc} with a newly created output
  1126. port. When the function returns, the string composed of the characters
  1127. written into the port is returned. @var{proc} should not close the port.
  1128. @end deffn
  1129. @deffn {Scheme Procedure} call-with-input-string string proc
  1130. @deffnx {C Function} scm_call_with_input_string (string, proc)
  1131. Calls the one-argument procedure @var{proc} with a newly
  1132. created input port from which @var{string}'s contents may be
  1133. read. The value yielded by the @var{proc} is returned.
  1134. @end deffn
  1135. @deffn {Scheme Procedure} with-output-to-string thunk
  1136. Calls the zero-argument procedure @var{thunk} with the current output
  1137. port set temporarily to a new string port. It returns a string
  1138. composed of the characters written to the current output.
  1139. @end deffn
  1140. @deffn {Scheme Procedure} with-input-from-string string thunk
  1141. Calls the zero-argument procedure @var{thunk} with the current input
  1142. port set temporarily to a string port opened on the specified
  1143. @var{string}. The value yielded by @var{thunk} is returned.
  1144. @end deffn
  1145. @deffn {Scheme Procedure} open-input-string str
  1146. @deffnx {C Function} scm_open_input_string (str)
  1147. Take a string and return an input port that delivers characters
  1148. from the string. The port can be closed by
  1149. @code{close-input-port}, though its storage will be reclaimed
  1150. by the garbage collector if it becomes inaccessible.
  1151. @end deffn
  1152. @deffn {Scheme Procedure} open-output-string
  1153. @deffnx {C Function} scm_open_output_string ()
  1154. Return an output port that will accumulate characters for
  1155. retrieval by @code{get-output-string}. The port can be closed
  1156. by the procedure @code{close-output-port}, though its storage
  1157. will be reclaimed by the garbage collector if it becomes
  1158. inaccessible.
  1159. @end deffn
  1160. @deffn {Scheme Procedure} get-output-string port
  1161. @deffnx {C Function} scm_get_output_string (port)
  1162. Given an output port created by @code{open-output-string},
  1163. return a string consisting of the characters that have been
  1164. output to the port so far.
  1165. @code{get-output-string} must be used before closing @var{port}, once
  1166. closed the string cannot be obtained.
  1167. @end deffn
  1168. With string ports, the port-encoding is treated differently than other
  1169. types of ports. When string ports are created, they do not inherit a
  1170. character encoding from the current locale. They are given a
  1171. default locale that allows them to handle all valid string characters.
  1172. Typically one should not modify a string port's character encoding
  1173. away from its default. @xref{Encoding}.
  1174. @node Custom Ports
  1175. @subsubsection Custom Ports
  1176. Custom ports allow the user to provide input and handle output via
  1177. user-supplied procedures. The most basic of these operates on the level
  1178. of bytes, calling user-supplied functions to supply bytes for input and
  1179. accept bytes for output. In Guile, textual ports are built on top of
  1180. binary ports, encoding and decoding their codepoint sequences from the
  1181. bytes; the higher-level textual layer for custom ports allows users to
  1182. deal in characters instead of bytes.
  1183. Before using these procedures, import the appropriate module:
  1184. @example
  1185. (use-modules (ice-9 binary-ports))
  1186. (use-modules (ice-9 textual-ports))
  1187. @end example
  1188. @cindex custom binary input ports
  1189. @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
  1190. Return a new custom binary input port named @var{id} (a string) whose
  1191. input is drained by invoking @var{read!} and passing it a bytevector, an
  1192. index where bytes should be written, and the number of bytes to read.
  1193. The @code{read!} procedure must return an integer indicating the number
  1194. of bytes read, or @code{0} to indicate the end-of-file.
  1195. Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
  1196. that will be called when @code{port-position} is invoked on the custom
  1197. binary port and should return an integer indicating the position within
  1198. the underlying data stream; if @var{get-position} was not supplied, the
  1199. returned port does not support @code{port-position}.
  1200. Likewise, if @var{set-position!} is not @code{#f}, it should be a
  1201. one-argument procedure. When @code{set-port-position!} is invoked on the
  1202. custom binary input port, @var{set-position!} is passed an integer
  1203. indicating the position of the next byte is to read.
  1204. Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
  1205. invoked when the custom binary input port is closed.
  1206. The returned port is fully buffered by default, but its buffering mode
  1207. can be changed using @code{setvbuf} (@pxref{Buffering}).
  1208. Using a custom binary input port, the @code{open-bytevector-input-port}
  1209. procedure (@pxref{Bytevector Ports}) could be implemented as follows:
  1210. @lisp
  1211. (define (open-bytevector-input-port source)
  1212. (define position 0)
  1213. (define length (bytevector-length source))
  1214. (define (read! bv start count)
  1215. (let ((count (min count (- length position))))
  1216. (bytevector-copy! source position
  1217. bv start count)
  1218. (set! position (+ position count))
  1219. count))
  1220. (define (get-position) position)
  1221. (define (set-position! new-position)
  1222. (set! position new-position))
  1223. (make-custom-binary-input-port "the port" read!
  1224. get-position set-position!
  1225. #f))
  1226. (read (open-bytevector-input-port (string->utf8 "hello")))
  1227. @result{} hello
  1228. @end lisp
  1229. @end deffn
  1230. @cindex custom binary output ports
  1231. @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
  1232. Return a new custom binary output port named @var{id} (a string) whose
  1233. output is sunk by invoking @var{write!} and passing it a bytevector, an
  1234. index where bytes should be read from this bytevector, and the number of
  1235. bytes to be ``written''. The @code{write!} procedure must return an
  1236. integer indicating the number of bytes actually written; when it is
  1237. passed @code{0} as the number of bytes to write, it should behave as
  1238. though an end-of-file was sent to the byte sink.
  1239. The other arguments are as for @code{make-custom-binary-input-port}.
  1240. @end deffn
  1241. @cindex custom binary input/output ports
  1242. @deffn {Scheme Procedure} make-custom-binary-input/output-port id read! write! get-position set-position! close
  1243. Return a new custom binary input/output port named @var{id} (a string).
  1244. The various arguments are the same as for The other arguments are as for
  1245. @code{make-custom-binary-input-port} and
  1246. @code{make-custom-binary-output-port}. If buffering is enabled on the
  1247. port, as is the case by default, input will be buffered in both
  1248. directions; @xref{Buffering}. If the @var{set-position!} function is
  1249. provided and not @code{#f}, then the port will also be marked as
  1250. random-access, causing the buffer to be flushed between reads and
  1251. writes.
  1252. @end deffn
  1253. @cindex custom textual ports
  1254. @cindex custom textual input ports
  1255. @cindex custom textual output ports
  1256. @cindex custom textual input/output ports
  1257. @deffn {Scheme Procedure} make-custom-textual-input-port id read! get-position set-position! close
  1258. @deffnx {Scheme Procedure} make-custom-textual-output-port id write! get-position set-position! close
  1259. @deffnx {Scheme Procedure} make-custom-textual-input/output-port id read! write! get-position set-position! close
  1260. Like their custom binary port counterparts, but for textual ports.
  1261. Concretely this means that instead of being passed a bytevector, the
  1262. @var{read} function is passed a mutable string to fill, and likewise for
  1263. the buffer supplied to @var{write}. Port positions are still expressed
  1264. in bytes, however.
  1265. If string ports were not supplied with Guile, we could implement them
  1266. With custom textual ports:
  1267. @example
  1268. (define (open-string-input-port source)
  1269. (define position 0)
  1270. (define length (string-length source))
  1271. (define (read! dst start count)
  1272. (let ((count (min count (- length position))))
  1273. (string-copy! dst start source position (+ position count))
  1274. (set! position (+ position count))
  1275. count))
  1276. (make-custom-textual-input-port "strport" read! #f #f #f))
  1277. (read (open-string-input-port "hello"))
  1278. @end example
  1279. @end deffn
  1280. @node Soft Ports
  1281. @subsubsection Soft Ports
  1282. @cindex Soft port
  1283. @cindex Port, soft
  1284. Soft ports are what Guile had before it had custom binary and textual
  1285. ports, and allow for customizable textual input and output.
  1286. We recommend soft ports over R6RS custom textual ports because they are
  1287. easier to use while also being more expressive. R6RS custom textual
  1288. ports operate under the principle that a port has a mutable string
  1289. buffer, and this is reflected in the @code{read} and @code{write}
  1290. procedures which take a buffer, offset, and length. However in Guile as
  1291. all ports have a byte buffer rather than some having a string buffer,
  1292. the R6RS interface imposes overhead and complexity.
  1293. Additionally, and unlike the R6RS interfaces, @code{make-soft-port} from
  1294. the @code{(ice-9 soft-ports)} module accepts keyword arguments, allowing
  1295. for its functionality to be extended over time.
  1296. If you find yourself needing more power, notably the ability to seek,
  1297. probably you want to use low-level custom ports. @xref{Low-Level Custom
  1298. Ports}.
  1299. @example
  1300. (use-modules (ice-9 soft-ports))
  1301. @end example
  1302. @deffn {Scheme Procedure} make-soft-port @
  1303. [#:id] [#:read-string] [#:write-string] [#:input-waiting?] @
  1304. [#:close] [#:close-on-gc?]
  1305. Return a new port. If the @var{read-string} keyword argument is
  1306. present, the port will be an input port. If @var{write-string} is
  1307. present, the port will be an output port. If both are supplied, the
  1308. port will be open for input and output.
  1309. When the port's internal buffers are empty, @var{read-string} will be
  1310. called with no arguments, and should return a string, or @code{#f} to
  1311. indicate end-of-stream. Similarly when a port flushes its write buffer,
  1312. the characters in that buffer will be passed to the @var{write-string}
  1313. procedure as its single argument. @var{write-string} returns
  1314. unspecified values.
  1315. If supplied, @var{input-waiting?} should return @code{#t} if the soft
  1316. port has input which would be returned directly by @var{read-string}.
  1317. If supplied, @var{close} will be called when the port is closed, with no
  1318. arguments. If @var{close-on-gc?} is @code{#t}, @var{close} will
  1319. additionally be called when the port becomes unreachable, after flushing
  1320. any pending write buffers.
  1321. @end deffn
  1322. With soft ports, the @code{open-string-input-port} example from the
  1323. previous section is more simple:
  1324. @example
  1325. (define (open-string-input-port source)
  1326. (define already-read? #f)
  1327. (define (read-string)
  1328. (cond
  1329. (already-read? "")
  1330. (else
  1331. (set! already-read? #t)
  1332. source)))
  1333. (make-soft-port #:id "strport" #:read-string read-string))
  1334. @end example
  1335. Note that there was an earlier form of @code{make-soft-port} which was
  1336. exposed in Guile's default environment, and which is still there. Its
  1337. interface is more clumsy and its users historically expect unbuffered
  1338. input. This interface will be deprecated, but we document it here.
  1339. @deffn {Scheme Procedure} deprecated-make-soft-port pv modes
  1340. Return a port capable of receiving or delivering characters as
  1341. specified by the @var{modes} string (@pxref{File Ports,
  1342. open-file}). @var{pv} must be a vector of length 5 or 6. Its
  1343. components are as follows:
  1344. @enumerate 0
  1345. @item
  1346. procedure accepting one character for output
  1347. @item
  1348. procedure accepting a string for output
  1349. @item
  1350. thunk for flushing output
  1351. @item
  1352. thunk for getting one character
  1353. @item
  1354. thunk for closing port (not by garbage collection)
  1355. @item
  1356. (if present and not @code{#f}) thunk for computing the number of
  1357. characters that can be read from the port without blocking.
  1358. @end enumerate
  1359. For an output-only port only elements 0, 1, 2, and 4 need be
  1360. procedures. For an input-only port only elements 3 and 4 need
  1361. be procedures. Thunks 2 and 4 can instead be @code{#f} if
  1362. there is no useful operation for them to perform.
  1363. If thunk 3 returns @code{#f} or an @code{eof-object}
  1364. (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
  1365. Scheme}) it indicates that the port has reached end-of-file.
  1366. For example:
  1367. @lisp
  1368. (define stdout (current-output-port))
  1369. (define p (deprecated-make-soft-port
  1370. (vector
  1371. (lambda (c) (write c stdout))
  1372. (lambda (s) (display s stdout))
  1373. (lambda () (display "." stdout))
  1374. (lambda () (char-upcase (read-char)))
  1375. (lambda () (display "@@" stdout)))
  1376. "rw"))
  1377. (write p p) @result{} #<input-output: soft 8081e20>
  1378. @end lisp
  1379. @end deffn
  1380. @node Void Ports
  1381. @subsubsection Void Ports
  1382. @cindex Void port
  1383. @cindex Port, void
  1384. This kind of port causes any data to be discarded when written to, and
  1385. always returns the end-of-file object when read from.
  1386. @deffn {Scheme Procedure} %make-void-port mode
  1387. @deffnx {C Function} scm_sys_make_void_port (mode)
  1388. Create and return a new void port. A void port acts like
  1389. @file{/dev/null}. The @var{mode} argument
  1390. specifies the input/output modes for this port: see the
  1391. documentation for @code{open-file} in @ref{File Ports}.
  1392. @end deffn
  1393. @node Low-Level Custom Ports
  1394. @subsubsection Low-Level Custom Ports
  1395. This section describes how to implement a new kind of port using Guile's
  1396. lowest-level, most primitive interfaces. First, load the @code{(ice-9
  1397. custom-ports)} module:
  1398. @example
  1399. (use-modules (ice-9 custom-ports))
  1400. @end example
  1401. Then to make a new port, call @code{make-custom-port}:
  1402. @deffn {Scheme Procedure} make-custom-port @
  1403. [#:read] [#:write] @
  1404. [#:read-wait-fd] [#:write-wait-fd] [#:input-waiting?] @
  1405. [#:seek] [#:random-access?] [#:get-natural-buffer-sizes] @
  1406. [#:id] [#:print] @
  1407. [#:close] [#:close-on-gc?] @
  1408. [#:truncate] @
  1409. [#:encoding] [#:conversion-strategy]
  1410. Make a new custom port.
  1411. @xref{Encoding}, for more on @code{#:encoding} and
  1412. @code{#:conversion-strategy}.
  1413. @end deffn
  1414. A port has a number of associated procedures and properties which
  1415. collectively implement its behavior. Creating a new custom port mostly
  1416. involves writing these procedures, which are passed as keyword arguments
  1417. to @code{make-custom-port}.
  1418. @deffn {Scheme Port Method} #:read port dst start count
  1419. A port's @code{#:read} implementation fills read buffers. It should
  1420. copy bytes to the supplied bytevector @var{dst}, starting at offset
  1421. @var{start} and continuing for @var{count} bytes, and return the number
  1422. of bytes that were read, or @code{#f} to indicate that reading any bytes
  1423. would block.
  1424. @end deffn
  1425. @deffn {Scheme Port Method} #:write port src start count
  1426. A port's @code{#:write} implementation flushes write buffers to the
  1427. mutable store. It should write out bytes from the supplied bytevector
  1428. @var{src}, starting at offset @var{start} and continuing for @var{count}
  1429. bytes, and return the number of bytes that were written, or @code{#f} to
  1430. indicate writing any bytes would block.
  1431. @end deffn
  1432. If @code{make-custom-port} is passed a @code{#:read} argument, the port
  1433. will be an input port. Passing a @code{#:write} argument will make an
  1434. output port, and passing both will make an input-output port.
  1435. @deffn {Scheme Port Method} #:read-wait-fd port
  1436. @deffnx {Scheme Port Method} #:write-wait-fd port
  1437. If a port's @code{#:read} or @code{#:write} method returns @code{#f},
  1438. that indicates that reading or writing would block, and that Guile
  1439. should instead @code{poll} on the file descriptor returned by the port's
  1440. @code{#:read-wait-fd} or @code{#:write-wait-fd} method, respectively,
  1441. until the operation can complete. @xref{Non-Blocking I/O}, for a more
  1442. in-depth discussion.
  1443. These methods must be implemented if the @code{#:read} or @code{#:write}
  1444. method can return @code{#f}, and should return a non-negative integer
  1445. file descriptor. However they may be called explicitly by a user, for
  1446. example to determine if a port may eventually be readable or writable.
  1447. If there is no associated file descriptor with the port, they should
  1448. return @code{#f}. The default implementation returns @code{#f}.
  1449. @end deffn
  1450. @deffn {Scheme Port Method} #:input-waiting? port
  1451. In rare cases it is useful to be able to know whether data can be read
  1452. from a port. For example, if the user inputs @code{1 2 3} at the
  1453. interactive console, after reading and evaluating @code{1} the console
  1454. shouldn't then print another prompt before reading and evaluating
  1455. @code{2} because there is input already waiting. If the port can look
  1456. ahead, then it should implement the @code{#:input-waiting?} method,
  1457. which returns @code{#t} if input is available, or @code{#f} reading the
  1458. next byte would block. The default implementation returns @code{#t}.
  1459. @end deffn
  1460. @deffn {Scheme Port Method} #:seek port offset whence
  1461. Set or get the current byte position of the port. Guile will flush read
  1462. and/or write buffers before seeking, as appropriate. The @var{offset}
  1463. and @var{whence} parameters are as for the @code{seek} procedure;
  1464. @xref{Random Access}.
  1465. The @code{#:seek} method returns the byte position after seeking. To
  1466. query the current position, @code{#:seek} will be called with an
  1467. @var{offset} of 0 and @code{SEEK_CUR} for @var{whence}. Other values of
  1468. @var{offset} and/or @var{whence} will actually perform the seek. The
  1469. @code{#:seek} method should throw an error if the port is not seekable,
  1470. which is what the default implementation does.
  1471. @end deffn
  1472. @deffn {Scheme Port Method} #:truncate port
  1473. Truncate the port data to be specified length. Guile will flush buffers
  1474. beforehand, as appropriate. The default implementation throws an error,
  1475. indicating that truncation is not supported for this port.
  1476. @end deffn
  1477. @deffn {Scheme Port Method} #:random-access? port
  1478. Return @code{#t} if @var{port} is open for random access, or @code{#f}
  1479. otherwise.
  1480. @cindex random access
  1481. Seeking on a random-access port with buffered input, or switching to
  1482. writing after reading, will cause the buffered input to be discarded and
  1483. Guile will seek the port back the buffered number of bytes. Likewise
  1484. seeking on a random-access port with buffered output, or switching to
  1485. reading after writing, will flush pending bytes with a call to the
  1486. @code{write} procedure. @xref{Buffering}.
  1487. Indicate to Guile that your port needs this behavior by returning true
  1488. from your @code{#:random-access?} method. The default implementation of
  1489. this function returns @code{#t} if the port has a @code{#:seek}
  1490. implementation.
  1491. @end deffn
  1492. @deffn {Scheme Port Method} #:get-natural-buffer-sizes read-buf-size write-buf-size
  1493. Guile will internally attach buffers to ports. An input port always has
  1494. a read buffer, and an output port always has a write buffer.
  1495. @xref{Buffering}. A port buffer consists of a bytevector, along with
  1496. some cursors into that bytevector denoting where to get and put data.
  1497. Port implementations generally don't have to be concerned with
  1498. buffering: a port's @code{#:read} or @code{#:write} method will receive
  1499. the buffer's bytevector as an argument, along with an offset and a
  1500. length into that bytevector, and should then either fill or empty that
  1501. bytevector. However in some cases, port implementations may be able to
  1502. provide an appropriate default buffer size to Guile. For example file
  1503. ports implement @code{#:get-natural-buffer-sizes} to let the operating
  1504. system inform Guile about the appropriate buffer sizes for the
  1505. particular file opened by the port.
  1506. This method returns two values, corresponding to the natural read and
  1507. write buffer sizes for the ports. The two parameters
  1508. @var{read-buf-size} and @var{write-buf-size} are Guile's guesses for
  1509. what sizes might be good. A custom @code{#:get-natural-buffer-sizes}
  1510. method could override Guile's choices, or just pass them on, as the
  1511. default implementation does.
  1512. @end deffn
  1513. @deffn {Scheme Port Method} #:print port out
  1514. Called when the port @var{port} is written to @var{out}, e.g. via
  1515. @code{(write port out)}.
  1516. If @code{#:print} is not explicitly supplied, the default implementation
  1517. prints something like @code{#<@var{mode}:@var{id} @var{address}>}, where
  1518. @var{mode} is either @code{input}, @code{output}, or
  1519. @code{input-output}, @var{id} comes from the @code{#:id} keyword
  1520. argument (defaulting to @code{"custom-port"}), and @var{address} is a
  1521. unique integer associated with the port.
  1522. @end deffn
  1523. @deffn {Scheme Port Method} #:close port
  1524. Called when @var{port} is closed. It should release any
  1525. explicitly-managed resources used by the port.
  1526. @end deffn
  1527. By default, ports that are garbage collected just go away without
  1528. closing or flushing any buffered output. If your port needs to release
  1529. some external resource like a file descriptor, or needs to make sure
  1530. that its internal buffers are flushed even if the port is collected
  1531. while it was open, then pass @code{#:close-on-gc? #t} to
  1532. @code{make-custom-port}. Note that in that case, the @code{#:close}
  1533. method will probably be called on a separate thread.
  1534. Note that calls to all of these methods can proceed in parallel and
  1535. concurrently and from any thread up until the point that the port is
  1536. closed. The call to @code{close} will happen when no other method is
  1537. running, and no method will be called after the @code{close} method is
  1538. called. If your port implementation needs mutual exclusion to prevent
  1539. concurrency, it is responsible for locking appropriately.
  1540. @node Low-Level Custom Ports in C
  1541. @subsubsection Low-Level Custom Ports in C
  1542. The @code{make-custom-port} procedure described in the previous section
  1543. has similar functionality on the C level, though it is organized a bit
  1544. differently.
  1545. In C, the mechanism is that one creates a new @dfn{port type object}.
  1546. The methods are then associated with the port type object instead of the
  1547. port itself. The port type object is an opaque pointer allocated when
  1548. defining the port type, which serves as a key into the port API.
  1549. Ports themselves have associated @dfn{stream} values. The stream is a
  1550. pointer controlled by the user, which is set when the port is created.
  1551. Given a port, the @code{SCM_STREAM} macro returns its associated stream
  1552. value, as a @code{scm_t_bits}. Note that your port methods are only
  1553. ever called with ports of your type, so port methods can safely cast
  1554. this value to the expected type. Contrast this to Scheme, which doesn't
  1555. need access to the stream because the @code{make-custom-port} methods
  1556. can be closures that share port-specific data directly.
  1557. A port type is created by calling @code{scm_make_port_type}.
  1558. @deftypefun scm_t_port_type* scm_make_port_type (char *name, size_t (*read) (SCM port, SCM dst, size_t start, size_t count), size_t (*write) (SCM port, SCM src, size_t start, size_t count))
  1559. Define a new port type. The @var{name} parameter is like the
  1560. @code{#:id} parameter to @code{make-custom-port}; and @var{read} and
  1561. @var{write} are like @code{make-custom-port}'s @code{#:read} and
  1562. @code{#:write}, except that they should return @code{(size_t)-1} if the
  1563. read or write operation would block, instead of @code{#f}.
  1564. @end deftypefun
  1565. @deftypefun void scm_set_port_read_wait_fd (scm_t_port_type *type, int (*wait_fd) (SCM port))
  1566. @deftypefunx void scm_set_port_write_wait_fd (scm_t_port_type *type, int (*wait_fd) (SCM port))
  1567. @deftypefunx void scm_set_port_print (scm_t_port_type *type, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
  1568. @deftypefunx void scm_set_port_close (scm_t_port_type *type, void (*close) (SCM port))
  1569. @deftypefunx void scm_set_port_needs_close_on_gc (scm_t_port_type *type, int needs_close_p)
  1570. @deftypefunx void scm_set_port_seek (scm_t_port_type *type, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
  1571. @deftypefunx void scm_set_port_truncate (scm_t_port_type *type, void (*truncate) (SCM port, scm_t_off length))
  1572. @deftypefunx void scm_set_port_random_access_p (scm_t_port_type *type, int (*random_access_p) (SCM port));
  1573. @deftypefunx void scm_set_port_input_waiting (scm_t_port_type *type, int (*input_waiting) (SCM port));
  1574. @deftypefunx void scm_set_port_get_natural_buffer_sizes @
  1575. (scm_t_port_type *type, void (*get_natural_buffer_sizes) (SCM, size_t *read_buf_size, size_t *write_buf_size))
  1576. Port method definitions. @xref{Low-Level Custom Ports}, for more
  1577. details on each of these methods.
  1578. @end deftypefun
  1579. Once you have your port type, you can create ports with
  1580. @code{scm_c_make_port}, or @code{scm_c_make_port_with_encoding}.
  1581. @deftypefun SCM scm_c_make_port_with_encoding (scm_t_port_type *type, unsigned long mode_bits, SCM encoding, SCM conversion_strategy, scm_t_bits stream)
  1582. @deftypefunx SCM scm_c_make_port (scm_t_port_type *type, unsigned long mode_bits, scm_t_bits stream)
  1583. Make a port with the given @var{type}. The @var{stream} indicates the
  1584. private data associated with the port, which your port implementation
  1585. may later retrieve with @code{SCM_STREAM}. The mode bits should include
  1586. one or more of the flags @code{SCM_RDNG} or @code{SCM_WRTNG}, indicating
  1587. that the port is an input and/or an output port, respectively. The mode
  1588. bits may also include @code{SCM_BUF0} or @code{SCM_BUFLINE}, indicating
  1589. that the port should be unbuffered or line-buffered, respectively. The
  1590. default is that the port will be block-buffered. @xref{Buffering}.
  1591. As you would imagine, @var{encoding} and @var{conversion_strategy}
  1592. specify the port's initial textual encoding and conversion strategy.
  1593. Both are symbols. @code{scm_c_make_port} is the same as
  1594. @code{scm_c_make_port_with_encoding}, except it uses the default port
  1595. encoding and conversion strategy.
  1596. @end deftypefun
  1597. At this point you may be wondering whether to implement your custom port
  1598. type in C or Scheme. The answer is that probably you want to use
  1599. Scheme's @code{make-custom-port}. The speed is similar between C and
  1600. Scheme, and ports implemented in C have the disadvantage of not being
  1601. suspendable. @xref{Non-Blocking I/O}.
  1602. @node Venerable Port Interfaces
  1603. @subsection Venerable Port Interfaces
  1604. Over the 25 years or so that Guile has been around, its port system has
  1605. evolved, adding many useful features. At the same time there have been
  1606. four major Scheme standards released in those 25 years, which also
  1607. evolve the common Scheme understanding of what a port interface should
  1608. be. Alas, it would be too much to ask for all of these evolutionary
  1609. branches to be consistent. Some of Guile's original interfaces don't
  1610. mesh with the later Scheme standards, and yet Guile can't just drop old
  1611. interfaces. Sadly as well, the R6RS and R7RS standards both part from a
  1612. base of R5RS, but end up in different and somewhat incompatible designs.
  1613. Guile's approach is to pick a set of port primitives that make sense
  1614. together. We document that set of primitives, design our internal
  1615. interfaces around them, and recommend them to users. As the R6RS I/O
  1616. system is the most capable standard that Scheme has yet produced in this
  1617. domain, we mostly recommend that; @code{(ice-9 binary-ports)} and
  1618. @code{(ice-9 textual-ports)} are wholly modeled on @code{(rnrs io
  1619. ports)}. Guile does not wholly copy R6RS, however; @xref{R6RS
  1620. Incompatibilities}.
  1621. At the same time, we have many venerable port interfaces, lore handed
  1622. down to us from our hacker ancestors. Most of these interfaces even
  1623. predate the expectation that Scheme should have modules, so they are
  1624. present in the default environment. In Guile we support them as well
  1625. and we have no plans to remove them, but again we don't recommend them
  1626. for new users.
  1627. @rnindex char-ready?
  1628. @deffn {Scheme Procedure} char-ready? [port]
  1629. Return @code{#t} if a character is ready on input @var{port}
  1630. and return @code{#f} otherwise. If @code{char-ready?} returns
  1631. @code{#t} then the next @code{read-char} operation on
  1632. @var{port} is guaranteed not to hang. If @var{port} is a file
  1633. port at end of file then @code{char-ready?} returns @code{#t}.
  1634. @code{char-ready?} exists to make it possible for a
  1635. program to accept characters from interactive ports without
  1636. getting stuck waiting for input. Any input editors associated
  1637. with such ports must make sure that characters whose existence
  1638. has been asserted by @code{char-ready?} cannot be rubbed out.
  1639. If @code{char-ready?} were to return @code{#f} at end of file,
  1640. a port at end of file would be indistinguishable from an
  1641. interactive port that has no ready characters.
  1642. Note that @code{char-ready?} only works reliably for terminals and
  1643. sockets with one-byte encodings. Under the hood it will return
  1644. @code{#t} if the port has any input buffered, or if the file descriptor
  1645. that backs the port polls as readable, indicating that Guile can fetch
  1646. more bytes from the kernel. However being able to fetch one byte
  1647. doesn't mean that a full character is available; @xref{Encoding}. Also,
  1648. on many systems it's possible for a file descriptor to poll as readable,
  1649. but then block when it comes time to read bytes. Note also that on
  1650. Linux kernels, all file ports backed by files always poll as readable.
  1651. For non-file ports, this procedure always returns @code{#t}, except for
  1652. soft ports, which have a @code{char-ready?} handler. @xref{Soft Ports}.
  1653. In short, this is a legacy procedure whose semantics are hard to
  1654. provide. However it is a useful check to see if any input is buffered.
  1655. @xref{Non-Blocking I/O}.
  1656. @end deffn
  1657. @rnindex read-char
  1658. @deffn {Scheme Procedure} read-char [port]
  1659. The same as @code{get-char}, except that @var{port} defaults to the
  1660. current input port. @xref{Textual I/O}.
  1661. @end deffn
  1662. @rnindex peek-char
  1663. @deffn {Scheme Procedure} peek-char [port]
  1664. The same as @code{lookahead-char}, except that @var{port} defaults to
  1665. the current input port. @xref{Textual I/O}.
  1666. @end deffn
  1667. @deffn {Scheme Procedure} unread-char cobj [port]
  1668. The same as @code{unget-char}, except that @var{port} defaults to the
  1669. current input port, and the arguments are swapped. @xref{Textual I/O}.
  1670. @end deffn
  1671. @deffn {Scheme Procedure} unread-string str [port]
  1672. @deffnx {C Function} scm_unread_string (str, port)
  1673. The same as @code{unget-string}, except that @var{port} defaults to the
  1674. current input port, and the arguments are swapped. @xref{Textual I/O}.
  1675. @end deffn
  1676. @rnindex newline
  1677. @deffn {Scheme Procedure} newline [port]
  1678. Send a newline to @var{port}. If @var{port} is omitted, send to the
  1679. current output port. Equivalent to @code{(put-char port #\newline)}.
  1680. @end deffn
  1681. @rnindex write-char
  1682. @deffn {Scheme Procedure} write-char chr [port]
  1683. The same as @code{put-char}, except that @var{port} defaults to the
  1684. current input port, and the arguments are swapped. @xref{Textual I/O}.
  1685. @end deffn
  1686. @node Using Ports from C
  1687. @subsection Using Ports from C
  1688. Guile's C interfaces provides some niceties for sending and receiving
  1689. bytes and characters in a way that works better with C.
  1690. @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
  1691. Read up to @var{size} bytes from @var{port} and store them in
  1692. @var{buffer}. The return value is the number of bytes actually read,
  1693. which can be less than @var{size} if end-of-file has been reached.
  1694. Note that as this is a binary input procedure, this function does not
  1695. update @code{port-line} and @code{port-column} (@pxref{Textual I/O}).
  1696. @end deftypefn
  1697. @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
  1698. Write @var{size} bytes at @var{buffer} to @var{port}.
  1699. Note that as this is a binary output procedure, this function does not
  1700. update @code{port-line} and @code{port-column} (@pxref{Textual I/O}).
  1701. @end deftypefn
  1702. @deftypefn {C Function} size_t scm_c_read_bytes (SCM port, SCM bv, size_t start, size_t count)
  1703. @deftypefnx {C Function} void scm_c_write_bytes (SCM port, SCM bv, size_t start, size_t count)
  1704. Like @code{scm_c_read} and @code{scm_c_write}, but reading into or
  1705. writing from the bytevector @var{bv}. @var{count} indicates the byte
  1706. index at which to start in the bytevector, and the read or write will
  1707. continue for @var{count} bytes.
  1708. @end deftypefn
  1709. @deftypefn {C Function} void scm_unget_bytes (const unsigned char *buf, size_t len, SCM port)
  1710. @deftypefnx {C Function} void scm_unget_byte (int c, SCM port)
  1711. @deftypefnx {C Function} void scm_ungetc (scm_t_wchar c, SCM port)
  1712. Like @code{unget-bytevector}, @code{unget-byte}, and @code{unget-char},
  1713. respectively. @xref{Textual I/O}.
  1714. @end deftypefn
  1715. @deftypefn {C Function} void scm_c_put_latin1_chars (SCM port, const scm_t_uint8 *buf, size_t len)
  1716. @deftypefnx {C Function} void scm_c_put_utf32_chars (SCM port, const scm_t_uint32 *buf, size_t len);
  1717. Write a string to @var{port}. In the first case, the
  1718. @code{scm_t_uint8*} buffer is a string in the latin-1 encoding. In the
  1719. second, the @code{scm_t_uint32*} buffer is a string in the UTF-32
  1720. encoding. These routines will update the port's line and column.
  1721. @end deftypefn
  1722. @node Non-Blocking I/O
  1723. @subsection Non-Blocking I/O
  1724. Most ports in Guile are @dfn{blocking}: when you try to read a character
  1725. from a port, Guile will block on the read until a character is ready, or
  1726. end-of-stream is detected. Likewise whenever Guile goes to write
  1727. (possibly buffered) data to an output port, Guile will block until all
  1728. the data is written.
  1729. Interacting with ports in blocking mode is very convenient: you can
  1730. write straightforward, sequential algorithms whose code flow reflects
  1731. the flow of data. However, blocking I/O has two main limitations.
  1732. The first is that it's easy to get into a situation where code is
  1733. waiting on data. Time spent waiting on data when code could be doing
  1734. something else is wasteful and prevents your program from reaching its
  1735. peak throughput. If you implement a web server that sequentially
  1736. handles requests from clients, it's very easy for the server to end up
  1737. waiting on a client to finish its HTTP request, or waiting on it to
  1738. consume the response. The end result is that you are able to serve
  1739. fewer requests per second than you'd like to serve.
  1740. The second limitation is related: a blocking parser over user-controlled
  1741. input is a denial-of-service vulnerability. Indeed the so-called ``slow
  1742. loris'' attack of the early 2010s was just that: an attack on common web
  1743. servers that drip-fed HTTP requests, one character at a time. All it
  1744. took was a handful of slow loris connections to occupy an entire web
  1745. server.
  1746. In Guile we would like to preserve the ability to write straightforward
  1747. blocking networking processes of all kinds, but under the hood to allow
  1748. those processes to suspend their requests if they would block.
  1749. To do this, the first piece is to allow Guile ports to declare
  1750. themselves as being nonblocking. This is currently supported only for
  1751. file ports, which also includes sockets, terminals, or any other port
  1752. that is backed by a file descriptor. To do that, we use an arcane UNIX
  1753. incantation:
  1754. @example
  1755. (let ((flags (fcntl socket F_GETFL)))
  1756. (fcntl socket F_SETFL (logior O_NONBLOCK flags)))
  1757. @end example
  1758. Now the file descriptor is open in non-blocking mode. If Guile tries to
  1759. read or write from this file and the read or write returns a result
  1760. indicating that more data can only be had by doing a blocking read or
  1761. write, Guile will block by polling on the socket's @code{read-wait-fd}
  1762. or @code{write-wait-fd}, to preserve the illusion of a blocking read or
  1763. write. @xref{Low-Level Custom Ports} for more on those internal
  1764. interfaces.
  1765. So far we have just reproduced the status quo: the file descriptor is
  1766. non-blocking, but the operations on the port do block. To go farther,
  1767. it would be nice if we could suspend the ``thread'' using delimited
  1768. continuations, and only resume the thread once the file descriptor is
  1769. readable or writable. (@xref{Prompts}).
  1770. But here we run into a difficulty. The ports code is implemented in C,
  1771. which means that although we can suspend the computation to some outer
  1772. prompt, we can't resume it because Guile can't resume delimited
  1773. continuations that capture the C stack.
  1774. To overcome this difficulty we have created a compatible but entirely
  1775. parallel implementation of port operations. To use this implementation,
  1776. do the following:
  1777. @example
  1778. (use-modules (ice-9 suspendable-ports))
  1779. (install-suspendable-ports!)
  1780. @end example
  1781. This will replace the core I/O primitives like @code{get-char} and
  1782. @code{put-bytevector} with new versions that are exactly the same as the
  1783. ones in the standard library, but with two differences. One is that
  1784. when a read or a write would block, the suspendable port operations call
  1785. out the value of the @code{current-read-waiter} or
  1786. @code{current-write-waiter} parameter, as appropriate.
  1787. @xref{Parameters}. The default read and write waiters do the same thing
  1788. that the C read and write waiters do, which is to poll. User code can
  1789. parameterize the waiters, though, enabling the computation to suspend
  1790. and allow the program to process other I/O operations. Because the new
  1791. suspendable ports implementation is written in Scheme, that suspended
  1792. computation can resume again later when it is able to make progress.
  1793. Success!
  1794. The other main difference is that because the new ports implementation
  1795. is written in Scheme, it is slower than C, currently by a factor of 3 or
  1796. 4, though it depends on many factors. For this reason we have to keep
  1797. the C implementations as the default ones. One day when Guile's
  1798. compiler is better, we can close this gap and have only one port
  1799. operation implementation again.
  1800. Note that Guile does not currently include an implementation of the
  1801. facility to suspend the current thread and schedule other threads in the
  1802. meantime. Before adding such a thing, we want to make sure that we're
  1803. providing the right primitives that can be used to build schedulers and
  1804. other user-space concurrency patterns, and that the patterns that we
  1805. settle on are the right patterns. In the meantime, have a look at 8sync
  1806. (@url{https://gnu.org/software/8sync}) for a prototype of an
  1807. asynchronous I/O and concurrency facility.
  1808. @deffn {Scheme Procedure} install-suspendable-ports!
  1809. Replace the core ports implementation with suspendable ports, as
  1810. described above. This will mutate the values of the bindings like
  1811. @code{get-char}, @code{put-u8}, and so on in place.
  1812. @end deffn
  1813. @deffn {Scheme Procedure} uninstall-suspendable-ports!
  1814. Restore the original core ports implementation, un-doing the effect of
  1815. @code{install-suspendable-ports!}.
  1816. @end deffn
  1817. @deffn {Scheme Parameter} current-read-waiter
  1818. @deffnx {Scheme Parameter} current-write-waiter
  1819. Parameters whose values are procedures of one argument, called when a
  1820. suspendable port operation would block on a port while reading or
  1821. writing, respectively. The default values of these parameters do a
  1822. blocking @code{poll} on the port's file descriptor. The procedures are
  1823. passed the port in question as their one argument.
  1824. @end deffn
  1825. @node BOM Handling
  1826. @subsection Handling of Unicode Byte Order Marks
  1827. @cindex BOM
  1828. @cindex byte order mark
  1829. This section documents the finer points of Guile's handling of Unicode
  1830. byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
  1831. at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
  1832. determine the byte order. Occasionally, a BOM is found at the start of
  1833. a UTF-8 stream, but this is much less common and not generally
  1834. recommended.
  1835. Guile attempts to handle BOMs automatically, and in accordance with the
  1836. recommendations of the Unicode Standard, when the port encoding is set
  1837. to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
  1838. automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
  1839. and automatically consumes one from the start of a UTF-8, UTF-16, or
  1840. UTF-32 stream.
  1841. As specified in the Unicode Standard, a BOM is only handled specially at
  1842. the start of a stream, and only if the port encoding is set to
  1843. @code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
  1844. set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
  1845. @code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
  1846. the special handling described in this section applies.
  1847. @itemize @bullet
  1848. @item
  1849. To ensure that Guile will properly detect the byte order of a UTF-16 or
  1850. UTF-32 stream, you must perform a textual read before any writes, seeks,
  1851. or binary I/O. Guile will not attempt to read a BOM unless a read is
  1852. explicitly requested at the start of the stream.
  1853. @item
  1854. If a textual write is performed before the first read, then an arbitrary
  1855. byte order will be chosen. Currently, big endian is the default on all
  1856. platforms, but that may change in the future. If you wish to explicitly
  1857. control the byte order of an output stream, set the port encoding to
  1858. @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
  1859. and explicitly write a BOM (@code{#\xFEFF}) if desired.
  1860. @item
  1861. If @code{set-port-encoding!} is called in the middle of a stream, Guile
  1862. treats this as a new logical ``start of stream'' for purposes of BOM
  1863. handling, and will forget about any BOMs that had previously been seen.
  1864. Therefore, it may choose a different byte order than had been used
  1865. previously. This is intended to support multiple logical text streams
  1866. embedded within a larger binary stream.
  1867. @item
  1868. Binary I/O operations are not guaranteed to update Guile's notion of
  1869. whether the port is at the ``start of the stream'', nor are they
  1870. guaranteed to produce or consume BOMs.
  1871. @item
  1872. For ports that support seeking (e.g. normal files), the input and output
  1873. streams are considered linked: if the user reads first, then a BOM will
  1874. be consumed (if appropriate), but later writes will @emph{not} produce a
  1875. BOM. Similarly, if the user writes first, then later reads will
  1876. @emph{not} consume a BOM.
  1877. @item
  1878. For ports that are not random access (e.g. pipes, sockets, and
  1879. terminals), the input and output streams are considered
  1880. @emph{independent} for purposes of BOM handling: the first read will
  1881. consume a BOM (if appropriate), and the first write will @emph{also}
  1882. produce a BOM (if appropriate). However, the input and output streams
  1883. will always use the same byte order.
  1884. @item
  1885. Seeks to the beginning of a file will set the ``start of stream'' flags.
  1886. Therefore, a subsequent textual read or write will consume or produce a
  1887. BOM. However, unlike @code{set-port-encoding!}, if a byte order had
  1888. already been chosen for the port, it will remain in effect after a seek,
  1889. and cannot be changed by the presence of a BOM. Seeks anywhere other
  1890. than the beginning of a file clear the ``start of stream'' flags.
  1891. @end itemize
  1892. @c Local Variables:
  1893. @c TeX-master: "guile.texi"
  1894. @c End: