filelist.sgml 26 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710
  1. <!doctype debiandoc system>
  2. <!-- -*- mode: sgml; mode: fold -*- -->
  3. <book>
  4. <title>DSync File List Format</title>
  5. <author>Jason Gunthorpe <email>jgg@debian.org</email></author>
  6. <version>$Id: filelist.sgml,v 1.4 1999/11/15 07:59:49 jgg Exp $</version>
  7. <abstract>
  8. </abstract>
  9. <copyright>
  10. Copyright &copy; Jason Gunthorpe, 1998-1999.
  11. <p>
  12. DSync and this document are free software; you can redistribute them and/or
  13. modify them under the terms of the GNU General Public License as published
  14. by the Free Software Foundation; either version 2 of the License, or (at your
  15. option) any later version.
  16. <p>
  17. For more details, on Debian GNU/Linux systems, see the file
  18. /usr/doc/copyright/GPL for the full license.
  19. </copyright>
  20. <toc sect>
  21. <chapt>Introduction
  22. <!-- Purpose {{{ -->
  23. <!-- ===================================================================== -->
  24. <sect>Purpose
  25. <p>
  26. The DSync file list is a crucial part of the DSync system, it provides the
  27. client with access to a list of files and file attributes for all the files
  28. in a directory tree. Much information is compacted into the per-file structure
  29. that may be used by the client in reconstructing the directory tree. In spirit
  30. it is like the common ls-lR files that mirrors have, but in practice it is
  31. radically different, most striking is that it is stored in a compacted binary
  32. format and may optionally contain MD5 hashes.
  33. <p>
  34. The file list for a directory tree may be either dynamically generated by the
  35. server or generated only once like the ls-lR files. In fact with a static
  36. file list it is possible to use the <em>rsync method</> to transfer only the
  37. differences in the list which is a huge boon for sites with over 50000 files
  38. in their directory trees
  39. <p>
  40. Internally the file list is stored as a series of directory blocks in no set
  41. order. Each block has a relative path from the base to the directory itself
  42. and a list of all files in that directory. Things are not stored recursively
  43. so that the client can have fixed memory usage when managing the list.
  44. Depending on how the generator is configured the order of the directories
  45. may be breadth first or depth first, or perhaps completely random. The client
  46. should make no assumptions about the ordering of anything in the file.
  47. <p>
  48. Since the list may be generated on the fly by the server it is necessary for
  49. it to be streamable. To this effect there will be no counts or sizes that
  50. refer to anything outside of the current record. This assures that the
  51. generator will be able to build a file list without negligable server side
  52. overhead. Furthermore a focus is placed on making things as small as possible,
  53. to this end usefull items like record length indicators are omitted. This
  54. does necessarily limit the ability to handle format changes.
  55. <!-- }}} -->
  56. <chapt>Structure
  57. <!-- Data Stream {{{ -->
  58. <!-- ===================================================================== -->
  59. <sect>Data Stream
  60. <p>
  61. The data stream is encoded as a series of variable length numbers, fixed
  62. length numbers and strings. The use of variable length number encoding
  63. was chosen to accomidate sites with over 100000 files, mostly below 16k,
  64. using variable length encoding will save approximately 400k of data and still
  65. allow some items that are very large.
  66. <p>
  67. Numbers are coded as a series of bytes of non-fixed length, the highest bit
  68. of each byte is 1 if the next byte is part of this number. Bytes are ordered
  69. backwards from the least significant to the most significant inorder to
  70. simplify decoding, any omitted bits can be assumed to be 0. Clients should
  71. decode into their largest type and fatally error if a number expands to
  72. larger than that. All numbers are positive.
  73. <p>
  74. Strings are coded in pascal form, with a length number preceeding a series
  75. of 8 bit characters making up the string. The strings are coded in UTF.
  76. <p>
  77. The first records in the file should be a header record followed by any
  78. include/exclude records to indicate how the list was generated. Following
  79. that is the actual file list data.
  80. <p>
  81. The records all have the same form, they start with an 8 bit tag value and
  82. then have the raw record data after. The main header has a set of flags for
  83. all of the records types, these flags are used to designate optional portions
  84. of the record. For instance a
  85. file record may not have a md5 hash or uid/gid values, those would be marked
  86. off in the flags. Generally every non-critical value is optional. The records
  87. and their tags are as follows:
  88. <list>
  89. <item> 0 - Header
  90. <item> 1 - Directory Marker
  91. <item> 2 - Directory Start
  92. <item> 3 - Directory End
  93. <item> 4 - Normal File
  94. <item> 5 - Symlink
  95. <item> 6 - Device Special
  96. <item> 7 - Directory
  97. <item> 8 - Include/Exclude
  98. <item> 9 - User Map
  99. <item> 10 - Group Map
  100. <item> 11 - Hard Link
  101. <item> 12 - End Marker
  102. <item> 13 - RSync Checksums
  103. <item> 14 - Aggregate File
  104. <item> 15 - RSync End
  105. </list>
  106. <p>
  107. The header record is placed first in the file followed by Directory records
  108. and then by a number of file type records. The Directory Start/End are used
  109. to indicate which directory the file records are in. The approach is to
  110. create a bundle of file type records for each directory that are stored
  111. non-recursively. The directory marker records are used with depth-first
  112. traversal to create unseen directories with the proper permissions.
  113. <!-- }}} -->
  114. <!-- Header {{{ -->
  115. <!-- ===================================================================== -->
  116. <sect>Header
  117. <p>
  118. The header is the first record in the file and contains some information about
  119. what will follow.
  120. <example>
  121. struct Header
  122. {
  123. uint8 Tag; // 0 for the header
  124. uint32 Signature;
  125. uint16 MajorVersion;
  126. uint16 MinorVersion;
  127. number Epoch;
  128. uint8 FlagCount;
  129. uint32 Flags[12];
  130. };
  131. </example>
  132. <taglist>
  133. <tag>Signature<item>
  134. This field should contain the hex value 0x97E78AB which designates the file
  135. as a DSync file list. Like all numbers it should be stored in network byte
  136. order.
  137. <tag>MajorVersion
  138. <tag>MinorVersion<item>
  139. These two fields designate the revision of the format. The major version
  140. should be increased if an incompatible change is made to the structure of
  141. the file, otherwise the minor version should reflect any changes. The current
  142. major/minor is 0 and 0. Compatibility issues are discussed later on.
  143. <tag>Epoch<item>
  144. Inorder to encode time in a single 32 bit signed integer the format uses a
  145. shifting epoch. Epoch is set to a time in seconds from the unix
  146. epoch. All other times are relative to this time.
  147. In this way we can specify any date 68 years in either direction from any
  148. possible time. Doing so allows us to encode time using only 32 bits. The
  149. generator should either error or truncate if a time value exceeds this
  150. representation. This does impose the limitation that the difference between
  151. the lowest stored date and highest stored date must be no more than 136 years.
  152. <tag>FlagCount<item>
  153. This designates the number of items in the flag array.
  154. <tag>Flags<item>
  155. Each possible record type has a flag value that is used to indicate what
  156. items the generator emitted. There is no per-record flag in order to save
  157. space. The flag array is indexed by the record ID.
  158. </taglist>
  159. <!-- }}} -->
  160. <!-- Directory Marker {{{ -->
  161. <!-- ===================================================================== -->
  162. <sect>Directory Marker, Directory Start and Directory
  163. <p>
  164. The purpose of the directory marker record is to specify directories that
  165. must be created before a directory start record can be processed. It is needed
  166. to ensure the correct permissions and ownership are generated while the
  167. contents are in transfer.
  168. <p>
  169. A Directory Start record serves to indicate a change of directory. All further
  170. file type records will refer to the named directory until a Directory End
  171. record is processed marking the final modification for this directory. It is
  172. not possible to nest directory start directives, in fact a Directory Start
  173. record implies a Directory End record for the previosly Started Directory
  174. <p>
  175. The plain directory record is a file type record that refers to a directory
  176. file type. All of these record types describe the same thing used in different
  177. contexts so share the same structure.
  178. <example>
  179. struct DirMarker
  180. {
  181. uint8 Tag; // 1, 2 or 7 for the header
  182. uint32 ModTime;
  183. uint16 Permissions;
  184. number User;
  185. number Group;
  186. string Path;
  187. };
  188. </example>
  189. <taglist>
  190. <tag>Flags [from the header]<item>
  191. Optional portions of the structure are Permissions (1&lt;&lt;0) and user/group
  192. (1&lt;&lt;1). The bit is set to 1 if they are present.
  193. <tag>ModTime<item>
  194. This is the number of seconds since the file list epoch, it is the modification
  195. date of the directory.
  196. <tag>Permissions<item>
  197. This is the standard unix permissions in the usual format.
  198. <tag>User
  199. <tag>Group<item>
  200. These are the standard unix user/group for the directory. They are indirected
  201. through the user/group maps described later on.
  202. <tag>Path<item>
  203. The path from the base of the file list to the directory this record describes.
  204. However ordinary directory types have a single name relative to the last
  205. Directory Start record.
  206. </taglist>
  207. <!-- }}} -->
  208. <!-- Directory End {{{ -->
  209. <!-- ===================================================================== -->
  210. <sect>Directory End
  211. <p>
  212. The purpose of the directory end marker is to signafy that their will be no
  213. more file type records from this directory. Directory Start and Directory
  214. End records must be paired. The intent of this record is to allow future
  215. expansion, NOT to allow recursive directory blocks. A Directory Start
  216. record will imply a Directory End record if the previous was not terminated.
  217. <p>
  218. There are no data members, it is the basic 1 item record. If the data stream
  219. terminates with an open directory block it is assumed to be truncated and
  220. an error issued.
  221. <!-- }}} -->
  222. <!-- Normal File {{{ -->
  223. <!-- ===================================================================== -->
  224. <sect>Normal File
  225. <p>
  226. A normal file is a simple, regular file. It has the standard set of unix
  227. attributes and an optional MD5 hash for integrity checking.
  228. <example>
  229. struct NormalFile
  230. {
  231. uint8 Tag; // 4
  232. uint32 ModTime;
  233. uint16 Permissions;
  234. number User;
  235. number Group;
  236. string Name;
  237. number Size;
  238. uint128 MD5;
  239. };
  240. </example>
  241. <taglist>
  242. <tag>Flags [from the header]<item>
  243. Optional portions of the structure are Permissions (1&lt;&lt;0), user/group
  244. (1&lt;&lt;1), and MD5 (1&lt;&lt;2). The bit is set to 1 if they are present.
  245. <tag>ModTime<item>
  246. This is the number of seconds since the file list epoch, it is the modification
  247. date of the file.
  248. <tag>Permissions<item>
  249. This is the standard unix permissions in the usual format.
  250. <tag>User
  251. <tag>Group<item>
  252. These are the standard unix user/group for the directory. They are indirected
  253. through the user/group maps described later on.
  254. <tag>Name<item>
  255. The name of the item. It should have no pathname components and is relative
  256. to the last Directory Start record.
  257. <tag>MD5<item>
  258. This is a MD5 hash of the file.
  259. <tag>Size<item>
  260. This is the size of the file in bytes.
  261. </taglist>
  262. <!-- }}} -->
  263. <!-- Symlink {{{ -->
  264. <!-- ===================================================================== -->
  265. <sect>Symlink
  266. <p>
  267. This encodes a normal unix symbolic link. Symlinks do not have permissions
  268. or size, but do have optional ownership.
  269. <example>
  270. struct Symlink
  271. {
  272. uint8 Tag; // 5
  273. uint32 ModTime;
  274. number User;
  275. number Group;
  276. string Name;
  277. uint8 Compression;
  278. string To;
  279. };
  280. </example>
  281. <taglist>
  282. <tag>Flags [from the header]<item>
  283. Optional portions of the structure are, user/group
  284. (1&lt;&lt;0). The bit is set to 1 if they are present.
  285. <tag>ModTime<item>
  286. This is the number of seconds since the file list epoch, it is the modification
  287. date of the file.
  288. <tag>User
  289. <tag>Group<item>
  290. These are the standard unix user/group for the directory. They are indirected
  291. through the user/group maps described later on.
  292. <tag>Name<item>
  293. The name of the item. It should have no pathname components and is relative
  294. to the last Directory Start record.
  295. <tag>Compression<item>
  296. Common use of symlinks makes them very easy to compress, the compression
  297. byte allows this. It is an 8 bit byte with the first 7 bits representing an
  298. unsigned number and the 8th bit as being a flag. The first 7 bits describe
  299. how many bytes of the last symlink should be prepended to To and if the 8th
  300. bit is set then Name is appended to To.
  301. <tag>To<item>
  302. This is the file the symlink is pointing to. It is an absolute string taken
  303. as is. The client may perform checking on it if desired. The string is
  304. compressed as described in the Compression field.
  305. </taglist>
  306. <!-- }}} -->
  307. <!-- Device Special {{{ -->
  308. <!-- ===================================================================== -->
  309. <sect>Device Special
  310. <p>
  311. Device Special records encode unix device special files, which have a major
  312. and a minor number corrisponding to some OS specific attribute. These also
  313. encode fifo files, anything that can be created by mknod.
  314. <example>
  315. struct DeviceSpecial
  316. {
  317. uint8 Tag; // 6
  318. uint32 ModTime;
  319. uint16 Permissions;
  320. number User;
  321. number Group;
  322. number Dev;
  323. string Name;
  324. };
  325. </example>
  326. <taglist>
  327. <tag>Flags [from the header]<item>
  328. Optional portions of the structure areuser/group
  329. (1&lt;&lt;0). The bit is set to 1 if they are present.
  330. <tag>ModTime<item>
  331. This is the number of seconds since the file list epoch, it is the modification
  332. date of the file.
  333. <tag>Permissions<item>
  334. This non-optional field is used to encode the type of device and the
  335. creation permissions.
  336. <tag>Dev<item>
  337. This is the OS specific 'dev_t' field for mknod.
  338. <tag>Major
  339. <tag>Minor<item>
  340. These are the OS dependent device numbers.
  341. <tag>Name<item>
  342. The name of the item. It should have no pathname components and is relative
  343. to the last Directory Start record.
  344. <tag>To<item>
  345. This is the file the symlink is pointing to.
  346. </taglist>
  347. <!-- }}} -->
  348. <!-- Include/Exclude {{{ -->
  349. <!-- ===================================================================== -->
  350. <sect>Include and Exclude
  351. <p>
  352. The include/exclude list used to generate the file list is encoded after
  353. the header record. It is stored as an ordered set of include/exclude records
  354. acting as a filter. If no record matches then the pathname is assumed to
  355. be included otherwise the first matching record decides.
  356. <example>
  357. struct IncludeExclude
  358. {
  359. uint8 Tag; // 8
  360. uint8 Type;
  361. string Pattern;
  362. };
  363. </example>
  364. <taglist>
  365. <tag>Flags [from the header]<item>
  366. None defined.
  367. <tag>Type<item>
  368. This is the sort of rule, presently 1 is an include rule and 2 is an exclude
  369. rule.
  370. <tag>Pattern<item>
  371. This is the textual pattern used for matching.
  372. </taglist>
  373. <!-- }}} -->
  374. <!-- User/Group Map {{{ -->
  375. <!-- ===================================================================== -->
  376. <sect>User/Group Map
  377. <p>
  378. In order to properly transfer users and groups the names are converted from
  379. a local number into a file list number and a number to name mapping. When
  380. the remote side reads the file list it directs all UID/GID translations
  381. through the mapping to create the real names and then does a local lookup.
  382. This also provides some compressesion in the file list as large UIDs are
  383. converted into smaller values through the mapping.
  384. <p>
  385. The generator is expected to emit these records at any place before the IDs
  386. are actually used.
  387. <example>
  388. struct NameMap
  389. {
  390. uint8 Tag; // 9,10
  391. number FileID;
  392. number RealID;
  393. string Name;
  394. };
  395. </example>
  396. <taglist>
  397. <tag>Flags [from the header]<item>
  398. Optional portions of the structure are RealID (1&lt;&lt;0).
  399. <tag>FileID<item>
  400. This is the ID used internally in the file list, it should be monotonically
  401. increasing each time a Map record is created so that it is small and unique.
  402. <tag>RealID<item>
  403. This is the ID used in the filesystem on the generating end. This information
  404. maybe used if the user selected to regenerate IDs without translation.
  405. </taglist>
  406. <!-- }}} -->
  407. <!-- Hard Link {{{ -->
  408. <!-- ===================================================================== -->
  409. <sect>Hard Link
  410. <p>
  411. A hard link record is used to record a file that is participating in a hard
  412. link. The only information we know about the link is the inode and device
  413. on the local machine, so we store this information. The client will have to
  414. reconstruct the linkages if possible.
  415. <example>
  416. struct HardLink
  417. {
  418. uint8 Tag; // 11
  419. uint32 ModTime;
  420. number Serial;
  421. uint16 Permissions;
  422. number User;
  423. number Group;
  424. string Name;
  425. number Size;
  426. uint128 MD5;
  427. };
  428. </example>
  429. <taglist>
  430. <tag>Flags [from the header]<item>
  431. Optional portions of the structure are Permissions (1&lt;&lt;0), user/group
  432. (1&lt;&lt;1), and MD5 (1&lt;&lt;2). The bit is set to 1 if they are present.
  433. <tag>ModTime<item>
  434. This is the number of seconds since the file list epoch, it is the modification
  435. date of the file.
  436. <tag>Serial<item>
  437. This is the unique ID number for the hardlink. It is composed from the
  438. device inode pair in a generator dependent way. The exact nature of the
  439. value is unimportant, only that two hard link records with the same serial
  440. should be linked together. It is recommended that the generator compress
  441. hard link serial numbers into small monotonically increasing IDs.
  442. <tag>Permissions<item>
  443. This is the standard unix permissions in the usual format.
  444. <tag>User
  445. <tag>Group<item>
  446. These are the standard unix user/group for the directory. They are indirected
  447. through the user/group maps described later on.
  448. <tag>Name<item>
  449. The name of the item. It should have no pathname components and is relative
  450. to the last Directory Start record.
  451. <tag>MD5<item>
  452. This is a MD5 hash of the file.
  453. <tag>Size<item>
  454. This is the size of the file in bytes.
  455. </taglist>
  456. <!-- }}} -->
  457. <!-- End Marker {{{ -->
  458. <!-- ===================================================================== -->
  459. <sect>End Marker
  460. <p>
  461. The End Marker is the final record in the stream, if it is missing the stream
  462. is assumed to be incomplete.
  463. <example>
  464. struct Trailer
  465. {
  466. uint8 Tag; // 12 for the header
  467. uint32 Signature;
  468. };
  469. </example>
  470. <taglist>
  471. <tag>Signature<item>
  472. This field should contain the hex value 0xBA87E79 which is designed to
  473. prevent a correputed stream as begin a legitimate end marker.
  474. </taglist>
  475. <!-- }}} -->
  476. <!-- RSync Checksums {{{ -->
  477. <!-- ===================================================================== -->
  478. <sect>RSync Checksums
  479. <p>
  480. The checksum record contains the list of checksums for a file and represents
  481. the start of a RSync description block which may contain RSync Checksums,
  482. a Normal File entry or Aggregate Files records.
  483. <example>
  484. struct RSyncChecksums
  485. {
  486. uint8 Tag; // 13
  487. number BlockSize;
  488. number FileSize;
  489. uint160 Sums[ceil(FileSize/BlockSize)];
  490. };
  491. </example>
  492. <taglist>
  493. <tag>BlockSize<item>
  494. The size of each block in the stream in bytes.
  495. <tag>FileSize<item>
  496. The total size of the the file in bytes.
  497. <tag>Sums<item>
  498. The actual checksum data. The format has the lower 32 bytes as the weak
  499. checksum and the upper 128 as the strong checksum.
  500. </taglist>
  501. <!-- }}} -->
  502. <!-- Aggregate File {{{ -->
  503. <!-- ===================================================================== -->
  504. <sect>Aggregate File
  505. <p>
  506. If the generator was given a list of included files this record will be
  507. emitted after the rsync checksum record, once for each file. The given
  508. paths are files that are likely to contain fragments of the larger file.
  509. <example>
  510. struct AggregateFile
  511. {
  512. uint8 Tag; // 14 for this record
  513. string File;
  514. };
  515. </example>
  516. <taglist>
  517. <tag>File<item>
  518. The stored filename.
  519. </taglist>
  520. <!-- }}} -->
  521. <!-- RSync End {{{ -->
  522. <!-- ===================================================================== -->
  523. <sect>RSync End
  524. <p>
  525. The purpose of the directory end marker is to signafy that the RSync data
  526. is finished. RSync blocks begin with the RSync checksum record, then are
  527. typically followed by a Normal File record describing the name and attributes
  528. of the file and then optionally followed by a set of Aggregate File records.
  529. <p>
  530. There are no data members, it is the basic 1 item record. If the data stream
  531. terminates with an open block it is assumed to be truncated and an error
  532. issued.
  533. <!-- }}} -->
  534. <chapt>The Client
  535. <!-- Handling Compatibility {{{ -->
  536. <!-- ===================================================================== -->
  537. <sect>Handling Compatibility
  538. <p>
  539. The format has no provision for making backwards compatible changes, even
  540. minor ones. What was provided is a way to make a generator that is both
  541. forwards and backwards compatible with clients, this is done by disabling
  542. generation of unsupported items and masking them off in the flags.
  543. <p>
  544. To deal with this a client should examine the header and determine if it has
  545. a suitable major version, the minor version should largely be ignored. The
  546. client should then examine the flags values and for all records it understands
  547. ensure that no bits are masked on that it does not understand. Records that
  548. it cannot handle should be ignored at this point. When the client is
  549. parsing it should abort if it hits a record it does not support.
  550. <!-- }}} -->
  551. <!-- Client Requirements {{{ -->
  552. <!-- ===================================================================== -->
  553. <sect>Client Requirements
  554. <p>
  555. The client attempting to verify syncronisity of a local file tree and a
  556. tree destribed in a file list must do three things, look for extra local files,
  557. manage the UID/GID mappings and maintain a map of hardlinks. These items
  558. corrispond to the only necessary memory usage on the client.
  559. <p>
  560. It is expected that the client will use the timestamp, size and possibly
  561. MD5 hash to match the local file against the remote one to decide if it
  562. should be retrieved.
  563. <p>
  564. Hardlinks are difficult to handle, but represent a very usefull feature. The
  565. client should track all hard links until they are associated with a local
  566. file+inode, then all future links to that remote inode can be recreated
  567. locally.
  568. <!-- }}} -->
  569. <chapt>RSync Method
  570. <!-- Overview {{{ -->
  571. <!-- ===================================================================== -->
  572. <sect>Overview
  573. <p>
  574. The <em>rsync method</> was invented by Andrew Tridgell and originally
  575. implemented in the rsync program. DSync has a provision to make use of the
  576. <em>rsync method</> for transfering differences between files effeciently,
  577. however the implemention is not as bandwidth efficient as what the rsync
  578. program uses, emphasis is placed on generator efficiency.
  579. <p>
  580. Primarily the <em>rsync method</> makes use of a series of weak and strong
  581. block checksums for each block in a file. Blocks are a uniform size and
  582. are uniformly distributed about the source file. In order to minimize server
  583. loading the checksum data is generated for the file on the server and then
  584. sent to the client - this might optionally be done from a cached file. The
  585. client is responsible for performing the checksumming and searching on its
  586. end.
  587. <p>
  588. In contrast rsync has the client send its checksums to the server and the
  589. server sends back commands to reconstruct the file. This is more bandwidth
  590. efficient because only one round trip is required and there is a higher chance
  591. that more blocks will be matched and not need to be sent to the client.
  592. <p>
  593. Furthermore a feature designed for use by CD images is provided where a file
  594. can be specified as the aggregation of many smaller files. The aggregated
  595. files are specified only by giving the file name. The client is expected to
  596. read the file (probably from the network) and perform checksum searching
  597. against the provided table.
  598. <!-- }}} -->
  599. <!-- CD Images {{{ -->
  600. <!-- ===================================================================== -->
  601. <sect>CD Images
  602. <p>
  603. The primary and most complex use of the rsync data is for forming CD images
  604. on the fly from a mirror and a CD source. This is extremly usefull beacause
  605. CD images take up alot of space and bandwidth to mirror, while they are
  606. mearly aggregates of (possibly) already mirrored data. Using checksums
  607. and a file listing allows the CD image to be reconstructed from any mirror
  608. and reduces the loading on primary CD image servers.
  609. <p>
  610. The next use of checksums is to 'freshen' a CD image during development. If
  611. a image is already present that contains a subset of the required data the
  612. checksums generally allow a large percentage of that data to be reused.
  613. <p>
  614. Since the client is responsible for reconstruction and checksum searching it
  615. is possible to perform in place reconstruction and in place initial generation
  616. that does not require a (large!) temporary file.
  617. <!-- }}} -->
  618. </book>