obfs4-spec.txt 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340
  1. obfs4 (The obfourscator)
  2. 0. Introduction
  3. This is a protocol obfuscation layer for TCP protocols. Its purpose is to
  4. keep a third party from telling what protocol is in use based on message
  5. contents.
  6. Unlike obfs3, obfs4 attempts to provide authentication and data integrity,
  7. though it is still designed primarily around providing a layer of
  8. obfuscation for an existing authenticated protocol like SSH or TLS.
  9. Like obfs3 and ScrambleSuit, the protocol has 2 phases: in the first phase
  10. both parties establish keys. In the second, the parties exchange
  11. super-enciphered traffic.
  12. 1. Motivation
  13. ScrambleSuit [0] has been developed with the aim of improving the obfs3 [1]
  14. protocol to provide resilience against active attackers and to disguise
  15. flow signatures.
  16. ScrambleSuit like the existing obfs3 protocol uses UniformDH for the
  17. cryptographic handshake, which has severe performance implications due to
  18. modular exponentiation being a expensive operation. Additionally, the key
  19. exchange is not authenticated so it is possible for active attackers to
  20. mount a man in the middle attack assuming they know the client/bridge
  21. shared secret (k_B).
  22. obfs4 attempts to address these shortcomings by using an authenticated key
  23. exchange mechanism based around the Tor Project's ntor handshake [2].
  24. Obfuscation of the Curve25519 public keys transmitted over the wire is
  25. accomplished via the Elligator 2 mapping [3].
  26. 2. Threat Model
  27. The threat model of obfs4 is the threat model of obfs2 [4] with added
  28. goals/modifications:
  29. obfs4 offers protection against passive Deep Packet Inspection machines
  30. that expect the obfs4 protocol. Such machines should not be able to verify
  31. the existence of the obfs4 protocol without obtaining the server's Node ID
  32. and identity public key.
  33. obfs4 offers protection against active attackers attempting to probe for
  34. obfs4 servers. Such machines should not be able to verify the existence
  35. of an obfs4 server without obtaining the server's Node ID and identity
  36. public key.
  37. obfs4 offers protection against active attackers that have obtained the
  38. server's Node ID and identity public key. Such machines should not be
  39. able to impersonate the server without obtaining the server's identity
  40. private key.
  41. obfs4 offers protection against some non-content protocol fingerprints,
  42. specifically the packet size, and optionally packet timing.
  43. obfs4 provides integrity and confidentiality of the underlying traffic,
  44. and authentication of the server.
  45. 3. Notation and Terminology
  46. All Curve25519 keys and Elligator 2 representatives are transmitted in the
  47. Little Endian representation, for ease of integration with current
  48. Curve25519 and Elligator 2 implementations. All other numeric fields are
  49. transmitted as Big Endian (Network byte order) values.
  50. HMAC-SHA256-128(k, s) is the HMAC-SHA256 digest of s with k as the key,
  51. truncated to 128 bits.
  52. x | y is the concatenation of x and y.
  53. A "byte" is an 8-bit octet.
  54. 4. Key Establishment Phase
  55. As part of the configuration, all obfs4 servers have a 20 byte Node ID
  56. (NODEID) and Curve25519 keypair (B,b) that is used to establish that the
  57. client knows about a given server and to authenticate the server.
  58. The server distributes the public component of the identity key (B) and
  59. NODEID to the client via an out-of-band mechanism.
  60. Data sent as part of the handshake are padded to random lengths to attempt to
  61. obfuscate the initial flow signature. The constants used are as follows:
  62. MaximumHandshakeLength = 8192
  63. Maximum size of a handshake request or response, including padding.
  64. MarkLength = 16
  65. Length of M_C/M_S (A HMAC-SHA256-128 digest).
  66. MACLength = 16
  67. Length of MAC_C/MAC_S (A HMAC-SHA256-128 digest).
  68. RepresentativeLength = 32
  69. Length of a Elligator 2 representative of a Curve25519 public key.
  70. AuthLength = 32
  71. Length of the ntor AUTH tag (A HMAC-SHA256 digest).
  72. InlineSeedFrameLength = 45
  73. Length of a unpadded TYPE_PRNG_SEED frame.
  74. ServerHandshakeLength = 96
  75. The length of the non-padding data in a handshake response.
  76. RepresentativeLength + AuthLength + MarkLength + MACLength
  77. ServerMaxPadLength = 8096
  78. The maximum amount of padding in a handshake response.
  79. MaximumHandshakeLength - ServerHandshakeLength
  80. ServerMinPadLength = InlineSeedFrameLength
  81. The minimum amount of padding in a handshake response.
  82. ClientHandshakeLength = 64
  83. The length of the non-padding data in a handshake request.
  84. RepresentativeLength + MarkLength + MACLength
  85. ClientMinPadLength = 85
  86. The minimum amount of padding in a handshake request.
  87. (ServerHandshakeLength + ServerMinPadLength) - ClientHandshakeLength
  88. ClientMaxPadLength = 8128
  89. The maximum amount of padding in a handshake request.
  90. MaximumHandshakeLength - ClientHandshakeLength
  91. The amount of padding is chosen such that the smallest possible request and
  92. response (requests and responses with the minimum amount of padding) are
  93. equal in size. For details on the InlineSeedFrameLength, see section 6.
  94. The client handshake process is as follows.
  95. 1. The client generates an ephemeral Curve25519 keypair X,x and an
  96. Elligator 2 representative of the public component X'.
  97. 2. The client sends a handshake request to the server where:
  98. X' = Elligator 2 representative of X (32 bytes)
  99. P_C = Random padding [ClientMinPadLength, ClientMaxPadLength] bytes
  100. M_C = HMAC-SHA256-128(B | NODEID, X')
  101. E = String representation of the number of hours since the UNIX
  102. epoch
  103. MAC_C = HMAC-SHA256-128(B | NODEID, X' | P_C | M_C | E)
  104. clientRequest = X' | P_C | M_C | MAC_C
  105. 3. The client receives the serverResponse from the server.
  106. 4. The client derives M_S from the serverResponse and uses it to locate
  107. MAC_S in the serverResponse. It then calculates MAC_S and compares it
  108. with the value received from the server. If M_S cannot be found or the
  109. MAC_S values do not match, the client MUST drop the connection.
  110. 5. The client derives Y from Y' via the Elligator 2 map in the reverse
  111. direction.
  112. 6. The client completes the client side of the ntor handshake, deriving
  113. the 256 bit shared secret (KEY_SEED), and the authentication tag
  114. (AUTH). The client then compares the derived value of AUTH with that
  115. contained in the serverResponse. If the AUTH values do not match, the
  116. client MUST drop the connection.
  117. The server handshake process is as follows.
  118. 1. The server receives the clientRequest from the client.
  119. 2. The server derives M_C from the clientRequest and uses it to locate
  120. MAC_C in the clientRequest. It then calculates MAC_C and compares it
  121. with the value received from the client. If M_C cannot be found or the
  122. MAC_C values do not match, the server MUST stop processing data from
  123. the client.
  124. Implementations MUST derive and compare multiple values of MAC_C with
  125. "E = {E - 1, E, E + 1}" to account for clock skew between the client
  126. and server.
  127. On the event of a failure at this point implementations SHOULD delay
  128. dropping the TCP connection from the client by a random interval to
  129. make active probing more difficult.
  130. 3. The server derives X from X' via the Elligator 2 map in the reverse
  131. direction.
  132. 4. The server generates an ephemeral Curve25519 keypair Y, y and an
  133. Elligator 2 representative of the public component Y'.
  134. 5. The server completes the server side of the ntor handshake, deriving
  135. the 256 bit shared secret (KEY_SEED), and the authentication tag
  136. (AUTH).
  137. 6. The server sends a handshake response to the client where:
  138. Y' = Elligator 2 Representative of Y (32 bytes)
  139. AUTH = The ntor authentication tag (32 bytes)
  140. P_S = Random padding [ServerMinPadLength, ServerMaxPadLength] bytes
  141. M_S = HMAC-SHA256-128(B | NODEID, Y')
  142. E' = E from the client request
  143. MAC_S = HMAC-SHA256-128(B | NODEID, Y' | AUTH | P_S | M_S | E')
  144. serverResponse = Y' | AUTH | P_S | M_S | MAC_S
  145. At the point that each side finishes the handshake, they have a 256 bit
  146. shared secret KEY_SEED that is then extracted/expanded via the ntor KDF to
  147. produce the 144 bytes of keying material used to encrypt/authenticate the
  148. data.
  149. The keying material is used as follows:
  150. Bytes 000:031 - Server to Client 256 bit NaCl secretbox key.
  151. Bytes 032:047 - Server to Client 128 bit NaCl secretbox nonce prefix.
  152. Bytes 048:063 - Server to Client 128 bit SipHash-2-4 key.
  153. Bytes 064:071 - Server to Client 64 bit SipHash-2-4 OFB IV.
  154. Bytes 072:103 - Client to Server 256 bit NaCl secretbox key.
  155. Bytes 104:119 - Client to Server 128 bit NaCl secretbox nonce prefix.
  156. Bytes 120:135 - Client to Server 128 bit SipHash-2-4 key.
  157. Bytes 136:143 - Client to Server 64 bit SipHash-2-4 OFB IV.
  158. 5. Data Transfer Phase
  159. Once both sides have completed the handshake, they transfer application
  160. data broken up into "packets", that are then encrypted and authenticated in
  161. NaCl crypto_secretbox_xsalsa20poly1305 [5] "frames".
  162. +------------+----------+--------+--------------+------------+------------+
  163. | 2 bytes | 16 bytes | 1 byte | 2 bytes | (optional) | (optional) |
  164. | Frame len. | Tag | Type | Payload len. | Payload | Padding |
  165. +------------+----------+--------+--------------+------------+------------+
  166. \_ Obfs. _/ \___________ NaCl secretbox (Poly1305/XSalsa20) ___________/
  167. The frame length refers to the length of the succeeding secretbox. To
  168. avoid transmitting identifiable length fields in stream, the frame length
  169. is obfuscated by XORing a mask derived from SipHash-2-4 in OFB mode.
  170. K = The SipHash-2-4 key from the KDF.
  171. IV[0] = The SipHash-2-4 OFB from the KDF.
  172. For each packet:
  173. IV[n] = SipHash-2-4(K, IV[n-1])
  174. Mask[n] = First 2 bytes of IV[n]
  175. obfuscatedLength = length ^ Mask[n]
  176. As the receiver has the SipHash-2-4 key and IV, decoding the length is done
  177. via deriving the mask used to obfsucate the length and XORing the truncated
  178. digest to obtain the length of the secretbox.
  179. The payload length refers to the length of the payload portion of the frame
  180. and does not include the padding. It is possible for the payload length to
  181. be 0 in which case all the remaining data is authenticated and decrypted,
  182. but ignored.
  183. The maximum allowed frame length is 1448 bytes, which allows up to 1427
  184. bytes of useful payload to be transmitted per "frame".
  185. The NaCl secretbox (Poly1305/XSalsa20) nonce format is:
  186. uint8_t[24] prefix (Fixed)
  187. uint64_t counter (Big endian)
  188. The counter is initialized to 1, and is incremented on each frame. Since
  189. the protocol is designed to be used over a reliable medium, the nonce is not
  190. transmitted over the wire as both sides of the conversation know the prefix
  191. and the initial counter value. It is imperative that the counter does not
  192. wrap, and sessions MUST terminate before 2^64 frames are sent.
  193. If unsealing a secretbox ever fails (due to a Tag mismatch), implementations
  194. MUST drop the connection.
  195. The type field is used to denote the type of payload (if any) contained in
  196. each packet.
  197. TYPE_PAYLOAD (0x00):
  198. The entire payload is to be treated as application data.
  199. TYPE_PRNG_SEED (0x01):
  200. The entire payload is to be treated as seeding material for the
  201. protocol polymorphism PRNG. The format is 24 bytes of seeding
  202. material.
  203. Implementations SHOULD ignore unknown packet types for the purposes of
  204. forward compatibility, though each frame MUST still be authenticated and
  205. decrypted.
  206. 6. Protocol Polymorphism
  207. Implementations MUST implement protocol polymorphism to obfuscate the obfs4
  208. flow signature. The implementation should follow that of ScrambleSuit (See
  209. "ScrambleSuit Protocol Specification", section 4). Like with ScrambleSuit,
  210. implementations MAY omit inter-arrival time obfuscation as a performance
  211. trade-off.
  212. As an optimization, implementations MAY treat the TYPE_PRNG_SEED frame as
  213. part of the serverResponse if it always sends the frame immediately
  214. following the serverResponse body. If implementations chose to do this,
  215. the TYPE_PRNG_SEED frame MUST have 0 bytes of padding, and P_S MUST
  216. be generated with a ServerMinPadLength of 0 (P_S consists of [0,8096]
  217. bytes of random data). The calculation of ClientMinPadLength however is
  218. unchanged (P_C still consists of [85,8128] bytes of random data).
  219. 7. References
  220. [0]: https://gitweb.torproject.org/user/phw/scramblesuit.git/blob/HEAD:/doc/scramblesuit-spec.txt
  221. [1]: https://gitweb.torproject.org/pluggable-transports/obfsproxy.git/blob/HEAD:/doc/obfs3/obfs3-protocol-spec.txt
  222. [2]: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/216-ntor-handshake.txt
  223. [3]: http://elligator.cr.yp.to/elligator-20130828.pdf
  224. [4]: https://gitweb.torproject.org/pluggable-transports/obfsproxy.git/blob/HEAD:/doc/obfs2/obfs2-threat-model.txt
  225. [5]: http://nacl.cr.yp.to/secretbox.html
  226. [6]: https://131002.net/siphash/
  227. 8. Acknowledgments
  228. Much of the protocol and this specification document is derived from the
  229. ScrambleSuit protocol and specification by Philipp Winter.