strscans.nim 25 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687
  1. #
  2. #
  3. # Nim's Runtime Library
  4. # (c) Copyright 2016 Andreas Rumpf
  5. #
  6. # See the file "copying.txt", included in this
  7. # distribution, for details about the copyright.
  8. #
  9. ##[
  10. This module contains a `scanf`:idx: macro that can be used for extracting
  11. substrings from an input string. This is often easier than regular expressions.
  12. Some examples as an apetizer:
  13. .. code-block:: nim
  14. # check if input string matches a triple of integers:
  15. const input = "(1,2,4)"
  16. var x, y, z: int
  17. if scanf(input, "($i,$i,$i)", x, y, z):
  18. echo "matches and x is ", x, " y is ", y, " z is ", z
  19. # check if input string matches an ISO date followed by an identifier followed
  20. # by whitespace and a floating point number:
  21. var year, month, day: int
  22. var identifier: string
  23. var myfloat: float
  24. if scanf(input, "$i-$i-$i $w$s$f", year, month, day, identifier, myfloat):
  25. echo "yes, we have a match!"
  26. As can be seen from the examples, strings are matched verbatim except for
  27. substrings starting with ``$``. These constructions are available:
  28. ================= ========================================================
  29. ``$i`` Matches an integer. This uses ``parseutils.parseInt``.
  30. ``$f`` Matches a floating pointer number. Uses ``parseFloat``.
  31. ``$w`` Matches an ASCII identifier: ``[A-Z-a-z_][A-Za-z_0-9]*``.
  32. ``$s`` Skips optional whitespace.
  33. ``$$`` Matches a single dollar sign.
  34. ``$.`` Matches if the end of the input string has been reached.
  35. ``$*`` Matches until the token following the ``$*`` was found.
  36. The match is allowed to be of 0 length.
  37. ``$+`` Matches until the token following the ``$+`` was found.
  38. The match must consist of at least one char.
  39. ``${foo}`` User defined matcher. Uses the proc ``foo`` to perform
  40. the match. See below for more details.
  41. ``$[foo]`` Call user defined proc ``foo`` to **skip** some optional
  42. parts in the input string. See below for more details.
  43. ================= ========================================================
  44. Even though ``$*`` and ``$+`` look similar to the regular expressions ``.*``
  45. and ``.+`` they work quite differently, there is no non-deterministic
  46. state machine involved and the matches are non-greedy. ``[$*]``
  47. matches ``[xyz]`` via ``parseutils.parseUntil``.
  48. Furthermore no backtracking is performed, if parsing fails after a value
  49. has already been bound to a matched subexpression this value is not restored
  50. to its original value. This rarely causes problems in practice and if it does
  51. for you, it's easy enough to bind to a temporary variable first.
  52. Startswith vs full match
  53. ========================
  54. ``scanf`` returns true if the input string **starts with** the specified
  55. pattern. If instead it should only return true if there is also nothing
  56. left in the input, append ``$.`` to your pattern.
  57. User definable matchers
  58. =======================
  59. One very nice advantage over regular expressions is that ``scanf`` is
  60. extensible with ordinary Nim procs. The proc is either enclosed in ``${}``
  61. or in ``$[]``. ``${}`` matches and binds the result
  62. to a variable (that was passed to the ``scanf`` macro) while ``$[]`` merely
  63. optional tokens.
  64. In this example, we define a helper proc ``someSep`` that skips some separators
  65. which we then use in our scanf pattern to help us in the matching process:
  66. .. code-block:: nim
  67. proc someSep(input: string; start: int; seps: set[char] = {':','-','.'}): int =
  68. # Note: The parameters and return value must match to what ``scanf`` requires
  69. result = 0
  70. while input[start+result] in seps: inc result
  71. if scanf(input, "$w$[someSep]$w", key, value):
  72. ...
  73. It also possible to pass arguments to a user definable matcher:
  74. .. code-block:: nim
  75. proc ndigits(input: string; intVal: var int; start: int; n: int): int =
  76. # matches exactly ``n`` digits. Matchers need to return 0 if nothing
  77. # matched or otherwise the number of processed chars.
  78. var x = 0
  79. var i = 0
  80. while i < n and i+start < input.len and input[i+start] in {'0'..'9'}:
  81. x = x * 10 + input[i+start].ord - '0'.ord
  82. inc i
  83. # only overwrite if we had a match
  84. if i == n:
  85. result = n
  86. intVal = x
  87. # match an ISO date extracting year, month, day at the same time.
  88. # Also ensure the input ends after the ISO date:
  89. var year, month, day: int
  90. if scanf("2013-01-03", "${ndigits(4)}-${ndigits(2)}-${ndigits(2)}$.", year, month, day):
  91. ...
  92. The scanp macro
  93. ===============
  94. This module also implements a ``scanp`` macro, which syntax somewhat resembles
  95. an EBNF or PEG grammar, except that it uses Nim's expression syntax and so has
  96. to use prefix instead of postfix operators.
  97. ============== ===============================================================
  98. ``(E)`` Grouping
  99. ``*E`` Zero or more
  100. ``+E`` One or more
  101. ``?E`` Zero or One
  102. ``E{n,m}`` From ``n`` up to ``m`` times ``E``
  103. ``~Ε`` Not predicate
  104. ``a ^* b`` Shortcut for ``?(a *(b a))``. Usually used for separators.
  105. ``a ^* b`` Shortcut for ``?(a +(b a))``. Usually used for separators.
  106. ``'a'`` Matches a single character
  107. ``{'a'..'b'}`` Matches a character set
  108. ``"s"`` Matches a string
  109. ``E -> a`` Bind matching to some action
  110. ``$_`` Access the currently matched character
  111. ============== ===============================================================
  112. Note that unordered or ordered choice operators (``/``, ``|``) are
  113. not implemented.
  114. Simple example that parses the ``/etc/passwd`` file line by line:
  115. .. code-block:: nim
  116. const
  117. etc_passwd = """root:x:0:0:root:/root:/bin/bash
  118. daemon:x:1:1:daemon:/usr/sbin:/bin/sh
  119. bin:x:2:2:bin:/bin:/bin/sh
  120. sys:x:3:3:sys:/dev:/bin/sh
  121. nobody:x:65534:65534:nobody:/nonexistent:/bin/sh
  122. messagebus:x:103:107::/var/run/dbus:/bin/false
  123. """
  124. proc parsePasswd(content: string): seq[string] =
  125. result = @[]
  126. var idx = 0
  127. while true:
  128. var entry = ""
  129. if scanp(content, idx, +(~{'\L', '\0'} -> entry.add($_)), '\L'):
  130. result.add entry
  131. else:
  132. break
  133. The ``scanp`` maps the grammar code into Nim code that performs the parsing.
  134. The parsing is performed with the help of 3 helper templates that that can be
  135. implemented for a custom type.
  136. These templates need to be named ``atom`` and ``nxt``. ``atom`` should be
  137. overloaded to handle both single characters and sets of character.
  138. .. code-block:: nim
  139. import streams
  140. template atom(input: Stream; idx: int; c: char): bool =
  141. ## Used in scanp for the matching of atoms (usually chars).
  142. peekChar(input) == c
  143. template atom(input: Stream; idx: int; s: set[char]): bool =
  144. peekChar(input) in s
  145. template nxt(input: Stream; idx, step: int = 1) =
  146. inc(idx, step)
  147. setPosition(input, idx)
  148. if scanp(content, idx, +( ~{'\L', '\0'} -> entry.add(peekChar($input))), '\L'):
  149. result.add entry
  150. Calling ordinary Nim procs inside the macro is possible:
  151. .. code-block:: nim
  152. proc digits(s: string; intVal: var int; start: int): int =
  153. var x = 0
  154. while result+start < s.len and s[result+start] in {'0'..'9'} and s[result+start] != ':':
  155. x = x * 10 + s[result+start].ord - '0'.ord
  156. inc result
  157. intVal = x
  158. proc extractUsers(content: string): seq[string] =
  159. # Extracts the username and home directory
  160. # of each entry (with UID greater than 1000)
  161. const
  162. digits = {'0'..'9'}
  163. result = @[]
  164. var idx = 0
  165. while true:
  166. var login = ""
  167. var uid = 0
  168. var homedir = ""
  169. if scanp(content, idx, *(~ {':', '\0'}) -> login.add($_), ':', * ~ ':', ':',
  170. digits($input, uid, $index), ':', *`digits`, ':', * ~ ':', ':',
  171. *('/', * ~{':', '/'}) -> homedir.add($_), ':', *('/', * ~{'\L', '/'}), '\L'):
  172. if uid >= 1000:
  173. result.add login & " " & homedir
  174. else:
  175. break
  176. When used for matching, keep in mind that likewise scanf, no backtracking
  177. is performed.
  178. .. code-block:: nim
  179. proc skipUntil(s: string; until: string; unless = '\0'; start: int): int =
  180. # Skips all characters until the string `until` is found. Returns 0
  181. # if the char `unless` is found first or the end is reached.
  182. var i = start
  183. var u = 0
  184. while true:
  185. if s[i] == '\0' or s[i] == unless:
  186. return 0
  187. elif s[i] == until[0]:
  188. u = 1
  189. while i+u < s.len and u < until.len and s[i+u] == until[u]:
  190. inc u
  191. if u >= until.len: break
  192. inc(i)
  193. result = i+u-start
  194. iterator collectLinks(s: string): string =
  195. const quote = {'\'', '"'}
  196. var idx, old = 0
  197. var res = ""
  198. while idx < s.len:
  199. old = idx
  200. if scanp(s, idx, "<a", skipUntil($input, "href=", '>', $index),
  201. `quote`, *( ~`quote`) -> res.add($_)):
  202. yield res
  203. res = ""
  204. idx = old + 1
  205. for r in collectLinks(body):
  206. echo r
  207. In this example both macros are combined seamlessly in order to maximise
  208. efficiency and perform different checks.
  209. .. code-block:: nim
  210. iterator parseIps*(soup: string): string =
  211. ## ipv4 only!
  212. const digits = {'0'..'9'}
  213. var a, b, c, d: int
  214. var buf = ""
  215. var idx = 0
  216. while idx < soup.len:
  217. if scanp(soup, idx, (`digits`{1,3}, '.', `digits`{1,3}, '.',
  218. `digits`{1,3}, '.', `digits`{1,3}) -> buf.add($_)):
  219. discard buf.scanf("$i.$i.$i.$i", a, b, c, d)
  220. if (a >= 0 and a <= 254) and
  221. (b >= 0 and b <= 254) and
  222. (c >= 0 and c <= 254) and
  223. (d >= 0 and d <= 254):
  224. yield buf
  225. buf.setLen(0) # need to clear `buf` each time, cause it might contain garbage
  226. idx.inc
  227. ]##
  228. import macros, parseutils
  229. proc conditionsToIfChain(n, idx, res: NimNode; start: int): NimNode =
  230. assert n.kind == nnkStmtList
  231. if start >= n.len: return newAssignment(res, newLit true)
  232. var ifs: NimNode = nil
  233. if n[start+1].kind == nnkEmpty:
  234. ifs = conditionsToIfChain(n, idx, res, start+3)
  235. else:
  236. ifs = newIfStmt((n[start+1],
  237. newTree(nnkStmtList, newCall(bindSym"inc", idx, n[start+2]),
  238. conditionsToIfChain(n, idx, res, start+3))))
  239. result = newTree(nnkStmtList, n[start], ifs)
  240. proc notZero(x: NimNode): NimNode = newCall(bindSym"!=", x, newLit 0)
  241. proc buildUserCall(x: string; args: varargs[NimNode]): NimNode =
  242. let y = parseExpr(x)
  243. result = newTree(nnkCall)
  244. if y.kind in nnkCallKinds: result.add y[0]
  245. else: result.add y
  246. for a in args: result.add a
  247. if y.kind in nnkCallKinds:
  248. for i in 1..<y.len: result.add y[i]
  249. macro scanf*(input: string; pattern: static[string]; results: varargs[typed]): bool =
  250. ## See top level documentation of his module of how ``scanf`` works.
  251. template matchBind(parser) {.dirty.} =
  252. var resLen = genSym(nskLet, "resLen")
  253. conds.add newLetStmt(resLen, newCall(bindSym(parser), input, results[i], idx))
  254. conds.add resLen.notZero
  255. conds.add resLen
  256. var i = 0
  257. var p = 0
  258. var idx = genSym(nskVar, "idx")
  259. var res = genSym(nskVar, "res")
  260. result = newTree(nnkStmtListExpr, newVarStmt(idx, newLit 0), newVarStmt(res, newLit false))
  261. var conds = newTree(nnkStmtList)
  262. var fullMatch = false
  263. while p < pattern.len:
  264. if pattern[p] == '$':
  265. inc p
  266. case pattern[p]
  267. of '$':
  268. var resLen = genSym(nskLet, "resLen")
  269. conds.add newLetStmt(resLen, newCall(bindSym"skip", input, newLit($pattern[p]), idx))
  270. conds.add resLen.notZero
  271. conds.add resLen
  272. of 'w':
  273. if i < results.len or getType(results[i]).typeKind != ntyString:
  274. matchBind "parseIdent"
  275. else:
  276. error("no string var given for $w")
  277. inc i
  278. of 'i':
  279. if i < results.len or getType(results[i]).typeKind != ntyInt:
  280. matchBind "parseInt"
  281. else:
  282. error("no int var given for $d")
  283. inc i
  284. of 'f':
  285. if i < results.len or getType(results[i]).typeKind != ntyFloat:
  286. matchBind "parseFloat"
  287. else:
  288. error("no float var given for $f")
  289. inc i
  290. of 's':
  291. conds.add newCall(bindSym"inc", idx, newCall(bindSym"skipWhitespace", input, idx))
  292. conds.add newEmptyNode()
  293. conds.add newEmptyNode()
  294. of '.':
  295. if p == pattern.len-1:
  296. fullMatch = true
  297. else:
  298. error("invalid format string")
  299. of '*', '+':
  300. if i < results.len or getType(results[i]).typeKind != ntyString:
  301. var min = ord(pattern[p] == '+')
  302. var q=p+1
  303. var token = ""
  304. while q < pattern.len and pattern[q] != '$':
  305. token.add pattern[q]
  306. inc q
  307. var resLen = genSym(nskLet, "resLen")
  308. conds.add newLetStmt(resLen, newCall(bindSym"parseUntil", input, results[i], newLit(token), idx))
  309. conds.add newCall(bindSym"!=", resLen, newLit min)
  310. conds.add resLen
  311. else:
  312. error("no string var given for $" & pattern[p])
  313. inc i
  314. of '{':
  315. inc p
  316. var nesting = 0
  317. let start = p
  318. while true:
  319. case pattern[p]
  320. of '{': inc nesting
  321. of '}':
  322. if nesting == 0: break
  323. dec nesting
  324. of '\0': error("expected closing '}'")
  325. else: discard
  326. inc p
  327. let expr = pattern.substr(start, p-1)
  328. if i < results.len:
  329. var resLen = genSym(nskLet, "resLen")
  330. conds.add newLetStmt(resLen, buildUserCall(expr, input, results[i], idx))
  331. conds.add newCall(bindSym"!=", resLen, newLit 0)
  332. conds.add resLen
  333. else:
  334. error("no var given for $" & expr)
  335. inc i
  336. of '[':
  337. inc p
  338. var nesting = 0
  339. let start = p
  340. while true:
  341. case pattern[p]
  342. of '[': inc nesting
  343. of ']':
  344. if nesting == 0: break
  345. dec nesting
  346. of '\0': error("expected closing ']'")
  347. else: discard
  348. inc p
  349. let expr = pattern.substr(start, p-1)
  350. conds.add newCall(bindSym"inc", idx, buildUserCall(expr, input, idx))
  351. conds.add newEmptyNode()
  352. conds.add newEmptyNode()
  353. else: error("invalid format string")
  354. inc p
  355. else:
  356. var token = ""
  357. while p < pattern.len and pattern[p] != '$':
  358. token.add pattern[p]
  359. inc p
  360. var resLen = genSym(nskLet, "resLen")
  361. conds.add newLetStmt(resLen, newCall(bindSym"skip", input, newLit(token), idx))
  362. conds.add resLen.notZero
  363. conds.add resLen
  364. result.add conditionsToIfChain(conds, idx, res, 0)
  365. if fullMatch:
  366. result.add newCall(bindSym"and", res,
  367. newCall(bindSym">=", idx, newCall(bindSym"len", input)))
  368. else:
  369. result.add res
  370. template atom*(input: string; idx: int; c: char): bool =
  371. ## Used in scanp for the matching of atoms (usually chars).
  372. input[idx] == c
  373. template atom*(input: string; idx: int; s: set[char]): bool =
  374. input[idx] in s
  375. #template prepare*(input: string): int = 0
  376. template success*(x: int): bool = x != 0
  377. template nxt*(input: string; idx, step: int = 1) = inc(idx, step)
  378. macro scanp*(input, idx: typed; pattern: varargs[untyped]): bool =
  379. ## See top level documentation of his module of how ``scanp`` works.
  380. type StmtTriple = tuple[init, cond, action: NimNode]
  381. template interf(x): untyped = bindSym(x, brForceOpen)
  382. proc toIfChain(n: seq[StmtTriple]; idx, res: NimNode; start: int): NimNode =
  383. if start >= n.len: return newAssignment(res, newLit true)
  384. var ifs: NimNode = nil
  385. if n[start].cond.kind == nnkEmpty:
  386. ifs = toIfChain(n, idx, res, start+1)
  387. else:
  388. ifs = newIfStmt((n[start].cond,
  389. newTree(nnkStmtList, n[start].action,
  390. toIfChain(n, idx, res, start+1))))
  391. result = newTree(nnkStmtList, n[start].init, ifs)
  392. proc attach(x, attached: NimNode): NimNode =
  393. if attached == nil: x
  394. else: newStmtList(attached, x)
  395. proc placeholder(n, x, j: NimNode): NimNode =
  396. if n.kind == nnkPrefix and n[0].eqIdent("$"):
  397. let n1 = n[1]
  398. if n1.eqIdent"_" or n1.eqIdent"current":
  399. result = newTree(nnkBracketExpr, x, j)
  400. elif n1.eqIdent"input":
  401. result = x
  402. elif n1.eqIdent"i" or n1.eqIdent"index":
  403. result = j
  404. else:
  405. error("unknown pattern " & repr(n))
  406. else:
  407. result = copyNimNode(n)
  408. for i in 0 ..< n.len:
  409. result.add placeholder(n[i], x, j)
  410. proc atm(it, input, idx, attached: NimNode): StmtTriple =
  411. template `!!`(x): untyped = attach(x, attached)
  412. case it.kind
  413. of nnkIdent:
  414. var resLen = genSym(nskLet, "resLen")
  415. result = (newLetStmt(resLen, newCall(it, input, idx)),
  416. newCall(interf"success", resLen),
  417. !!newCall(interf"nxt", input, idx, resLen))
  418. of nnkCallKinds:
  419. # *{'A'..'Z'} !! s.add(!_)
  420. template buildWhile(init, cond, action): untyped =
  421. while true:
  422. init
  423. if not cond: break
  424. action
  425. # (x) a # bind action a to (x)
  426. if it[0].kind == nnkPar and it.len == 2:
  427. result = atm(it[0], input, idx, placeholder(it[1], input, idx))
  428. elif it.kind == nnkInfix and it[0].eqIdent"->":
  429. # bind matching to some action:
  430. result = atm(it[1], input, idx, placeholder(it[2], input, idx))
  431. elif it.kind == nnkInfix and it[0].eqIdent"as":
  432. let cond = if it[1].kind in nnkCallKinds: placeholder(it[1], input, idx)
  433. else: newCall(it[1], input, idx)
  434. result = (newLetStmt(it[2], cond),
  435. newCall(interf"success", it[2]),
  436. !!newCall(interf"nxt", input, idx, it[2]))
  437. elif it.kind == nnkPrefix and it[0].eqIdent"*":
  438. let (init, cond, action) = atm(it[1], input, idx, attached)
  439. result = (getAst(buildWhile(init, cond, action)),
  440. newEmptyNode(), newEmptyNode())
  441. elif it.kind == nnkPrefix and it[0].eqIdent"+":
  442. # x+ is the same as xx*
  443. result = atm(newTree(nnkPar, it[1], newTree(nnkPrefix, ident"*", it[1])),
  444. input, idx, attached)
  445. elif it.kind == nnkPrefix and it[0].eqIdent"?":
  446. # optional.
  447. let (init, cond, action) = atm(it[1], input, idx, attached)
  448. if cond.kind == nnkEmpty:
  449. error("'?' operator applied to a non-condition")
  450. else:
  451. result = (newTree(nnkStmtList, init, newIfStmt((cond, action))),
  452. newEmptyNode(), newEmptyNode())
  453. elif it.kind == nnkPrefix and it[0].eqIdent"~":
  454. # not operator
  455. let (init, cond, action) = atm(it[1], input, idx, attached)
  456. if cond.kind == nnkEmpty:
  457. error("'~' operator applied to a non-condition")
  458. else:
  459. result = (init, newCall(bindSym"not", cond), action)
  460. elif it.kind == nnkInfix and it[0].eqIdent"|":
  461. let a = atm(it[1], input, idx, attached)
  462. let b = atm(it[2], input, idx, attached)
  463. if a.cond.kind == nnkEmpty or b.cond.kind == nnkEmpty:
  464. error("'|' operator applied to a non-condition")
  465. else:
  466. result = (newStmtList(a.init,
  467. newIfStmt((a.cond, a.action), (newTree(nnkStmtListExpr, b.init, b.cond), b.action))),
  468. newEmptyNode(), newEmptyNode())
  469. elif it.kind == nnkInfix and it[0].eqIdent"^*":
  470. # a ^* b is rewritten to: (a *(b a))?
  471. #exprList = expr ^+ comma
  472. template tmp(a, b): untyped = ?(a, *(b, a))
  473. result = atm(getAst(tmp(it[1], it[2])), input, idx, attached)
  474. elif it.kind == nnkInfix and it[0].eqIdent"^+":
  475. # a ^* b is rewritten to: (a +(b a))?
  476. template tmp(a, b): untyped = (a, *(b, a))
  477. result = atm(getAst(tmp(it[1], it[2])), input, idx, attached)
  478. elif it.kind == nnkCommand and it.len == 2 and it[0].eqIdent"pred":
  479. # enforce that the wrapped call is interpreted as a predicate, not a non-terminal:
  480. result = (newEmptyNode(), placeholder(it[1], input, idx), newEmptyNode())
  481. else:
  482. var resLen = genSym(nskLet, "resLen")
  483. result = (newLetStmt(resLen, placeholder(it, input, idx)),
  484. newCall(interf"success", resLen), !!newCall(interf"nxt", input, idx, resLen))
  485. of nnkStrLit..nnkTripleStrLit:
  486. var resLen = genSym(nskLet, "resLen")
  487. result = (newLetStmt(resLen, newCall(interf"skip", input, it, idx)),
  488. newCall(interf"success", resLen), !!newCall(interf"nxt", input, idx, resLen))
  489. of nnkCurly, nnkAccQuoted, nnkCharLit:
  490. result = (newEmptyNode(), newCall(interf"atom", input, idx, it), !!newCall(interf"nxt", input, idx))
  491. of nnkCurlyExpr:
  492. if it.len == 3 and it[1].kind == nnkIntLit and it[2].kind == nnkIntLit:
  493. var h = newTree(nnkPar, it[0])
  494. for count in 2..it[1].intVal: h.add(it[0])
  495. for count in it[1].intVal .. it[2].intVal-1: h.add(newTree(nnkPrefix, ident"?", it[0]))
  496. result = atm(h, input, idx, attached)
  497. elif it.len == 2 and it[1].kind == nnkIntLit:
  498. var h = newTree(nnkPar, it[0])
  499. for count in 2..it[1].intVal: h.add(it[0])
  500. result = atm(h, input, idx, attached)
  501. else:
  502. error("invalid pattern")
  503. of nnkPar:
  504. if it.len == 1:
  505. result = atm(it[0], input, idx, attached)
  506. else:
  507. # concatenation:
  508. var conds: seq[StmtTriple] = @[]
  509. for x in it: conds.add atm(x, input, idx, attached)
  510. var res = genSym(nskVar, "res")
  511. result = (newStmtList(newVarStmt(res, newLit false),
  512. toIfChain(conds, idx, res, 0)), res, newEmptyNode())
  513. else:
  514. error("invalid pattern")
  515. #var idx = genSym(nskVar, "idx")
  516. var res = genSym(nskVar, "res")
  517. result = newTree(nnkStmtListExpr, #newVarStmt(idx, newCall(interf"prepare", input)),
  518. newVarStmt(res, newLit false))
  519. var conds: seq[StmtTriple] = @[]
  520. for it in pattern:
  521. conds.add atm(it, input, idx, nil)
  522. result.add toIfChain(conds, idx, res, 0)
  523. result.add res
  524. when defined(debugScanp):
  525. echo repr result
  526. when isMainModule:
  527. proc twoDigits(input: string; x: var int; start: int): int =
  528. if input[start] == '0' and input[start+1] == '0':
  529. result = 2
  530. x = 13
  531. else:
  532. result = 0
  533. proc someSep(input: string; start: int; seps: set[char] = {';',',','-','.'}): int =
  534. result = 0
  535. while input[start+result] in seps: inc result
  536. proc demangle(s: string; res: var string; start: int): int =
  537. while s[result+start] in {'_', '@'}: inc result
  538. res = ""
  539. while result+start < s.len and s[result+start] > ' ' and s[result+start] != '_':
  540. res.add s[result+start]
  541. inc result
  542. while result+start < s.len and s[result+start] > ' ':
  543. inc result
  544. proc parseGDB(resp: string): seq[string] =
  545. const
  546. digits = {'0'..'9'}
  547. hexdigits = digits + {'a'..'f', 'A'..'F'}
  548. whites = {' ', '\t', '\C', '\L'}
  549. result = @[]
  550. var idx = 0
  551. while true:
  552. var prc = ""
  553. var info = ""
  554. if scanp(resp, idx, *`whites`, '#', *`digits`, +`whites`, ?("0x", *`hexdigits`, " in "),
  555. demangle($input, prc, $index), *`whites`, '(', * ~ ')', ')',
  556. *`whites`, "at ", +(~{'\C', '\L', '\0'} -> info.add($_)) ):
  557. result.add prc & " " & info
  558. else:
  559. break
  560. var key, val: string
  561. var intval: int
  562. var floatval: float
  563. doAssert scanf("abc:: xyz 89 33.25", "$w$s::$s$w$s$i $f", key, val, intval, floatVal)
  564. doAssert key == "abc"
  565. doAssert val == "xyz"
  566. doAssert intval == 89
  567. doAssert floatVal == 33.25
  568. let xx = scanf("$abc", "$$$i", intval)
  569. doAssert xx == false
  570. let xx2 = scanf("$1234", "$$$i", intval)
  571. doAssert xx2
  572. let yy = scanf(";.--Breakpoint00 [output]", "$[someSep]Breakpoint${twoDigits}$[someSep({';','.','-'})] [$+]$.", intVal, key)
  573. doAssert yy
  574. doAssert key == "output"
  575. doAssert intVal == 13
  576. var ident = ""
  577. var idx = 0
  578. let zz = scanp("foobar x x x xWZ", idx, +{'a'..'z'} -> add(ident, $_), *(*{' ', '\t'}, "x"), ~'U', "Z")
  579. doAssert zz
  580. doAssert ident == "foobar"
  581. const digits = {'0'..'9'}
  582. var year = 0
  583. var idx2 = 0
  584. if scanp("201655-8-9", idx2, `digits`{4,6} -> (year = year * 10 + ord($_) - ord('0')), "-8", "-9"):
  585. doAssert year == 201655
  586. const gdbOut = """
  587. #0 @foo_96013_1208911747@8 (x0=...)
  588. at c:/users/anwender/projects/nim/temp.nim:11
  589. #1 0x00417754 in tempInit000 () at c:/users/anwender/projects/nim/temp.nim:13
  590. #2 0x0041768d in NimMainInner ()
  591. at c:/users/anwender/projects/nim/lib/system.nim:2605
  592. #3 0x004176b1 in NimMain ()
  593. at c:/users/anwender/projects/nim/lib/system.nim:2613
  594. #4 0x004176db in main (argc=1, args=0x712cc8, env=0x711ca8)
  595. at c:/users/anwender/projects/nim/lib/system.nim:2620"""
  596. const result = @["foo c:/users/anwender/projects/nim/temp.nim:11",
  597. "tempInit000 c:/users/anwender/projects/nim/temp.nim:13",
  598. "NimMainInner c:/users/anwender/projects/nim/lib/system.nim:2605",
  599. "NimMain c:/users/anwender/projects/nim/lib/system.nim:2613",
  600. "main c:/users/anwender/projects/nim/lib/system.nim:2620"]
  601. doAssert parseGDB(gdbOut) == result