1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258 |
- Types
- =====
- All expressions have a type which is known at compile time. Nim
- is statically typed. One can declare new types, which is in essence defining
- an identifier that can be used to denote this custom type.
- These are the major type classes:
- * ordinal types (consist of integer, bool, character, enumeration
- (and subranges thereof) types)
- * floating point types
- * string type
- * structured types
- * reference (pointer) type
- * procedural type
- * generic type
- Ordinal types
- -------------
- Ordinal types have the following characteristics:
- - Ordinal types are countable and ordered. This property allows
- the operation of functions as ``inc``, ``ord``, ``dec`` on ordinal types to
- be defined.
- - Ordinal values have a smallest possible value. Trying to count further
- down than the smallest value gives a checked runtime or static error.
- - Ordinal values have a largest possible value. Trying to count further
- than the largest value gives a checked runtime or static error.
- Integers, bool, characters and enumeration types (and subranges of these
- types) belong to ordinal types. For reasons of simplicity of implementation
- the types ``uint`` and ``uint64`` are not ordinal types.
- Pre-defined integer types
- -------------------------
- These integer types are pre-defined:
- ``int``
- the generic signed integer type; its size is platform dependent and has the
- same size as a pointer. This type should be used in general. An integer
- literal that has no type suffix is of this type.
- intXX
- additional signed integer types of XX bits use this naming scheme
- (example: int16 is a 16 bit wide integer).
- The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``.
- Literals of these types have the suffix 'iXX.
- ``uint``
- the generic `unsigned integer`:idx: type; its size is platform dependent and
- has the same size as a pointer. An integer literal with the type
- suffix ``'u`` is of this type.
- uintXX
- additional signed integer types of XX bits use this naming scheme
- (example: uint16 is a 16 bit wide unsigned integer).
- The current implementation supports ``uint8``, ``uint16``, ``uint32``,
- ``uint64``. Literals of these types have the suffix 'uXX.
- Unsigned operations all wrap around; they cannot lead to over- or
- underflow errors.
- In addition to the usual arithmetic operators for signed and unsigned integers
- (``+ - *`` etc.) there are also operators that formally work on *signed*
- integers but treat their arguments as *unsigned*: They are mostly provided
- for backwards compatibility with older versions of the language that lacked
- unsigned integer types. These unsigned operations for signed integers use
- the ``%`` suffix as convention:
- ====================== ======================================================
- operation meaning
- ====================== ======================================================
- ``a +% b`` unsigned integer addition
- ``a -% b`` unsigned integer subtraction
- ``a *% b`` unsigned integer multiplication
- ``a /% b`` unsigned integer division
- ``a %% b`` unsigned integer modulo operation
- ``a <% b`` treat ``a`` and ``b`` as unsigned and compare
- ``a <=% b`` treat ``a`` and ``b`` as unsigned and compare
- ``ze(a)`` extends the bits of ``a`` with zeros until it has the
- width of the ``int`` type
- ``toU8(a)`` treats ``a`` as unsigned and converts it to an
- unsigned integer of 8 bits (but still the
- ``int8`` type)
- ``toU16(a)`` treats ``a`` as unsigned and converts it to an
- unsigned integer of 16 bits (but still the
- ``int16`` type)
- ``toU32(a)`` treats ``a`` as unsigned and converts it to an
- unsigned integer of 32 bits (but still the
- ``int32`` type)
- ====================== ======================================================
- `Automatic type conversion`:idx: is performed in expressions where different
- kinds of integer types are used: the smaller type is converted to the larger.
- A `narrowing type conversion`:idx: converts a larger to a smaller type (for
- example ``int32 -> int16``. A `widening type conversion`:idx: converts a
- smaller type to a larger type (for example ``int16 -> int32``). In Nim only
- widening type conversions are *implicit*:
- .. code-block:: nim
- var myInt16 = 5i16
- var myInt: int
- myInt16 + 34 # of type ``int16``
- myInt16 + myInt # of type ``int``
- myInt16 + 2i32 # of type ``int32``
- However, ``int`` literals are implicitly convertible to a smaller integer type
- if the literal's value fits this smaller type and such a conversion is less
- expensive than other implicit conversions, so ``myInt16 + 34`` produces
- an ``int16`` result.
- For further details, see `Convertible relation`_.
- Subrange types
- --------------
- A subrange type is a range of values from an ordinal type (the base
- type). To define a subrange type, one must specify it's limiting values: the
- lowest and highest value of the type:
- .. code-block:: nim
- type
- Subrange = range[0..5]
- ``Subrange`` is a subrange of an integer which can only hold the values 0
- to 5. Assigning any other value to a variable of type ``Subrange`` is a
- checked runtime error (or static error if it can be statically
- determined). Assignments from the base type to one of its subrange types
- (and vice versa) are allowed.
- A subrange type has the same size as its base type (``int`` in the example).
- Nim requires `interval arithmetic`:idx: for subrange types over a set
- of built-in operators that involve constants: ``x %% 3`` is of
- type ``range[0..2]``. The following built-in operators for integers are
- affected by this rule: ``-``, ``+``, ``*``, ``min``, ``max``, ``succ``,
- ``pred``, ``mod``, ``div``, ``%%``, ``and`` (bitwise ``and``).
- Bitwise ``and`` only produces a ``range`` if one of its operands is a
- constant *x* so that (x+1) is a power of two.
- (Bitwise ``and`` is then a ``%%`` operation.)
- This means that the following code is accepted:
- .. code-block:: nim
- case (x and 3) + 7
- of 7: echo "A"
- of 8: echo "B"
- of 9: echo "C"
- of 10: echo "D"
- # note: no ``else`` required as (x and 3) + 7 has the type: range[7..10]
- Pre-defined floating point types
- --------------------------------
- The following floating point types are pre-defined:
- ``float``
- the generic floating point type; its size is platform dependent
- (the compiler chooses the processor's fastest floating point type).
- This type should be used in general.
- floatXX
- an implementation may define additional floating point types of XX bits using
- this naming scheme (example: float64 is a 64 bit wide float). The current
- implementation supports ``float32`` and ``float64``. Literals of these types
- have the suffix 'fXX.
- Automatic type conversion in expressions with different kinds
- of floating point types is performed: See `Convertible relation`_ for further
- details. Arithmetic performed on floating point types follows the IEEE
- standard. Integer types are not converted to floating point types automatically
- and vice versa.
- The IEEE standard defines five types of floating-point exceptions:
- * Invalid: operations with mathematically invalid operands,
- for example 0.0/0.0, sqrt(-1.0), and log(-37.8).
- * Division by zero: divisor is zero and dividend is a finite nonzero number,
- for example 1.0/0.0.
- * Overflow: operation produces a result that exceeds the range of the exponent,
- for example MAXDOUBLE+0.0000000000001e308.
- * Underflow: operation produces a result that is too small to be represented
- as a normal number, for example, MINDOUBLE * MINDOUBLE.
- * Inexact: operation produces a result that cannot be represented with infinite
- precision, for example, 2.0 / 3.0, log(1.1) and 0.1 in input.
- The IEEE exceptions are either ignored at runtime or mapped to the
- Nim exceptions: `FloatInvalidOpError`:idx:, `FloatDivByZeroError`:idx:,
- `FloatOverflowError`:idx:, `FloatUnderflowError`:idx:,
- and `FloatInexactError`:idx:.
- These exceptions inherit from the `FloatingPointError`:idx: base class.
- Nim provides the pragmas `NaNChecks`:idx: and `InfChecks`:idx: to control
- whether the IEEE exceptions are ignored or trap a Nim exception:
- .. code-block:: nim
- {.NanChecks: on, InfChecks: on.}
- var a = 1.0
- var b = 0.0
- echo b / b # raises FloatInvalidOpError
- echo a / b # raises FloatOverflowError
- In the current implementation ``FloatDivByZeroError`` and ``FloatInexactError``
- are never raised. ``FloatOverflowError`` is raised instead of
- ``FloatDivByZeroError``.
- There is also a `floatChecks`:idx: pragma that is a short-cut for the
- combination of ``NaNChecks`` and ``InfChecks`` pragmas. ``floatChecks`` are
- turned off as default.
- The only operations that are affected by the ``floatChecks`` pragma are
- the ``+``, ``-``, ``*``, ``/`` operators for floating point types.
- An implementation should always use the maximum precision available to evaluate
- floating pointer values at compile time; this means expressions like
- ``0.09'f32 + 0.01'f32 == 0.09'f64 + 0.01'f64`` are true.
- Boolean type
- ------------
- The boolean type is named `bool`:idx: in Nim and can be one of the two
- pre-defined values ``true`` and ``false``. Conditions in ``while``,
- ``if``, ``elif``, ``when``-statements need to be of type ``bool``.
- This condition holds::
- ord(false) == 0 and ord(true) == 1
- The operators ``not, and, or, xor, <, <=, >, >=, !=, ==`` are defined
- for the bool type. The ``and`` and ``or`` operators perform short-cut
- evaluation. Example:
- .. code-block:: nim
- while p != nil and p.name != "xyz":
- # p.name is not evaluated if p == nil
- p = p.next
- The size of the bool type is one byte.
- Character type
- --------------
- The character type is named ``char`` in Nim. Its size is one byte.
- Thus it cannot represent an UTF-8 character, but a part of it.
- The reason for this is efficiency: for the overwhelming majority of use-cases,
- the resulting programs will still handle UTF-8 properly as UTF-8 was specially
- designed for this.
- Another reason is that Nim can support ``array[char, int]`` or
- ``set[char]`` efficiently as many algorithms rely on this feature. The
- `Rune` type is used for Unicode characters, it can represent any Unicode
- character. ``Rune`` is declared in the `unicode module <unicode.html>`_.
- Enumeration types
- -----------------
- Enumeration types define a new type whose values consist of the ones
- specified. The values are ordered. Example:
- .. code-block:: nim
- type
- Direction = enum
- north, east, south, west
- Now the following holds::
- ord(north) == 0
- ord(east) == 1
- ord(south) == 2
- ord(west) == 3
- Thus, north < east < south < west. The comparison operators can be used
- with enumeration types.
- For better interfacing to other programming languages, the fields of enum
- types can be assigned an explicit ordinal value. However, the ordinal values
- have to be in ascending order. A field whose ordinal value is not
- explicitly given is assigned the value of the previous field + 1.
- An explicit ordered enum can have *holes*:
- .. code-block:: nim
- type
- TokenType = enum
- a = 2, b = 4, c = 89 # holes are valid
- However, it is then not an ordinal anymore, so it is not possible to use these
- enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ``
- and ``pred`` are not available for them either.
- The compiler supports the built-in stringify operator ``$`` for enumerations.
- The stringify's result can be controlled by explicitly giving the string
- values to use:
- .. code-block:: nim
- type
- MyEnum = enum
- valueA = (0, "my value A"),
- valueB = "value B",
- valueC = 2,
- valueD = (3, "abc")
- As can be seen from the example, it is possible to both specify a field's
- ordinal value and its string value by using a tuple. It is also
- possible to only specify one of them.
- An enum can be marked with the ``pure`` pragma so that it's fields are not
- added to the current scope, so they always need to be accessed
- via ``MyEnum.value``:
- .. code-block:: nim
- type
- MyEnum {.pure.} = enum
- valueA, valueB, valueC, valueD
- echo valueA # error: Unknown identifier
- echo MyEnum.valueA # works
- String type
- -----------
- All string literals are of the type ``string``. A string in Nim is very
- similar to a sequence of characters. However, strings in Nim are both
- zero-terminated and have a length field. One can retrieve the length with the
- builtin ``len`` procedure; the length never counts the terminating zero.
- The assignment operator for strings always copies the string.
- The ``&`` operator concatenates strings.
- Most native Nim types support conversion to strings with the special ``$`` proc.
- When calling the ``echo`` proc, for example, the built-in stringify operation
- for the parameter is called:
- .. code-block:: nim
- echo 3 # calls `$` for `int`
- Whenever a user creates a specialized object, implementation of this procedure
- provides for ``string`` representation.
- .. code-block:: nim
- type
- Person = object
- name: string
- age: int
- proc `$`(p: Person): string = # `$` always returns a string
- result = p.name & " is " &
- $p.age & # we *need* the `$` in front of p.age, which
- # is natively an integer, to convert it to
- # a string
- " years old."
- While ``$p.name`` can also be used, the ``$`` operation on a string does
- nothing. Note that we cannot rely on automatic conversion from an ``int`` to
- a ``string`` like we can for the ``echo`` proc.
- Strings are compared by their lexicographical order. All comparison operators
- are available. Strings can be indexed like arrays (lower bound is 0). Unlike
- arrays, they can be used in case statements:
- .. code-block:: nim
- case paramStr(i)
- of "-v": incl(options, optVerbose)
- of "-h", "-?": incl(options, optHelp)
- else: write(stdout, "invalid command line option!\n")
- Per convention, all strings are UTF-8 strings, but this is not enforced. For
- example, when reading strings from binary files, they are merely a sequence of
- bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the
- i-th *unichar*. The iterator ``runes`` from the `unicode module
- <unicode.html>`_ can be used for iteration over all Unicode characters.
- cstring type
- ------------
- The ``cstring`` type meaning `compatible string` is the native representation
- of a string for the compilation backend. For the C backend the ``cstring`` type
- represents a pointer to a zero-terminated char array
- compatible to the type ``char*`` in Ansi C. Its primary purpose lies in easy
- interfacing with C. The index operation ``s[i]`` means the i-th *char* of
- ``s``; however no bounds checking for ``cstring`` is performed making the
- index operation unsafe.
- A Nim ``string`` is implicitly convertible
- to ``cstring`` for convenience. If a Nim string is passed to a C-style
- variadic proc, it is implicitly converted to ``cstring`` too:
- .. code-block:: nim
- proc printf(formatstr: cstring) {.importc: "printf", varargs,
- header: "<stdio.h>".}
- printf("This works %s", "as expected")
- Even though the conversion is implicit, it is not *safe*: The garbage collector
- does not consider a ``cstring`` to be a root and may collect the underlying
- memory. However in practice this almost never happens as the GC considers
- stack roots conservatively. One can use the builtin procs ``GC_ref`` and
- ``GC_unref`` to keep the string data alive for the rare cases where it does
- not work.
- A `$` proc is defined for cstrings that returns a string. Thus to get a nim
- string from a cstring:
- .. code-block:: nim
- var str: string = "Hello!"
- var cstr: cstring = str
- var newstr: string = $cstr
- Structured types
- ----------------
- A variable of a structured type can hold multiple values at the same
- time. Structured types can be nested to unlimited levels. Arrays, sequences,
- tuples, objects and sets belong to the structured types.
- Array and sequence types
- ------------------------
- Arrays are a homogeneous type, meaning that each element in the array
- has the same type. Arrays always have a fixed length which is specified at
- compile time (except for open arrays). They can be indexed by any ordinal type.
- A parameter ``A`` may be an *open array*, in which case it is indexed by
- integers from 0 to ``len(A)-1``. An array expression may be constructed by the
- array constructor ``[]``. The element type of this array expression is
- inferred from the type of the first element. All other elements need to be
- implicitly convertable to this type.
- Sequences are similar to arrays but of dynamic length which may change
- during runtime (like strings). Sequences are implemented as growable arrays,
- allocating pieces of memory as items are added. A sequence ``S`` is always
- indexed by integers from 0 to ``len(S)-1`` and its bounds are checked.
- Sequences can be constructed by the array constructor ``[]`` in conjunction
- with the array to sequence operator ``@``. Another way to allocate space for a
- sequence is to call the built-in ``newSeq`` procedure.
- A sequence may be passed to a parameter that is of type *open array*.
- Example:
- .. code-block:: nim
- type
- IntArray = array[0..5, int] # an array that is indexed with 0..5
- IntSeq = seq[int] # a sequence of integers
- var
- x: IntArray
- y: IntSeq
- x = [1, 2, 3, 4, 5, 6] # [] is the array constructor
- y = @[1, 2, 3, 4, 5, 6] # the @ turns the array into a sequence
- let z = [1.0, 2, 3, 4] # the type of z is array[0..3, float]
- The lower bound of an array or sequence may be received by the built-in proc
- ``low()``, the higher bound by ``high()``. The length may be
- received by ``len()``. ``low()`` for a sequence or an open array always returns
- 0, as this is the first valid index.
- One can append elements to a sequence with the ``add()`` proc or the ``&``
- operator, and remove (and get) the last element of a sequence with the
- ``pop()`` proc.
- The notation ``x[i]`` can be used to access the i-th element of ``x``.
- Arrays are always bounds checked (at compile-time or at runtime). These
- checks can be disabled via pragmas or invoking the compiler with the
- ``--boundChecks:off`` command line switch.
- Open arrays
- -----------
- Often fixed size arrays turn out to be too inflexible; procedures should
- be able to deal with arrays of different sizes. The `openarray`:idx: type
- allows this; it can only be used for parameters. Openarrays are always
- indexed with an ``int`` starting at position 0. The ``len``, ``low``
- and ``high`` operations are available for open arrays too. Any array with
- a compatible base type can be passed to an openarray parameter, the index
- type does not matter. In addition to arrays sequences can also be passed
- to an open array parameter.
- The openarray type cannot be nested: multidimensional openarrays are not
- supported because this is seldom needed and cannot be done efficiently.
- .. code-block:: nim
- proc testOpenArray(x: openArray[int]) = echo repr(x)
- testOpenArray([1,2,3]) # array[]
- testOpenArray(@[1,2,3]) # seq[]
- Varargs
- -------
- A ``varargs`` parameter is an openarray parameter that additionally
- allows to pass a variable number of arguments to a procedure. The compiler
- converts the list of arguments to an array implicitly:
- .. code-block:: nim
- proc myWriteln(f: File, a: varargs[string]) =
- for s in items(a):
- write(f, s)
- write(f, "\n")
- myWriteln(stdout, "abc", "def", "xyz")
- # is transformed to:
- myWriteln(stdout, ["abc", "def", "xyz"])
- This transformation is only done if the varargs parameter is the
- last parameter in the procedure header. It is also possible to perform
- type conversions in this context:
- .. code-block:: nim
- proc myWriteln(f: File, a: varargs[string, `$`]) =
- for s in items(a):
- write(f, s)
- write(f, "\n")
- myWriteln(stdout, 123, "abc", 4.0)
- # is transformed to:
- myWriteln(stdout, [$123, $"def", $4.0])
- In this example ``$`` is applied to any argument that is passed to the
- parameter ``a``. (Note that ``$`` applied to strings is a nop.)
- Note that an explicit array constructor passed to a ``varargs`` parameter is
- not wrapped in another implicit array construction:
- .. code-block:: nim
- proc takeV[T](a: varargs[T]) = discard
- takeV([123, 2, 1]) # takeV's T is "int", not "array of int"
- ``varargs[typed]`` is treated specially: It matches a variable list of arguments
- of arbitrary type but *always* constructs an implicit array. This is required
- so that the builtin ``echo`` proc does what is expected:
- .. code-block:: nim
- proc echo*(x: varargs[typed, `$`]) {...}
- echo @[1, 2, 3]
- # prints "@[1, 2, 3]" and not "123"
- Tuples and object types
- -----------------------
- A variable of a tuple or object type is a heterogeneous storage
- container.
- A tuple or object defines various named *fields* of a type. A tuple also
- defines an *order* of the fields. Tuples are meant for heterogeneous storage
- types with no overhead and few abstraction possibilities. The constructor ``()``
- can be used to construct tuples. The order of the fields in the constructor
- must match the order of the tuple's definition. Different tuple-types are
- *equivalent* if they specify the same fields of the same type in the same
- order. The *names* of the fields also have to be identical.
- The assignment operator for tuples copies each component.
- The default assignment operator for objects copies each component. Overloading
- of the assignment operator is described in `type-bound-operations-operator`_.
- .. code-block:: nim
- type
- Person = tuple[name: string, age: int] # type representing a person:
- # a person consists of a name
- # and an age
- var
- person: Person
- person = (name: "Peter", age: 30)
- # the same, but less readable:
- person = ("Peter", 30)
- The implementation aligns the fields for best access performance. The alignment
- is compatible with the way the C compiler does it.
- For consistency with ``object`` declarations, tuples in a ``type`` section
- can also be defined with indentation instead of ``[]``:
- .. code-block:: nim
- type
- Person = tuple # type representing a person
- name: string # a person consists of a name
- age: natural # and an age
- Objects provide many features that tuples do not. Object provide inheritance
- and information hiding. Objects have access to their type at runtime, so that
- the ``of`` operator can be used to determine the object's type. The ``of`` operator
- is similar to the ``instanceof`` operator in Java.
- .. code-block:: nim
- type
- Person = object of RootObj
- name*: string # the * means that `name` is accessible from other modules
- age: int # no * means that the field is hidden
- Student = ref object of Person # a student is a person
- id: int # with an id field
- var
- student: Student
- person: Person
- assert(student of Student) # is true
- assert(student of Person) # also true
- Object fields that should be visible from outside the defining module, have to
- be marked by ``*``. In contrast to tuples, different object types are
- never *equivalent*. Objects that have no ancestor are implicitly ``final``
- and thus have no hidden type field. One can use the ``inheritable`` pragma to
- introduce new object roots apart from ``system.RootObj``.
- Object construction
- -------------------
- Objects can also be created with an `object construction expression`:idx: that
- has the syntax ``T(fieldA: valueA, fieldB: valueB, ...)`` where ``T`` is
- an ``object`` type or a ``ref object`` type:
- .. code-block:: nim
- var student = Student(name: "Anton", age: 5, id: 3)
- Note that, unlike tuples, objects require the field names along with their values.
- For a ``ref object`` type ``system.new`` is invoked implicitly.
- Object variants
- ---------------
- Often an object hierarchy is overkill in certain situations where simple
- variant types are needed.
- An example:
- .. code-block:: nim
- # This is an example how an abstract syntax tree could be modelled in Nim
- type
- NodeKind = enum # the different node types
- nkInt, # a leaf with an integer value
- nkFloat, # a leaf with a float value
- nkString, # a leaf with a string value
- nkAdd, # an addition
- nkSub, # a subtraction
- nkIf # an if statement
- Node = ref NodeObj
- NodeObj = object
- case kind: NodeKind # the ``kind`` field is the discriminator
- of nkInt: intVal: int
- of nkFloat: floatVal: float
- of nkString: strVal: string
- of nkAdd, nkSub:
- leftOp, rightOp: Node
- of nkIf:
- condition, thenPart, elsePart: Node
- # create a new case object:
- var n = Node(kind: nkIf, condition: nil)
- # accessing n.thenPart is valid because the ``nkIf`` branch is active:
- n.thenPart = Node(kind: nkFloat, floatVal: 2.0)
- # the following statement raises an `FieldError` exception, because
- # n.kind's value does not fit and the ``nkString`` branch is not active:
- n.strVal = ""
- # invalid: would change the active object branch:
- n.kind = nkInt
- var x = Node(kind: nkAdd, leftOp: Node(kind: nkInt, intVal: 4),
- rightOp: Node(kind: nkInt, intVal: 2))
- # valid: does not change the active object branch:
- x.kind = nkSub
- As can been seen from the example, an advantage to an object hierarchy is that
- no casting between different object types is needed. Yet, access to invalid
- object fields raises an exception.
- The syntax of ``case`` in an object declaration follows closely the syntax of
- the ``case`` statement: The branches in a ``case`` section may be indented too.
- In the example the ``kind`` field is called the `discriminator`:idx:\: For
- safety its address cannot be taken and assignments to it are restricted: The
- new value must not lead to a change of the active object branch. For an object
- branch switch ``system.reset`` has to be used. Also, when the fields of a
- particular branch are specified during object construction, the correct value
- for the discriminator must be supplied at compile-time.
- Set type
- --------
- .. include:: ../sets_fragment.txt
- Reference and pointer types
- ---------------------------
- References (similar to pointers in other programming languages) are a
- way to introduce many-to-one relationships. This means different references can
- point to and modify the same location in memory (also called `aliasing`:idx:).
- Nim distinguishes between `traced`:idx: and `untraced`:idx: references.
- Untraced references are also called *pointers*. Traced references point to
- objects of a garbage collected heap, untraced references point to
- manually allocated objects or to objects somewhere else in memory. Thus
- untraced references are *unsafe*. However for certain low-level operations
- (accessing the hardware) untraced references are unavoidable.
- Traced references are declared with the **ref** keyword, untraced references
- are declared with the **ptr** keyword. In general, a `ptr T` is implicitly
- convertible to the `pointer` type.
- An empty subscript ``[]`` notation can be used to derefer a reference,
- the ``addr`` procedure returns the address of an item. An address is always
- an untraced reference.
- Thus the usage of ``addr`` is an *unsafe* feature.
- The ``.`` (access a tuple/object field operator)
- and ``[]`` (array/string/sequence index operator) operators perform implicit
- dereferencing operations for reference types:
- .. code-block:: nim
- type
- Node = ref NodeObj
- NodeObj = object
- le, ri: Node
- data: int
- var
- n: Node
- new(n)
- n.data = 9
- # no need to write n[].data; in fact n[].data is highly discouraged!
- Automatic dereferencing is also performed for the first argument of a routine
- call. But currently this feature has to be only enabled
- via ``{.experimental.}``:
- .. code-block:: nim
- {.experimental.}
- proc depth(x: NodeObj): int = ...
- var
- n: Node
- new(n)
- echo n.depth
- # no need to write n[].depth either
- In order to simplify structural type checking, recursive tuples are not valid:
- .. code-block:: nim
- # invalid recursion
- type MyTuple = tuple[a: ref MyTuple]
- Likewise ``T = ref T`` is an invalid type.
- As a syntactical extension ``object`` types can be anonymous if
- declared in a type section via the ``ref object`` or ``ptr object`` notations.
- This feature is useful if an object should only gain reference semantics:
- .. code-block:: nim
- type
- Node = ref object
- le, ri: Node
- data: int
- To allocate a new traced object, the built-in procedure ``new`` has to be used.
- To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and
- ``realloc`` can be used. The documentation of the system module contains
- further information.
- If a reference points to *nothing*, it has the value ``nil``.
- Special care has to be taken if an untraced object contains traced objects like
- traced references, strings or sequences: in order to free everything properly,
- the built-in procedure ``GCunref`` has to be called before freeing the untraced
- memory manually:
- .. code-block:: nim
- type
- Data = tuple[x, y: int, s: string]
- # allocate memory for Data on the heap:
- var d = cast[ptr Data](alloc0(sizeof(Data)))
- # create a new string on the garbage collected heap:
- d.s = "abc"
- # tell the GC that the string is not needed anymore:
- GCunref(d.s)
- # free the memory:
- dealloc(d)
- Without the ``GCunref`` call the memory allocated for the ``d.s`` string would
- never be freed. The example also demonstrates two important features for low
- level programming: the ``sizeof`` proc returns the size of a type or value
- in bytes. The ``cast`` operator can circumvent the type system: the compiler
- is forced to treat the result of the ``alloc0`` call (which returns an untyped
- pointer) as if it would have the type ``ptr Data``. Casting should only be
- done if it is unavoidable: it breaks type safety and bugs can lead to
- mysterious crashes.
- **Note**: The example only works because the memory is initialized to zero
- (``alloc0`` instead of ``alloc`` does this): ``d.s`` is thus initialized to
- ``nil`` which the string assignment can handle. One needs to know low level
- details like this when mixing garbage collected data with unmanaged memory.
- .. XXX finalizers for traced objects
- Not nil annotation
- ------------------
- All types for that ``nil`` is a valid value can be annotated to
- exclude ``nil`` as a valid value with the ``not nil`` annotation:
- .. code-block:: nim
- type
- PObject = ref TObj not nil
- TProc = (proc (x, y: int)) not nil
- proc p(x: PObject) =
- echo "not nil"
- # compiler catches this:
- p(nil)
- # and also this:
- var x: PObject
- p(x)
- The compiler ensures that every code path initializes variables which contain
- non nilable pointers. The details of this analysis are still to be specified
- here.
- Memory regions
- --------------
- The types ``ref`` and ``ptr`` can get an optional ``region`` annotation.
- A region has to be an object type.
- Regions are very useful to separate user space and kernel memory in the
- development of OS kernels:
- .. code-block:: nim
- type
- Kernel = object
- Userspace = object
- var a: Kernel ptr Stat
- var b: Userspace ptr Stat
- # the following does not compile as the pointer types are incompatible:
- a = b
- As the example shows ``ptr`` can also be used as a binary
- operator, ``region ptr T`` is a shortcut for ``ptr[region, T]``.
- In order to make generic code easier to write ``ptr T`` is a subtype
- of ``ptr[R, T]`` for any ``R``.
- Furthermore the subtype relation of the region object types is lifted to
- the pointer types: If ``A <: B`` then ``ptr[A, T] <: ptr[B, T]``. This can be
- used to model subregions of memory. As a special typing rule ``ptr[R, T]`` is
- not compatible to ``pointer`` to prevent the following from compiling:
- .. code-block:: nim
- # from system
- proc dealloc(p: pointer)
- # wrap some scripting language
- type
- PythonsHeap = object
- PyObjectHeader = object
- rc: int
- typ: pointer
- PyObject = ptr[PythonsHeap, PyObjectHeader]
- proc createPyObject(): PyObject {.importc: "...".}
- proc destroyPyObject(x: PyObject) {.importc: "...".}
- var foo = createPyObject()
- # type error here, how convenient:
- dealloc(foo)
- Future directions:
- * Memory regions might become available for ``string`` and ``seq`` too.
- * Builtin regions like ``private``, ``global`` and ``local`` might be
- useful for an OpenCL target.
- * Builtin "regions" can model ``lent`` and ``unique`` pointers.
- * An assignment operator can be attached to a region so that proper write
- barriers can be generated. This would imply that the GC can be implemented
- completely in user-space.
- Procedural type
- ---------------
- A procedural type is internally a pointer to a procedure. ``nil`` is
- an allowed value for variables of a procedural type. Nim uses procedural
- types to achieve `functional`:idx: programming techniques.
- Examples:
- .. code-block:: nim
- proc printItem(x: int) = ...
- proc forEach(c: proc (x: int) {.cdecl.}) =
- ...
- forEach(printItem) # this will NOT compile because calling conventions differ
- .. code-block:: nim
- type
- OnMouseMove = proc (x, y: int) {.closure.}
- proc onMouseMove(mouseX, mouseY: int) =
- # has default calling convention
- echo "x: ", mouseX, " y: ", mouseY
- proc setOnMouseMove(mouseMoveEvent: OnMouseMove) = discard
- # ok, 'onMouseMove' has the default calling convention, which is compatible
- # to 'closure':
- setOnMouseMove(onMouseMove)
- A subtle issue with procedural types is that the calling convention of the
- procedure influences the type compatibility: procedural types are only
- compatible if they have the same calling convention. As a special extension,
- a procedure of the calling convention ``nimcall`` can be passed to a parameter
- that expects a proc of the calling convention ``closure``.
- Nim supports these `calling conventions`:idx:\:
- `nimcall`:idx:
- is the default convention used for a Nim **proc**. It is the
- same as ``fastcall``, but only for C compilers that support ``fastcall``.
- `closure`:idx:
- is the default calling convention for a **procedural type** that lacks
- any pragma annotations. It indicates that the procedure has a hidden
- implicit parameter (an *environment*). Proc vars that have the calling
- convention ``closure`` take up two machine words: One for the proc pointer
- and another one for the pointer to implicitly passed environment.
- `stdcall`:idx:
- This the stdcall convention as specified by Microsoft. The generated C
- procedure is declared with the ``__stdcall`` keyword.
- `cdecl`:idx:
- The cdecl convention means that a procedure shall use the same convention
- as the C compiler. Under windows the generated C procedure is declared with
- the ``__cdecl`` keyword.
- `safecall`:idx:
- This is the safecall convention as specified by Microsoft. The generated C
- procedure is declared with the ``__safecall`` keyword. The word *safe*
- refers to the fact that all hardware registers shall be pushed to the
- hardware stack.
- `inline`:idx:
- The inline convention means the the caller should not call the procedure,
- but inline its code directly. Note that Nim does not inline, but leaves
- this to the C compiler; it generates ``__inline`` procedures. This is
- only a hint for the compiler: it may completely ignore it and
- it may inline procedures that are not marked as ``inline``.
- `fastcall`:idx:
- Fastcall means different things to different C compilers. One gets whatever
- the C ``__fastcall`` means.
- `syscall`:idx:
- The syscall convention is the same as ``__syscall`` in C. It is used for
- interrupts.
- `noconv`:idx:
- The generated C code will not have any explicit calling convention and thus
- use the C compiler's default calling convention. This is needed because
- Nim's default calling convention for procedures is ``fastcall`` to
- improve speed.
- Most calling conventions exist only for the Windows 32-bit platform.
- The default calling convention is ``nimcall``, unless it is an inner proc (a
- proc inside of a proc). For an inner proc an analysis is performed whether it
- accesses its environment. If it does so, it has the calling convention
- ``closure``, otherwise it has the calling convention ``nimcall``.
- Distinct type
- -------------
- A ``distinct`` type is new type derived from a `base type`:idx: that is
- incompatible with its base type. In particular, it is an essential property
- of a distinct type that it **does not** imply a subtype relation between it
- and its base type. Explicit type conversions from a distinct type to its
- base type and vice versa are allowed.
- Modelling currencies
- ~~~~~~~~~~~~~~~~~~~~
- A distinct type can be used to model different physical `units`:idx: with a
- numerical base type, for example. The following example models currencies.
- Different currencies should not be mixed in monetary calculations. Distinct
- types are a perfect tool to model different currencies:
- .. code-block:: nim
- type
- Dollar = distinct int
- Euro = distinct int
- var
- d: Dollar
- e: Euro
- echo d + 12
- # Error: cannot add a number with no unit and a ``Dollar``
- Unfortunately, ``d + 12.Dollar`` is not allowed either,
- because ``+`` is defined for ``int`` (among others), not for ``Dollar``. So
- a ``+`` for dollars needs to be defined:
- .. code-block::
- proc `+` (x, y: Dollar): Dollar =
- result = Dollar(int(x) + int(y))
- It does not make sense to multiply a dollar with a dollar, but with a
- number without unit; and the same holds for division:
- .. code-block::
- proc `*` (x: Dollar, y: int): Dollar =
- result = Dollar(int(x) * y)
- proc `*` (x: int, y: Dollar): Dollar =
- result = Dollar(x * int(y))
- proc `div` ...
- This quickly gets tedious. The implementations are trivial and the compiler
- should not generate all this code only to optimize it away later - after all
- ``+`` for dollars should produce the same binary code as ``+`` for ints.
- The pragma `borrow`:idx: has been designed to solve this problem; in principle
- it generates the above trivial implementations:
- .. code-block:: nim
- proc `*` (x: Dollar, y: int): Dollar {.borrow.}
- proc `*` (x: int, y: Dollar): Dollar {.borrow.}
- proc `div` (x: Dollar, y: int): Dollar {.borrow.}
- The ``borrow`` pragma makes the compiler use the same implementation as
- the proc that deals with the distinct type's base type, so no code is
- generated.
- But it seems all this boilerplate code needs to be repeated for the ``Euro``
- currency. This can be solved with templates_.
- .. code-block:: nim
- template additive(typ: typedesc) =
- proc `+` *(x, y: typ): typ {.borrow.}
- proc `-` *(x, y: typ): typ {.borrow.}
- # unary operators:
- proc `+` *(x: typ): typ {.borrow.}
- proc `-` *(x: typ): typ {.borrow.}
- template multiplicative(typ, base: typedesc) =
- proc `*` *(x: typ, y: base): typ {.borrow.}
- proc `*` *(x: base, y: typ): typ {.borrow.}
- proc `div` *(x: typ, y: base): typ {.borrow.}
- proc `mod` *(x: typ, y: base): typ {.borrow.}
- template comparable(typ: typedesc) =
- proc `<` * (x, y: typ): bool {.borrow.}
- proc `<=` * (x, y: typ): bool {.borrow.}
- proc `==` * (x, y: typ): bool {.borrow.}
- template defineCurrency(typ, base: untyped) =
- type
- typ* = distinct base
- additive(typ)
- multiplicative(typ, base)
- comparable(typ)
- defineCurrency(Dollar, int)
- defineCurrency(Euro, int)
- The borrow pragma can also be used to annotate the distinct type to allow
- certain builtin operations to be lifted:
- .. code-block:: nim
- type
- Foo = object
- a, b: int
- s: string
- Bar {.borrow: `.`.} = distinct Foo
- var bb: ref Bar
- new bb
- # field access now valid
- bb.a = 90
- bb.s = "abc"
- Currently only the dot accessor can be borrowed in this way.
- Avoiding SQL injection attacks
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- An SQL statement that is passed from Nim to an SQL database might be
- modelled as a string. However, using string templates and filling in the
- values is vulnerable to the famous `SQL injection attack`:idx:\:
- .. code-block:: nim
- import strutils
- proc query(db: DbHandle, statement: string) = ...
- var
- username: string
- db.query("SELECT FROM users WHERE name = '$1'" % username)
- # Horrible security hole, but the compiler does not mind!
- This can be avoided by distinguishing strings that contain SQL from strings
- that don't. Distinct types provide a means to introduce a new string type
- ``SQL`` that is incompatible with ``string``:
- .. code-block:: nim
- type
- SQL = distinct string
- proc query(db: DbHandle, statement: SQL) = ...
- var
- username: string
- db.query("SELECT FROM users WHERE name = '$1'" % username)
- # Error at compile time: `query` expects an SQL string!
- It is an essential property of abstract types that they **do not** imply a
- subtype relation between the abstract type and its base type. Explicit type
- conversions from ``string`` to ``SQL`` are allowed:
- .. code-block:: nim
- import strutils, sequtils
- proc properQuote(s: string): SQL =
- # quotes a string properly for an SQL statement
- return SQL(s)
- proc `%` (frmt: SQL, values: openarray[string]): SQL =
- # quote each argument:
- let v = values.mapIt(SQL, properQuote(it))
- # we need a temporary type for the type conversion :-(
- type StrSeq = seq[string]
- # call strutils.`%`:
- result = SQL(string(frmt) % StrSeq(v))
- db.query("SELECT FROM users WHERE name = '$1'".SQL % [username])
- Now we have compile-time checking against SQL injection attacks. Since
- ``"".SQL`` is transformed to ``SQL("")`` no new syntax is needed for nice
- looking ``SQL`` string literals. The hypothetical ``SQL`` type actually
- exists in the library as the `TSqlQuery type <db_sqlite.html#TSqlQuery>`_ of
- modules like `db_sqlite <db_sqlite.html>`_.
- Void type
- ---------
- The ``void`` type denotes the absence of any type. Parameters of
- type ``void`` are treated as non-existent, ``void`` as a return type means that
- the procedure does not return a value:
- .. code-block:: nim
- proc nothing(x, y: void): void =
- echo "ha"
- nothing() # writes "ha" to stdout
- The ``void`` type is particularly useful for generic code:
- .. code-block:: nim
- proc callProc[T](p: proc (x: T), x: T) =
- when T is void:
- p()
- else:
- p(x)
- proc intProc(x: int) = discard
- proc emptyProc() = discard
- callProc[int](intProc, 12)
- callProc[void](emptyProc)
- However, a ``void`` type cannot be inferred in generic code:
- .. code-block:: nim
- callProc(emptyProc)
- # Error: type mismatch: got (proc ())
- # but expected one of:
- # callProc(p: proc (T), x: T)
- The ``void`` type is only valid for parameters and return types; other symbols
- cannot have the type ``void``.
- Auto type
- ---------
- The ``auto`` type can only be used for return types and parameters. For return
- types it causes the compiler to infer the type from the routine body:
- .. code-block:: nim
- proc returnsInt(): auto = 1984
- For parameters it currently creates implicitly generic routines:
- .. code-block:: nim
- proc foo(a, b: auto) = discard
- Is the same as:
- .. code-block:: nim
- proc foo[T1, T2](a: T1, b: T2) = discard
- However later versions of the language might change this to mean "infer the
- parameters' types from the body". Then the above ``foo`` would be rejected as
- the parameters' types can not be inferred from an empty ``discard`` statement.
|