Single header reliable datagram library written in ANSI C

bzt b66a9a4277 More docs		1 year ago
README.md	b66a9a4277 More docs	1 year ago
client.c	a53d05fc4d Code cleanup and separate queue size and bitmask size	1 year ago
netq.h	4c9d70e6e2 Option to explicitly disable pthread locking	1 year ago
server.c	7e7e3279c6 Code cleanup and separate queue size and bitmask size	1 year ago

Networking Queue

Same old story over again, I've tried to use the libs like everybody else, checked UDT, ENET, nng, several Plan9's RUDP implementations, but found all of them overcomplicated, and their API hard to integrate. So I wrote my own.

NetQ is a single header ANSI C library to provide reliable datagrams with the simplest possible API and no dependencies. It does not provide any networking functions, all it cares about is small data blocks with a header, thus it can be used with a large variety of transport protocols (works with IPv4 and IPv6 UDP and with any number of peers). It provides a small circular FIFO on both the sender and the receiver side, and takes care of the re-transmission of lost packets, duplicated packets and re-ordering of packets. It does not allocate memory at all, meaning it's robust, thread-safe and super-duper fast. Perfectly suitable for microcontrollers with limited resources. I've also choosen algorithms that keep the additional network traffic as low as possible. (And it is also extremely small, about 400 SLoC, ca. 16k only. As being super small and dependency-free, audit and correctness checking is trivial.)

If keeping track of only the last 16 (or 32 or 64) packets isn't enough for your use case, then choose a different library or use a connection-aware stream protocol.

Even though its API is extremely simple (1 function to send, 2 functions to receive, that's all), it is also highly flexible and configurable with defines. Therefore the size of the header that prefixes your message in raw packets varies from 3 bytes to 16 bytes. Influenced by RFC 1982, Plan9's RUDP, and I took ideas from Glenn Fiedler. Big kudos to all of those people!

[[TOC]]

Usage

See client.c and server.c for a fully working example that supports IPv4 and IPv6 UDP datagrams as transport.

Overview

Include netq.h in your source files. In exactly one of your source file, define NETQ_IMPLEMENTATION. When you do that, you also have to define a raw packet sender callback function in NETQ_SEND (see its protoype and a minimal example below):

#define NETQ_IMPLEMENTATION
#define NETQ_SEND mypacketsender
#include "netq.h"

No need to initialize this library, just make sure the netq_t structure is zeroed out on startup, but you have to initialize your own my_net struct of course.

Sending a message is as simple as calling netq_send() with a buffer.

char *buf = "This is a message";

netq_send(&netq_ctx, buf, strlen(buf) + 1, &my_net);

On the other hand instead of a single receiver function, like in other libraries, NetQ has two functions here, netq_push() and netq_pop(). This is the simpliest approach to provide in-order datagram delivery and multiple peers at once. With these two functions a typical single-threaded receiver code looks like this:

char buf[1280];
netq_t *nq;

while(1) {
    /* receive raw packets */
    if((ret = recvfrom(sock, buf, sizeof(buf), MSG_DONTWAIT, &my_net.peer_addr, &my_net.addrlen)) > 0) {
        /* get the netq context for this message's peer, for example use
         * the source ip and port or use a session cookie in the packet, etc. */
        nq = my_get_netq_instance(&my_net);
        /* add it to the peer's queue */
        netq_push(nq, buf, ret, &my_net);
    }

    /* iterate on all network queues, and see if any has a message to be processed */
    for(i = 0; i < number_of_peers; i++)
        if((ret = netq_pop(&netq_contexts[i], buf, sizeof(buf))) > 0) {
            /* process your message in buf here */
            printf("peer #%d has sent %d bytes: '%s'\n", i, ret, buf);
        }
}

It is important to call netq_pop() even if there was no raw packet received, because it might return a message from its queue. For a performance boost, set a flag when a raw packet was received, and clear it when all netq_pop() calls returned 0. Use blocking recvfrom if this flag is clear, non-blocking otherwise. This way you can omit checking the network socket and iterating on peer list all the time.

Read also below in section "Thread-safety" about a particular threaded configuration that provides full performance.

Configuration

IMPORTANT!!! You must configure the server and the clients the SAME way!!!

To configure the library, you'll use defines in the source file where you have the implementation.

Define	Description
`NETQ_SEND`	Mandatory Name of the low level raw packet sender function
`NETQ_SEQ_BITS`	Sequence size in bits (8, 16 or 32, defaults to 16)
`NETQ_ACK_BITS`	The ack mask size (16, 32 or 64, defaults to 16)
`NETQ_QUEUE_SIZE`	The queue's size (at least `NETQ_ACK_BITS`)
`NETQ_MTU`	Number of bytes (including header) that fit into a raw packet
`NETQ_NACK_TRES`	NACK treshold, how many lost packets we tolerate before NACK
`NETQ_MUTEX_TYPE`	Locking mutex type (uses `pthread` by default, but see below)
`NETQ_MUTEX_LOCK`	Locking mutex acquire
`NETQ_MUTEX_UNLOCK`	Locking mutex release
`NETQ_NO_PTHREAD`	Explicitly disable locking even if `pthread` found

The header size is calculated as (2 * NETQ_SEQ_BITS + NETQ_ACK_BITS) / 8 bytes. Your biggest message can be NETQ_MTU minus the header's size bytes. IPv6 RFC guarantees that MTU is at least 1280 bytes, and usually it's no more than 1500 bytes (but keep in mind that IP protocols have their own headers too, so probably 1400 bytes in practice).

The raw packet sender function's prototype looks like:

int NETQ_SEND(void *net, const void *message, int length);

The transport layer's context net is fully opaque to NetQ, it just passes this pointer around. The simplest possible implementation for UDP packets could be just a plain simple dummy wrapper around sendto:

typedef struct {
    int sock;
    struct sockaddr_storage peer_addr;
    socklen_t addrlen;
} my_net_t;

int my_sender(void *net, const void *message, int length)
{
    my_net_t *my_net = (my_net_t *)net;
    return sendto(my_net->sock, message, length, MSG_CONFIRM, &my_net->peer_addr, my_net->addrlen);
}

Thread-safety

Normally just don't care. However if you decide to call these functions from different threads on the same queue (you shouldn't), then there could be a problem if you do this at the same time. To avoid issues, you have three options:

First, simplest and easiest is, just include phtread.h before you include the NetQ implementation, and thread-safety will be automagically taken care for you (unless you explicitly define NETQ_NO_PTHREAD, see below).

#include <pthread.h>
#define NETQ_IMPLEMENTATION
#include "netq.h"

Second, if pthread doesn't suit your needs for whatever reason, then you can instead use any locking mechanism too. For this, define NETQ_MUTEX_TYPE (the type of the mutex), NETQ_MUTEX_LOCK(m) (a function to acquire the lock on the mutex) and NETQ_MUTEX_UNLOCK(m) (to release it). For example, if you're using the SDL library, then this would be:

#define NETQ_MUTEX_TYPE      SDL_mutex
#define NETQ_MUTEX_LOCK(m)   SDL_LockMutex(m)
#define NETQ_MUTEX_UNLOCK(m) SDL_UnlockMutex(m)
#define NETQ_IMPLEMENTATION
#include "netq.h"

With NETQ_MUTEX_TYPE defined, NetQ will protect all queue access against concurency. This takes place on per queue basis, so you can use different queues from different threads at the same time without any performance penalty. (Using only one queue from multiple threads is a really, really bad idea, never do that, kids.)

Third, the NETQ_NO_PTHREAD define exists because there's one particular threaded configuration which does not need locking.

One receiver thread     Processing threads
                        +---------------+
                        | thread 2      |
                        +---------------+
                        | netq_pop()    |
                        | netq_send()   |
                        +---------------+
                        +---------------+
+-----------------+     | thread 3      |
| thread 1        |     +---------------+    x   as many as connected peers
+-----------------+     | netq_pop()    |
| netq_push()     |     | netq_send()   |
+-----------------+     +---------------+
                            ...
                        +---------------+
                        | thread N      |
                        +---------------+
                        | netq_pop()    |
                        | netq_send()   |
                        +---------------+

In this set up there's one thread that recieves packets from the network for all peers, and all the other threads are just popping from their corresponding peer's queue, do the processing and send results back. Here the receiver thread never calls pop nor send, and processing threads never call push, furthermore processing threads operate on their own queues exclusively.

NetQ uses circular buffers and was written carefully so that heads and tails are never increased from different functions. netq_push() only increases seq_in and seq_ack, but never touches seq_out nor seq_pop, netq_send() never touches seq_pop nor seq_in, finally netq_pop() never touches seq_in nor seq_out. This means that in this one particular set up you can omit locking entirely and it will result in full throttle throughput and performance while still being thread-safe.

(Not so accidentally this set up happens to be the best practice for writing high-performance servers.)

API

The nq argument is of netq_t type, but totally opaque to the application as the netq_t struct is only available in the source file where you have defined NETQ_IMPLEMENTATION. No need to initialize this library, it is simply enough if this struct is zeroed out before use. If you have configured for custom muteces, and your preferred mutex implementation requires initialization, then you have to do that on your own (on netq_t.mutex) when you create the netq_t instance.

Similarly the networking layer context net is opaque too, but for NetQ; you can choose it to whatever your application wants it to be. Likewise all the messages are void *, so that you can pass whatever struct or array you want.

Sending Messages

int netq_send(void *nq, const void *msg, int len, void *net);

Send out a message and store it in the queue (should re-transmission needed later).

Argument	Description
`nq`	The NetQ instance (opaque to the application)
`msg`	The message buffer
`len`	Length of message
`net`	Your networking layer instance (opaque to NetQ)

Returns -2 if the message is too big, otherwise the same as your raw packet sender's return value (-1 on communication error or the number of bytes sent).

Just for the completeness of this documentation, here are the three non-message variants (no need to know):

int netq_rst(void *nq, void *net);      /* no need to know, */
int netq_ack(void *nq, void *net);      /* no need to use */
int netq_nack(void *nq, void *net);     /* these, just for completeness */

Can be used to send technical, non-reliable, non-message packets (reset, ack, and negative ack which is request for re-transmission). Should be no need for these, just in case.

Receiving Messages

int netq_push(void *nq, const void *raw, int len, void *net);

Push a raw packet (message with header) to the queue and send ACK or NACK automatically if needed.

Argument	Description
`nq`	The NetQ instance (opaque to the application)
`raw`	The raw packet (message with header) buffer
`len`	Length of raw packet
`net`	Your networking layer instance (opaque to NetQ)

Returns 1 on success, 0 on error (more packet lost or arrived than the queue size).

int netq_pop (void *nq, void *msg, int len);

Receive message in correct order.

Argument	Description
`nq`	The NetQ instance (opaque to the application)
`msg`	The output buffer (should be at least NETQ_MTU big)
`len`	Length of output buffer

Returns -1 on error (buffer too small or queue overflowed), 0 if there was no new message, otherwise the number of bytes received and message in msg.

Just for the completeness of this documentation, there's a misc function too (no need to know):

int netq_pend(void *nq);            /* no need to know, no need to use */

Returns 1 if there's a message pending in the receiver queue. No need to call this, just check if netq_pop() returns 0 instead.

Debugging

void netq_dump(void *nq);

This function is only available if the NETQ_NODEBUG define isn't defined. It simply dumps the queues to stdout using printf.

Memory Requirements

The netq_t struct (in bytes):

Configuration	Sequence Bits	Ack Bits	Queue Size	MTU Size	sizeof(netq_t)
smallest	8	8	8	576	9,252
default	16	16	16	1280	41,034
biggest	32	64	64	1500	192,272

Mostly influenced by NETQ_QUEUE_SIZE and NETQ_MTU. Also note that there are lots of other configurations in between, use the one that suits your use case the best.

Packet Header Format

It contains the outgoing packet's sequence number NETQ_SEQ_BITS bits, followed by another of the same size, the largest received packet's sequence number (Ack Seq). These are in network byte order, aka. big endian. The third and last part is a bitmask, NETQ_ACK_BITS bits long, and always little endian (aka. first bit is in the first byte, last bit is in the last byte).

Sequence Bits	Ack Bits	Header Size
8	8	3
16	8	5
32	8	9
8	16	4
(default) 16	16	6
32	16	10
8	32	6
16	32	8
32	32	12
8	64	10
16	64	12
32	64	16

The most significant bits in the sequence numbers encode the message type (considering big endianness, that's first byte & 0x80).

Out Seq MSB	Ack Seq MSB	Meaning
0	0	Normal, reliable message
1	0	Reset
0	1	Acknowledge
1	1	Negative Acknowledge

The ack bitmask is as follows: first bit (LSB bit 0 of the first byte) corresponds to the given Ack Seq. The next bit (bit 1 of the first byte) to Ack Seq - 1. The 9th bit (bit 0 of the second byte) to Ack Seq - 9, etc. Normally these Ack Seq numbers indicate which packets were received. However with the Negative Acknowledge packet, these bits are negated too, aka. meaning which packets needs re-transmission.

This simple protocol does not negotiate a connection. If you want that, you can easy implement a simple TCP like SYN, SYN+ACK, ACK communication atop of it using normal NetQ messages.

License

NetQ is licensed under the terms of the MIT license. It is written entirely from scratch, no other source were used and it is totally dependency-free (save libc of course).

Have fun with it, bzt

README.md