Quick and Dirty Brainfuck Interpreter and Compiler

Why? Because why not? Stupid example on how to generate machine independent bytecode, run it through various optimizer techniques, then interpret it in a VM, or compile it into C or Assembly source or directly into a native ELF executable.

Compilation

Just run make. No library dependencies of any kind, just binutils and gcc.

Usage

BrainFuck by bzt Copyright (C) 2021 MIT license

./bf [-p] [-O[01234]] [-m<size>] [-w|-d] <bf source> [-b|-c|-s|-o] [out]

  -p   turn on profiling
  -O   optimization level
  -m   set program memory size
  -w   use word cells (16 bit)
  -d   use dword cells (32 bit)
  -b   output bytecode
  -c   output C source
  -s   output x86_64 assembly
  -o   output ELF binary

Interpreter

Specify the BF script as single parameter.

$ ./bf tests/hello.bf
Hello World!

With -p it will do profiling too:

$ ./bf -p tests/hello.bf
Optimizer:      0.000004 sec
Hello World!
Execution:      0.000052 sec

Syntax and run-time checks are made on the script. Error messages look like this:

<source file>:<line>:<col>: <message>

Compiler

With the -o flag, BrainFuck compiler outputs ELF executable (x86_64 only, requires the binutils package to be installed):

$ ./bf tests/hello.bf -o hello
$ ./hello
Hello World!

You can also create ANSI C and x86_64 sources (which in turn can be compiled into native code) with this compiler. For example:

$ ./bf tests/hello.bf -c hello.c
$ gcc hello.c -o hello
$ ./hello
Hello World!

The -c flag outputs ANSI C source. You can compile that with gcc -O3 but that's still not the real thing. Instead, let Brainfuck do the optimizations and use the -s flag to output x86_64 Assembly source (this way the result is about 4 times faster than the gcc compiled C source).

$ ./bf tests/hello.bf -s hello.s
$ gcc hello.s -o hello
$ ./hello
Hello World!

Common Flags

You can specify the cell size (which defaults to byte) with -w (word) and -d (dword), like in these examples:

$ ./bf tests/bitwidth.bf
Hello World! 255
$ ./bf -w tests/bitwidth.bf
Hello world! 65535
$ ./bf -d tests/bitwidth.bf
Hello, world!
$

To set the program's memory limit (in cells, defaults to 32768 cells), use -m like

$ ./bf -m8192 tests/hello.bf
Hello World!

Optimization can be set with -O0 through -O4 (level 4 only makes sense with the compiler, not with the interpreter).

Flag	Optimization
-O0	no optimization at all (sloooooow)
-O1	run-length encode tokens
-O2	like -O1 plus dead loop elimination and zero out cell
-O3	like -O2 plus fast find and cell move mnemonics
-O4	like -O3 but without run-time boundary checks

Bytecode

You can save the optimized, intermediate bytecode into a file, like this:

$ ./bf tests/hello.bf -b hello.bc
$ hexdump -C hello.bc
00000000  42 43 21 c0 00 19 01 c0  00 16 11 31 01 21 c0 00  |BC!........1.!..|
00000010  15 01 23 01 4f 2b 01 5c  f1 00 01 31 f2 00 01 01  |..#.O+.\...1....|
00000020  31 58 53 2a 11 51 02 04  34 60 12 23 60 11 31 60  |1XS*.Q..4..#..1.|
00000030  60 23 60 11 31 60 03 60  12 60 23 60 36 60 01 31  |.#..1.....#.6..1|
00000040  60 12 21 60 11 60                                 |..!...|

Then the bytecode can be read back as if it were a plain text BF script (but this time no optimizations will take place):

$ ./bf hello.bc
Hello World!
$

The bytecode file starts with a 2 bytes magic, BC, followed by encoded instructions:

OpCode	Mnemonic	Source	Description
i000xxxx	INC	>	increment pointer by x
i001xxxx	DEC	<	decrement pointer by x
i010xxxx	ADD	+	add x to the pointed cell
i011xxxx	SUB	-	substract x from the pointed cell
i100xxxx	JZ	[	jump to x if pointed cell is zero
i101xxxx	JNZ	]	jump to x if pointed cell is non-zero
0110????	PUT	.	output pointed cell
0111????	GET	,	get input into pointed cell
11110000	ZRO	[-]	zero out cell
1111i001	FFZ x	[>]	find first zero cell in x big steps
1111i010	FLZ x	[<]	find last zero cell in x big steps
1111i011	MVD x	[-<+>]	move cell value to down cell
1111i100	MVU x	[->+<]	move cell value to up cell

The highest bit, i indicates the existance of an immedate operand. That is 16 bit wide, and together with the command byte's lower tetrad forms a big endian 0 .. 0xFFFFF value.

For the additional mnemonics (upper tetrad 0xF0), i is the highest bit of the lower tetrad, and immediate is 3 bytes long if its set, otherwise 2 bytes. The only exception is ZRO, which has no argument at all.

The Fun

First, measure Hello World:

$ ./bf -p tests/hello.bf
Optimizer:      0.000004 sec
Hello World!
Execution:      0.000052 sec

Then let's add profiling and save as C source, then compile it as optimized native code and execute :-)

$ ./bf -p tests/hello.bf -c hello.c
Optimizer:      0.000003 sec
$ gcc -O3 hello.c -o hello
$ ./hello
Hello World!
Execution:      0.000069 sec
$

Haha, the interpreter was actually faster than the gcc optimized code! Let's try with Assembly directly compiled to ELF:

$ ./bf -p -O3 hello.bc -o hello
Optimizer:      0.000003 sec
$ ./hello
Hello World!
Execution:      0.000067 sec
$

Obviously if you do this with a much larger script that computes a lot more, then the Assembly is going to be faster:

$ ./bf -p tests/mandelbrot.bf >/dev/null
Optimizer:      0.000121 sec
Execution:      9.085870 sec
$ ./bf -p tests/mandelbrot.bf -o mandelbrot
Optimizer:      0.000121 sec
$ ./mandelbrot >/dev/null
Execution:      1.393856 sec
$

Here the speed up was little less than 7 times over the interpreter.

License

Free and Open Source, licensed under the MIT license.

Cheers, bzt

README.md 6.4 KB Permalink History Raw