Flpc internals: model and bootstrapping

The Forth Lisp Python Continuum (Flpc) is a small high dynamic programming language I'm making. In these series of posts, I will describe and discuss its inner workings, more or less in the order that they are run when the program starts.

This will hopefully help with others making programming languages and serve as some kind of documentation. Though Flpc was designed to be relatively easy to change, there is a surprising amount that stayed unchanged since the start of this project, perhaps unfortunately. I'm sure some of this post will still become outdatated and the intended final format is an interactive tutorial to teach you how to recreate Flpc so different design decisions can be made (similar to How to make these slides) but we're still very far from that.

Aside: Flpc current status and optimization

I've recently ported Flpc's C source to Nim, which now runs faster than C (when all Nim/underlying C compiler optimization flags are turned on) thanks to folks from the Nim forum.

I did this so that I can (hopefully) develop and try out different optimizations more quickly. Since Flpc became self-hosted, I've used the language in its intended way to add syntactic sugar at runtime. For example, I've added

triple quotes: """This is a long stirng with "quotes" in it"""
list syntax: { 3 1 True None }
list comprehension syntax: { x*x for x in l }

in this way. It's great to see my envisioned workflow pan out, but the feedback loop for editing a grammar and seeing the results is too slow. And this is no good interactively using and modifying the language at runtime. Even without grammar modification, reruning the same script becomes slow after 4-5 functions are added to it.

However, optimization clashes with the dynamic nature of the language and the clarity and simplicity of its internal state at runtime. This combined with the changes needed across multiple layers has made progress on optimization slow. So I'm procrastinating by writing these posts.

If you know anything about optimizing compilers or bytecode and understand the trade-offs stated here, please get in touch by email or otherwise. Because of the long cycles, some intuition about what will be effective could really speed things up.

Flpc computation model

With that out of the way, let's get to the description of Flpc internals. The model is very similar to other languages I wrote about before so I will concisely state the description and invite you to read those for a slightly more detailed exploration in a simpler setting.

memory: an array containing function bodies and data
data_stack: a stack values, thought of as local variables, function parameters and return values.
call_stack: pointers (indices) to memory where instructions should be read
primitives: an array of functions in the host language (Nim or C).

Alongside that, it has a number of other information in its global state

memory_hint: An array of strings for each element in memory, used to stroe info about what is at that slot.
data_stack again: Each value on the data_stack can have an associated (variable) name which is a string. But while the implementation is a (name, value) pair, future references to elements on the data stack refer to the value, unless its name is explicitly mentioned.
strings: An array of strings. Strings in Flpc are represented by integers in this array (in other words, strings are interned).

All "global" variables above are fields of g (g.memory, g.call_stack, etc).

Elements in memory and the data_stack are integers where

the last TL=3 bits represents the type of the element and
the rest the value of the element.

There are 5 types, which in order (represented by 0, 1, 2, 3, 4) are

Int: integer
Str: string represented by an index in the strings array containing the actual string,
Pointer: a pointer to memory represented by an index of memory
Prim: a primitive represented by an index of primitives array,
Special: special singleton values such as None

For example,

3456 = 54 * (2**3) + 1

has type 1, which is Str and value 54. Since strings[54] is +, a value of 3456 in memory represents the string "+".

There's more to say about the design and functionality of some of those, but lets skip that for now and jump directly to the bootstrapping steps, which is what this is about.

Bootstrapping

The core of the main loop for running FlpcForth bytecode is [simplification]

proc main_loop() =
  while true:
    g.call_stack[g.call_stack.len - 1] += 1
    func_call(g.memory[g.call_stack[g.call_stack.len - 1] - 1])

That is, it

increments the program counter which is the top of the call_stack,
look up the entry in memory at that positions and
calls it with func_call.

func_call is also fairly straightforward:

If the argument is a primitive (represented as an index in g.primitives), call it.
If the argument is a pointer (that is, an index in g.memory), then add it to the top of the call_stack.

proc func_call(elem: Elem) =
  case elem.ftype
    of Pointer:
      g.call_stack.add(elem.value)
    of Prim:
      let res = primitives[elem.value]()
      if res != NoValue:
        data_stack.add(res)
    else:
      error("Cannot call " & $elem)

For this main_loop to work, our initial state needs some value on the call_stack. The initial call_stack is set to [0], meaning the first instruction we run is at g.memory[0] and so we just need to make sure memory is initialized with the right value there.

g.memory[0] is a pointer to the function input_loop2 whose body is

  repeat:
    call(names.get(input.next_token()))

That is, it repeats these three steps indefinitely

Get the next whitespace delimited group of characters (token) from input (stdin or a file) as a string.
Looks up this token to find the primitive or pointer associated with this string.
func_call this primitive or pointer.

Notice that we're calling func_call from Flpc rather than Nim (like we did in main_loop). To do that, we put a thin wrapper around func_call and add that wrapper to the primitives array.

names lookup

input.next_token is a primitive but names.get is not. names.get is a Flpc function whose body looks something like

return_if(name == "string1" value1)
return_if(name == "string2" value3)
return_if(name == "string3" value3)
...
lookup_error()

The body of this function is modified by appending to it when new entries are added (to what we think of as the names dictionary though there is no explicit object other than this names.get function body).

Initial memory

Initial values written to memory are loaded from a file init_memory.dat are loaded the integer representation of all these elements. 7 functions, including input_loop2 and names.get are loaded this way. Another pre-loaded function is write_loop.

Writing new functions

With main_loop, we can run any sequence of primitives or predefined Flpc function. Now we want some way to define new functions.

names.get associates the (function) name [ with the (pointer to the) function write_loop which lets us write new functions. Its body is essentially

write_loop <- fun[]:
  pointer = Pointer(memory.len())
  repeat:
    token = input.next_token()
    memory.append(names.get(token))
    if string_endswith(token ":"):
      memory.append(input.next_token())
    if string_equal(`token "]"):
      return(`pointer)

memory.append is a primitive that writes is argument to the end of memory and memory.len returns the current number of cells in memory.

memory.get and memory.set (not seen yet) are other primitives for reading and writing directly to memory.

As a result if we write [ foo bar baz ], it will write a new function whose body is foo bar baz (run foo() then bar() then baz()) and return a pointer to (the beginning of) that function. We can call the result using call.

Extending names.get

For easier reuse, we'd like to name these new functions. To do that, we add them to the body of names.get. A helper function functions.add pre-loaded via init_memory.dat takes a string and pointer and adds them the the names dictionary by essentially adding the line

return_if(name == "new_string" new_pointer)

to the body of memory.get (and moves the line lookup_error() down by rewriting it). The full definition is

functions.append <- fun[value]:
  memory.set(functions.end() s21()) # pos value
  functions.end.increase()

functions.add <- inline[]:
  F' push: pick1 functions.append
     push: push: functions.append functions.append
     push: string_equal functions.append
     push: push: functions.append functions.append
     push: return_if functions.append
     push: lookup_error functions.append
     push: ] functions.append
     functions.end.decrease functions.end.decrease 'F

functions.end is a primitive that returns a pointer to the end of the body of names.get, functions.end.increase and functions.end.decrease are primitives to increase and decrease these values.

The pointer from functions_end is a global variable in Nim initialized by a value in init_memory.dat (init_memory.dat contains initial values for more than just memory). In retrospect, functions.end could be a value stored in memory instead of a global in Nim (or C).

Lookup error

lookup_error is the final pre-loaded function from init_memory.dat, which just calls

error("lookup_error")

where error is a premitive for an unrecoverable error.

proc error(msg: string) =
  echo msg
  raise newException(Exception, msg)

Alternate design

An alternate design (not tried) is to only input_loop2 in init_memory.dat and move everything else to a later stage stage0. To do that, we'd have to replace the initial function bodies

[ instr1 instr2 instr3 ]

with

pushf: instr1 memory.append pushf: instr2 memory.append pushf: instr3 memory.append

In other words, we write out by hand the instructions that write_loop would execute.

But then we'd need a program to generate these sequences (or write them by hand and make them harder to edit). So I think adding a few extra function to init_memory.dat/boot.flpc was worth it.

Update: As I'm writing this, I added a new add_primitive primitive to add new primitives to the names dictionary. I will remove most primitives not needs for bootstrapping until add_primitive can be called.

Generating init_memory.dat

This begs the question of how init_memory.dat is generated from boot.flpc and other information. The process is a bit ugly. I still have an old Flpc interpreter written in Python lying around that is not part of the Github repo.

It puts the list of primitives in an order, builds an internal names dictionary from primitives names to integer representation (the index in the order = the type Prim). Then it writes down the body of each function in boot.flpc using its own reimplementation of write_loop. Then writes the names.get function described earlier, to memory. Finally, this interpreter dumps its state to init_memory.dat.

Exploration tip: use bp

I want to end this post (before the appendix with init_memory.dat layout) with one way to explore a bit.

THere's a function bp (which is also a primitive) that can be called from either Nim or Flpc, which I think of as a breakpoint. It drops you to a prompt and lets you into a prompt bp> and lets you examine. In the C version, this was an empty function that you'd attach to with gdb. In the Nim version, there a few commands (and you could expand these yourself):

bt - prints the current Flpc state, including stack
cs - prints just the call stack as integers (for when bt itself is broken)
ds - prints just the data stack
st - prints the Nim stack trace (needs --stacktrace:on when compiling).
mem <pos> - print memory around the position pos (an integer).

Similar to bp, there's a debugger and debuggerv2 function that can be called from Flpc once they are up.

Appendix: init_memory.dat format

The current of init_memory.dat contains the following lines in order.

#memory <init memory length> <init memory entries>
<space separated memory entries>

where #memory is a constant string to help navigate the init_memory.dat file. The first <init memory entries> entries of memory are initialized to what's on the second line and the rest (up to <init memory length>) are all 0.

Entries are integers representing (typed) elements. However, the last 2 bits are used to represent the type (instead of the last 3 bits) and so initial memory contains no value of type 5 = Special (as they cannot be represented).

#positions <init position entries>
<space separated pairs of entries>

I've not yet described the positions globals yet, which maps memory indices to FlpcForth source ranges (start and end character). -1 represent no source positions for things like data.

#strings <init strings length>
<space separated strings entries>

is the initial strings. In particular, this contains the names of primitive and pre-loaded functions.

#indices <pointer to functions_end> <pointer to names_get> <pointer to run>

Pointer to (initial) memory, essentially allowing a fast names.get from Nim. The last one is no longer used and could be removed from this file.

#names <init primitive names length>
<space separated primitive names entries>

The names of primitives. This is so we know when initial memory contains Prim 12, we know what primitive that should be. It is however redundant as we could have made strings start with all primitive names.

Footnotes

[simplification] A simplified version of the actual source code is shown here, to focus on the part discussed. Hopefully, they are recognizable in the full source code. The extra line will possibly be described later.

Posted on Jul 3, 2021

Blog index RSS feed Contact