Enough with the fossil fuel combustion metaphors. That little Forth has already begun crawling and standing, it’s now time to add that crucial ingredient that will allow it to walk by itself, that is, creating new words from its base set, unlocking a whole world of possibilities.
Doing so requires us to perform several little tasks, which we’ll do in this article:
dictionary
label mutable at runtime.So far, we’ve been dealing with three memory areas:
If we want to add new words to the dictionary, they have to go somewhere. In Forth, there’s a variable called “here” which is a pointer to the system’s permanent1 heap. That heap is where all the permanent results generated at runtime goes, where it can be referenced by further processing, which can be written to “here”, and so forth, until the moon is reached.
The principle is simple: the heap starts at some address in memory and whenever we write something to the heap, we increase the pointer accordingly. The code writing this new data to the heap has the responsibility of referencing2 this data properly so that it can be reused where it must.
In Forth speak, adding something to that heap is called “write” and the associated symbol for its action is “,” (the comma).
If you look at the implementation I’ve written for you, you’ll see
that I’ve added a new here
variable which is initialized with the value of
the herestart
label, which is simply placed at the very end of the code3.
Then, I’ve added three write routines, writechar
, writeword
and
writestr
4 which operate like I’ve described above. Those routines are used
by the word compilation routines we're covering below.
Code-wise, there’s not much new. There’s another of our fancy string
instructions, movsb
, which is used to copy code from one area to the other.
This time, instead of repz
, we use rep
which repeats unconditionally CX
times.
The dictfind
routine so far has been beginning its search from a hardcoded
label called dictionary
. It worked well, but it didn’t allow us to add new
words at runtime. To this end, we need to add a new variable, which will be the
new starting point for dictfind
.
Adding a word then becomes as simple as writing a new entry in “here” and
update the dictionary
variable so it points to it.
Code-wise, this operation is trivial. I kept the dictionary
name for the
variable, so I renamed the label for the last entry to lastentry
, which is
the initial value for the new variable. Then, all we need to do is to, in
dictfind
, put dictionary
between []
brackets to add an indirection.
In Forth speak, creating a new word is “: wordname ... ;”
. For example, if we
wanted to simply combine hello
and goodbye
in a single word, we would do:
: wellthatwasquick hello goodbye ;
After that definition is made, calling wellthatwasquick
calls hello
, then
goodbye
, then returns.
How that word is compiled varies from Forth to Forth. Brad Rodriguez makes an excellent presentation of the different execution models encountered in the wild. Before implementing word compilation in a Forth, one needs to choose a model. The model I chose for our own little Forth is the Subroutine Threaded Code (STC) model.
Brad mentions that speed might be an advantage to STC and it’s true, but he fails to mention the awesomest advantages of all: with STC, Forth code and native code can freely intertwine, which is not the case with ITC and DTC. In the C compiler implementation I’m going to show you in this story arc, this is a crucial capability.
It’s not just the C compiler. So many doors open with STC! Dusk OS, also a STC Forth, takes advantage of this intertwinability to crazy cool levels. I can’t show them to you now because we have a baby Forth to finish, but we’ll see some of this craziness soon enough. In my opinion, when compactness isn’t a primary concern5, STC is king.
So, we want to compile a word at runtime using the STC model. How to proceed?
ret
i386 opcode and stop.call
instruction targeting that word.If you look at the “:” word I’ve implemented for you (at the bottom of
kernel.asm
), you’ll see that the code there follows this logic above using
semantics you should already understand. The juicy parts are in the
createentry
, writecall
and .stop
routines, so let’s look at them.
createentry
writes the name of the entry, followed by the link to the
previous entry, followed by the length of the name, as per our dictionary
structure. The code should look straightforward to you. The only funny part is
the weird exchange we make between [here]
and [dictionary]
. At that very
moment, [here]
points to the very place where [dictionary]
will have to
point to when the entry is finished being written, but at the same time,
[dictionary]
holds the value we want to write at [here]
. A little mind
bending, but think about it for a while, it will end up making sense.
You remember when I told you in my previous article that we use the hardware
stack to preserve registers? That’s what we do with the name length in CX
.
The funny part is when we pop it: we pop it in another register. We can do
that? Of course we can! But as regular developers, our brain is not used to
thinking of these kinds of little tricks. It’s a pleasant habit to acquire.
Then comes writecall
, where the magic happens for real. But even though this
operation opens the door to an infinite and magical world, the actual operation
is rather mundane: we simply encode a 16-bit i386 call, that is, the 0xe8
opcode followed by a displacement leading to the target to call. Yes,
displacement. Calls under i386 don’t target absolute addresses, but relative
ones. The displacement is relative to the end of the call operation. For
example, the call E8 00 00
is a noop. The call E8 FD FF
(-3) is an infinite
loop until it smashes the stack.
By the way, if you want to play with the concept, I encourage you to build
yourself a test assembler file with fake call
instructions and then look at
the encoded result, or better yet, use ndisasm
to examine the disassembled
binary. Doing this helps understanding i386 instruction encoding logic a lot.
So, back to that magical part: all we need to do is to transform our absolute target address to a displacement by subtracting “here+3” from it, because “here” is where we’re about to write that call. If you look at the code, you’ll see that this is what we do.
The easiest part is .stop
. All we need to do at that point is to encode a
ret
and that ret
has a simple opcode, 0xc3. We’re done!
And just like that, our little baby can walk by itself, creating new words from its base set. The problem now is to design a base set of words that is sufficient to take us to the moon, something we’ll tackle in the next article, but we can have a taster through the addition of two trivial words: “*”, a simple multiplication operation and “dup”, a word that duplicates the top element of the stack. I wrote them already. With this, we can easily complete the exercise that I asked in the previous article:
: square dup * ;
5 square .
over
which fetches the second item from PS
and
pushes it, resulting in a “a b -- a b a”
signature. How would you go about
implementing it?Next: The Unbearable Immediateness of Compiling
It’s in fact semi-permanent because there’s also the concept of “forget” which allows to reclaim some of that heap, but that’s an exception to its regular modus operandi. ↩
This is often synonymous to “adding it to the dictionary”, but not always. ↩
By the way, have you noticed that this whole code results in a 742 bytes binary? How big is a GCC hello world again? ↩
“writecall” is special, I talk about it below. ↩
If we have more than, say, one megabyte of memory. ↩
It should be noted that aborting at that point leaves the word
unfinished, and thus broken. With no ret
to stop the call chain, calling the
broken word will result in executing uninitialized memory, which means
“fireworks!”. Our baby Forth isn’t mistake-friendly! ↩