Enough with the fossil fuel combustion metaphors. That little Forth has already begun crawling and standing, it’s now time to add that crucial ingredient that will allow it to walk by itself, that is, creating new words from its base set, unlocking a whole world of possibilities.
Doing so requires us to perform several little tasks, which we’ll do in this article:
dictionarylabel mutable at runtime.
So far, we’ve been dealing with three memory areas:
If we want to add new words to the dictionary, they have to go somewhere. In Forth, there’s a variable called “here” which is a pointer to the system’s permanent1 heap. That heap is where all the permanent results generated at runtime goes, where it can be referenced by further processing, which can be written to “here”, and so forth, until the moon is reached.
The principle is simple: the heap starts at some address in memory and whenever we write something to the heap, we increase the pointer accordingly. The code writing this new data to the heap has the responsibility of referencing2 this data properly so that it can be reused where it must.
In Forth speak, adding something to that heap is called “write” and the associated symbol for its action is “,” (the comma).
If you look at the implementation I’ve written for you, you’ll see
that I’ve added a new
here variable which is initialized with the value of
herestart label, which is simply placed at the very end of the code3.
Then, I’ve added three write routines,
writestr4 which operate like I’ve described above. Those routines are used
by the word compilation routines we're covering below.
Code-wise, there’s not much new. There’s another of our fancy string
movsb, which is used to copy code from one area to the other.
This time, instead of
repz, we use
rep which repeats unconditionally
dictfind routine so far has been beginning its search from a hardcoded
dictionary. It worked well, but it didn’t allow us to add new
words at runtime. To this end, we need to add a new variable, which will be the
new starting point for
Adding a word then becomes as simple as writing a new entry in “here” and
dictionary variable so it points to it.
Code-wise, this operation is trivial. I kept the
dictionary name for the
variable, so I renamed the label for the last entry to
lastentry, which is
the initial value for the new variable. Then, all we need to do is to, in
 brackets to add an indirection.
In Forth speak, creating a new word is
“: wordname ... ;”. For example, if we
wanted to simply combine
goodbye in a single word, we would do:
: wellthatwasquick hello goodbye ;
After that definition is made, calling
goodbye, then returns.
How that word is compiled varies from Forth to Forth. Brad Rodriguez makes an excellent presentation of the different execution models encountered in the wild. Before implementing word compilation in a Forth, one needs to choose a model. The model I chose for our own little Forth is the Subroutine Threaded Code (STC) model.
Brad mentions that speed might be an advantage to STC and it’s true, but he fails to mention the awesomest advantages of all: with STC, Forth code and native code can freely intertwine, which is not the case with ITC and DTC. In the C compiler implementation I’m going to show you in this story arc, this is a crucial capability.
It’s not just the C compiler. So many doors open with STC! Dusk OS, also a STC Forth, takes advantage of this intertwinability to crazy cool levels. I can’t show them to you now because we have a baby Forth to finish, but we’ll see some of this craziness soon enough. In my opinion, when compactness isn’t a primary concern5, STC is king.
So, we want to compile a word at runtime using the STC model. How to proceed?
reti386 opcode and stop.
callinstruction targeting that word.
If you look at the “:” word I’ve implemented for you (at the bottom of
kernel.asm), you’ll see that the code there follows this logic above using
semantics you should already understand. The juicy parts are in the
.stop routines, so let’s look at them.
createentry writes the name of the entry, followed by the link to the
previous entry, followed by the length of the name, as per our dictionary
structure. The code should look straightforward to you. The only funny part is
the weird exchange we make between
[dictionary]. At that very
[here] points to the very place where
[dictionary] will have to
point to when the entry is finished being written, but at the same time,
[dictionary] holds the value we want to write at
[here]. A little mind
bending, but think about it for a while, it will end up making sense.
You remember when I told you in my previous article that we use the hardware
stack to preserve registers? That’s what we do with the name length in
The funny part is when we pop it: we pop it in another register. We can do
that? Of course we can! But as regular developers, our brain is not used to
thinking of these kinds of little tricks. It’s a pleasant habit to acquire.
writecall, where the magic happens for real. But even though this
operation opens the door to an infinite and magical world, the actual operation
is rather mundane: we simply encode a 16-bit i386 call, that is, the 0xe8
opcode followed by a displacement leading to the target to call. Yes,
displacement. Calls under i386 don’t target absolute addresses, but relative
ones. The displacement is relative to the end of the call operation. For
example, the call
E8 00 00 is a noop. The call
E8 FD FF (-3) is an infinite
loop until it smashes the stack.
By the way, if you want to play with the concept, I encourage you to build
yourself a test assembler file with fake
call instructions and then look at
the encoded result, or better yet, use
ndisasm to examine the disassembled
binary. Doing this helps understanding i386 instruction encoding logic a lot.
So, back to that magical part: all we need to do is to transform our absolute target address to a displacement by subtracting “here+3” from it, because “here” is where we’re about to write that call. If you look at the code, you’ll see that this is what we do.
The easiest part is
.stop. All we need to do at that point is to encode a
ret and that
ret has a simple opcode, 0xc3. We’re done!
And just like that, our little baby can walk by itself, creating new words from its base set. The problem now is to design a base set of words that is sufficient to take us to the moon, something we’ll tackle in the next article, but we can have a taster through the addition of two trivial words: “*”, a simple multiplication operation and “dup”, a word that duplicates the top element of the stack. I wrote them already. With this, we can easily complete the exercise that I asked in the previous article:
: square dup * ; 5 square .
overwhich fetches the second item from
PSand pushes it, resulting in a
“a b -- a b a”signature. How would you go about implementing it?
Next: The Unbearable Immediateness of Compiling
It’s in fact semi-permanent because there’s also the concept of “forget” which allows to reclaim some of that heap, but that’s an exception to its regular modus operandi. ↩
This is often synonymous to “adding it to the dictionary”, but not always. ↩
By the way, have you noticed that this whole code results in a 742 bytes binary? How big is a GCC hello world again? ↩
“writecall” is special, I talk about it below. ↩
If we have more than, say, one megabyte of memory. ↩
It should be noted that aborting at that point leaves the word
unfinished, and thus broken. With no
ret to stop the call chain, calling the
broken word will result in executing uninitialized memory, which means
“fireworks!”. Our baby Forth isn’t mistake-friendly! ↩