It’s done! We have that crucial ingredient giving life to that little Forth, allowing it to walk by itself! Unfortunately, you’ll soon see that after a few steps, this little creature stumbles and falls. What is it lacking, a little balance maybe?
As I’ve hinted in the previous article, it foremost lacks a solid set of base words to build from. It also lacks this little something that would allow it to compile things like numeric literals in a word definition. Lastly, but not the least, one last speckle of pixie dust: immediateness.
This is the moment in the series when I throw you in the gentle hands of Leo Brodie with his excellent introduction to Forth, Starting Forth. This introduces you to Forth from a “user” perspective and covers all the words that a regular well-formed Forth contains, which should help designing our own little Forth.
Why now rather than sooner? Because of the “user” perspective. I think that being introduced to Forth from this perspective makes it underwhelming to the newcomer. Forth’s strongest selling point is precisely that it allows you to approach it from an “operator” perspective1, something other systems/languages rarely allow because of their overwhelming complexity.
Had I thrown you directly to Leo, you’d have whipped up GForth, a Forth implementation that is too complex for its own good, and played with Forth as a mere user. Now, as you read Leo, your first reflex won’t be “let’s find an implementation I can try this concept in”, but rather “let’s whip up an implementation of this concept in my little Forth to try it on”.
I’d thus suggest that you interrupt the reading of this article to read Starting Forth. It’s not a requirement for this story arc to implement words you read about in our little Forth, but doing so will bring you joy, so why not? When some words give you troubles2, postpone. When you reach chapter 4 about conditionals, you can read it, but you are likely to need help for its implementation, at which point you can resume reading this article.
Maybe you’ve given it thoughts already. What’s that little something that
would allow us to compile number literals in a word definition? We somehow need
to pass a parameter to a word at compile time, so we can’t just do like in
our main loop and push that number to
PS: that’s runtime! We need
There isn’t a million ways to parametrize something at compile time. We have to
store that parameter somewhere and that parameter will be called back every
time the word is called. The orthodox way to store such parameter is to write
it to here right after a call to the associated “handler” word. In the case
that occupies our mind, that “handler” could have the responsibility of reading
its compile time parameter and push it to
PS at runtime.
I don’t know of any name for that kind of routine. Me, I catch myself calling
them “parens words”, because the code associated to them is often linked to a
dictionary entry with a name inside parentheses, such as
(does), etc. Those words can’t be called directly because they require one or
more compiled parameter next to them. But providing that we follow their
argument structure, they can be written in a word definition.
If you look at the implementation I’ve written for you, you’ll see
that I’ve added a new
litn routine. Although it’s not wrapped in a word3,
it’s what we could call a “parens word". Its goal is to read the parameter that
has been written next to its call and push it to
PS. How do we access this
parameter? Through the Return Stack!
As with any call, the address where we should return has been pushed to
before we jumped to
litn. If the writer of the call has followed
convention4, it has written the number parameter right after that call.
Therefore, all we need to do is to:
Then, all we need to do is to amend our
: word so that it tries parsing input
as a number and, when it’s a number, write a call to
litn by following its
argument convention. This happens in
You can now define words like:
: answertouniverse 42 . ;
and have this new word behave as expected. This also means that you could scrap
blue native words because they can be trivially
So far so good. However, it does raise a question: is it possible to reference
a “parens word” within the realm of Forth semantics? That would unlock quite an
interesting set of possibilities, wouldn’t it? The answer is: for now, no. Our
loop in the
: word is too dumb to compile anything but straight calls to
words and number literals. Does it mean that we’ll need to augment that routine
whenever we want to add a new kind of “parens word”? Fortunately no, and this
brings us to the last magical concept we’ll need to add to be able to reach the
moon: immediate words.
Forth dictionary entries contain a flag, typically the most significant bit of the length field, to indicate whether the word is an immediate. In the interpret loop, this changes nothing. When the compiler loop encounters such a word, however, it executes it instead of compiling a call to it. And that’s all there is to it.
Code-wise, it’s simple. First, we have to modify
dictfind to tell it to
ignore the 8th bit of the length field. Then, we need to add a check on this
bit in the
: loop, which we do with the
test instruction in front of the
conditional jump to
.immediate. What do we do when that condition arises?
Call and carry on, what else!
This very simple mechanism opens a very important door giving us access to the exponential function, that is: tools that build tools that build tools that build tools…
This is the most mind bending, yet the most powerful concept there is in Forth. All the really cool things you’re going to do with it are likely to involve immediate words, maybe several levels of them. Going too deep in this logic leads to insanity, but it’s such a small price to pay for such power!
We try this mechanism with two new words,
then, which I’ve already
added. As you can see, their length field have the 8th bit set, indicating that
they’re immediate. This means that their behavior is a compile time behavior.
then words work around the
condbr5 “parens word” which behaves a bit like
litn, having the same number of parameters (one), but does this
The job of the
if word is to compile a parametrized
unfortunately, we don’t know yet where we want to jump to! Therefore, what we
do is we allocate 2 bytes for this parameter, but don’t set it. Instead, we
push6 the address of that parameter to
PS and let the compile loop do its
Then comes the
then immediate word, whose job is to take that address from
PS7, see where we’re at, here-wise, and go place that address at our
condbr parameter. Voilà! This allows things like:
: foo if hello then ; 1 foo --> prints "hello" 0 foo --> does nothing
I know it’s a big one to take. Take a while to think about it, experiment a little bit. By the way, have you noticed how this system works with nested branching?
The good news is that we’re done for real-real-real this time! These are all the important concepts that make a Forth. All the rest is building on these foundations, and the best part of it is that it’s built to your liking because you’re the one who’s likely to have built it.
createin Starting Forth. How would you go about it?
does>if you want, but it’s a tough one to tackle for a newcomer, altough there’s nothing fundamentally new about it except that you have to hijack the compile loop to prematurely stop it. Hint: this “parens word” needs two parameters.
We’re at the end of the “build a Forth”8 part of this story arc. It’s possible that you’ve grown attached to this little Forth we've been building, that you intend to make it into a real Forth. Unfortunately, this baby Forth is not a solid foundation for a complete Forth. I’ve optimized for the quickest path to “wow moments”, not for viability. Too much of its code is into assembler routines, where more of it should be in Forth words referencing each other. Too much of the logic has grown around null-terminated strings, but in the end, counted strings are easier to work with. The “;” detection logic is hackish.
If you want to build a Forth9, you're better off starting from scratch. And while you're at it, maybe target an architecture that is more elegant than i386?
By the way, I can’t let you go without mentioning jonesforth. It’s a i386 Forth with loads and loads of comments and people generally rave about it. Me, I don’t think it’s such a good way to be introduced to the subject, which is why I wrote this little series of articles, but it contains a lot of details I may have glossed over but that could be of use to you.
The goal of this story arc is to fully understand how to compile a piece of C code, from scratch. My intent is to show that to you through the compiler I’ve developed in Dusk OS, which is a Forth. Now that we know how to build a Forth, we’ll enter Dusk’s world with a pretty good understanding of how it works underneath. No more secrets.
So, let’s leave that little Forth there, don’t listen to it whimpering, calling for your mercy, and let’s discover the brave new world of Dusk OS!
Next: From Dusk Till C
Remember, you’re Tank! You ain’t some cog in the machine, machines bend to your will! ↩
For example, the string literal. ↩
Why? I talk about it at the end of the article. ↩
If it hasn’t, fireworks! ↩
"Branch conditionally” ↩
At compile time, remember! Yes, I know, mind bending. ↩
Still at compile time! ↩
8 articles, 8 bits. Coincidence? I don’t think so! ↩
and I’d be very happy if you were! Let me know if you’re stuck at some point. ↩