Tumble Forth: Do Look Up

We have a bare bone I/O loop that sure is simple, but that really doesn’t do much. What about triggering a first explosion of possibilities by combining it with another concept? This time, it’s going to be the Dictionary.

The Dictionary

In Forth, the word “dictionary” refers to a linked list going backwards where each element is executable and has a name. By “going backwards”, I mean that each new element is added on top of the previously added element and that iteration is done starting from the last element. By “executable”, I mean that the payload is meant to be executed, the meaning of which depending on our execution model1.

The structure of the dictionary differs from Forth to Forth, but here’s the one we’ll use:

  A-length    A+0 (2b)       A+2 (1b)      A+3
+--------------------------------------------------------
| Word name | Link to prev | Name length | Payload ...
+--------------------------------------------------------

“A” is the address of the dictionary entry. Such a structure has the following properties which we’ll rely on:

  1. Such a structure is easy to assemble “by hand” in NASM.
  2. We can find an element by name.
  3. We can add elements at runtime.
  4. List iteration is done by simple de-referencing without offset until 0 is met.

A regular Forth has only one dictionary, so when we say “the dictionary”, we mean the One.

In Forth nomenclature, the combination of an entry in the dictionary and its payload is called a word, and “word” points specifically to the “payload” part. For example, you’ll see later that there is a Forth word called (single quote) which yields the address of a word. When we say that, we mean the address of the payload of the entry2. In other words, a word is executable.

I know, I know, it’s not the best name ever because its meaning is ambiguous. There is no way around it though, all Forth literature use this nomenclature, so you might as well get used to it.

Writing a dictionary

All Forths start with a system dictionary which is often assembled in their predefined structure directly in the code. If you look at the implementation I’ve written for you, you’ll see that this is what I’ve done too, starting at the “; Dictionary” comment. Let’s look at the first entry (hellomsg isn’t part of the dictionary entry):

db 'hello'
hello: dw 0
db 5
    mov di, hellomsg
    call printmsg
    ret

You can see that the first 3 lines correspond exactly to the structure described above. It’s the first time we see the dw directive. It does the same thing as db, but writes a little-endian word3 instead of a byte. hello is the name, which is of course 5 bytes in length. Because this is the first entry, the “link” field is zero. This way, we know that this is the end of the line when we iterate the dictionary.

Then, the payload follows immediately. It’s a bit weird to see code organized like this because we’re used to seeing code directly follow a label. If we were to directly call this piece of code from assembly, we would indeed add a label right after db 5, but this word is only ever called through the interpret loop, so we don’t bother.

The hello label serves as a reference for the following entry:

db 'goodbye'
goodbye: dw hello
db 7

The way NASM resolves dw hello is exactly the same as if it resolved call hello, but instead of encoding a call, it simply writes down the resolved address as a literal. This is what we need in our dictionary structure.

Then, we continue to add a bunch of entries in the same fashion until we reach the last one, which we bind to the dictionary label, which we’ll need for the lookup routine.

I recommend looking at kernel.img in a hexadecimal editor, using literals like hello and goodbye as orientation marks. By subtracting 0x500 to each link field (because that executable lives at 0x500, remember?), you should be able to follow the chain by yourself. It’s a good exercise to visualize what is going on.

Looking up a word

The main addition to the code in this article is the dictfind routine which can be used to look up an entry by name in the dictionary. It takes a name/length input and returns either an entry address if a corresponding one was found, or 0 if nothing was found. The logic of the routine is as follow:

  1. BP is the iterator register, pointing to the currently selected dictionary entry. It begins at dictionary.
  2. Is the name length of the current entry the same as the one we’re looking for?
  3. No? Go to 6.
  4. Yes? Compare the contents of the names.
  5. They match? we can stop right there, we have a result.
  6. Iterate to the next entry.
  7. If zero, stop iteration: entry not found. If non-zero, go to 2.

Assembler-wise, a few things to mention. We use the [register+offset] form for the first time. Not only can we indirectly address a register, but we can also do so with an offset applied to it. For example, if BP is 0x1234, “mov al, [bp+2]” copies the byte at address 0x1236 in register AL. It’s very useful to access structured data in memory, which is what we do here.

Then, there’s the cmpsb instruction which is a nice little creature. If our task is to compare every character between two addresses in memory across X bytes, we could write a simple loop to do so. But this is a task that is done so often that Intel thought it wise to add a “shortcut” instruction for it, which executes faster. The code “repz cmpsb” is the rough equivalent of:

loop:
    mov al, [si]
    cmp al, [di]
    jnz loopend
    inc si
    inc di
    dec cx
    jnz loop
loopend:

And you know what? This little friend there can go both ways (increase or decrease SI/DI at each step)! Thank you Intel for being so kind, I’m sure it will come handy some day, but most of the time, this feature is there to hit me in the face: the status of the flag4 that controls it doesn't have a specific state at initialization. To make sure that we go forward, we have to clear the D flag with cld, which we do at the beginning of the code, never to touch it again.

Other changes to the code

Other notable changes to the code that were made is to switch, in spitchar, from INT10H AH=0E to INT10H AH=09 so that we can have colors. This comes at the cost of having to maintain a cursor position5.

I also added a new strlen routine to compute the length of a null-terminated string. It also uses a fancy instruction, scasb, which operates a bit like cmpsb, but instead of comparing two memory areas, it compares one memory area to a constant, 0 in this instance (we’re looking for the zero). We take advantage of the CX decreasing logic so that we can reuse that value to extract a final length from it.

Finally, the main loop stitches it all together, transforming the resulting entry into a word (if found) so that it can call it, for a final result that is functionally much richer than our previous iteration: we have a pool of routines that we can arbitrarily call at runtime. Those routines operate around a richer shared state, as we can see with a fancier “Variables” section. With the color variable, we can change the behavior of our printing words. An explosion of possibilities!

Exercises

  1. Add a new word in the middle of the dictionary.
  2. Move it at the end. What do you have to do?
  3. Create a new word to control cursor position. Rewind to beginning maybe?
  4. Create a word that calls two other words. What does it look like? Think we could figure out a way to create such a word at runtime? (spoiler alert: you bet we will!)

Up next

Interesting little explosion right? That’s only the beginning! In the next article, we’ll introduce the Parameter Stack along with number parsing routines, making our interpret loop complete. This will make our system operator6 much more powerful.

Next: A tale of two stacks


  1. More on that subject later, but for now, because all we have are words that contain a payload in the form of native code, “executing” means “calling”. 

  2. With the “entry” being the address to the linked list element. 

  3. Oh no! another meaning for “word”! This time, we mean a 16-bit number. 

  4. The D flag, for “direction”. 

  5. That wasn’t planned. I hadn’t noticed, while writing the other article, that my “BL” parameter was ignored by “INT10H AH=0E”. But my example is centered around colors! 

  6. That’s what I like to call the “user” of a system as cool as Forth. The word “user” really doesn’t convey how intensely the creativity of the person at the keyboard is being solicited. You’re like Tank in the Matrix!