Tumble Forth: Liberation through bare metal

They say that liberty is not secured at sword’s point. Maybe, but you can liberate yourself from mediocre software conventions through bare metal. In my previous post, we began tumbling down the rabbit hole, always unable to have a clear understanding of our system because of its overwhelming complexity.

In this post, we hit bare metal1 on the PC platform. This hardware has a lot of weird corners of its own, but unlike the software side, there’s nothing you can do to change the situation, so you can be stoical about it. From that point on, we can begin amassing real, solid knowledge about computing.

Entering QEMU

The binaries we’re about to assemble can theoretically run on any PC2 with the “legacy BIOS3” option enabled, but it’s a lot more convenient to use an emulator for two reasons:

  1. Getting the binary on the machine can get tedious after the hundredth time.
  2. In the real world, most BIOSes have bugs and it can be difficult, especially for a beginner, to make the difference between a problem in your code (likely) or in the BIOS (possible).

For these reasons, you want to begin your development on an emulator4 which is a convenient and predictable platform. The best one out there is QEMU5.

QEMU has an executable for each architecture it targets and that executable has a plethora of emulated machines and options to choose from. The executable we’re interested in is qemu-system-i386 and the default machine it emulates is called “pc”6, an alias to the “i440fx + PIIX” machine. That’s the “old school” PC, with old school floppy drive and IDE controllers. The more modern PC platform, with Intel’s “ICH” family of chipsets is called “q35”.

In our adventure, we’ll begin by targeting the BIOS, so the machine we target doesn’t matter7. Soon enough, however, we’ll grow out of the BIOS and begin poking the hardware directly. At this point, the choice of machine will matter.

I suggest8 that we use the default machine because old school hardware is easier to interface with. The downside of this is that your drivers will only run on old PCs, but this is something you can improve later.

Booting up

Alright, eager to begin? Yeah? Too bad, you’ll have to wait further as I babble some more, this time about the BIOS booting sequence.

When powering up, the BIOS selects a drive to boot from and reads its first sector, which on the PC platform is always 512 bytes. It verifies that this sector is actually a boot sector by checking the last two bytes of the sector for a magic number, which is 0x55 and 0xaa. It then copies the sector at memory address 0x7c00 and then jumps to that address.

Therefore, if you want to write code that will run on a bare metal PC, what you have to do is to assemble a 512 bytes file and place it on the first sector of your boot media. When running in an emulator, the file is the media, so it’s easy.

An important aspect of the PC booting sequence is that it always boots up in what we call the “real mode”, that is, in a mode that is 100% compatible with the original 8086. This means 16-bit registers and instructions. It’s also a requirement to be in this mode to call BIOS functions. You remember when you heard about the legendary backward compatibility of the Intel x86 family? That’s what we talk about and it’s going to make you swear.

Enough babbling, let’s flex our fingers and get going.

Hello World

Code in this article is also available in Tumble Forth's source tarball.

We’re going to print “Hello World!” on a bare metal PC using BIOS functions through a boot sector we’ll assemble using NASM. The listing is so simple that I’m just going to give it straight to you and then explain it:

org 0x7c00
    mov ax, 0
    mov es, ax
    mov ah, 0x13    ; Function: Write string
    mov al, 0       ; write mode: don't update cursor
    mov bh, 0       ; page number
    mov bl, 0xf     ; color: white
    mov cx, 12      ; Number of characters to write
    mov dh, 0       ; Row
    mov dl, 0       ; Column
    mov bp, msg     ; Pointer to string
    int 0x10
loop:   jmp loop
  db 'Hello World!'
times 510 -( $ - $$ ) db 0
db 0x55, 0xaa

If you save this to hello.asm, you can compile this with nasm -o hello.img hello.asm and then run it under QEMU with qemu-system-i386 -hda hello.img9. You’ll see your “Hello World!” printed at the top left of the corner, mingled with BIOS boot messages because we haven’t cleared the screen.

I’ll briefly explain what’s going on here. Full explanations are out of the scope of this article, but you can get those answers from good i386 assembler tutorials and OSDev wiki. It’s ok if you don’t because as we go along, I’m going to explain things in a “need to know” basis.

First, the BITS 16 line tells NASM that we want to compile code in real mode. This code below runs not only on a i386, but on any CPU of the x86 family, including the original 8086.

The org 0x7c00 directive tells NASM that this code will run at offset 0x7c00 in memory. Without this, the msg reference below will be wrong10 because NASM will think that msg lives somewhere near address zero.

Most of the following lines are arguments setup for the BIOS function INT10h. We’ll talk about interrupts in more details later, but for now, what you should know is that BIOS functions are called through the int instruction which triggers a software interrupt of the specified index. Arguments to BIOS functions are passed through specific registers. The most important argument is AH11 because it specifies the subfunction to call and thus the nature of the call. In this instance, “INT10h AH=0x13” means, as you can see in the documentation, “Write string”. Most of the other arguments are self-explanatory. As you can see, the msg label is used as a source argument for mov. When assembling, NASM will calculate the offset where msg ends up in the binary and replace all references to this label by this offset. When the mov instruction is encoded, msg is the equivalent of a constant number.

The int 0x10 line triggers the interrupt handler 0x10, which lives in the BIOS. This is the equivalent of a call and will return once it’s finished doing its thing.

Let's go back to mov ax, 0; mov es, ax which warrant a special mention. They aren't precisely part of the int10h argument setup, but rather the system memory setup.

These lines go together and have the effect of setting the ES register to zero. ES is a special “segment” register allowing the original 8086 to address more than 64 kilobytes of memory (the maximum 16-bit address). The idea is that “segment registers” such as ES12 moves the address window by 16 times its value. For example, if ES is 1, then calling int10h below with BP=0 will effectively reference absolute address 0x10. A bit confusing? I told you you’d swear. We won’t be doing anything fancy with segments in 16-bit mode, no need to think too much of it.

Anyways, for the msg reference below to work, we need ES to be zero13. There are no form of mov instruction allowing an immediate14 source argument with a "segment register" destination argument, so we need to set another register to zero, AX in this case, and then set ES with the value of AX.

The line following int 0x10 is an infinite loop because we don’t have anything to do after after having written the string. It defines a new label and jumps to it, which means it’s stuck there forever.

The next two lines are data and are never executed. db is a special NASM directive to write literal data into the executable and this directive supports string literals. No need for a terminating null because we specify our string length in the int10h call.

We’re finished with our boot sector, but we still need to add the magic 0x55 and 0xaa at the end of the sector. To do this, we use special NASM voodoo to fill the exact number of zeroes we need to get to byte 510, and then spit 0x55 and 0xaa with a regular db directive.

That’s it, the whole machine is yours now! If you write this to a USB key with something like “dd if=hello.img of=/dev/sdX”, it should make that USB key bootable on most PCs. Feels nice right? You still have a long way to go, but at least you have a foot in the door.

Up next

The whole world is open to us now, we could go anywhere from here, but paths are mostly dictated by constraints. 512 bytes is really tight, real mode is really ugly, Forth is really intriguing. Will we build a bootloader, go in protected mode, hack a Forth right away? Stay tuned for the answer, same bat-hour, same bat-channel!

Next: One sector to rule them all

OSDev wiki

To the initiated, the contents of this article will look similar to what the OSDev wiki offers. It’s true, and it’s going to be mostly true for the next few articles15. This wiki is an awesome resource for PC development, I can't recommend it enough. However, they insist, when you read them, that you grow into an OS developer. They want you to sweat a little bit rather than spoon feed everything to you. This series of articles I'm writing doesn't aim to make you an operating system developer16, but to broaden your understanding of computing.

I thus believe that the path I’m offering you is quicker, easier, more seductive, even though it leads you to the Dark Side. But it’s worth it, right?

  1. On modern hardware, what we call “bare metal” is far from it because that “metal” is wrapped by layers upon layers of very complicated firmware that make you believe you’re actually flipping hardware switches. That illusion will suffice for now and the knowledge you’re gathering is real, not fake. Later, you can become a vintage hardware enthusiast and flip switches for real. 

  2. Why the PC? Because it’s ubiquitous. Its innards are ugly and full of pitfalls, but resources about it are plentiful. Moreover, it’s the only hardware that Dusk OS supports at the moment. In other story arcs, we’ll explore other architectures, some of them beautiful. 

  3. The BIOS is a piece of software living in a ROM chip in the computer. Its entry points and behavior is (supposed to be) standard across all PCs and serve as a generic way to access hardware. 

  4. If you’re like my former self, you treat emulators like some piece of magical black box. In reality, the way they fundamentally work is simple and straightforward. One day, we’ll explore this, but for now, we can continue to treat it as a black box. 

  5. It is packaged on the vast majority of Linux distros, but there’s sometimes one packages per target architecture. You’re looking for the “x86” target. For example, on Debian, the package to install is “qemu-system-x86”. 

  6. Run “qemu-system-i386 -machine help” for a list. 

  7. … much. Consistency across BIOS implementations in the PC world is far from perfect. 

  8. Did I say “suggest”? I meant “force you to” because that’s the only machine I’m going to cover for now. 

  9. "hda” being a reference to the first IDE drive of the system, so it’s as if your machine was booting up from the HDD. 

  10. The “loop” one will be fine though because all jumps use relative adressing. More on this later. 

  11. Each 8086 “general” register, AX, BX, CX, DX, are divided in two “H” (high) and “L” (low) 8-bit sub-registers. 

  12. There are many such registers that serve in different contexts. 

  13. Another way to proceed would be to keep “org” to zero and then set “ES” to 0x7c0. 

  14. "constant” in layman terms. 

  15. But not for the whole story arc, as we’ll soon dive into the worderful world of Forth, which OSdev doesn’t cover. 

  16. Although I’d be happy if it did!