(Updated March 13, 2023)
Overview
Writing an operating system (OS) is no small feat. Ask Linus Torvalds. First, you must be able to store the OS on some medium that is durable and persistent, then you must write code that makes the OS start upon powering on the hardware on which it is intended to run.
This tutorial is based on the MikeOS tutorial on how to write your own OS.
This tutorial uses MBR and BIOS. Although GPT and UEFI are widely available, this is intended to be a primer on the process, not a treatise on complexity.
If you are looking for the UEFI/GPT way you can go to Write Your Own OS! (UEFI/GPT).
Preparing the Bootloader
Everything begins somewhere. With an operating system on any BIOS platform, it begins with the bootloader. The OS is started by crafting a small piece of software. This has traditionally been 512 bytes – the size of a disk sector. This does not matter if it is a floppy, CD-ROM, hard drive, or solid-state disk. It can even be a USB stick!
We will start with installing anything we need right away.
sudo apt update && sudo apt install build-essential qemu-system-x86 nasm
Understanding the MBR
In the days of floppy disks, the 512 bytes located at absolute sector 0 (sometimes called Cylinder 0, Head 0, Sector 1) is where everything began.
The MBR or Master Boot Record is one of the oldest methods of booting media for the Intel-based personal computer. It was designed to store and manage limited knowledge about the connected device and how its contents may be accessed using the very old BIOS (Basic Input/Output System) of the Intel personal computer. Of course, BIOS is a more general term for any read-only primitive I/O subsystem. These limited operations generally do not include graphical user interface (GUI) support.
There is enough code in the BIOS to relocate the bootloader to another memory location and then search the known disk partitions for the remainder of the code necessary to start the next phase of loading the OS. This all has to fit inside 512 bytes!
The basics of this are well described in OSDEV’s MBR documentation. Of course, the MBR has long since been superseded by the newer and more versatile GUID Partition Table (GPT), which is used in conjunction with the BIOS successor, the Unified Extensible Firmware Interface (UEFI).
Understanding the BIOS
The BIOS is a very primitive I/O system to communicate with devices attached to the computer. It is traditionally read-only and part of the base system hardware. There are no protections or security, and writes are performed with impunity. It is expected that the programmer knows what they are doing.
There are some basic rules about bootstrapping the system. These are detailed in the MBR link above. But the basics are:
The first 512 bytes located in sector 0, track 0 of the disk are loaded into memory at physical memory location 0x7c00. This may seem arbitrary, but it was detailed in the BIOS standard.
BITS 16 ; Tell nasm this is 16-bit mode!
start:
mov ax, 07C0h ; Set up 4K stack space after this bootloader
mov ds, ax ; Set data, extra segment to where we're loaded
mov es, ax
add ax, 288 ; (4096 + 512) / 16 bytes per paragraph
mov ss, ax ; Set the stack segment to be the number of
mov sp, 4096 ; Set SP to 4096 bytes beyond the
cld ; never assume direction (DF) is clear!
mov si, text_string ; Put string position into SI
call print_string ; Call our string-printing routine
jmp $ ; Jump here - infinite loop!
text_string times 12 db `\n`
db 'Hello from the boot sector!', 0
print_string: ; Routine: output string in SI to screen
mov ah, 0Eh ; int 10h 'print char' function
.repeat:
lodsb ; Get character from string
cmp al, 0
je .done ; If char is zero, end of string
int 10h ; Otherwise, print it
jmp .repeat
.done:
ret
times 510-($-$$) db 0 ; Pad remainder of boot sector with 0s
dw 0xAA55 ; The standard PC boot signature
How Does This Work?
This section attempts to explain the details of the code based on the labeled section.
start:
WE ARE BOOTING THE OS!
At this point, we have been loaded into memory at location 7c00h.
Starting at the beginning, we need to create a stack. Please note that 16-bit real-mode Intel memory is messy. We lived in a time when 64KB was not enough, but the CPUs of the time could not address more than that. This is because 2^16 = 65536 or 64KB. So, to address this (pun intended), they created multiple segment registers to allow for multiple 64KB chunks in play. We had CS (code segment), DS (data segment), SS (stack segment), and ES (extra segment).
To make matters worse, Intel had the novel idea that everything was on paragraph boundaries. A paragraph was 16 bytes. As such, segment registers always start on a paragraph boundary. This confused some programmers since the same memory location could have multiple addresses due to segments. Location 7c04h in segment 0 or 0000:7c40 was also:
0001:7c30 0002:7c20 0003:7c10 ... 07c2:0020 07c3:0010 07c4:0000
For every 16 bytes you remove on the right, add one on the left.
Getting back to the assembly code…
- Lines 4-6 set the data (and extra) segment to 07c0h so that memory references using the SI register (source index) and the
lodsb
(load string byte) instruction work correctly. - Lines 8-10 move 07c0h to AX, add 288 paragraphs, then move that to the stack segment. We want a stack of 4096 bytes after the end of the boot sector. Wait, you said that the boot sector was loaded to $7c00! Yes, but paragraphs! Then we move 4096 to the stack pointer on line 7. Remember that stacks grow from high to low memory.
- Line 12 clear the direction flag (DF) for forward move through the string with the
lodsb
instruction. We have NO IDEA how the BIOS left this flag, so we cleared it to make sure. - Lines 13-16 print our welcoming message and create an infinite loop, halting the OS.
- Lines 19-20 are the message bytes to be displayed. It has 12 newlines and a hello message
print_string:
This section uses the INT 10h software interrupt to print characters to the screen. This is a raw BIOS write. The value of 0Eh in the AH (high order portion of AX) register means we are using “teletype output.”
This code section also uses a snazzy loop construct of the Intel processor. This uses lodsb
and the SI register. We assume the direction flag (DF) is clear (zero), which means forward direction. (The cld
instruction clears the direction flag while std
sets it.)
From the start section, we set the data segment to our boot sector (lines 4 and 5), and the string address was moved into SI (line 13). Here are some vital details surrounding the printing.
- The
lodsb
instruction uses DS:SI as the string pointer to get characters into the AL (low order portion of AX) register. - There is an automatic advancing of the SI register by using
lodsb
, so no increment is necessary. - The loop stops when AL contains zero, which indicates the end of the string, and then we return.
Preparing To Start the OS!
All the tools we need were installed earlier. Once these are installed we can assemble the program above. Save the file as boot-sect.asm
. Then prepare the program and the disk with the following commands.
nasm -f bin -o boot-sect.bin boot-sect.asm dd status=noxfer conv=notrunc if=boot-sect.bin of=boot-sect.flp
The first command assembles the code with a specific option to produce a bin
file. This represents the flat-form binary output format. This is a “plain binary” form with no relocation data and is generally used for OS images and boot loaders.
The second command converts the bin file into a floppy disk using the dd
command. The status=noxfer
suppresses any output statistics. The conv=notrunc
means the resultant file will be fully formed and not truncated based on a small amount of data.
Now we are ready to launch our OS!
qemu-system-i386 -drive file=boot-sect.flp,format=raw
From here, we could make a bigger OS, but that would require arranging memory and loading more code from the disk.
An excellent place to start would be to look at what MikeOS does. Their model considers the nature of a smaller OS by sticking to a 64KB model with 32KB reserved for the kernel and 32KB reserved for user applications. Since it is operating in real mode on the x86 processor, you also have complete access to the BIOS’ ability to perform I/O. Remember that MikeOS is not intended to be Microsoft Windows or MacOS. It’s a very simple OS for people who want to learn how to make one.
References and Further Reading
http://mikeos.sourceforge.net/write-your-own-os.html
https://wiki.osdev.org/BIOS
https://wiki.osdev.org/UEFI
https://wiki.osdev.org/MBR_(x86)
https://wiki.osdev.org/GPT
https://wiki.osdev.org/Rolling_Your_Own_Bootloader