Writing a simple Operating System from scratch - Learn + improve + enjoy !

mail

Writing a simple Operating system from scratch

Here are my notes / work in progress while reading and trying things with Writing a simple Operating system from scratch.pdf (originally found here, looks like there are some nice docs there).

Create a blank disk image :

  1. dd if=/dev/zero of=./01_boot.img bs=1 count=1024
  2. Boot it : qemu -curses -k fr -hda 01_boot.img
  3. It tries all devices, and ends on : No bootable device.
  4. To leave : killall -1 qemu

Edit the image file with Emacs :

  1. within Emacs : Alt-x hexl-find-file, then (source) :
    • Ctrl-Alt-d : insert a byte with a code typed in decimal
    • Ctrl-Alt-o : insert a byte with a code typed in octal
    • Ctrl-Alt-x : insert a byte with a code typed in hex
  2. Start with (hex values): e9 fd ff
  3. End (512 bytes chunk, MBR) : 55 aa
  4. The final result should look like :
    e9 fd ff 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa
  5. Then boot : Booting from Hard Disk...

Boot sector re-visited :

  1. apt install nasm
  2. edit 02_boot_sect.asm :
    ;
    ; A simple boot sector program that loops forever.
    ;
    
    loop:			; Define a label, "loop", that will allow
    			; us to jump back to it, forever.
    
    jmp loop		; Use a simple CPU instruction that jumps
    			; to a new memory address to continue execution.
    			; In our case, jump to the address of the current
    			; instruction.
    
    times 510 -($-$$) db 0	; When compiled, our program must fit into 512 bytes,
    			; with the last two bytes being the magic number,
    			; so here, tell our assembly compiler to pad out our
    			; program with enough zero bytes ( db 0) to bring us to the
    			; 510th byte.
    
    dw 0xaa55		; Last two bytes ( one word ) form the magic number,
    			; so BIOS knows we are a boot sector.
  3. compile : nasm 02_boot_sect.asm -f bin -o 02_boot_sect.bin
  4. then boot : qemu -curses -k fr 02_boot_sect.bin
  5. result is the same as above : Booting from Hard Disk...
  6. let's have a look INSIDE 02_boot_sect.bin : od -t x1 -A n 02_boot_sect.bin
    eb fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa
    e9 fd ff changed into eb fe ? Maybe this is just a "jump" destination and doesn't really matter (?)

16-bit real mode :

16-bit real mode is opposed to 32/64-bit protected mode where the OS can restrict a user process to access kernel memory.

CPU registers :

x86 processors have 4 16-bit registers : ax, bx, cx, dx (details) :

To store a value in a register, we "move" it : mov register, value
It's also possible to address either the high byte or the low byte of a register : ah, al, ...
  1. Let's edit 03_boot_sect_hello.asm :
    ;
    ; A simple boot sector that prints a message to the screen using a BIOS routine.
    ;
    mov ah, 0x0e		; int 10/ah = 0eh -> scrolling teletype BIOS routine
    mov al, 'H'		; char to display into al
    int 0x10		; call the display routine
    mov al, 'e'
    int 0x10
    mov al, 'l'
    int 0x10
    mov al, 'l'
    int 0x10
    mov al, 'o'
    int 0x10
    
    jmp $			; Jump to the current address ( i.e. forever ).
    
    ;
    ; Padding and magic BIOS number.
    ;
    times 510 -($-$$ ) db 0	; Pad the boot sector out with zeros
    
    dw 0xaa55		; Last two bytes form the magic number,
    			; so BIOS knows we are a boot sector.
  2. compile : nasm 03_boot_sect_hello.asm -f bin -o 03_boot_sect_hello.bin
  3. boot :
    qemu -curses -k fr 03_boot_sect_hello.bin
    qemu 03_boot_sect_hello.bin
  4. it says :
    Booting from Hard Disk...
    Hello
  5. let's have a look into 03_boot_sect_hello.bin : hd 03_boot_sect_hello.bin
    00000000	b4 0e b0 48 cd 10 b0 65		cd 10 b0 6c cd 10 b0 6c		|...H...e...l...l|
    00000010	cd 10 b0 6f cd 10 eb fe		00 00 00 00 00 00 00 00		|...o............|
    00000020	00 00 00 00 00 00 00 00		00 00 00 00 00 00 00 00		|................|
    
    000001f0	00 00 00 00 00 00 00 00		00 00 00 00 00 00 55 aa		|..............U.|
    00000200

Memory, addresses and labels :

BIOS loads the boot sector to the address 0x7c00.

X marks the spot :

  1. let's edit 04_boot_sect_X.asm :
    ;
    ; A simple boot sector program that demonstrates addressing.
    ;
    mov ah, 0x0e		; int 10/ah = 0eh -> scrolling teletype BIOS routine
    
    ; First attempt
    mov al, the_secret	; copy the value of 'the_secret' into al
    int 0x10		; Does this print an X?
    
    ; Second attempt
    mov al, [the_secret]	; copy the contents of address 'the_secret' into al
    int 0x10		; Does this print an X?
    
    ; Third attempt
    mov bx, the_secret
    add bx, 0x7c00
    mov al, [bx]
    int 0x10		; Does this print an X?
    
    ; Fourth attempt
    ;mov al, [0x7c1e]
    mov al, [0x7c1d]
    int 0x10		; Does this print an X?
    
    jmp $			; Jump forever.
    
    the_secret:		; this is just a label to mark the offset of 'X' since the beginning of this program
    			; this label will be removed by nasm, and every reference to it will be replaced by its actual value
    db "X"			; 'db' : 'declare byte(s) of data'. The quotes instruct the assembler to replace each character with its ASCII value
    			; if the data has several characters, the label will point to the first of them
    
    
    times 510-($-$$ ) db 0	; Padding
    dw 0xaa55		; magic BIOS number
  2. nasm 04_boot_sect_X.asm -f bin -o 04_boot_sect_X.bin
  3. hd 04_boot_sect_X.bin outputs :
    00000000	b4 0e b0 1d cd 10 a0 1d		00 cd 10 bb 1d 00 81 c3		|................|
    00000010	00 7c 8a 07 cd 10 a0 1e		7c cd 10 eb fe 58 00 00		|.|......|....X..|
    00000020	00 00 00 00 00 00 00 00		00 00 00 00 00 00 00 00		|................|
    
    000001f0	00 00 00 00 00 00 00 00		00 00 00 00 00 00 55 aa		|..............U.|
    00000200
    In my case, the offset of 'X' is 0x1d.
  4. Then boot : qemu -curses -k fr 04_boot_sect_X.bin
    Booting from Hard Disk...
    ?XX
    "?" is actually a non-ASCII character
  5. With [org 0x7c00] at the very beginning of the program, the display becomes :
    Booting from Hard Disk...
    X X

Using the stack :

push / pop : in 16-bit mode, the stack works only on 16-bit boundaries, we cannot work with single bytes.
The stack is implemented by two special CPU registers (remember the x86 registers ?) :
  • bp : address of the stack base (i.e. bottom, actually : base pointer)
  • sp : address of the stack top (i.e. summit, actually : stack pointer)

The stack grows downwards from the base pointer. So when we issue a push, the value actually gets stored below --- and not above --- the address of bp, and sp is decremented by the value's size.

  1. Let's edit 05_boot_sect_stack.asm :
    ;
    ; A simple boot sector program that demonstrates the stack.
    ;
    mov ah, 0x0e		; int 10/ah = 0eh -> scrolling teletype BIOS routine
    
    mov bp, 0x8000		; Set the base of the stack a little above where BIOS
    mov sp, bp		; loads our boot sector - so it won't overwrite us.
    
    push 'A'		; Push some characters on the stack for later retrieval.
    push 'B'		; Note, these are pushed on as 16-bit values, so the
    push 'C'		; most significant byte will be added by our assembler as 0x00.
    
    pop bx			; Note, we can only pop 16-bits, so pop to bx
    mov al, bl		; then copy bl (i.e. 8-bit char) to al
    int 0x10		; print (al)
    
    pop bx			; Pop the next value
    mov al, bl
    int 0x10		; print (al)
    
    mov al, [0x7ffe]	; To prove our stack grows downwards from bp,
    			; fetch the char at 0x8000 - 0x2 (i.e. 16-bits)
    int 0x10		; print (al)
    
    jmp $			; Jump forever. '$' is the address of the current instruction (?)
    
    times 510-($-$$) db 0	; Padding
    dw 0xaa55		; magic BIOS number
  2. compile : nasm 05_boot_sect_stack.asm -f bin -o 05_boot_sect_stack.bin
  3. then boot : qemu -curses -k fr 05_boot_sect_stack.bin
  4. which outputs :
    Booting from Hard Disk...
    CBA

Question 3 :

  1. ;
    ; EXERCISE : code this
    ;
    mov bx, 30
    if (bx <= 4) {
    	mov al, 'A '
    	}
    else if (bx < 40) {
    	mov al, 'B '
    	}
    else {
    	mov al, 'C '
    	}
    
    ...
  2. 06_boot_sect_jump.asm :
    ;
    ; MY ANSWER
    ;
    mov bx, 30	; should display 'B'
    ;mov bx, 3	; should display 'A'
    ;mov bx, 50	; should display 'C'
    
    cmp bx, 4
    jle lessOrEqualThan4
    
    cmp bx, 40
    jl lessThan40
    
    mov al, 'C'
    jmp theEnd
    
    lessOrEqualThan4:
    mov al, 'A'
    jmp theEnd
    
    lessThan40:
    mov al, 'B'
    jmp theEnd
    
    theEnd:
    mov ah, 0x0e	; int=10/ah=0x0e -> BIOS tele-type output
    int 0x10		; print the character in al
    
    jmp $
    
    ; Padding and magic number.
    times 510-($-$$) db 0
    dw 0xaa55
  3. compile then boot : nasm 06_boot_sect_jump.asm -f bin -o 06_boot_sect_jump.bin && qemu -curses -k fr 06_boot_sect_jump.bin
  4. try it with different values in bx

Question 4 :

  1. Given :
    ;
    ; EXERCISE : A boot sector that prints a string using our function.
    ;
    [org 0x7c00]		; Tell the assembler where this code will be loaded
    mov bx, HELLO_MSG	; Use bx as a parameter to our function, so
    call print_string	; we can specify the address of a string.
    
    mov bx, GOODBYE_MSG
    call print_string
    jmp $			; Hang
    
    %include "print_string.asm"
    
    ; Data
    HELLO_MSG:
    db 'Hello, World !', 0	; <-- The zero on the end tells our routine
    			; when to stop printing characters.
    
    GOODBYE_MSG:
    db 'Goodbye !', 0
    
    ; Padding and magic number.
    times 510-($-$$) db 0
    dw 0xaa55
    Write print_string.asm
  2. ;
    ; MY ANSWER : print_string.asm
    ;
    
    ; arg :		BX, start of the string to print
    ; return :	nothing
    print_string:
    	pusha
    
    	print_string_start:
    
    	cmp byte[bx], 0
    	je leave_print_string_function
    
    	mov al, [bx]
    	call print_character
    
    ;	add bx, 0x01		; jump to the next character of the string to display
    	inc bx			; same as above, different syntax
    
    	jmp print_string_start
    
    	leave_print_string_function:
    
    	mov al, 0x0d		; bonus : add a carriage return
    	call print_character
    	mov al, 0x0a		; ... and a line feed
    	call print_character
    
    	popa
    	ret
    
    
    ; arg :		AL, character to print
    ; return :	nothing
    print_character:
    	pusha
    	mov ah, 0x0e		; int=10/ah=0x0e -> BIOS tele-type output
    	int 0x10		; print the character in al
    	popa
    	ret
  3. compile and boot : nasm 07_boot_sect_print_string.asm -f bin -o 07_boot_sect_print_string.bin && qemu -curses -k fr 07_boot_sect_print_string.bin

For reference, x86 assembly guides : 1, 2.

Question 5 :

  1. Code this :
    mov dx, 0x1fb6	; store the value to print in dx
    call print_hex	; call the function
    
    ; prints the value of DX as hex.
    print_hex:
    	; TO DO: manipulate chars at HEX_OUT to reflect DX
    	mov bx, HEX_OUT		; print the string pointed to
    	call print_string	; by BX
    	ret
    
    ; global variables
    HEX_OUT: db '0x0000', 0
  2. some intermediate notes :
    • mov [HEX_OUT+5], 3 : square brackets mean that the operand points to a memory location (source)
    • mov [HEX_OUT+5], 3 : nasm will complain operation size not specified because it has no idea whether this value should be sorted as a single-byte (8 bits) value ("byte"), 2-byte (16 bits) value ("word"), 4-byte (32 bits) value ("double word"), or 8-byte (64 bits) value ("quad word").
      To fix it, be more explicit : mov byte[HEX_OUT+5], 3
    • mov byte[HEX_OUT+5], 42 : will store the decimal value 42 (the ASCII character *) into the specified address.
      To store a B (42h), be explicit : mov byte[HEX_OUT+5], 42h, or mov byte[HEX_OUT+5], 0x42
    • Don't systematically pusha / popa when creating a function, or popa will overwrite changes made to registers by the function, and the returned value(s) will be the function parameter(s) (i.e. unchanged value(s))
    • If you want to shr register, n, n being a variable number of bits, this can only work if n is stored in cl. (WHY?)
  3. 08_boot_sect_print_hex.asm :
    ;
    ; EXERCISE : Print an hexadecimal value.
    ;
    [org 0x7c00]		; Tell the assembler where this code will be loaded
    
    
    mov dx, 0x1fb6		; store the value to print in dx
    call print_hex		; call the function
    
    mov dx, 0x1234		; store the value to print in dx
    call print_hex		; call the function
    
    mov dx, 0xcaca		; store the value to print in dx
    call print_hex		; call the function
    
    
    jmp $			; Hang
    
    
    ; arg :		DX, the hex value to print
    ; return :	nothing
    ; the idea is to convert the value to print into characters, then insert those into HEX_OUT
    print_hex:
    	pusha
    
    	; 1st digit from the right
    	mov cx, dx
    	; nothing to shift here
    
    	and cx, 0x000f		; select 1 digit with a logical AND, store result in cx
    	call changeHexDigitIntoAscii
    	mov byte[HEX_OUT+5], cl	; store the converted char as the rightmost char of HEX_OUT :
    				; HEX_OUT : '0x....'
    				; offset : 012345
    
    	; 2nd digit from the right
    	mov cx, dx
    	shr cx, 4
    
    	and cx, 0x000f
    	call changeHexDigitIntoAscii
    	mov byte[HEX_OUT+4], cl	; store the converted char as the 2nd char from the right of HEX_OUT
    
    
    	; 3rd digit from the right
    	mov cx, dx
    	shr cx, 8
    
    	and cx, 0x000f
    	call changeHexDigitIntoAscii
    	mov byte[HEX_OUT+3], cl	; store the converted char as the 3rd char from the right of HEX_OUT
    
    
    	; 4th digit from the right
    	mov cx, dx
    	shr cx, 12
    
    	and cx, 0x000f
    	call changeHexDigitIntoAscii
    	mov byte[HEX_OUT+2], cl	; store the converted char as the 4th char from the right of HEX_OUT
    
    
    	; Done converting. Let's print the result
    	mov bx, HEX_OUT		; print the string pointed to
    	call print_string	; by BX
    
    	popa
    	ret
    
    
    ; arg :	CX, single digit
    ; return :	CX, corresponding ASCII value
    changeHexDigitIntoAscii:
    	cmp cl, 0xa
    	jl digitValueIsLessThanA
    
    	; the current hex digit is within [a-f]
    	add cx, 0x57		; convert hex value 0xa into ASCII value 'a'
    	ret
    
    	digitValueIsLessThanA:	; the current hex digit is within [0-9]
    				; let's add 0x30 to change it into its ASCII value
    	add cx, 0x30
    	ret
    
    
    %include "print_string.asm"
    
    
    ; global variables
    HEX_OUT: db '0x1234', 0
    
    
    ; Padding and magic number.
    times 510-($-$$) db 0
    dw 0xaa55
    I'm not completely satisfied with this solution : too much copy-pasted code. I've tried (but not succeeded yet) to write a function taking dx and the right shift offset as parameters... (I'm open to suggestions
  4. compile and boot : nasm 08_boot_sect_print_hex.asm -f bin -o 08_boot_sect_print_hex.bin && qemu -curses -k fr 08_boot_sect_print_hex.bin

Extended Memory Access Using Segments :

  1. Let's edit 09_boot_sect_segment_offsetting.asm :

    0x7c0 refers to 0x7c00 : to calculate the absolute address, the CPU multiplies the value in the segment register by 16 and then adds your offset address

    ;
    ; A simple boot sector program that demonstrates segment offsetting
    ;
    mov ah, 0x0e			; int 10/ ah = 0 eh -> scrolling teletype BIOS routine
    
    mov al, [the_secret]
    int 0x10			; Does this print an X ?
    
    mov bx, 0x7c0			; Can't set ds directly, so set bx
    mov ds, bx			; then copy bx to ds.
    mov al, [the_secret]
    int 0x10			; Does this print an X ?
    
    mov al, [es:the_secret]		; Tell the CPU to use the 'es' (not 'ds') segment.
    int 0x10			; Does this print an X ?
    
    mov bx, 0x7c0
    mov es, bx
    mov al, [es:the_secret]
    int 0x10			; Does this print an X ?
    
    jmp $				; Jump forever.
    
    the_secret:
    db "X"
    
    times 510-($-$$) db 0		; Padding.
    dw 0xaa55			; magic BIOS number.
  2. compile and boot : programName='09_boot_sect_segment_offsetting'; nasm "$programName.asm" -f bin -o "$programName.bin" && qemu -curses -k fr "$programName.bin"
  3. Booting from Hard Disk...
    XX
    Doesn't help much understanding what's happening
  4. Let's edit 09_boot_sect_segment_offsetting.asm again :
    ;
    ; A simple boot sector program that demonstrates segment offsetting
    ;
    mov ah, 0x0e			; int 10/ ah = 0 eh -> scrolling teletype BIOS routine
    
    mov al, [the_secret_a]
    int 0x10			; Does this print an 'A' ?
    
    mov bx, 0x7c0			; Can't set ds directly, so set bx
    mov ds, bx			; then copy bx to ds.
    mov al, [the_secret_b]
    int 0x10			; Does this print a 'B' ?
    
    mov al, [es:the_secret_c]	; Tell the CPU to use the 'es' (not 'ds') segment.
    int 0x10			; Does this print a 'C' ?
    
    mov bx, 0x7c0
    mov es, bx
    mov al, [es:the_secret_d]
    int 0x10			; Does this print a 'D' ?
    
    jmp $				; Jump forever.
    
    the_secret_a:
    db "A"
    
    the_secret_b:
    db "B"
    
    the_secret_c:
    db "C"
    
    the_secret_d:
    db "D"
    
    times 510-($-$$) db 0		; Padding.
    dw 0xaa55			; magic BIOS number.
  5. Booting from Hard Disk...
    B D
  6. (to be continued ...)