What happens when a typical computer is powered on?
Normally, when a computer is turned on the power button signals power supply to send proper voltage to computer and other components such as CPU, Monitor, Keyboard, Mouse. CPU initializes Basic Input Output System Read only Memory chip to load an executable program. Once the BIOS chip is initialized, it passes a special program to the CPU to execute called as BIOS and below are its functionality.
- A BIOS is a special program that is embedded in BIOS chip.
- The BIOS program is executed which in turn performs the following tasks.
- Runs Power On self Test.
- Checks the clocks and various buses available.
- Checks system clock and hardware information in CMOS RAM
- Verifies system settings, hardware settings pre-configured etc.,
- Tests the attached hardware starting from devices like RAM, disk drives, optical drives, hardware drives and so on.
- Depending upon the pre-configured information in BIOS Bootable devices information, it searches for a boot drive based on the information available in the settings and starts initializing it to proceed further.
Note:
All x86 compatible CPUs start in an operating mode called as Real Mode during the booting.
A device is bootable device if it contains a boot sector or boot block and bios reads that device by first loading the boot sector into memory (RAM) for execution and then proceeds further.
A sector is a specifically sized division of a bootable disk. Usually a sector is of 512 bytes in size. I will explain you more about how a computer memory is measured and what are the various terminologies associated with it in the coming sections.
A boot sector or a boot block is a region on a bootable device that contains machine code to be loaded into RAM by a computer system’s built-in firmware during its initialization. It is of 512 bytes on a floppy disk. You will come to know more about bytes in the coming sections.
How does a bootable device work?
Whenever a bootable device is initialized, bios searches and loads the 1st sector which is known as boot sector or boot block into the RAM and starts executing it. Whatever the code resides inside a boot sector is the first program you may edit to define the functionality of the computer for the rest of the time. What I mean here is you can write your own code and copy it to the boot sector to make the computer work in accordance with your requisites. The program code that you intend to write to the boot sector of a device is also called as boot loader.
In computing, a boot loader is a special program that is executed each time a bootable device is initialized by the computer during its power on or reset. It is an executable machine code, which is very specific to the hardware architecture of the type of CPU or microprocessor.
I will list out the following mainly.
Normally the more the number of bits the more memory space the programs are accessed to and the more performance they gain in terms of temporary storage etc. There are two major manufacturers of the microprocessors in business today and they are Intel and AMD. Through the rest of this article I will be referring only to Intel based family(x86) microprocessors.
Each company has their own unique way of designing the microprocessors in terms of hardware and instruction sets used for the interactions.
Introduction to the development environment.
Earlier in section “What happens when a computer boots”, I have mentioned that all x86 CPUs while booting from a device start in a real mode. It is very important to make a note of this while writing a boot code for any device. Real mode supports only 16-bit instructions. So the code you write to load into a boot record or boot sector of a device should be compiled to only 16 bit compatible code. In real mode, the instructions can work with a maximum of 16-bits at once, for example: a 16-bit CPU will have a particular instruction that can add two 16-bit numbers together in one CPU cycle, if it was necessary for a process to add together two 32-bit numbers, then it would take more cycles, that make use of 16-bit addition.
A heterogeneous collection of entities that are very specific to the architecture (in terms of design) of the microprocessor that a user can use to interact with a microprocessor. I mean a collection of entities, which comprises of native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling and external I/O. Usually a group of instructions are made common available for a family of microprocessor. The 8086 microprocessor is one of the family of 8086, 80286, 80386, 80486, Pentium, Pentium I, II, III …. also referred to as the X86 family. Through out this article I will refer to the instruction set referring to the x86 family of microprocessors.
To successfully achieve this task, we need to know about the below.
I will explain this in a very simple way. A big collection of various programs written by 100s and 1000s of professionals includes applications and utilities to help individuals and people across the globe. A part from technical stand point of view in general an operating system is mainly written to provide various applications to help people a lot in their daily life activities. Like connecting to internet, chatting, browsing the net, create files, save files, data, process data and a lot more. I still did not understand. What I mean here is that you may want to chat with your friends, you may want to watch news online, you may want to write some personal information to a file, you may want to watch some movies, you may want to calculate some mathematical equations, you may want to play games, you may want to do write programs and more…All these tasks can be achieved by means of an operating system. The job of an operating system is to provide with enough tools to help you and serve you. Some of the activities you want to multitask too and it is the job of the operating system to manage hardware and provide you the best experience it can to you.
Also, please make a note that, all modern operating systems operate in protected mode.
And more…
Unlike in Real mode, protected mode supports 32-bit instructions. Do not worry about it now a lot, as we are not much bothered about how an operating system works etc.
An assembler converts the instructions given by a user to a machine code.
At a higher-level yes...but it is actually the assembler which is embedded inside a compiler does this activity.
Then why can't a compiler generate machine code directly?
The primary job of a compiler mainly falls into converting the instructions written by a user into an intermediate set of instructions called as assembly language instructions. Then the assembler will consume these instructions and will convert into the respective machine code.
Right now, I do not want to get into very detailed level of explanation but let me explain in terms of the scope of this article. Well! Earlier I mentioned that in order to write instructions that can be understood by a microprocessor, we need compiler and this compiler is developed as a utility in Operating Systems. I told you that Operating Systems are designed to help people providing various utilities and compilers are one of the utilities too.
I have written programs on Ubuntu Operating system to boot from a floppy device so I would recommend Ubuntu for this article.
I have written programs using GNU GCC compiler and I will how to compile the code using the same. How do I test a hand written code to a boot sector of a device? I will introduce you to an x86 emulator which can help us to a great levels without letting us to restart the computer each time we edit the boot sector of the device.
In order to learn programming a microprocessor, first we need to learn how to use registers.
Registers are like utilities of a microprocessor to store data temporarily and manipulate it as per our requirements. Suppose say if the user wants to add 3 with 2, the user asks the computer to store number 3 in one register and number 2 in more register and then add the contents of the these two registers and the result is placed in another register by the CPU which is the output that we desire to see. There are four types of registers and are listed below.
Let me brief you about each of the types.
General purpose registers: These are used to store temporary data required by the program during its lifecycle. Each of these registers is 16 bit wide or 2 bytes long.
Segment Registers: To represent a memory address to a microprocessor, there are two terms we need to be aware of:
Segment: It is usually the beginning of the block of a memory.
Offset: It is the index of memory block onto it.
Example: Suppose say, there is a byte whose value is 'X' that is present on a block of memory whose start address is 0x7c00 and the byte is located at the 10th position from the beginning. In this situation, We represent segment as 0x7c00 and the offset as 10.
The absolute address is 0x7c00 + 10.
There are four categories that I wanted to list out.
But there is always a limitation with these registers. You cannot directly assign an address to these registers. What we can do is, copy the address to a general purpose registers and then copy the address from that register to the segment registers. Example: To solve the problem of locating byte 'X', we do the following way
In our case what happens is
set 0x07c0 * 16 in AX
set DS = AX = 0x7c00
set 0x7c00 + 0x0a to ax
I will describe about the various addressing modes that we need to understand while writing programs.
In computing, a bit is the smallest unit where data can be stored. Bits store data in the form of binary. Either a 1(On) or 0(Off).
More about registers:
The registers are further divided as below following left to right order or bits:
BIOS provide a set of functions that let us draw the attention of the CPU. One will be able to access BIOS features through interrupts.
To interrupt the ordinary flow of a program and to process events that require prompt response we use interrupts. The hardware of a computer provides a mechanism called interrupts to handle events. For example, when a mouse is moved, the mouse hardware interrupts the current program to handle the mouse movement (to move the mouse cursor, etc.) Interrupts cause control to be passed to an interrupt handler. Interrupt handlers are routines that process the interrupt. Each type of interrupt is assigned an integer number. At the beginning of physical memory, a table of interrupt vectors resides that contain the segmented addresses of the interrupt handlers. The number of interrupt is essentially an index into this table. We can also called as the interrupt as a service offered by BIOS.
Bios interrupt 0x10.
What is a bootable device?
A device is bootable device if it contains a boot sector or boot block and bios reads that device by first loading the boot sector into memory (RAM) for execution and then proceeds further.
What is a sector?
A sector is a specifically sized division of a bootable disk. Usually a sector is of 512 bytes in size. I will explain you more about how a computer memory is measured and what are the various terminologies associated with it in the coming sections.
What is a boot sector?
A boot sector or a boot block is a region on a bootable device that contains machine code to be loaded into RAM by a computer system’s built-in firmware during its initialization. It is of 512 bytes on a floppy disk. You will come to know more about bytes in the coming sections.
How does a bootable device work?
Whenever a bootable device is initialized, bios searches and loads the 1st sector which is known as boot sector or boot block into the RAM and starts executing it. Whatever the code resides inside a boot sector is the first program you may edit to define the functionality of the computer for the rest of the time. What I mean here is you can write your own code and copy it to the boot sector to make the computer work in accordance with your requisites. The program code that you intend to write to the boot sector of a device is also called as boot loader.
What is a Boot Loader?
In computing, a boot loader is a special program that is executed each time a bootable device is initialized by the computer during its power on or reset. It is an executable machine code, which is very specific to the hardware architecture of the type of CPU or microprocessor.
How many types of microprocessor are available?
I will list out the following mainly.
- 16 bit
- 32 bit
- 64 bit
Normally the more the number of bits the more memory space the programs are accessed to and the more performance they gain in terms of temporary storage etc. There are two major manufacturers of the microprocessors in business today and they are Intel and AMD. Through the rest of this article I will be referring only to Intel based family(x86) microprocessors.
What is the difference between Intel based microprocessors and AMD based microprocessors?
Each company has their own unique way of designing the microprocessors in terms of hardware and instruction sets used for the interactions.
Introduction to the development environment.
What is Real Mode?
Earlier in section “What happens when a computer boots”, I have mentioned that all x86 CPUs while booting from a device start in a real mode. It is very important to make a note of this while writing a boot code for any device. Real mode supports only 16-bit instructions. So the code you write to load into a boot record or boot sector of a device should be compiled to only 16 bit compatible code. In real mode, the instructions can work with a maximum of 16-bits at once, for example: a 16-bit CPU will have a particular instruction that can add two 16-bit numbers together in one CPU cycle, if it was necessary for a process to add together two 32-bit numbers, then it would take more cycles, that make use of 16-bit addition.
What is an instruction set?
A heterogeneous collection of entities that are very specific to the architecture (in terms of design) of the microprocessor that a user can use to interact with a microprocessor. I mean a collection of entities, which comprises of native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling and external I/O. Usually a group of instructions are made common available for a family of microprocessor. The 8086 microprocessor is one of the family of 8086, 80286, 80386, 80486, Pentium, Pentium I, II, III …. also referred to as the X86 family. Through out this article I will refer to the instruction set referring to the x86 family of microprocessors.
How to write your own code to boot sector of a device?
To successfully achieve this task, we need to know about the below.
- Operating system (GNU Linux)
- Assembler (GNU Assembler)
- Instruction set(x86 family)
- Writing x86 Instructions on GNU Assembler for x86 Microprocessor.
- Compiler (C programming language - optional)
- Linker (GNU linker ld)
- An x86 emulator like bochs used for our testing purposes.
What is an Operating System?
I will explain this in a very simple way. A big collection of various programs written by 100s and 1000s of professionals includes applications and utilities to help individuals and people across the globe. A part from technical stand point of view in general an operating system is mainly written to provide various applications to help people a lot in their daily life activities. Like connecting to internet, chatting, browsing the net, create files, save files, data, process data and a lot more. I still did not understand. What I mean here is that you may want to chat with your friends, you may want to watch news online, you may want to write some personal information to a file, you may want to watch some movies, you may want to calculate some mathematical equations, you may want to play games, you may want to do write programs and more…All these tasks can be achieved by means of an operating system. The job of an operating system is to provide with enough tools to help you and serve you. Some of the activities you want to multitask too and it is the job of the operating system to manage hardware and provide you the best experience it can to you.
Also, please make a note that, all modern operating systems operate in protected mode.
What are the different types of Operating System?
- Windows
- Linux
- MAC
And more…
What is protected mode?
Unlike in Real mode, protected mode supports 32-bit instructions. Do not worry about it now a lot, as we are not much bothered about how an operating system works etc.
What is an Assembler?
An assembler converts the instructions given by a user to a machine code.
Even a compiler does the same...doesn't it?
At a higher-level yes...but it is actually the assembler which is embedded inside a compiler does this activity.
Then why can't a compiler generate machine code directly?
The primary job of a compiler mainly falls into converting the instructions written by a user into an intermediate set of instructions called as assembly language instructions. Then the assembler will consume these instructions and will convert into the respective machine code.
Why do I need an operating system to write a code for boot sector?
Right now, I do not want to get into very detailed level of explanation but let me explain in terms of the scope of this article. Well! Earlier I mentioned that in order to write instructions that can be understood by a microprocessor, we need compiler and this compiler is developed as a utility in Operating Systems. I told you that Operating Systems are designed to help people providing various utilities and compilers are one of the utilities too.
Which Operating System may I use?
I have written programs on Ubuntu Operating system to boot from a floppy device so I would recommend Ubuntu for this article.
Which compiler should I use?
I have written programs using GNU GCC compiler and I will how to compile the code using the same. How do I test a hand written code to a boot sector of a device? I will introduce you to an x86 emulator which can help us to a great levels without letting us to restart the computer each time we edit the boot sector of the device.
Introduction to microprocessor
In order to learn programming a microprocessor, first we need to learn how to use registers.
What are registers?
Registers are like utilities of a microprocessor to store data temporarily and manipulate it as per our requirements. Suppose say if the user wants to add 3 with 2, the user asks the computer to store number 3 in one register and number 2 in more register and then add the contents of the these two registers and the result is placed in another register by the CPU which is the output that we desire to see. There are four types of registers and are listed below.
- General purpose registers
- Segment registers
- Stack registers
- Index registers
Let me brief you about each of the types.
General purpose registers: These are used to store temporary data required by the program during its lifecycle. Each of these registers is 16 bit wide or 2 bytes long.
- AX - the accumulator register
- BX - the base address register
- CX - the count register
- DX - the data register
Segment Registers: To represent a memory address to a microprocessor, there are two terms we need to be aware of:
Segment: It is usually the beginning of the block of a memory.
Offset: It is the index of memory block onto it.
Example: Suppose say, there is a byte whose value is 'X' that is present on a block of memory whose start address is 0x7c00 and the byte is located at the 10th position from the beginning. In this situation, We represent segment as 0x7c00 and the offset as 10.
The absolute address is 0x7c00 + 10.
There are four categories that I wanted to list out.
- CS - code segment
- SS - stack segment
- DS - data segment
- ES - extended segment
But there is always a limitation with these registers. You cannot directly assign an address to these registers. What we can do is, copy the address to a general purpose registers and then copy the address from that register to the segment registers. Example: To solve the problem of locating byte 'X', we do the following way
- movw $0x07c0, %ax
- movw %ax , %ds
- movw (0x0A) , %ax
In our case what happens is
set 0x07c0 * 16 in AX
set DS = AX = 0x7c00
set 0x7c00 + 0x0a to ax
I will describe about the various addressing modes that we need to understand while writing programs.
Stack Registers:
- BP - base pointer
- SP - stack pointer
Index Registers:
- SI - source index register.
- DI - destination index register.
- AX: CPU uses it for arithmetic operations.
- BX: It can hold the address of a procedure or variable (SI, DI, and BP can also). And also perform arithmetic and data movement.
- CX: It acts as a counter for repeating or looping instructions.
- DX: It holds the high 16 bits of the product in multiply (also handles divide operations).
- CS: It holds base location for all executable instructions in a program.
- SS: It holds the base location of the stack.
- DS: It holds the default base location for variables.
- ES: It holds additional base location for memory variables.
- BP: It contains an assumed offset from the SS register. Often used by a subroutine to locate variables that were passed on the stack by a calling program.
- SP: Contains the offset of the top of the stack.
- SI: Used in string movement instructions. The source string is pointed to by the SI register.
- DI: Acts as the destination for string movement instructions.
What is a bit?
In computing, a bit is the smallest unit where data can be stored. Bits store data in the form of binary. Either a 1(On) or 0(Off).
More about registers:
The registers are further divided as below following left to right order or bits:
- AX: The first 8 bits of AX is identified as AL and the last 8 bits is identified as AH
- BX: The first 8 bits of BX is identified as BL and the last 8 bits is identified as BH
- CX: The first 8 bits of CX is identified as CL and the last 8 bits is identified as CH
- DX: The first 8 bits of DX is identified as DL and the last 8 bits is identified as DH
How to access BIOS functions?
BIOS provide a set of functions that let us draw the attention of the CPU. One will be able to access BIOS features through interrupts.
What are interrupts?
To interrupt the ordinary flow of a program and to process events that require prompt response we use interrupts. The hardware of a computer provides a mechanism called interrupts to handle events. For example, when a mouse is moved, the mouse hardware interrupts the current program to handle the mouse movement (to move the mouse cursor, etc.) Interrupts cause control to be passed to an interrupt handler. Interrupt handlers are routines that process the interrupt. Each type of interrupt is assigned an integer number. At the beginning of physical memory, a table of interrupt vectors resides that contain the segmented addresses of the interrupt handlers. The number of interrupt is essentially an index into this table. We can also called as the interrupt as a service offered by BIOS.
Which interrupt service are we going to use in our programs?
Bios interrupt 0x10.