Have you ever asked yourself how many steps does your computer perform from the moment you press the power switch to turn it on till the moment your favorite desktop wallpaper shines it its full glory in front of you? Well, there are many steps that the computer performs in order you to be able to enjoy your powerful hardware and your slick software but when a step or two fails to execute and you can’t access the contents of your computer, you become more and more interested to know what exactly happens when you boot your computer and what could have potentially gone wrong that turns your ultra high-tech computer into a useless piece of metal (and some plastic, to be more precise).
Besides that it is interesting to know what exactly happens when you boot your computer, being familiar with the basic steps that are executed during system boot can help you troubleshoot problems on your own, rather than be dependent on Technical Support for minor issues, like an unplugged power supply cable, for example.
If you are in Technical Support, then certainly it is a must to know the boot sequence of the type of machines you are supporting. Well, if you are in Technical Support, you definitely need to know more about booting than what is presented in this article but I believe that even experienced Technical Support maniacs might have something to learn from the next chapters.
Although there are slight differences between the way different systems boot, generally the process can be divided into two – booting the hardware and starting the operating system. Basically, all PCs follow the same routine and if there are differences (mainly in the BIOS-related steps), you may want to check your vendors documentation before you take any steps to troubleshoot it further. Differences in the way the operating system boots are also possible and I have mentioned them occasionally. Also, though it is hardly possible to make an exhaustive list of all possible problems and their solutions for each of the stages, I have tried to suggest some of the reasons for things to go wrong and the ways to fix the problem.
Turning the Machinery On
Briefly, the hardware part of the boot sequence can be described like that: The boot process starts with pressing the power button to turn the computer on. Then, after a short self-test of the power supply, a signal is sent to the processor and it starts executing the ROM BIOS code. Then the ROM BIOS performs a short test of the available hardware and if everything is OK, the BIOS starts reading the configuration information in the CMOS (Complimentary Metal-Oxide Semiconductor) – i.e. where to start the operating system from (from a floppy, CD or the harddisk). In case the operating system is to be loaded from a harddisk, the BIOS locates its Master Boot Record (MBR) and loads it into memory. Then the partition loader (also called Boot Loader) takes charge and reads the partition table to find the active partition and a boot record there. After that the operating system starts booting. After that brief explanation of the hardware part of the boot process, let’s look in more detail at each of the steps:
- Switching on the power. When the power supply is switched on, its first task is to perform a self-test to ensure that power is stable – all voltages and current levels are normal. The self-test takes less than a half second and if you didn’t know about it, you would certainly never notice it, unless the computer freezes at this point. If the power supply does not pass the power-self test, this means that either the power supply unit is faulty, or that the voltage and current levels are not normal. So, when you turn on your computer and it just does not make any noise at all, one possible reason is a problem with the power supply. However, this does not necessarily mean that the power supply has gone off – a more prosaic reason can be an unplugged power supply cable.
- Here comes the CPU. Before the CPU receives the signal and the power is stable, the processor receives continuous reset signals and just waits. After the power supply has made the power self-test, it sends a signal to the processor that the power is OK. The CPU starts operations and the first thing it does is to look in the BIOS ROM for the start of the BIOS boot program. Generally, the start of the BIOS boot program resides right at the end of system memory and usually it is only 16 bytes from the top of ROM memory. Of course, 16 bytes are quite insufficient for the program itself to reside there but they are absolutely enough for a JMP (jump) instruction, which tells the processor the actual address of the ROM BIOS code.
- The BIOS POST is next. One of the first operations that are performed by the BIOS is the power-on self test (POST). The purpose of the POST is to determine if there are any fatal errors that prevent the proper booting or operation of the computer. Since at this time the video adapter is not started yet, all alarms about fatal errors are communicated in beeps. These beeps vary from manufacturer to manufacturer and their meanings are looked up in the vendors documentation, so if you hear your computer scream, try to distinguish the signals, look up their meaning in the docs and see if you can troubleshoot it on your own.
- Looking for the video adapter. If the power-on self test is passed without errors up to here, the BIOS starts looking for adapters that also need to load their ROM BIOS program in order to be initialized. After the video adapter has been initialized, you will see on screen all other messages about failed hardware. Actually initializing the video adapter takes place after the video test, which is also part of the POST, has been completed successfully, so maybe it is more precise to say that the video adapter is loaded in the middle of the POST.
- POST continued. Besides checking the central hardware and the video adapter, the POST reads the BIOS identification and displays the data on screen. Another portion of the POST is the memory test, which is skipped if it is a warm-startup (warm startup is when you restart the computer, while cold startup is when you have switched it on). The output of the memory test is displayed on screen – i.e. how much installed memory you have. If you see that the installed memory is less than what you physically have inside the computer box, this could mean that some of your memory might have stopped functioning (i.e. you have 2 blocks of 512MB each, which is 1GB in total but the memory test displays that you have only 512MB, which means that one of the two blocks is not working). Depending on how many blocks of RAM you have, you might be able or you might be not able to continue booting. If you have 2 or more blocks of memory and at least one of them is working, you will be able to go further, though your computer will be slower because of the reduced memory. This memory scenario is an example of a non-critical error.
- More tests. This stage could easily be grouped with the previous stage because essentially it is still checking the system but in the name of clarity, I have separated it in another stage. This stage can be called “system inventory” because the BIOS checks for disks, drives, and all kinds of peripherals. If the BIOS supports the Plug and Play standard, plug and play devices are also discovered and resources are assigned to them. The hard disk(s), the optical drives and the floppy-disk drive are located as well. When the hardware testing is over, the BIOS displays a summary screen about your system’s configuration. Well, you might not be able to read it because it flashes on screen for a split second but if you don’t see any error messages, which tend to stay forever, this means that the test has passed successfully. Any error messages here will direct you to the cause of failure. For instance, if you see that a hard disk is not found, this could mean that the disk is not physically present, it is not connected properly, it has died completely and so on. You might not be able to learn the exact cause of failure but when you know that there is something wrong with the hard disk, additional tests (or at least a look inside the computer case) will help you troubleshoot the problem.
- The BIOS reads the CMOS. After the POST has been passed either successfully, or with non-critical errors, the next step the BIOS performs is to read the configuration in the CMOS. The CMOS is a 64 bytes area in memory, which is persistent (i.e. the information there is not deleted when you turn off the computer, as is the case with RAM) because it is fed by the current of a small battery, which resides on the motherboard. Due to its small size, the CMOS can’t contain much information but it has a vital role in the boot sequence because it tells the ROM BIOS where to look for the operating system. The BIOS will try all possible drives (hard disk, floppy, external disks) to boot from and if no boot device is found at all, you will get an error message saying that there is no boot device available. This error message is BIOS-specific and sometimes can sound like “NO ROM BASIC – SYSTEM HALTED ”, which in plain English means that the BIOS could not find a drive to boot the operating system from.
- Reading the MBR. After the CMOS has directed the BIOS to the drive from which to boot the operating system, the BIOS starts reading the very first sector of the specified device. For hard disks, this is the Master Boot Record and it is 512 bytes in size. 512 bytes is a lot of space, compared to the 64 bytes of the CMOS, for example and there is enough room for a partition table, partition loader and signature. The signature is the last two bytes and if is is missing, or its value is invalid, the boot sequence stops with a fatal error message because the MBR can’t be loaded into memory and it is not possible to read the partition information.
- Reading the partition table. The partition table, as the name implies, contains information about the partitions on the hard disk like numbers of bytes per sectors, numbers of sectors per cluster, start and end of the partition, etc. When the active partition is found, the BIOS reads its very first sector for information about the boot record and depending on what is written in the boot record, proceeds further to load the operating system. In case you dual-boot (i.e. you have Windows and Linux installed on your computer), most likely (if everything is in order with the MBR) you will be presented with a screen, where you can choose which operating system to load. If you don’t make a choice within the set timeout, the default operating system will be loaded.
- Loading the operating system. After you have chosen which operating system to load, the procedure differs depending if it is Windows or Linux, as described next.
The Operating System Takes Charge
After the hardware part of the boot sequence has finished, what is loaded next is the operating system. If you have only one operating system, it is easy, because the computer knows immediately how to proceed. If you have more than one operating systems – i.e. Windows and Linux, or two or more Windows or Linux varieties, all this is written in the boot configuration and it takes one more step to load the Windows (or respectively) Linux boot loader and then to choose which of the available varieties of Windows (or Linux) to load.
Basically, the process of loading the operating system involves loading the kernel in memory, loading device drivers, starting services and finally presenting the user with a login screen or a login prompt, if it is not a GUI. It is useful to know what services are loaded at startup because this way you can remove some services (not the core ones, of course) and make your computer boot faster. The process of loading the operating system differs under Linux and Windows and that is why the boot sequence for both operating systems is described separately in the next two sections.
There are several stages in the Windows boot sequence. NTLDR is the boot loader for Windows for Windows 2000, XP, 2003 (under Vista its functionality is divided between winload.exe and the Windows Boot Manager. NTLDR is located in the root directory of the Windows system partition. NTLDR requires the boot.ini file, where configuration options about the boot process are written. NTLDR goes through the following four phases before the user is presented with the login screen:
- Initial Boot Loader Phase. The tasks that NTLDR performs at this stage are memory initialization to enable full memory addressing, as well as initialization of the file system on the primary boot drive. On the primary boot drive NTLDR looks for boot.ini.
- Selection which operating system to load. The boot.ini file contains boot settings, like a list of available operating systems and instructions, timeout before loading the default operating systems and so. If you have only one Windows variety installed, it will be booted automatically but if you have two or more Windows varieties coexisting on your computer, you will be presented with a screen where you can select which of them you want to load.
If you are running Windows 2000, XP, or 2003, you can press F8 to interrupt boot sequence and display a list of several options for special cases booting like Safe Mode or Last Known Good Configuration. Safe Mode loads only the most essential drivers and services, while Last Known Good Configuration loads the latest working configuration. Both options are very useful after an unsuccessful driver or application installation, as a result of which you can’t login properly into Windows.
- Hardware initialization. The next step in the boot sequence for Windows 2000, XP and 2003 is detection of the installed device drivers and initialization of the respective pieces of hardware. If you have more than one hardware profiles, you will be presented with a screen to choose which one you want to load. If you have only one hardware profile, it will be booted automatically.
- Configuration loading. After the hardware has been initialized and the appropriate hardware configuration is loaded, some additional drivers, namely of boot devices, are loaded as well. Then the next step is loading the kernel. After that core subsystems, like the Object Manager, the I/O Manager, the Process Manager are started as well. Then all services that are labeled for autostart are started as well and if everything goes smoothly, the user is presented with a login screen and this boot sequence is saved as the Last Known Good Configuration. If there are problems at this stage, you may see your Windows halted or rebooted, especially if it is a device driver failure.
The above presentation of the boot sequence in Windows was brief and it covered the main points only. If you need more information, for instance a list of which services are started and in what order, you can find it on the site of Microsoft. After we have briefly examined the boot sequence for Windows, let’s see how’s it for Linux.
The boot sequence under Linux is very different from the boot sequence under Windows. Besides, under Linux there are two boot loaders – GRUB (GNU GRand Unified Bootloader) and LILO (LInux LOader). LILO is the older boot loader but it is still used by some distros, while GRUB is newer and more and more distros use it. There are some slight differences in the boot procedure between the two boot loaders and also among distros but the major steps, which are described next, are the same for all Linux distros.
Basically, the procedure involves starting the OS loader, which locates the kernel image on disk. Then the boot loader loads the kernel into memory and starts it. After that the kernel initializes the devices and their drivers, mounts the file system and starts the init program. The init program, in turn, starts the rest of the processes, including the process that allows the user to login. In many distros on screen messages inform you what is going on at any moment of the boot sequence. In more details, here is what happens:
- OS Loader. As already mentioned, the MBR is responsible to tell the BIOS which operating system to load. But the tiny size of the MBR (512 bytes) does not allow the OS loader itself to reside there. Instead, the primary loader is in the MBR but its function is only to provide information about the location of the secondary loader, which is located on the specified disk partition. Thus, LILO and GRUB can be installed on the primary (DOS) loader, which will be pointing to them and everything works just fine – when you select them from the boot screen, they start, locate the kernel and load it. Very often it is possible to configure the boot loader to load an alternative kernel or to pass additional parameters to it.
- Kernel startup. After the OS loader has loaded the kernel in memory, the kernel initializes the devices (i.e. loads their drivers) and you will see the messages about that on screen. Then the kernel starts the swapper and mounts the root file system. It is possible to pass parameters that override the default file system and mount another one. Right after the root file system has been mounted, the kernel starts the init process and then it starts the system services.
- init and inittab. The init process is the first process that the kernel starts and because of this it always had PID=1. After the init process is started, it reads /etc/inittab for ruther instructions what to load. inittab defines what should be run in different runlevels. Runlevel is a key concept in the Linux boot process because each of the 7 runlevels (from 0 to 6) define a different set of services that need to be started. Each runlevel has a separate directory (under the /etc/rc.d/ directory) with many boot scripts that are to be executed, when the particular runlevel is selected.
- Boot scripts. The location of the boot scripts directory and their structure differs between the distros, so you’d better consult the documentation of your distro, if you can’t find the scripts directory right away. Runlevels and boot scripts provide incredible flexibility in customizing the Linux boot process. Boot scripts define at least the start and stop options for a particular service, though very often it is possible to pass the service additional commands and parameters and to define the sequence in which services will be started. The last service that is started is the login process, which presents the user with a login screen (or prompt, if you are using a non-graphical environment).
Describing the Linux boot sequence is more details is outside the scope of this article, so if you need more information about the process in general or about the process for a particular distro, see the Linux Documentation Project or the documentation of your distro. There you can read how to configure GRUB and LILO, what options are available for passing, what is the meaning of each of the runlevels and how to configure them, what kinds of modifications are possible for the boot scripts, etc.