The Linux Kernel

You can use the Linux Kernel as a black box, in other words, using it while not knowing very much about how it works. I don’t know all that much about it and I’ve been using Linux, almost exclusively, for more than 15 years now.

I just want to be able to install, operate, maintain and repair my own Linux operating systems. Here are some of the notes I’ve been taking, as I study and learn how Linux works.

They are probably not very thorough. I’m still figuring out what’s important to know, what questions should I be seeking answers for. This story will be updated, whenever I learn new things about the Linux kernel. Its just a start. This is an unfinished story. Pay attention to the patterns.

One of the most important things I’m concentrating on learning right now, is how to configure Linux. Where are the configuration files, how should they be well organized and what is the best way to configure Linux and the applications I’m running on Linux.

Kernel space is not a directory in the traditional filesystem hierarchy. Instead, it refers to the area of memory where the kernel runs, separate from user space where applications run.

In terms of memory layout, kernel space typically occupies the upper portion of virtual address space, while user space occupies the lower portion. The exact boundary between kernel space and user space varies depending on the architecture and operating system.

While kernel space is not a physical directory, some filesystems provide a way to access kernel information and interfaces through special filesystems, such as:

  • /proc is a pseudo-filesystem that provides information about running processes, kernel parameters and system statistics.
  • /sys is a filesystem that exports kernel information and configuration, allowing user-space applications to interact with the kernel.
  • /dev is a filesystem that provides device files, which are used to interact with hardware devices managed by the kernel.

These filesystems provide a way to access kernel information and interfaces, but they do not represent the kernel space itself.

The Operating System

An operating system (OS) is software that controls the hardware of a computer. It consists of:

  • A bootloader is software that controls the process of starting your device. The bootloader checks your hardware and starts your kernel.
  • The kernel is the core of the system. It manages the CPU, memory and peripheral devices.
  • Daemons are background services, like drivers, which allow the kernel to communicate with and control your hardware.
  • D-Bus (Desktop Bus) is your InterProcess Communications system (IPC), which enables your applications to communicate with the kernel and vice versa.
  • The system daemon of D-Bus is launched at start up and is primarily used for managing hardware.
  • The session daemon of D-Bus is launched when you log into your desktop and primarily manages your desktop and the applications running on your desktop.
  • The console is a shell that runs during the startup process. It’s all the text you see flashing on your screen while the computer is starting.
  • The shell is a program, through which you operate your computer using text commands in a command line interface. Zsh is my favorite shell. The console and shell are different versions of the same thing. The console usually takes up the whole screen and a shell usually runs in an application called a terminal emulator. You can have as many shells open as you want. Modern shells have tabs and windows, so there is usually no need to have more than one shell open at a time. If you have several different projects going on in different workspaces, you may have a shell open in each different workspace.
  • Your Graphical Server is the sub-systems that display the graphics on your screen. Wayland is the more advanced replacement for the X Windows System.
  • A Desktop Environment, KDE is a very Kool Desktop Environment. Gnome, Xfce, MATE and many more are popular alternatives.
  • Tiling window managers are a popular alternative to a full desktop environment.
  • Applications are programs that perform a user’s tasks, such as Konsole, Kontact, LibreOffice and Brave.

The Linux operating system is one of the most widely used and versatile operating systems in the world. Known for its stability, security and open-source nature, Linux powers everything from personal computers to smartphones, servers, supercomputers and embedded systems. It has become a cornerstone of the tech industry and the backbone of many critical systems worldwide.

Linux is a Unix-like, open-source operating system based on the Linux kernel, which was first developed by Linus Torvalds in 1991. It provides a platform for running software and managing hardware resources like CPU, memory, storage and peripherals.

Linux source code is freely available, allowing anyone to study, modify and distribute it. It supports multiple users simultaneously without interfering with each other. Can efficiently manage and execute multiple tasks at once. Runs on a wide range of hardware platforms, from embedded devices to mainframes. It is designed with built-in security features like user permissions, encryption and firewalls.

A Linux operating system consists of several key components that work together to deliver functionality:

The kernel is the core of Linux, responsible for managing hardware resources and facilitating communication between hardware and software. It allocates and tracks memory for processes. Manages process creation, scheduling and termination. Interfaces with hardware devices like disks and network cards. It handles file system management, including data storage and retrieval.

System libraries contain reusable code that developers and applications use to perform standard tasks (e.g., managing files or handling input/output). Utilities are small programs that perform specific system tasks, such as disk management, file copying and system monitoring.

The shell is a command-line interface (CLI) that allows users to interact with the operating system by entering commands. Examples include Bash, Zsh and Fish. User space includes all applications, programs and processes that run outside the kernel. Examples: Desktop environments (e.g., GNOME, KDE), web browsers, media players and office suites.

Linux operates based on a layered architecture, with the kernel at the center and other components built on top of it. When a Linux system is powered on, a bootloader (e.g., GRUB) loads the Linux kernel into memory. The kernel initializes hardware, mounts the root file system and starts essential system processes.

The kernel directly communicates with the computer’s hardware using device drivers. It abstracts hardware details, providing a unified interface to applications.

The kernel schedules processes (programs in execution) to ensure efficient use of the CPU. It isolates processes to prevent them from interfering with each other. Linux organizes data using a hierarchical file system, where everything is treated as a file (e.g., hardware devices, directories and regular files). Common file systems include ext4, XFS, and Btrfs.

Users interact with the system via a command-line interface (CLI). The CLI is typically a shell, running inside a terminal emulator. Entering text commands in a terminal is one way you can execute precise control of your programmable computer operating system. You can also interact with your computer using graphical user interface (GUI) applications, running on a desktop environment like GNOME, XFCE or KDE or a tiling window manager.

Linux enforces strict user and group permissions to protect files and processes. Features like SELinux (Security-Enhanced Linux) enhance system security by implementing mandatory access controls.

Linux includes robust networking capabilities, supporting protocols like TCP/IP, FTP, and SSH. It is widely used in servers for hosting websites, managing networks and running cloud services.

The Kernel

The kernel is in a high priority system state, which includes protected memory space and full access to your device’s hardware. That system state and memory space are known as kernel space. The Linux kernel runs in the following locations:

  • The kernel resides in memory, specifically in the kernel space, which is a protected area of memory that is inaccessible to user-space processes. The kernel allocates memory for running programs and loads executable code into memory from the file system.
  • In most Linux distributions, including Ubuntu and Red Hat, the kernel is typically found in the /boot directory (e.g., /boot/vmlinuz-). This directory contains the kernel image file (e.g., vmlinuz) and other boot-related files.
  • When the system boots, the kernel is loaded into system RAM (Random Access Memory) from the /boot directory. The kernel then takes control of the system, initializing hardware and services and managing processes.
  • The kernel runs on the CPU (Central Processing Unit), managing tasks, scheduling processes and handling interrupts. The kernel’s code is executed in supervisor mode, giving it unrestricted access to system resources.

Sources
Brave Leo
superuser.com : Where can I find the Linux kernel file? – Super User
redhat.com : What is the Linux kernel?
en.wikipedia.org : Kernel (operating system) – Wikipedia
makeuseof.com : How Does the Linux Kernel Work? The Linux Kernel Anatomy Explained
iq.thc.org : How does the Linux Kernel start

The Linux kernel is the core component of the Linux operating system, responsible for managing the system’s resources and providing the lowest-level abstraction between the hardware and software. Developed by Linus Torvalds in 1991, it has evolved into a robust, scalable and highly customizable system that powers everything from smartphones to supercomputers.

The kernel is a piece of software that sits between the hardware and user applications. It’s not an operating system in itself but part of one, serving as the foundation upon which the rest of the system is built.

The kernel controls hardware resources like CPU, memory and I/O devices. It decides which processes get to use which resources and when. It manages the execution of processes, scheduling them to run on the CPU, handling context switches and providing mechanisms for inter-process communication (IPC).

The kernel allocates and manages memory for processes, including virtual memory management where it swaps memory to and from disk when physical memory is full. It provides an abstraction layer for file system management, allowing different types of storage devices to be used uniformly.

Linux includes a networking stack that manages network interfaces, routing tables and protocols, enabling communication over networks. The kernel implements basic security mechanisms like user permissions, access controls and more advanced features like SELinux or AppArmor for policy enforcement.

System Calls are the API that applications use to interact with the kernel, allowing programs to perform privileged operations like reading from or writing to hardware devices.

When a computer boots, the BIOS or UEFI loads a boot loader (like GRUB), which then starts the kernel. UEFI, the more modern replacement for BIOS, can start the kernel itself, however, in order to remain compatible with older computers, it uses GRUB.

The kernel initializes hardware, mounts the root file system and starts the first user-space process, usually init or systemd. It sets up system memory, initializes the virtual file system and prepares for user-space interactions.

The kernel schedules processes using algorithms like Completely Fair Scheduler (CFS) to ensure equitable CPU time distribution. It manages process states (running, sleeping, waiting, etc.), forks new processes and handle signals. It uses paging to manage memory, translating virtual addresses to physical ones and manages page tables. The kernel also handles swapping, moving pages of memory to disk when necessary.

Applications request services from the kernel through system calls. These calls enter the kernel in what’s known as kernel mode, allowing privileged operations. The kernel manages both hardware interrupts (like those from a keyboard) and software interrupts (like those for timer events), ensuring that the system responds to external events efficiently. Linux’s modular design allows for dynamic kernel modules. These can be loaded or unloaded at runtime to add or remove functionality without rebooting.

Linux’s development is community-driven, with thousands of contributors worldwide. The code is open, allowing for transparency and rapid evolution. The kernel uses Git for version control, which facilitates collaborative development. Various Linux distributions (like Ubuntu, Fedora, Debian) tailor the kernel to specific needs, adding user-space tools, different file systems and drivers.

The Linux kernel is a marvel of software engineering, offering a balance between performance, stability and customization. Its modular architecture and community support make it adaptable to a wide array of computing environments. For developers, system administrators or anyone curious about how their computer works, understanding the Linux kernel provides deep insights into the mechanics of modern computing systems.

Device drivers connect the Kernel to the hardware of your system. Knowing how to install and configure device drivers can be very useful for connecting your computer to other devices, like printers and monitors. Once you understand how Linux works, learning how to port Linux to hardware, such as a smartphone, might be an interesting subject to investigate.

The System Call Interface, Linux uses D-Bus, connects your kernel to user space, where all your applications operate. D-Bus is an InterProcess Communications (IPC) “manager.” It manages the ports and sockets automatically, so you don’t have to do it manually.

The system call interface is a communication system, which your applications use to communicate with each other and with the kernel, to ask the kernel to do things that make the applications function.

The communications between the kernel and the device drivers, and the kernel and the system call interface, are two way communications. The kernel uses device drivers to control devices. It uses the system call interface to control the applications running on your desktop. The applications and the devices, “call” the kernel to ask for kernel resources.

Devices are usually machines, such as your laptop, including the mouse and keyboard and the CPU and memory in your computer, your printer, a USB that you plug into your computer and any other hardware you want to control. Applications are the programs you run on the computer.

Your kernel turns systemd on. systemd turns on and controls your user space. Your applications run in user space, where they can access a subset of your computer’s resources, via kernel system calls. User level applications, such as Konsole, Kontact, LibreOffice and Firefox are created using the core services provided by the kernel.

The kernel was first released on September 17th, 1991, by Linus Torvald at Helsinki University. It eventually adopted the GNU/GPL v2 (GNU is Not Unix/General Public License).

With many academic and corporate supercomputers running Linux and the Linux kernel based Android operating system, working on millions of smart phones all over the world, Linux is the most common general purpose operating system in the world. Both Android and OS X are forks of Unix, just like Linux is.

AT&T was very stingy about its intellectual property rights in the telephone technology and in order to break up the monopoly, the government declared it to be a public utility providing telephone service. It was not allowed to be involved in any other business and was required to give any computer technology away for free.

Some AT&T computer scientists, who had been involved in developing C programming language and the Unix operating system, got jobs as computer science professors at the University of California at Berkeley. They developed the Berkeley Software Distribution to be a restrictive academic operating system. Apple and Sun Microsystems developed proprietary versions of unix. Linus Torvalds and Richard Stallman, working separately, Torvalds at the University of Helsinki, Stallman at MIT, developed the original Linux operating system.

The Linux kernel, originally developed by Linus Torvalds, is a monolithic kernel, rather than a micro-kernel or a hybrid. OS X and Windows are hybrids. There are many different distributions of Linux based on the Linux kernel. The monolithic kernel encompasses the Central Processing Unit, the memory and the IPC. It also handles device drivers, system server calls and file system management.

The monolithic design is faster, but less secure than micro-kernels, which keep user services separate from the kernel. Ubuntu has started containerizing applications in Snap Packages, accomplishing the security advantages of a micro-kernel. Pop!_OS and many other operating systems use Flatpak.

The Linux kernel is written with C and Assembly programming languages. There is also a lot of Python2.7 in Linux. One time, while upgrading Python, I removed Python 2.7 from a Linux installation. That was the end of that operating system. I suppose the Python could be in the drivers and/or the user space.

  • Operating System Operations
    • Process Management
    • Memory Management
    • File Management
    • Device Management

Hardware

The Linux kernel also manages the system’s hardware using interrupts, just like systemd uses interrupts to call for kernel resources to run applications.

When the hardware wants to interface with the system, an interrupt signal is issued that interrupts the kernel running in the processor. In order to synchronize various processes, the kernel can disable a single interrupt or all of them.

Interrupt handlers do not run in a process, they run in an interrupt context, not associated with any process. This enables interrupt handlers to quickly respond to an interrupt and then finally exit.

Linux supports dynamic loading of kernel modules. systemd merges all the modules into one central system configuration and service management platform.

The kernel is preemptive. It has symmetrical multiprocessor support. Linux provides an object-oriented device model, with device classes, hot-pluggable events and a user-space device file system.

System calls and Interrupts

Applications pass information to the kernel through system calls. Libraries contain functions that applications work with. They use the system call interface to command the kernel to execute a process the application needs.

Interrupts are the signals Linux uses to manage system hardware. The kernel interrupts the processors to command them to do things. The processor, in turn, interrupts the kernel to command it to do things. This is how the operating system communicates with the hardware.

An IRQ is an Interrupt ReQuest. IRQs are signals, asking a CPU to do some task. An IRQ line is a channel of communications between the kernel and a CPU. Exceptions are a particular kind of IRQ, involving some kind of error.

IDT is the Interrupt Descriptor Table. APIC is the Advanced Programmable Interrupt Controller. LAPIC is a Local APIC. The interrupt handler’s first task is reserving registers in the CPU.

When a CPU receives an interrupt, the interrupt handler uses assembly language code to reserve registers for the operation in the CPU and C functions to execute the commands. An IRQ n is stored in the interrupt[n] entry and then copied into the interrupt gate included in the proper IDT entry.

Each CPU in a computer has a hard IRQ stack and a soft IRQ stack. Hard IRQs have a higher priority than soft IRQs. The hard IRQ stack and the soft IRQ stack are each, one page frame lists of requests that can be sent. Hard IRQ stacks are contained in a hardirq_stack array. Soft IRQ stacks are contained in a softirq_stack array.

Interrupt Service Routines handle interrupts by executing an operation specific to one type of device. Interprocessor Interrupts are handled as direct messages on the bus (the computer’s communications system) that connects the Local APIC of all the processors.

Two distinct Application Programming Interfaces (API) exist, kernel-userspace and Kernel internal. The kernel-userspace API is the Linux API. It consists of the System Call Interface and the subroutines from the GNU C Library. It gives user space programs access to system resources and kernel services.

The Linux Application Binary Interface (ABI) is a kernel-user space interface, which exists between modules. ABIs access external code that has already been compiled, while APIs are structures for managing software. Linux distributions, rather than the Linux kernel, define important ABIs.

An ABI is the source code an application is made of. And API is the code an application uses to interact with the operating system.

ACPI is the Advanced Configuration and Power Interface specification. It is an open industry standard that transfers control of power management from the BIOS to the operating system.

Source: Interrupt Handling

The C Standard Library

All system calls are included in the GNU C library. The Linux API includes the system call interface and the GNU C library, named glibc in Linux.

Qt and GTK are libraries that KDE and GNOME use to build their applications. The libraries are objects and functions that programs can call on to do things, without having to create those objects and functions from scratch.

Your computer’s processors, my computer has eight processors on one chip, each one operating at 2 or 3 gigahertz, is performing billions of these calls and calculations every second.

POSIX

The Portable Operating System Interface (POSIX) is the standard for maintaining compatibility among operating systems. It declares the API, together with utility interfaces and command line shells. The Linux API includes the usable features defined by POSIX and additional features, including:

  • Cgroup subsystem
  • The Direct Rendering Manager’s system calls
  • A readahead feature
  • getrandom call
  • System calls such as futex, epoll, splice, dnotify, fanotify and inotify

The Linux Loadable Kernel Module (LKM)

You can add code to your Linux kernel by adding source files to the kernel source tree. You can also add code as a loadable kernel module, while the kernel is running. There are three kinds of LKMs: device drivers, file system drivers and system calls.

The advantage of installing LKMs, instead of binding into the base kernel are

  • You don’t have to rebuild your kernel, saving time and avoiding errors.
  • LKMs assist you to investigate system problems, such as bugs.
  • LKMs save space, because you only load them when you use them.
  • Faster maintenance and debugging.

You can use LKMs as

  • Device drivers are how the kernel exchanges information with hardware. A kernel must have a device’s driver before using it.
  • Filesystem drivers translate the contents of a filesystem
  • System calls enable programs in user space to acquire services from the kernel.
  • Network drivers interpret network protocol.
  • Executable interpreters load and manage executables.

Compiling the Linux Kernel

Compiling the Linux kernel is actually fairly simple. Download the source code from https://kernel.org. Read the manual. Follow directions. …

Architecture

A kernel is simply a resource manager. The resources being managed may be processes, memory or hardware devices. It manages and arbitrates access to resources between multiple competing users.

A GNU C Library provides a forum for the system call interface to connect to kernel space, allowing communication back and forth between the kernel and user space. It’s a library of prerecorded sentences and paragraphs of code that the kernel, hardware and applications can use to communicate with each other.

The kernel is organized into three primary levels. The system call interface is the topmost level and executes basic actions like read and write. The kernel code is located inside the system call interface. It is common to all processor architectures supported by Linux and is sometimes defined as architecture-independent kernel code.

Architecture-dependent code is inside, or below, the architecture-independent code. It forms a Board Support Package (BSP). This includes UEFI/GRUB, which places the Operating System and device drivers into memory and starts them.

The architectural perspective of the Linux kernel consists of:

  • The system call interface is a thin layer used to connect function calls from user space into the kernel. The system call interface may be architecture dependent
  • Process management executes the processes. These are referred to as the thread in a kernel and represent an individual virtualization of a particular processor
  • Linux includes methods for managing the available memory, as well as for interfacing with the hardware mechanisms for physical and virtual mappings. For efficiency, memory is managed as pages. Swap space is also provided.
  • The virtual file system provides a standard interface for the file system. It provides a switching layer between the system call interface and the file systems supported by the kernel.
  • The network stack is designed as a layered architecture, modeled after certain protocols.
  • Device drivers, which make hardware devices usable, are a significant part of the source code in the Linux kernel. They enable the kernel to communicate with and manage devices.
  • Architecture-dependent code, UEFI/GRUB, the elements that depend on the architecture of the hardware on which they run, must consider the architectural design for normal operations and efficiency.

Cgroups

Control groups, usually referred to as cgroups, are features of the Linux kernel which allow processes to be organized into hierarchical groups, whose usage of various resources can be monitored, prioritized and limited.

A cgroup is a collection of processes, which are bound to a set of limits or parameters defined via the cgroup filesystem.

The kernel’s cgroup interface is provided through a pseudo-filesystem called cgroupfs. Grouping is implemented in the core cgroup kernel code. Resource tracking and limits are implemented in a set of per-resource-type subsystems (memory, CPU, and so on).

A subsystem is a kernel component that modifies the behavior of the processes in a cgroup. Various subsystems have been implemented, making it possible to do things such as limiting the amount of CPU time and memory available to a cgroup, accounting for the CPU time used by a cgroup, and stopping and starting execution of the processes in a cgroup. Subsystems are sometimes called resource controllers, or simply, controllers.

The cgroups for a controller are arranged in a hierarchy. This hierarchy is defined by creating, removing, and renaming subdirectories within the cgroup filesystem. At each level of the hierarchy, attributes (e.g., limits) can be defined. The limits, control and accounting provided by cgroups generally have effect throughout the subhierarchy underneath the cgroup where the attributes are defined. So, the limits placed on a cgroup at a higher level in the hierarchy, cannot be exceeded by descendant cgroups.

Nodes

The no-processes-in-inner-nodes rule means that it is not permitted to have processes directly attached to a cgroup, which also has child cgroups and vice versa. In other words, a cgroup is either an inner node or a leaf node of the tree, and if it’s an inner node, it may not contain processes directly; and if it’s a leaf node, then it may not have child cgroups.

(There are some minor exceptions to this rule. For example, the root cgroup, which maintains kernel threads, is special and allows both processes and children)

The single-writer rule means that each cgroup has only a single writer, i.e. a single process managing it. It’s OK if different cgroups have different processes managing them. However, a single process should own a particular cgroup, and that ownership is exclusive, and nothing else should manipulate it at the same time. This rule ensures that various pieces of software are not constantly stepping on each other’s toes.

Tree Setups

Unified — this is the simplest mode, and exposes a pure cgroup v2 logic. In this mode /sys/fs/cgroup is the only mounted cgroup API file system and all available controllers are exclusively exposed through it.

Legacy — this is the traditional cgroup v1 mode. In this mode the various controllers each get their own cgroup file system mounted to /sys/fs/cgroup/. On top of that, systemd manages its own cgroup hierarchy for management purposes as /sys/fs/cgroup/systemd/.

Hybrid — this is a hybrid between the unified and legacy mode. It’s set up mostly like legacy, except that there’s also an additional hierarchy /sys/fs/cgroup/unified/ that contains the cgroup v2 hierarchy. This mode is a stopgap solution.

Conclusion

The Linux kernel serves as a resource manager for your hardware and your applications. Your applications use the system call interface to communicate with the kernel. The kernel uses interrupts to communicate with the hardware.

Linux is a multi-user, multi-tasking system, allowing many people to use it at the same time and allowing many projects to run at the same time.

The modular nature of the Linux kernel enables you to add significant modifications without rebooting the system.

The monolithic structure of the kernel makes it faster. The main weakness is that if any of its services fail, the whole system fails. Newer versions of Linux have addressed this problem by adding new services as modules. Snap packages and Flatpak install applications within containers to protect the kernel from the packages and to protect the packages from each other.

Sources:
LinuxHint
Wayland
Linux Man Pages / cgroups
Oreilly / Interrupt Handling
ChatGPT
Grok
Brave Leo