Linux

Starting Linux

Electricity flows from the external electrical grid, through the electrical cord to the power supply. The power supply converts the 120 volt energy from the grid into voltages that the computer can use. A .5 volt signal from the power supply is connected to the clock.

The clock is the timer, which makes sure all the different pieces of the puzzle are synchronized. The power on self check makes sure all the hardware is working properly. The Bus and the Central Processing Unit (CPU) and Graphical Processing Unit (GPU) are energized, UEFI loads the kernel into the CPU. The kernel starts systemd. systemd starts Wayland. Wayland starts your desktop environment and the compositor displays it on your monitors.

The Linux Kernel is the mysterious black box that does all the calculating and processing to get your hardware and operating system to work together to produce your desktop and all the applications running on your computer.

systemd is Ubuntu Linux’s all in one user space controller that manages your desktop and all the applications that run on it. systemd is process #1 and it creates and manages all other processes.

Your desktop displays the output of your computer on your monitor. It is where the action is. It is your work station, your art gallery. It is one of the most important tools in your studio.

Networking involves connecting all your computers to each other and to the Internet. As long as you have an Internet connection, Linux is always automatically connected to it.

UEFI

UEFI, or Unified Extensible Firmware Interface, represents a significant advancement over the traditional BIOS, providing a more robust framework for booting and managing hardware at the start-up phase of a computer.

One of the primary functions of UEFI is to initialize hardware components. When you turn your PC on, UEFI runs a series of tests and checks to ensure all hardware is functioning correctly before booting the operating system. This process is known as POST (Power-On Self-Test).

Beyond basic initialization, UEFI offers a more user-friendly interface. Instead of the text-based menu of BIOS, UEFI typically provides a graphical interface where users can navigate through menus to configure settings like boot order, time, security options and hardware parameters using a mouse and keyboard.

Security is another area where UEFI excels. It includes features like Secure Boot, which ensures that only trusted software can run at boot time, preventing malicious code from executing. This is particularly important for protecting against bootkits and rootkits.

UEFI also supports larger hard drives through its use of the GUID Partition Table (GPT) rather than the older MBR (Master Boot Record). This allows for drives larger than 2TB and supports more partitions, improving disk management capabilities.

Networking and device connectivity are enhanced under UEFI. It can handle network boot protocols natively, allowing for PXE (Preboot Execution Environment) booting which is essential for network installations or maintenance in enterprise environments.

Modularity is a key aspect of UEFI. It can load drivers and applications before the OS boots, which means firmware updates or diagnostics can be run directly from UEFI without needing an OS. This modularity is facilitated through the EFI System Partition (ESP), where boot loaders and other UEFI applications reside.

The architecture of UEFI is also designed to be future-proof. It uses a 64-bit environment, which means it can handle more memory and processing power than the 16-bit BIOS, preparing systems for future hardware advancements.

In terms of how UEFI works with the operating system, once the hardware is initialized and settings are confirmed, UEFI hands over control to the boot loader, which loads the OS. This transition is smoother and faster compared to BIOS, partly because UEFI can run in parallel with hardware initialization rather than in sequence.

Finally, UEFI supports runtime services, allowing the OS to interact with firmware even after the system has booted. This can include accessing UEFI variables for system information or executing firmware functions like updating the firmware.

UEFI provides a modern, secure and flexible interface for managing hardware and booting systems, significantly enhancing the capabilities available at startup compared to the older BIOS systems.

The Kernel

The Linux kernel is the core component of the Linux operating system, serving as the foundation that manages the system’s hardware resources and facilitates interactions between the software and hardware. It acts as a bridge, ensuring that applications can run efficiently by providing services like memory management, process scheduling and system calls.

Starting with its architecture, the kernel employs a monolithic design where all the essential services run in a single kernel space. This design allows for better performance since there’s less overhead in context switching between different parts of the system. However, this approach also means that a failure within the kernel could potentially crash the entire system.

Memory management is another crucial aspect where the kernel shines. It handles how memory is allocated and deallocated, using mechanisms like paging and swapping. Paging allows the kernel to map physical memory to virtual memory, which can be larger than the actual available RAM, thus enabling multitasking by giving each process its own virtual address space.

When it comes to process scheduling, the kernel decides how CPU time is distributed among running processes. It employs various scheduling algorithms, with the Completely Fair Scheduler (CFS) being a popular choice in recent versions. This scheduler aims to provide a fair share of CPU time to each process, balancing between responsiveness and throughput.

System calls form the interface between user space applications and the kernel. Whenever an application needs to perform a task that requires access to hardware or system resources, it makes a system call. The kernel then interprets this call, performs the necessary operations and returns control to the user space, ensuring security and stability by not allowing direct hardware manipulation from applications.

Drivers within the kernel manage the communication with hardware devices. These drivers are either part of the kernel or loaded as modules, which can be dynamically inserted or removed from the running kernel. This modularity allows for flexibility and easier updates without needing to reboot the system.

Security is also a paramount focus. Features like SELinux (Security-Enhanced Linux) or AppArmor provide mandatory access control mechanisms, enforcing policies that limit what processes can do, thereby enhancing system security.

Lastly, the kernel’s development is community-driven, with Linus Torvalds overseeing the mainline development. This open-source nature means constant updates, patches for security issues and new features being integrated, ensuring that the Linux kernel remains robust, secure and adaptable to new hardware and software demands.

I recommend that free and open source software adopt the “spiritual assembly” model of consultation, collective decision making and leadership. At least nine people meet regularly and consult and decide about any updates. The community will still contribute to software development, the assembly will maintain the software and lead the community.

The Linux kernel is a sophisticated piece of software that not only manages hardware but also defines how software interacts with this hardware, making Linux a powerful platform for both servers and personal computing.

Linux Files System

The Linux file system is a structured way of organizing data on storage devices, providing a hierarchical arrangement of directories and files. It’s designed to be both efficient and flexible, supporting various file systems like ext4, XFS or Btrfs, each with specific features for different use cases.

At the top of this hierarchy sits the root directory, denoted by a single forward slash /. This root directory serves as the starting point for all paths in the system. From here, everything branches out:

Key directories include:

  • /bin for essential binary executables needed by the system during boot or in single-user mode.
  • /etc for system-wide configuration files.
  • /home where each user has a personal directory for their files.
  • /usr for read-only user data, containing many subdirectories like /usr/bin for user commands and /usr/lib for libraries.
  • /var for variable data files like logs, spool directories and temporary files that change size while programs are running.

Linux uses a permission system where each file or directory has attributes indicating who can read, write or execute it. These permissions are set for the owner, group and others, managed via commands like chmod and chown.

Different file systems can be mounted at various points in the directory tree. For instance, /dev/sda1 might be mounted at /mnt/data. This process allows for the integration of various storage devices into a single namespace.

Every file or directory is associated with an inode, which contains metadata like ownership, permissions, timestamp and pointers to the actual data blocks on the disk. The relationship between inodes and file names is managed by directory entries.

Linux supports hard links, where multiple names point to the same inode, and symbolic links (symlinks), which are files that point to another file or directory. Symlinks can span across file systems but hard links cannot.

Linux supports numerous file systems, each with different optimizations.

  • ext4 is widely used for its reliability and performance, with features like journaling for data integrity.
  • Btrfs offers advanced features like snapshots, subvolumes and RAID support.
  • XFS excels at handling large files and scalability.

Pseudo file systems like /proc provide an interface to kernel data structures, allowing users to view or tweak system parameters at runtime without needing to dive into kernel code. Similarly, /sys gives access to kernel modules and hardware states.

Basic operations like creating, copying, moving or deleting files are managed through system calls, which the shell or applications use. These operations are abstracted so that users don’t need to know the underlying storage specifics.

Linux uses caching mechanisms to improve performance. Data read from disk is kept in memory (RAM) for quicker access on subsequent reads, managed by the kernel’s buffer cache.

The Linux file system works by abstracting the physical storage into a logical structure that’s both human-readable and programmatically manageable. This abstraction allows for a versatile environment where different types of storage can be integrated and managed uniformly.

XDG

XDG, or the X Desktop Group, is not an organization but a set of standards and specifications aimed at enhancing interoperability among desktop environments in Unix-like systems. These specifications are designed to ensure that applications and environments can work together seamlessly, regardless of the underlying desktop system.

The primary focus of XDG is on setting standards for user directories, application data and environment variables. One of the key contributions of XDG is defining where different types of data should be stored.

  • User-specific data should be in $XDG_DATA_HOME (defaulting to $HOME/.local/share if not set).
  • Config files go in $XDG_CONFIG_HOME (defaulting to $HOME/.config).
  • Runtime files are located in $XDG_RUNTIME_DIR.
  • Cache files are saved in $XDG_CACHE_HOME (defaulting to $HOME/.cache).

This structure helps in organizing user data and configuration more uniformly across different applications and desktop environments.

XDG specifications include environment variables that applications can use or set to conform to these standards. For instance, an application looking for its configuration file would check $XDG_CONFIG_HOME/appname/config instead of a hardcoded path like ~/.appname/config.

The XDG Menu Specification helps in creating a consistent desktop menu across different environments. Applications provide .desktop files which describe how they should appear in menus, including icons, names and categories. This allows for a standardized approach to app integration into various desktop environments like GNOME, KDE or XFCE.

XDG also deals with MIME type associations. The XDG Shared MIME-info Database ensures that file types are recognized and handled consistently across different applications. This means if you set a file association in one application, it’s likely to be respected system-wide.

By adhering to XDG standards, developers can write applications that are more portable across different Unix-like systems without needing to write environment-specific code. This also aids in the development of cross-desktop tools and plugins.

XDG specifications also include fallbacks for legacy applications that do not follow these standards, ensuring compatibility with older software while encouraging adoption of newer standards.

In practice, when an application starts, it checks for these XDG variables to determine where to read from or write to. If not set, it falls back to traditional Unix paths. This system benefits both users, by providing a cleaner and more organized home directory; and developers, by reducing the complexity of supporting multiple environments.

XDG specifications work by providing a framework for consistency, which in turn promotes a more unified user experience across different Linux desktop environments.

systemd

systemd is an init system and system manager used in many Linux distributions to initialize and manage services, control how the system boots up and handle dependencies between services. It has become quite controversial due to its monolithic nature and the significant changes it brings to traditional Unix system management practices.

systemd replaces older init systems like SysVinit or Upstart. It begins its work at boot time, where it takes over from the bootloader to start the system.

When the system boots, systemd is the first process (PID 1) to run. It reads its configuration files from /etc/systemd/system and /lib/systemd/system, which define units (services, sockets, targets, etc.) and their dependencies.

systemd organizes everything it manages into “units”. These units can be services (.service) that manage daemons or applications. systemd can start, stop, restart or check the status of these services.

Similar to runlevels in sysVinit, targets are synchronization points where groups of units are activated. For example, multi-user.target is akin to runlevel 3, where the system is up but with no graphical interface.

These define network or inter-process communication sockets. Services can be activated on-demand when a connection to the socket is made, which is part of systemd’s socket activation feature.

Automounts (.automount), Timers (.timer), among others, which handle filesystem mounts, automatic mounts and scheduled jobs respectively.

One of systemd’s key features is its ability to manage dependencies. If a service A requires service B to be running, systemd will ensure B starts before A. This dependency management can be quite complex, allowing for sophisticated system configurations.

systemd includes journald, which provides centralized logging. Logs are stored in a binary format in /var/log/journal/, which can be queried using journalctl. This is different from traditional logging where each service might write to its own log file.

Unlike older init systems, systemd can start services in parallel, potentially speeding up the boot process by reducing idle waiting times.

systemd uses cgroups (control groups) for resource management, allowing it to allocate resources like CPU, memory or I/O to different processes or services, which aids in security and performance tuning.

systemd also manages user sessions with systemd –user, providing each user with their own systemd instance for managing user services.

Systemd can save the state of the system at shutdown and restore it at boot, which can be useful for servers that need to maintain uptime.

However, systemd’s extensive integration into the system has led to debates within the Linux community about its complexity, the speed of adoption and its impact on system administration practices. Critics argue it centralizes too much functionality, potentially making system troubleshooting and maintenance more complex.

In summary, systemd is designed to make service management more dynamic, integrated and efficient, but its adoption has also prompted discussions about the philosophy of system administration in Unix-like environments.

Control Groups

Cgroups, or Control Groups, are a Linux kernel feature that allow for resource allocation and management among groups of processes. They provide a mechanism to limit, account for and isolate the resource usage of a collection of processes.

The primary function of cgroups is to manage system resources like CPU, memory, I/O and network bandwidth. By grouping processes together, cgroups enable administrators or applications to set constraints on how much of these resources each group can use, ensuring fair distribution or prioritizing certain tasks over others.

Cgroups operate through a hierarchy of controllers and groups. Each controller handles a specific type of resource like cpu, memory or blkio for block I/O. Processes are then added to groups managed by these controllers, where policies are applied to control resource access.

One practical application of cgroups is in containerization technologies like Docker, where each container runs in its own cgroup, thus isolating its resource consumption from others. This allows for effective sandboxing where one container’s resource hogging won’t impact others on the same system.

The implementation of cgroups involves creating a cgroup filesystem in /sys/fs/cgroup, where each controller has its own directory. Under these directories, you can create subdirectories (subgroups) and adjust parameters via files within these directories. For instance, setting a memory limit for a group would involve writing to a file like memory.limit_in_bytes.

Cgroups also support nesting, meaning a group can have subgroups, each with potentially different resource limits or controls. This hierarchical structure allows for fine-grained control, where you can define broad policies at higher levels and more specific ones at lower levels.

Beyond resource control, cgroups are used for system accounting, where they track resource usage for billing or monitoring purposes. This can be particularly useful in cloud environments or shared hosting scenarios where usage needs to be precisely measured.

Cgroups work in conjunction with namespaces, another Linux feature, to provide complete isolation for containers or virtual environments. While namespaces manage what processes see, cgroups manage what processes can use, together forming a powerful framework for resource management and isolation.

In essence, cgroups provide a flexible and powerful way to manage system resources at a process or group level, allowing for better performance, security and resource utilization in multi-tenant or dynamic workload environments.

Namespaces

Linux namespaces are a kernel feature that provide process isolation by creating separate instances of global system resources, making each process or group of processes believe they have their own, isolated view of the system. They are fundamental to technologies like containers, allowing for lightweight virtualization.

Namespaces work by encapsulating different aspects of the system’s global resources into separate spaces. There are several types of namespaces, each managing a specific resource:

  • PID namespace allow processes within it to have their own set of process IDs, independent from those outside the namespace.
  • Mount namespace provide each process with its own filesystem view, where mounts and unmounts do not affect the global system or other namespaces.
  • Network namespace isolates network interfaces, routing tables and other network resources, so each namespace appears to have its own network stack.
  • UTS namespace manages hostname and NIS domain name, allowing different namespaces to have different identities.
  • IPC namespace isolates System V IPC objects and POSIX message queues, ensuring inter-process communication is confined within the namespace.
  • User namespace allows a process to have different user and group IDs inside and outside the namespace, enhancing security by mapping root within the namespace to a non-privileged user outside.

When a new process is created (e.g., via clone() with specific flags), it can be placed in one or more new or existing namespaces. This process then sees only the resources within its namespaces, effectively isolating it from the rest of the system.

The kernel manages these namespaces, ensuring that changes within one do not affect others unless explicitly shared. For example, if a process in a network namespace creates a network interface, that interface is only visible and usable within that specific namespace.

Namespaces can be nested, allowing for complex isolation scenarios where a namespace might contain other namespaces. This hierarchical structure supports fine-grained control over resource visibility and access.

One practical application of namespaces is in container technologies. Docker, for instance, uses namespaces to create containers that run in isolation from each other and the host system. Each container gets its own network stack, file system view, process ID space, etc., which makes containers feel like they are running on their own virtual machine without the overhead.

Management of namespaces is often done through system calls like unshare() to create new namespaces or setns() to join existing ones. Tools like ip netns for network namespaces or unshare command for general namespace manipulation are used to interact with these features from user space.

In essence, Linux namespaces provide a way to partition kernel resources so that processes can be sandboxed or virtualized, enhancing security, privacy and system organization without the need for full virtualization.

Processes

Linux processes are instances of programs in execution, managed by the kernel. Each process has its own virtual memory space, system resources and a unique identifier known as a Process ID (PID). When you start an application or execute a command, the kernel creates a new process, which can then spawn other processes, leading to a tree-like structure where each process has a parent-child relationship.

Processes in Linux are managed through several mechanisms. The kernel schedules processes, determining when each process runs based on priority and scheduling algorithms like Completely Fair Scheduler (CFS). This scheduling ensures that CPU time is distributed fairly among processes, allowing for multitasking where multiple applications can seem to run simultaneously.

Memory management is another critical aspect for processes. Each process gets its own virtual address space, where memory is mapped. The kernel uses paging and swapping to manage this, allowing processes to use more memory than physically available by moving pages between RAM and disk.

Processes interact with the system and other processes through system calls, which are requests for kernel services. These can include file operations, network communication or process control commands like fork() to create a new process, or exec() to replace the current process image with a new program.

Inter-process communication (IPC) is facilitated by several methods in Linux, including pipes, signals, shared memory, message queues and sockets. These mechanisms allow processes to communicate or synchronize their activities, which is crucial for cooperative multitasking and complex applications.

Linux also employs process states to manage lifecycle and execution. A process can be in states like running, sleeping (waiting for an event or resource), stopped (paused by a signal) or zombie (a terminated process that still holds a slot in the process table until its parent collects its exit status).

Security and isolation are maintained through mechanisms like user IDs (UIDs) and group IDs (GIDs), which define what resources a process can access. Additionally, Linux uses namespaces to isolate process views of system resources, which is fundamental for containerization.

Process termination happens when a process completes its execution, is killed by a signal or crashes due to an error. Upon termination, resources like memory and open files are released back to the system and the process’s exit status is communicated to its parent, which can then decide whether to wait for the process or ignore its termination.

Linux processes are dynamic entities that encapsulate program execution, managed by the kernel for scheduling, resource allocation and security, allowing for efficient, secure and concurrent operation of multiple programs on a single system.

The Bus

In Linux, the term “bus” generally refers to the communication pathways within the system, both physical and logical, that allow different components to interact. This includes hardware buses like PCI, USB, and I2C, but also extends to software abstractions like the Linux kernel’s device model.

The Linux kernel uses a device model to manage hardware resources. At the core of this model is the concept of a bus, which organizes devices into a hierarchical structure. Each bus type in Linux is represented by a struct bus_type, which includes methods for managing devices on that bus. For instance, when new hardware is detected, the appropriate bus driver will be notified to handle the device.

Bus drivers in Linux manage the communication between devices and the rest of the system. When a device is plugged into a bus, the bus driver scans for new devices, using techniques like bus enumeration to detect and initialize them. This process involves registering the device with the kernel and possibly loading the appropriate driver if it’s not already in memory.

The device driver model in Linux allows for a clean separation between the hardware-specific code and the rest of the kernel. When a device is added to a bus, it not only gets a physical address but also a logical path within the device tree. This structure helps in managing dependencies and power management, where operations like suspend or resume are coordinated bus-wide.

One key aspect of the Linux bus system is its ability to handle hotplugging. Modern buses like USB support adding or removing devices while the system is running. The kernel’s hotplug infrastructure will notify the user space about these events, potentially triggering scripts or services to react accordingly, like mounting a newly inserted USB drive.

Buses also facilitate interrupt handling. When a device on a bus needs the CPU’s attention, it can send an interrupt signal through the bus. The Linux kernel maps these interrupts to handlers that manage the specific device’s needs, ensuring efficient communication between hardware and software.

Lastly, the Linux bus system supports a variety of protocols for device communication. Each bus type has its protocol, like USB’s request-response model or I2C’s master-slave communication. The kernel abstracts these differences, providing a uniform interface for device drivers to interact with the hardware, simplifying driver development across different bus types.

In essence, the Linux bus system acts as an intermediary, ensuring that hardware can communicate effectively with the software, managing device lifecycle, interrupts and providing a framework for driver development and device management.

Application Programming Interface

An API, or Application Programming Interface, serves as a set of rules and protocols that allow different software applications to communicate with each other. It defines how software components should interact, specifying the syntax, data formats and conventions to be followed.

At its core, an API acts like a contract between the provider and the consumer of the service. When a developer wants to use an API, they interact with it by making requests. These requests can be for data, functionality or to perform actions within another application or service. For instance, if you’re developing a weather app, you might use a weather data API to fetch current conditions or forecasts.

The client (your application) sends an HTTP request to the API’s endpoint. This request includes necessary parameters like authentication tokens, the type of data requested, or specific actions to perform.

Upon receiving the request, the API server processes it. This might involve querying a database, running algorithms or interfacing with other services. The server interprets the request according to the API’s specifications.

Once the server has processed the request, it sends back a response. This response is formatted according to the API’s documentation, typically in JSON or XML, containing the data or confirmation of the action performed. The response might include status codes (like 200 for OK, 404 for Not Found) that inform the client of the outcome.

APIs can be designed in various architectural styles.

  • REST (Representational State Transfer) uses standard HTTP methods like GET, POST, PUT, DELETE. It’s stateless, meaning each request from a client must contain all the information needed to complete the request.
  • SOAP (Simple Object Access Protocol) is an older, more rigid standard that uses XML for message format and can work over various protocols, not just HTTP. It’s known for its strict standards but can be more complex to implement.
  • GraphQL allows clients to request exactly the data they need, reducing over-fetching or under-fetching of data, which can be more efficient for complex queries.

APIs often include authentication mechanisms to ensure that only authorized users or applications access the service. This could involve API keys, OAuth tokens or other forms of secure authentication.

Error handling is another critical aspect of how APIs work. Good APIs provide clear, informative error messages that help developers diagnose issues when things go wrong, like invalid parameters or server errors.

An API works by providing a structured way for applications to interact, allowing for modular, scalable software design where different systems can work together seamlessly, each handling its part of the overall functionality or data management.

systemd-networkd

systemd-networkd is a system service that manages network configurations on Linux systems using systemd. It provides network configuration and management capabilities, handling network interfaces, IP addresses, routes and more, all in a declarative manner.

One of the primary ways systemd-networkd works is through the use of configuration files stored in /etc/systemd/network/. These files define how each network interface should be configured, including static IP addresses, DHCP settings or bridge configurations. The service reads these files at boot or when triggered by network changes.

Networkd supports various network types.

  • Ethernet: For wired connections, where you can specify static IPs or use DHCP.
  • VLANs: Virtual LANs can be created and managed, allowing network segmentation.
  • Bridges: For creating network bridges to connect multiple network segments or devices.
  • Tunnels: Like VXLAN or GRE, for encapsulating one network protocol within another.

When network devices come online, systemd-networkd detects them and applies the configurations from the matching .network files. If no configuration matches, it might fall back to generic settings or use DHCP if available.

The service also interacts with other systemd components like systemd-resolved for DNS management and systemd-link for interface activation. This integration allows for a unified approach to network management within the systemd ecosystem.

Link state monitoring is another feature, where systemd-networkd can react to changes in link status (like a cable being plugged in or out), automatically adjusting network configurations as needed.

For dynamic configurations, systemd-networkd can work with DHCP clients or servers, managing lease times, and even supporting DHCPv6 for IPv6 networks.

Administrators can control and query networkd via command-line tools like networkctl, which provides status information, allows interface management and can show which configurations are applied to which interfaces.

In summary, systemd-networkd offers a modern, integrated way to configure and manage network interfaces, aiming for simplicity and consistency across different network scenarios, all managed within the broader systemd framework.

❯ networkctl –help
networkctl [OPTIONS…] COMMAND

Query and control the networking subsystem.

Commands:
list [PATTERN…] List links
status [PATTERN…] Show link status
lldp [PATTERN…] Show LLDP neighbors
label Show current address label entries in the kernel
delete DEVICES… Delete virtual netdevs
up DEVICES… Bring devices up
down DEVICES… Bring devices down
renew DEVICES… Renew dynamic configurations
forcerenew DEVICES… Trigger DHCP reconfiguration of all connected clients
reconfigure DEVICES… Reconfigure interfaces
reload Reload .network and .netdev files

Options:
-h –help Show this help
–version Show package version
–no-pager Do not pipe output into a pager
–no-legend Do not show the headers and footers
-a –all Show status for all links
-s –stats Show detailed link statics
-l –full Do not ellipsize output
-n –lines=INTEGER Number of journal entries to show
–json=pretty|short|off
Generate JSON output

See the networkctl(1) man page for details.