Linux Server Planning and Configuration Essentials

A Linux server is just Linux, with the same kernel, the same commands, and the same file hierarchy, but configured deliberately: right-sized for its workload, secured for network exposure, and arranged so that services start reliably and can be debugged when they do not. This lecture walks through the decisions and tools that matter from the moment you choose a distribution to the moment a service is running and monitored. Every section builds on the last; by the end, you will have the vocabulary and the commands to set up and maintain a Linux server from scratch.

Where Does a Linux Server Run?

Linux servers run on bare metal (a physical machine you can touch), inside a virtual machine (VirtualBox, Proxmox, or a cloud hypervisor), or as a cloud instance (an EC2 instance on AWS, a Droplet on DigitalOcean, etc.). The operating system does not care which of these it is running on; once you have a shell prompt, the commands in this lecture work the same way everywhere. The differences show up at the edges: device names (sda on physical SATA, vda on virtio in a VM, nvme0n1 on NVMe or some cloud instance types), how you access the console (a monitor vs. a VNC window vs. SSH), and who manages the firmware and hardware underneath you.

Any of the following will work:

Cloud instance (easiest to start). Launch an EC2 (Elastic Compute Cloud, AWS’s virtual machine service) instance using a Linux LTS AMI (Amazon Machine Image, a pre-built OS snapshot used to launch instances). You get a running server in under a minute and can tear it down when you are done.
Local VM. Create a virtual machine in VirtualBox or UTM (macOS). Download the ISO for your chosen distribution and install from it. This gives you the full installation experience including the boot process, partitioning, and first-boot configuration.
Bare metal. If you have a spare physical machine, install your chosen Linux distribution directly. This is the most complete experience (you will see real firmware, real disk detection, real network interfaces), but it requires dedicated hardware.

Choosing a Linux Distribution

The first decision you face when building a server is which Linux distribution to run. This choice affects your package ecosystem, release cadence, support window, default tooling, vendor support, and the community you turn to when things break. It also affects how often change arrives. Some distributions deliberately change slowly so operations teams can standardize on them for years. Others move quickly so developers get newer kernels, drivers, and language runtimes sooner.

Linux distributions fall into a small number of family trees that share a common ancestry, package format, and tooling. Knowing the lineage helps you transfer skills from one distro to another.

Debian family: Debian is the upstream root. Ubuntu is built on Debian and is the most widely used server distribution. Mint, Kali, and Knoppix all derive from Ubuntu or Debian. Proxmox is also built on top of Debian. Package format: .deb; package manager: apt.
Red Hat / Fedora family: Red Hat Enterprise Linux (RHEL) is the commercial flagship. Fedora is the upstream community project. AlmaLinux and Rocky Linux are free rebuilds of RHEL. Amazon Linux is based on this family as well. Package format: .rpm; package manager: dnf (formerly yum).
Arch family: Arch Linux follows a rolling-release model. Manjaro is a more beginner-friendly derivative. Package manager: pacman.
Other notable distributions: openSUSE has both the stable Leap line and the rolling Tumbleweed line. Gentoo is source-based. Slackware is one of the oldest maintained distros. ChromeOS is Linux-based. FreeBSD and other BSDs are Unix-like but are not Linux.

The biggest practical difference between distributions is often the release model. A stable or LTS (Long Term Support) distribution freezes package versions for a defined release and then backports security patches. That means you keep the same major version of many packages for a long time, but receive security fixes without chasing constant feature churn. Debian Stable, Ubuntu LTS, and RHEL all follow this model. The operational advantage is predictability: documentation stays accurate longer, automation is less likely to break, and maintenance windows are easier to plan. The tradeoff is that package versions can look old compared to blog posts or tutorials written for newer releases.

A rolling-release distribution continuously publishes new package versions instead of bundling them into large, infrequent releases. Arch is the canonical example. The advantage is freshness: new kernels, drivers, compilers, and language runtimes appear quickly. The tradeoff is operational volatility. A server on a rolling distribution asks you to absorb change continuously, which is useful for learning and desktop experimentation but usually a poor default for production infrastructure unless your team is intentionally staffed and tooled for that pace.

Some distributions split the difference. Fedora has a regular release cadence and relatively short support window, so it moves much faster than RHEL while still shipping in distinct releases. openSUSE offers both Leap and Tumbleweed for this reason: one track for stability, one for constant updates.

Ubuntu Server (LTS) is one of the most popular choices for new deployments. Canonical publishes Long Term Support releases every two years, each backed by five years of security patches (ten with Ubuntu Pro). Ubuntu has an enormous ecosystem of community packages. Ubuntu is a strong default when you want current-enough software, broad documentation, and lots of third-party instructions that match what you are running.

Debian is the upstream project that Ubuntu is built on. Debian Stable prioritizes rock-solid reliability over cutting-edge software. Debian also maintains Testing and Unstable branches, but when administrators say “Debian on a server” they usually mean Stable. Debian is a strong choice when you want maximum stability and minimal surprises, but you may need backports or third-party repositories if your application stack requires newer software.

RHEL and its rebuilds (AlmaLinux, Rocky Linux) dominate enterprise environments. Red Hat Enterprise Linux follows a long support lifecycle measured in many years, not months. AlmaLinux and Rocky Linux are community rebuilds that track the RHEL ecosystem without the subscription cost. If your organization already runs RHEL-family systems, staying in that ecosystem reduces the number of things your team needs to know and keeps you aligned with enterprise vendor support matrices.

Arch Linux is excellent for understanding Linux because it exposes more of the system directly and assumes less. That same quality makes it less forgiving operationally. On a production server, the question is not “Can Arch do this?” but “Do you want this server to require constant careful attention?” Usually the answer is no.

In practice, you should choose a distribution using three filters. First, does the software you need officially support it? Database vendors, security agents, and monitoring tools often document only a subset of distributions. Second, does your team already know it? A familiar distribution with boring tooling is often safer than a theoretically superior one nobody can operate confidently at 2 AM. Third, does its release model match the workload? Production usually wants slow change; experimentation often benefits from fast change.

Most distributions offer both a desktop and server edition. A server edition omits the graphical desktop environment, reducing the installed package count, memory footprint, and attack surface. Server administration is done entirely over SSH.

Pre-Install Planning

Resist the urge to boot the installer immediately. A few minutes of planning will save hours of rework later.

Purpose and Sizing

Start by writing down what the server will do, what software it must run, and who will operate it. A server running a web application might need CPU for request handling, enough RAM for the database buffer pool and the application runtime, and enough disk for the OS, application code, database files, and logs. But sizing is not just about hardware. It is also about software requirements and operational maturity. PostgreSQL, Docker, Kubernetes agents, antivirus tools, backup agents, and observability collectors all consume resources. Some software also imposes hard support requirements: a vendor may support only certain kernels, distributions, or CPU architectures.

A reasonable starting point for a small web application might be 2 vCPUs (virtual CPUs, the share of a physical processor allocated to a VM or cloud instance), 4 GB of RAM, and a 40 GB root disk. That is only a starting hypothesis. If the application stack includes Java, Elasticsearch, large caches, or a local database, that baseline may be far too small. If the team is inexperienced with Linux, a more conservative choice can also be wise: use a familiar LTS distribution, leave extra disk space for logs and mistakes, and avoid exotic layouts until you have a reason for them. You can usually resize later, especially in the cloud, but having a baseline prevents both over-provisioning (wasting money) and under-provisioning (dropping requests under load).

Naming Conventions

Every server needs a hostname. In a small environment, a simple convention like purpose-environment-number works well: web-prod-01, db-staging-01. Avoid cute names (“gandalf”, “mordor”) in production; they are fun until you have forty servers and cannot remember which one runs the billing database. Set the hostname during installation or immediately after with sudo hostnamectl set-hostname <name>; verify the change took effect by running hostnamectl with no arguments.

The Boot Process

Understanding how a Linux server starts up helps you diagnose problems when it does not. Boot failures often look mysterious because the machine fails before you ever get a shell. In reality the process is a chain of handoffs: firmware hands off to a boot manager or bootloader, that component hands off to the kernel, the kernel hands off to the init system, and only then do the normal user-space services begin. If you know which stage failed, your debugging becomes much narrower.

The boot sequence has four major stages.

1. Firmware (UEFI or BIOS)

Firmware is the low-level software that starts before the operating system. On older PCs this was the BIOS (Basic Input/Output System). On modern systems it is UEFI (Unified Extensible Firmware Interface), which is more flexible, understands filesystems, and stores boot entries in non-volatile memory. UEFI does not load Linux directly. Instead, it looks in a special partition called the EFI System Partition (ESP) for a bootloader executable. The ESP is formatted as FAT32 because the UEFI specification requires a simple, universally readable filesystem that firmware vendors can implement consistently. FAT32 is not chosen because it is modern or secure; it is chosen because firmware can read it reliably before any Linux driver exists.

On a cloud VM, the same basic stages still exist, but they are partially hidden from you. An AWS EC2 instance is still a virtual machine with virtual firmware and a virtual disk image. You usually do not watch GRUB appear or manually partition an EFI System Partition because the cloud image already includes those pieces. The handoff chain is still there; the provider simply gives you a prepared disk image that has already been installed.

2. Boot manager or bootloader

The firmware hands control to the next stage, but the exact component depends on the boot design. On many Linux systems that component is GRUB2 (GRand Unified Bootloader, version 2). Embedded systems frequently use U-Boot. On UEFI systems, though, you may instead encounter a lighter boot manager such as systemd-boot or rEFInd. Those do not provide the same kind of scripting and filesystem logic as GRUB. They primarily select and launch EFI executables, which may then load the kernel directly or hand off to another loader.

For this course, the important operational idea is that this stage chooses which kernel entry to start and supplies boot parameters. GRUB is popular because it is flexible and works across many hardware and filesystem combinations. Its configuration lives in /boot/grub/grub.cfg, but you should edit /etc/default/grub and regenerate the real config rather than modifying grub.cfg directly. New kernels are usually installed alongside older versions so that you can roll back if a new kernel causes problems.

3. Kernel initialization and the initramfs

The Linux kernel decompresses itself into memory, detects hardware, and mounts an initramfs (initial RAM filesystem), historically also called an initial ramdisk. This is a tiny temporary root filesystem stored in memory. Its job is to provide just enough user-space tooling and just enough drivers to let the kernel reach the real root filesystem on disk. This matters because many real root filesystems live behind layers that the bare kernel may not yet understand: NVMe controllers, RAID, encryption, Logical Volume Manager (LVM), network boot, or unusual storage drivers. Once the kernel can see and mount the real root filesystem, it discards the temporary one and switches over.

The root filesystem is the filesystem mounted at /. It is called “root” not because it belongs to the root user, but because it is the root of the entire directory tree. /etc, /var, /home, /usr, and every other path either live on that filesystem or are mounted somewhere beneath it. A machine may have many filesystems on many disks, but there is always exactly one filesystem mounted at / at boot.

4. Init system

Once the kernel has mounted the root filesystem, it starts PID 1, the first normal user-space process. On virtually all modern Linux distributions this process is systemd. The init system’s job is to bring the machine from “kernel is alive” to “multi-user server is ready”: mount remaining filesystems, start logging, bring up networking, start daemons, and supervise services after boot.

At this point it helps to define PID (Process ID) explicitly. Every running process in Linux has a numeric identifier assigned by the kernel. PID 1 is special because other user-space processes ultimately descend from it. If PID 1 fails, the system cannot operate normally. PID 0 is a kernel-internal scheduler context that you do not see as a regular process. PID 2 is typically kthreadd, which creates kernel threads on demand. If you run ps -p 1 -o pid,comm=, you can see which init system your machine is actually using.

Historical Note MBR, GPT, and Why Old Booting Was So Constrained

Before UEFI and GPT (GUID Partition Table) became normal, PCs commonly booted using legacy BIOS and the MBR (Master Boot Record). The MBR is the first 512-byte sector of a disk. It contained both a tiny piece of boot code and the partition table. That design was historically important but severely constrained: 512 bytes is barely enough for a stub loader, and the classic MBR partition table supports only four primary partitions and has a 2 TB addressing limit.

Legacy BIOS disk layout:

+----------------------+  sector 0
| MBR boot code        |
| partition table      |
+----------------------+
| partition 1          |
| partition 2          |
| ...                  |
+----------------------+

UEFI with GPT separates these concerns. The partition table can be much larger and more robust, and the firmware loads a normal file from the EFI System Partition instead of squeezing logic into the first sector of the disk.

Linux Filesystems

Before partitioning a disk or mounting a root filesystem, you need to choose what type of filesystem to use. The choice has lasting consequences for performance, reliability, and available features.

Choosing the right filesystem for your workload matters more on Linux than on most other operating systems, because the options differ significantly in features and performance characteristics.

ext4 is the default filesystem for most Debian/Ubuntu installations and the workhorse of the Linux world. It is stable, well-understood, and works well for general-purpose servers. It supports large files and volumes and includes journaling to recover gracefully from crashes.

xfs is the default on RHEL-family distributions. It was designed for high-throughput workloads and scales well to multi-terabyte filesystems, very large files, and workloads that stream lots of data continuously. That is what administrators usually mean when they say XFS handles “large filesystems” well. They mean not just capacity in the abstract, but sustained performance and predictable behavior when the filesystem contains huge amounts of data.

ZFS was created in 2001 by Sun Microsystems and became open-source as OpenZFS in 2013. It acts as both a filesystem and a volume manager, so it understands the storage stack end to end rather than treating the underlying disks as an opaque block device. Its major appeal is data integrity: checksums, snapshots, pooled storage, and strong administrative tooling. The tradeoff is complexity and higher resource appetite, especially RAM. ZFS is important and widely used, but the main point here is that it represents a different design philosophy from ext4 and XFS, not just “another format option.”

Btrfs (B-tree filesystem) was designed as a modern alternative to ext4 and shares several goals with ZFS. It is lighter on resources than ZFS and is the default filesystem for Fedora Workstation and openSUSE (Fedora Server defaults to XFS instead). Key features include copy-on-write semantics, snapshots, transparent automatic compression, and subvolumes (virtual partitions within a single Btrfs volume that can be mounted independently). Btrfs is generally considered less production-ready than ZFS for mission-critical storage but is a common choice for desktop machines and workstations.

For high-performance computing and distributed storage, Linux supports specialized clustered filesystems such as Lustre, BeeGFS, GPFS, and Ceph, but these are outside the scope of a typical server deployment.

Filesystem Layout and Disk Management

Linux organizes files according to the Filesystem Hierarchy Standard (FHS). Understanding that layout helps you decide where to put things and how to size your partitions. Before that, though, it helps to separate four concepts: disks, partitions, volumes, and filesystems.

Disks, Partitions, Volumes, and Filesystems

A block device is any device that reads and writes data in fixed-size chunks called blocks and supports random access rather than strictly sequential access. In Linux, block devices appear as device files such as /dev/sda, /dev/vda, or /dev/nvme0n1.

A storage device in this context is a block device intended to hold data. It can be a physical hard disk, SSD, USB flash drive, hardware RAID array, or a network-based block device presented from elsewhere. As long as Linux exposes it as a block device file, the operating system can treat it as storage.

A partition is a fixed-size slice of a storage device. Each partition gets its own device file and behaves like a smaller, independent disk. The partition table, whether MBR or GPT, is stored in a small reserved area on the device and tells the system where each partition begins and ends.

An LVM (Logical Volume Manager) layer sits above physical devices or partitions and lets administrators pool them into a volume group, then carve out flexible logical volumes from that pool. For example, you could combine a 6 TB disk and a 2 TB disk into one 8 TB volume group, then create two separate 4 TB logical volumes from it. The practical advantage is flexibility: logical volumes can often be resized or snapshotted much more easily than traditional fixed partitions.

RAID (Redundant Array of Independent Disks) combines multiple storage devices into a single virtual device. Depending on the RAID level, the goal may be better performance through parallelism, better reliability through mirroring or parity, or some combination of both. RAID can be implemented in hardware or software and works with block devices generally, not just with raw physical drives.

A filesystem is the top software layer that organizes the raw blocks provided by a partition, logical volume, RAID device, or plain disk into a usable structure. The filesystem is what gives you filenames, directories, permissions, metadata, and recovery behavior after a crash. Most user data lives inside filesystems, but not all storage does. Swap space, for example, bypasses the normal filesystem layer entirely, and some specialized applications also manage raw blocks directly for performance or control reasons.

Storage management layers showing the relationship between block devices, partitions, volumes, filesystems, and mount points. — Storage Management Layers

Those layers matter because the commands and the failure modes differ. Partitioning changes the layout of a device. RAID or LVM may create new virtual block devices on top of lower ones. Formatting writes a filesystem onto whichever layer will store files. Mounting attaches that formatted filesystem to some directory in the live directory tree. If you confuse those stages, it becomes hard to reason about what a command is actually doing.

Partitioning Strategy and Swap

Before a fresh Linux install can store files, the disk needs a layout. On a simple UEFI server, a common beginner-friendly scheme is three partitions: a small EFI System Partition for boot files, a swap area, and one large root partition for everything else. That layout is popular because it is easy to understand and hard to mismanage.

Swap is disk space the kernel can use as an overflow area for memory pages that are not currently active. Swap is much slower than RAM, so it is not a performance feature. It is a pressure-relief valve. A small amount of swap can keep a machine alive long enough to recover from a memory spike instead of immediately killing processes. Some systems use a dedicated swap partition; others use a swap file created inside an existing filesystem. Both are valid. A swap partition is simple and traditional. A swap file is easier to resize later.

You can absolutely divide the disk more aggressively, with separate filesystems for /var, /home, /tmp, or database data. That can be useful when you want isolation: if logs in /var/log explode, they cannot fill the entire root filesystem. The tradeoff is planning overhead. For a first installation or a small general-purpose server, one EFI partition, one swap area, and one large root filesystem is a sensible starting point.

Key Directories

Once the root filesystem is mounted, Linux expects the directory tree to follow the FHS conventions. The table below covers the directories you will encounter most often. Remember that these are paths in one unified tree. Some may live on the root filesystem and others may be separate mounted filesystems, but they all appear under the same / root.

Path	Purpose
`/`	The root of the entire filesystem tree
`/boot`	Kernel images (`vmlinuz*`), initial ramdisk (`initramfs`), bootloader files; the EFI System Partition is typically mounted at `/boot/efi`
`/bin`	Essential binaries available to all users (`cat`, `kill`, `ping`, `mount`, `passwd`)
`/sbin`	System binaries for booting, restoring, and repairing (`fdisk`, `fsck`, `useradd`)
`/usr`	UNIX Systems Resource: installed programs, libraries, documentation, source code
`/usr/bin`	Most general-purpose user binaries (`grep`, `ls`, `curl`, `chmod`)
`/usr/sbin`	System-administration binaries typically run by root (`chroot`, `shutdown`)
`/usr/local/bin`	Locally compiled or manually installed binaries
`/etc`	System-wide configuration files (see below for notable examples)
`/home`	User home directories
`/root`	Home directory for the root user
`/lib`	Shared libraries and kernel modules
`/dev`	Device files (see below)
`/proc`	Virtual filesystem exposing kernel and process state
`/var`	Variable data: logs (`/var/log`), databases, mail, caches
`/tmp`	Temporary files, often cleared on reboot
`/opt`	Optional third-party software packages
`/media`	Mount points for removable media (USB drives, optical discs)

On most modern distributions that use usr-merge, /bin, /sbin, and /lib are often symlinks into /usr (for example, /bin -> /usr/bin).

Application code typically lives in /opt/<appname>, database files are managed by their respective services under /var/lib/, and logs accumulate in /var/log.

`/etc`

/etc is where almost all system configuration lives. Some frequently referenced files include:

File	Purpose
`/etc/passwd`	User account information (username, UID, home dir, shell)
`/etc/shadow`	Password hashes and aging information (root-readable only)
`/etc/group`	Group definitions
`/etc/sudoers`	Defines who may use `sudo` (edit with `visudo`)
`/etc/hosts`	Static hostname-to-IP mappings
`/etc/fstab`	Filesystem mount table, read at boot
`/etc/shells`	List of permitted login shells
`/etc/os-release`	Distribution identification (name, version, ID)
`/etc/apt/sources.list`	APT repository definitions (Debian/Ubuntu)
`/etc/yum.repos.d/`	YUM/DNF repository definitions (RHEL family)

`/dev`

In Linux, hardware devices are represented as files in /dev. Hard drives appear as /dev/sda, /dev/sdb, and so on (or /dev/nvme0n1 for NVMe drives). In addition to real hardware, a few special pseudo-devices are useful in scripting:

Path	Behavior
`/dev/null`	Discards everything written to it; reads return EOF
`/dev/zero`	Returns an endless stream of null bytes on read
`/dev/random`	Returns cryptographically random bytes
`/dev/tty`	Refers to the current terminal; writing to it outputs to screen

`/proc`

/proc (for the process virtual filesystem) is not stored on disk; it is created and destroyed dynamically by the kernel. Each running process has a subdirectory /proc/<PID>/ containing virtual files like cmdline, status, and file descriptor links under /proc/<PID>/fd/ (for example, file descriptors 0, 1, and 2 correspond to stdin, stdout, and stderr). The /proc/sys/ subtree exposes many kernel tuning parameters. Because the data is hard to read directly, higher-level tools like top, ps, and htop parse it for you. For example, /proc/1/stat contains the state information for PID 1 (systemd).

Inspecting Disks and Usage

Three commands form the core of daily storage inspection. lsblk lists all block devices and their partitions in a tree view, showing sizes, device names, and current mount points. df -h reports how much space each mounted filesystem has consumed and how much remains, which is a quick health check for mounted storage. du -sh <path> drills into a specific directory to measure its total footprint, which is useful when /var/log or a database directory is growing faster than expected.

Mount Points

In Linux, a filesystem becomes usable only when it is mounted at some directory in the tree. Mounting does not copy data into that directory. It attaches a filesystem so that the directory becomes the entry point into it. The /etc/fstab file defines which filesystems are mounted automatically at boot.

This is where the earlier distinction between formatting and mounting matters. A command like mkfs.ext4 /dev/vdb1 writes an ext4 filesystem onto a partition. A command like mount /dev/vdb1 /srv/data then attaches that filesystem at /srv/data. Until you mount it, the filesystem exists on disk but is not part of the live directory tree.

Separating data-intensive directories onto their own volume isolates their growth from the root filesystem and allows independent backup policies. Use the device’s UUID (retrieved with blkid) rather than the device-name path like /dev/vdb1 in /etc/fstab, because device names can shift when disks are added or removed.

Monitoring disk usage is an ongoing responsibility. A full /var partition (common when logs grow unchecked) can cause services to crash or refuse to start. Set up log rotation and keep an eye on df -h output regularly.

With storage laid out and mount points established, the next question is who (and what processes) should be allowed to access it.

User and Group Management

A freshly installed Linux server has a root account and the admin user you created during installation. Before deploying any application, you need to think about who (and what) needs access to this machine.

Why Users and Groups Matter

Linux enforces access control through ownership: every file, directory, process, and socket belongs to a user and a group. The kernel checks these identities for every operation. On a server, this matters in two ways.

First, service accounts are unprivileged system users that own the processes running your services. For example, nginx is a widely used web server and reverse proxy. On many Linux systems it runs as the www-data user rather than as root after startup. If an attacker exploits a vulnerability in nginx, they land in a shell as www-data, a user with very limited capabilities, rather than as root. System accounts like www-data typically have UIDs below 1000 on many distributions (the exact range is distro-defined), have no login shell (set to /sbin/nologin or /bin/false), and often have no home directory. Creating them with useradd -r signals this intent.

Second, groups manage shared access. If both your application process and your nginx process need to read the same set of files, you add both service accounts to a common group and set group read permissions on those files. This avoids giving more permissions than necessary to either account.

The principle behind all of this is least privilege: every user and process should have exactly the access it needs and nothing more.

File Permissions

Linux controls file access through a permission model built into every file and directory. Each object has an owner (a single user), an owning group, and a set of permission bits that specify what the owner, the group, and everyone else can do.

The three permission classes are owner (u), owning group (g), and others (o). Each class gets three bits: read (r, value 4), write (w, value 2), and execute (x, value 1). For a regular file, execute means the file can be run as a program; for a directory, execute means the directory can be entered (listed or traversed). Permissions are commonly written in octal: 755 means the owner can read, write, and execute (4+2+1=7) while the group and others can only read and execute (4+0+1=5). Common values to internalize: 644 for files readable by everyone but writable only by the owner; 600 for owner-only readable files such as private keys; 700 for owner-only directories.

The chmod command changes permissions (chmod 640 /etc/app.conf grants owner read/write, group read, no access for others). The chown command transfers ownership: chown deploy:www-data /opt/webapp sets both the owning user and group in one step. A common server pattern is 640 with deploy:www-data ownership on application files, giving the deploy user read/write access and the web server read-only access while excluding all other users.

Creating Users

The useradd command creates a new user account. On Debian-based distributions, the friendlier adduser wrapper is also available, but understanding the lower-level command matters:

# Create a user with a home directory and bash as the default shell
sudo useradd -m -s /bin/bash deploy

# Set the user's password
sudo passwd deploy

The -m flag creates the home directory (/home/deploy), and -s sets the login shell. Without -m, the home directory is not created, which is a common source of confusion.

Key Files

User account information is stored in two files:

/etc/passwd contains one line per user with fields separated by colons: username, a placeholder for the password, UID, GID, comment (full name), home directory, and shell.
/etc/shadow contains the actual password hashes and password aging information. This file is readable only by root.

You can inspect a user’s entry with:

getent passwd deploy # deploy:x:1001:1001::/home/deploy:/bin/bash

Modifying Users and Groups

The usermod command changes an existing account. One of its most common uses is adding a user to supplementary groups:

# Add the deploy user to the www-data group
sudo usermod -aG www-data deploy

The -aG flags mean “append to the supplementary group list.” Forgetting the -a replaces all supplementary groups, which can lock a user out of resources they need.

To create a group explicitly:

sudo groupadd <group_name>

Privilege Escalation with su and sudo

Administrative work raises an unavoidable question: how do you become root when you need to? Historically the answer was su, short for substitute user. Running su starts a shell as another user, usually root, after you enter that user’s password. This works, but it centralizes power in the root password and makes accountability weaker because multiple administrators may share the same credential.

Modern Linux systems usually prefer sudo, originally short for superuser do. sudo lets a permitted user run one command as root, or as another user, using their own password and a policy defined in /etc/sudoers. That policy can be broad or narrow. You can allow full root access, or only allow a specific operator to restart one specific service. This is why sudo is so common in multi-user administration: it gives you auditing and finer-grained delegation.

One caveat matters in practice: not every distribution enables sudo by default. Ubuntu does. Minimal Debian installs often do not unless you selected it during installation. On such systems, you may need to become root with su first and then install and configure sudo.

The /etc/sudoers file is the policy database that tells sudo who may do what. A rule like deploy ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx means the user deploy may run exactly that command as any target user, from any host, without being prompted for a password. For a deploy user, you might grant more targeted access like this:

# Open the sudoers file safely
sudo visudo

Add a line that lets the deploy user restart nginx without a password prompt:

deploy ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx

SSH Key Authentication

SSH (Secure Shell) is the standard protocol for remote administration on Linux. It gives you an encrypted terminal session over the network, and it can also transfer files and forward ports. When people say they “SSH into a server,” they mean they authenticate to the SSH daemon on that server and receive a shell over an encrypted channel.

Password authentication over SSH is convenient but weak because Internet-exposed servers are constantly probed for password logins. SSH key authentication uses asymmetric cryptography instead. A key pair consists of a private key that the client keeps secret and a public key that is placed on the server. During authentication the client proves it holds the private key through a cryptographic challenge without ever transmitting it over the network. That is the core conceptual improvement: the server verifies possession of a secret without learning the secret itself.

Key pairs are generated with ssh-keygen. Ed25519 is the recommended algorithm: ssh-keygen -t ed25519 produces ~/.ssh/id_ed25519 (private, never shared) and ~/.ssh/id_ed25519.pub (public, safe to distribute). The public key is appended to ~/.ssh/authorized_keys on the server, either with ssh-copy-id or by pasting it in manually. File permissions on the server side are enforced by SSH itself: the ~/.ssh directory must be mode 700 and authorized_keys must be mode 600; any broader permissions cause SSH to reject the file silently.

Once key authentication is confirmed to work, password authentication should be disabled by setting PasswordAuthentication no in /etc/ssh/sshd_config and reloading the daemon (service name ssh on Debian/Ubuntu, sshd on RHEL-family).

Network Configuration Basics

A server that cannot reach package repositories or accept remote administration is not very useful. At this stage, though, you only need the minimum networking concepts required to get a Linux server installed and reachable. The deeper topics, including subnetting, routing logic, DNS behavior, and cloud firewalls, will be discussed later.

Viewing the Current Configuration

The ip command is the modern tool for inspecting network interfaces:

# Show all interfaces and their IP addresses
ip addr show

# Show just the routing table
ip route show

You will see at least two interfaces: lo (the loopback interface, always 127.0.0.1) and one or more physical or virtual interfaces (commonly named eth0, ens3, enp0s3, or similar).

Hostname, DHCP, and Basic Reachability

During an installation or first boot, you typically need three things to work: the machine needs a hostname, the primary interface needs an IP configuration, and DNS needs to resolve package mirrors or remote hosts. Many server installs start with DHCP (Dynamic Host Configuration Protocol) because it is the fastest way to get an address, default route, and DNS server automatically. Production servers often move to static addressing or DHCP reservations later so other systems can depend on a stable address.

The exact configuration tool varies by distribution. Debian and Ubuntu commonly use Netplan and systemd-networkd; RHEL-family systems commonly use NetworkManager. The important concept at this point is not the exact syntax. It is that Linux needs an interface configuration, a route to the outside world, and a resolver configuration before remote administration and package installation feel normal.

Hostname and /etc/hosts

We already set the hostname with hostnamectl. That command writes a short label (for example web-prod-01) to /etc/hostname, which is the name the kernel uses to identify the machine locally. Many programs and protocols also expect a Fully Qualified Domain Name (FQDN) that includes the domain, for example web-prod-01.example.com. The FQDN is the machine’s globally unambiguous network identity: the name DNS has a record for, and what TLS certificates, mail servers, and other services use to refer to this machine. The short hostname alone is not externally resolvable; it carries no domain context. Use /etc/hosts to map the machine’s own address to both names so they resolve locally before DNS is ever consulted.

Each non-comment line has the format:

IP_address  canonical_hostname  [alias ...]

Fields are separated by any amount of whitespace (spaces or tabs). Lines starting with # are comments. A minimal server entry looks like this:

# /etc/hosts
127.0.0.1       localhost
127.0.1.1       web-prod-01.example.com web-prod-01
::1             localhost ip6-localhost ip6-loopback

The first hostname on a line is the canonical name: the one authoritative name the system returns when a program resolves the machine’s full identity. Any additional hostnames on the same line are aliases that reach the same address but are not returned as the canonical name. Ordering matters because of how resolution works: when the kernel resolves its own name, it calls gethostname() to get the short label (web-prod-01), looks that alias up in /etc/hosts to find the IP address, and then returns the first hostname on that line. Placing the FQDN first means programs that perform canonical-name lookups (mail software, TLS libraries, Java’s InetAddress.getCanonicalHostName()) receive web-prod-01.example.com rather than the bare short label.

The 127.0.1.1 line is the conventional place for a server’s own FQDN on Debian/Ubuntu systems where the machine does not have a permanent static IP. On RHEL-family systems or machines with a static IP, the real IP address is typically used instead of 127.0.1.1. The ::1 line provides the equivalent loopback mapping for IPv6.

Together, these entries ensure the server can resolve its own hostname and FQDN locally even if DNS is temporarily unavailable.

If you can resolve your own hostname, acquire an IP address, and reach a package mirror or another known host, that is enough networking to continue with basic server setup.

Package Management

Once users and networking are in place, you need to install software. Linux distributions use package managers to install, update, and remove software from curated repositories.

apt (Debian/Ubuntu)

Debian and Ubuntu use apt, which downloads .deb packages from repositories defined in /etc/apt/sources.list and /etc/apt/sources.list.d/ (Ubuntu 26.04 and newer use the deb822 format, with sources in .sources files inside that directory). The standard workflow is three commands used in sequence: sudo apt update refreshes the local package index from all configured repositories; sudo apt install <name> resolves and installs a package along with its dependencies; and sudo apt upgrade applies available security and feature updates to all installed packages. For exploration, apt search <term> queries the index by keyword and apt show <name> prints full metadata about a package, including its version, description, and dependencies.

dnf (RHEL/AlmaLinux/Rocky)

On Red Hat-family distributions, the equivalent commands use dnf:

sudo dnf check-update        # similar to apt update
sudo dnf install nginx       # install a package
sudo dnf upgrade             # upgrade all packages

Universal Package Formats

Distribution-specific package managers require the package to be built for that distribution. Three cross-distribution formats address this limitation:

Flatpak: packages run in sandboxed containers with their own dependencies bundled. Widely used for desktop applications and available on most distributions.
Snap: developed by Canonical. Packages (called snaps) bundle all dependencies and run with strict confinement. Integrated with Ubuntu and available on other distributions.
AppImage: a single self-contained executable that runs on any Linux distribution without installation. Users just download, mark executable, and run.

These formats trade some efficiency for portability. On a server, distribution packages (apt, dnf) are almost always preferable because they integrate with the system’s security update mechanisms.

Repositories and Pinning

Sometimes the version of a package in the default repositories is not the one you need. Third-party repositories (called PPAs, or Personal Package Archives, on Ubuntu) supply newer or specialized builds. Vendors typically provide a setup script that registers their signing key and repository definition; the pattern is common but the script should always be reviewed before running it with elevated privileges.

If you need to prevent a package from being upgraded automatically (to keep a specific PostgreSQL version, for example), sudo apt-mark hold <package> pins it at its current version; sudo apt-mark unhold <package> releases the pin later.

Systemd Services

With packages installed, you need a way to start services, stop them, restart them on failure, order them correctly at boot, and inspect their logs. On modern Linux, that job is handled by systemd. systemd is both the init system that owns PID 1 and the service manager that supervises long-running background processes, traditionally called daemons.

This matters because a server is not just “a machine with software installed.” It is a machine whose services come up predictably after reboot and stay supervised while running. systemctl is the administrative interface to that system.

Basic Service Management

systemctl is the command-line interface to systemd’s service manager. The operations you will reach for constantly are checking a service’s current state, starting or stopping it, restarting it after a configuration change, and controlling whether it comes up automatically at boot.

# Check the status of nginx
sudo systemctl status nginx

# Start, stop, and restart
sudo systemctl start nginx
sudo systemctl stop nginx
sudo systemctl restart nginx

# Reload configuration without dropping connections
sudo systemctl reload nginx

To make a service start automatically at boot:

sudo systemctl enable nginx

To prevent it from starting at boot:

sudo systemctl disable nginx

Combining enable and start in one command:

sudo systemctl enable --now nginx

Unit Files

Systemd services are defined by unit files, typically stored in /usr/lib/systemd/system/ (distribution-provided) or /etc/systemd/system/ (administrator overrides). A systemd unit file typically has three sections: [Unit] declares what this service is and what it depends on; [Service] defines how to run it (which user, which binary, restart policy); and [Install] tells systemd which target (boot state) should pull this service in when enabled. Only [Unit] and the type-specific section ([Service] here) are required; [Install] is optional and only needed when you want to enable or disable the unit with systemctl. A unit file for a Node.js application might look like this:

# /etc/systemd/system/webapp.service
[Unit]
Description=Node.js Web Application
Wants=network-online.target
Requires=postgresql.service
After=network-online.target postgresql.service

[Service]
Type=simple
User=deploy
Group=deploy
WorkingDirectory=/opt/webapp
ExecStart=/usr/bin/node server.js
Restart=on-failure
RestartSec=5
Environment=NODE_ENV=production
Environment=PORT=3000

[Install]
WantedBy=multi-user.target

The [Unit] section declares ordering and dependency behavior: this app starts after the network is considered online, and Requires=postgresql.service means this unit fails if PostgreSQL is not available. The [Service] section defines how to run it: as the deploy user, from the /opt/webapp directory, restarting on failure after a five-second delay. The [Install] section tells systemd that this service belongs in the multi-user.target, so it starts on a normal boot.

After creating or modifying a unit file, systemctl daemon-reload reloads the on-disk unit definitions without interrupting running services. The service can then be enabled and started in one step with systemctl enable --now. Logs for any unit are accessible through journalctl -u <service-name>, with -f to follow output in real time.

Systemd Timers

A .timer unit schedules a .service unit to run at defined times or intervals, replacing cron for many tasks on modern Linux. Timer units always come in pairs with the same base name: webapp-health.timer triggers webapp-health.service.

The service unit is written as a oneshot type (runs once and exits, rather than staying alive):

# /etc/systemd/system/webapp-health.service
[Unit]
Description=Webapp health check

[Service]
Type=oneshot
ExecStart=/usr/local/bin/webapp-health.sh

The timer unit controls when it fires:

# /etc/systemd/system/webapp-health.timer
[Unit]
Description=Run webapp health check every minute

[Timer]
OnCalendar=*-*-* *:*:00
Persistent=true

[Install]
WantedBy=timers.target

OnCalendar=*-*-* *:*:00 means “every minute, on the zero-second mark.” Persistent=true ensures that if the system was off when a trigger was due, it runs once immediately on next boot.

After creating both files, systemctl daemon-reload picks up the new units and systemctl enable --now webapp-health.timer activates the schedule.

Historical Note The Init Wars: SysV to systemd (2010-2015)

For most of Linux’s first two decades, PID 1 was SysV init: a collection of shell scripts that started services one at a time in a fixed sequence. It was simple but slow; each service had to finish before the next could start, and dependency ordering was fragile and largely manual. In 2006 Apple released launchd for macOS, which parallelized service startup. Lennart Poettering proposed systemd for Linux in 2010, drawing on launchd’s ideas. systemd parallelized startup using socket activation, managed logging through a unified journal, and eventually absorbed login sessions, device management, and dozens of other system functions that had previously required separate daemons. The adoption provoked significant debate. Critics argued it violated Unix’s “do one thing well” philosophy and that its centralized scope made debugging harder. Despite the controversy, most major mainstream Linux distributions had made systemd their default by 2015.

Takeaways

Consider what provisioning a web application server actually looks like, from a blank machine to a running service. You choose Ubuntu 26.04 LTS because the application vendor officially supports it and your team already knows it. You size the instance at 2 vCPUs and 4 GB of RAM with a 40 GB root filesystem and a 1 GB swap area. You create a deploy user for the application and rely on the www-data system account (already present on Ubuntu) for nginx to run as. You disable SSH password authentication after adding your public key to ~/.ssh/authorized_keys, set the hostname with hostnamectl, and add the FQDN to /etc/hosts. You install nginx and PostgreSQL with apt, verify both are running with systemctl status, and enable them at boot with systemctl enable. You write a unit file for the application process that declares Requires=postgresql.service and Restart=on-failure, reload the daemon, and start it.

At that point the server comes up cleanly after a reboot, restarts services after a crash, and logs everything to the journal. The Linux layer is in good shape. Making it reliably reachable from the outside requires a separate body of knowledge: how IP addressing and subnetting work, how routing decisions are made, how DNS resolves names, how firewalls filter traffic, and which diagnostic tools tell you why a packet is or is not arriving. Those are networking fundamentals, and they sit alongside everything in this lecture as a parallel layer of understanding rather than a continuation of it.