YAML
YAML appears throughout this course: in Docker Compose files, GitHub Actions workflows, Ansible playbooks, and Kubernetes manifests. Each of those tools uses YAML for the same reason: it is a human-readable way to describe structured data. Understanding the syntax once means you can read and write any of them.
What YAML Is
Section titled “What YAML Is”YAML (YAML Ain’t Markup Language) is a data serialization format. Its purpose is to represent structured data in a way that is easy for people to read and write and easy for programs to parse.
The core idea is simple: YAML is usually a hierarchy of keys and values, where indentation defines structure in the block style you will write most often. YAML also has a flow style that uses {} and [], but Docker Compose, Ansible, and most Kubernetes examples in this course use the indentation-based form.
Basic Syntax
Section titled “Basic Syntax”Mappings (Key-Value Pairs)
Section titled “Mappings (Key-Value Pairs)”A mapping is a set of key-value pairs, usually written with a colon followed by a space:
name: nginxport: 8080enabled: truetimeout: nullIn this course, keys are almost always plain strings. Values can be strings, numbers, booleans (true/false), or null. In practice, write mappings as key: value with a space after the colon. Without that separating space, text like port:8080 is usually parsed as a plain string rather than as a mapping entry.
A list uses a dash and a space before each item:
ports: - "8080:80" - "8443:443"
depends_on: - database - cacheLists can also be written inline with square brackets: ports: ["8080:80", "8443:443"]. The block form (one item per line) is more readable for more than two or three items.
Nested Structures
Section titled “Nested Structures”Indentation creates nesting. Consistent indentation within a block is required; two spaces per level is the convention:
services: web: image: nginx:alpine ports: - "8080:80" environment: APP_ENV: production LOG_LEVEL: infoHere services is a mapping with one key (web), which maps to another mapping with three keys (image, ports, environment). ports is a list; environment is a mapping.
Strings
Section titled “Strings”Plain strings do not need quotes:
message: hello from the containerUse quotes when the string contains characters that YAML would otherwise misinterpret: colons, hash signs, leading dashes, or values that look like other types:
version: "3.8" # without quotes, some parsers read this as a floatgreeting: "yes" # without quotes, older parsers read this as boolean truelabel: "key: value" # unquoted, the colon would break parsingcomment: "# not a comment"Both single and double quotes work. Double quotes support escape sequences (\n, \t, \"). Single quotes do not; a literal single quote inside single-quoted text is written as two consecutive single quotes ('').
Multi-Line Strings
Section titled “Multi-Line Strings”Two special characters control how multi-line strings are handled.
The literal block scalar (|) preserves line breaks:
script: | #!/bin/bash echo "starting..." systemctl restart nginxEach line in the block becomes a line in the string, including the newline at the end. This is the form you will see in GitHub Actions run: steps and Ansible shell: tasks.
The folded block scalar (>) folds line breaks into spaces, turning the block into a single paragraph:
description: > This is a long description that spans multiple lines but will be joined into one.Comments
Section titled “Comments”A # starts a comment to the end of the line:
# This is a standalone commentimage: nginx:alpine # inline commentComments are one of the places YAML has a clear advantage over JSON, which does not support them at all.
Anchors and Aliases
Section titled “Anchors and Aliases”YAML lets you define a value once and reuse it with anchors (&name) and aliases (*name). You may see this in GitHub Actions workflows or in example configurations online when the same block needs to appear more than once:
common-env: &common_env APP_ENV: production LOG_LEVEL: info
service-a: environment: *common_env
service-b: environment: *common_envA related feature called the merge key uses <<: to copy keys out of an anchored mapping. It is widely implemented as a YAML extension, but it is not part of YAML 1.2 and is not supported everywhere. GitHub Actions, for example, supports anchors and aliases but not merge keys. Treat <<: as tool-specific rather than universal YAML.
Common Pitfalls
Section titled “Common Pitfalls”The Norway problem. Older YAML 1.1-style resolvers treat values such as yes, no, on, off, NO, and OFF as booleans instead of strings. YAML 1.2 narrowed this behavior, but real tools do not all resolve plain scalars the same way, and older expectations are still common in the ecosystem. Ansible is a frequent source of these surprises, and Docker Compose documentation still recommends quoting ambiguous values. The safe habit is to quote any string value that could be mistaken for another type:
# Risky: older tooling may read "no" as falserestart: no
# Saferestart: "no"Version numbers and numeric-looking strings. Values such as 1.10 are numbers in YAML, not strings. If a tool expects a string, or if you want to preserve the formatting exactly, quote it:
app_version: "1.10"release_label: "2026.04"Older Compose examples often used version: "3.8", but modern Compose typically omits the top-level version key entirely. The broader rule is simpler: whenever a value looks numeric but should behave like text, quote it.
Indentation consistency. All items at the same level must be indented by exactly the same number of spaces. Mixing two-space and four-space indentation within the same file, or accidentally over-indenting a single key, causes parse errors or silently moves a key to the wrong level of the hierarchy.
Duplicate keys. YAML mappings are supposed to have unique keys. If the same key appears twice, the document is invalid, but parsers differ in how they react: some error, some keep the first value, and some keep the last. In practice: never repeat a key in the same block.
Where YAML Appears in This Course
Section titled “Where YAML Appears in This Course”YAML shows up in several of the course’s automation and orchestration tools. Not every later tool uses YAML. Terraform, for example, uses HCL. The table below shows where YAML appears and what those files describe.
| Tool | File(s) | What it describes | Lecture intro |
|---|---|---|---|
| Docker Compose | compose.yml | Services, networks, volumes, port mappings | Week 3 |
| GitHub Actions | .github/workflows/*.yml | CI/CD pipelines: triggers, jobs, steps | Week 5 |
| Ansible | site.yml, playbook.yml, group_vars/*.yml, host_vars/*.yml | Desired server state, variables, roles | Week 5 |
| Kubernetes | deployment.yaml, service.yaml, configmap.yaml, secret.yaml | Cluster workloads, networking, config, secrets | Week 7 |
The YAML you write for Docker Compose is usually the most forgiving: a small number of concepts, shallow nesting, and comparatively friendly error messages. Kubernetes manifests are the most verbose: every resource has fixed top-level fields such as apiVersion, kind, and metadata, and many resource types also have a substantial spec section. The exact required fields depend on the resource type.
YAML vs. Other Formats
Section titled “YAML vs. Other Formats”YAML is not the only way to serialize configuration data. Understanding its tradeoffs helps you choose the right format when you have a choice, and helps you understand why tools made the choices they did.
JSON (JavaScript Object Notation) and YAML are closely related. YAML 1.2 is actually a superset of JSON: any valid JSON file is valid YAML. The practical differences are:
- JSON requires quotes around all keys and string values; YAML block style often does not.
- JSON uses braces and brackets for structure; YAML block style usually uses indentation.
- JSON has no comment syntax; YAML uses
#. - JSON is more explicit and less ambiguous; YAML’s flexibility (the Norway problem, implicit type coercion) introduces edge cases.
- JSON is the default format for APIs and most machine-to-machine communication. YAML is preferred when humans write and read the file frequently.
GitHub Actions, Ansible, and Kubernetes all have JSON equivalents of their YAML formats. You will almost never use them, because the verbosity makes large files painful to maintain.
TOML (Tom’s Obvious Minimal Language) is common in Rust, Python, and Go project configuration (Cargo.toml, pyproject.toml). It is simpler than YAML for flat key-value data and strongly typed (no implicit coercions), but its nested table syntax becomes awkward for deeply hierarchical data like Kubernetes manifests. You will not write TOML in this course, but you will encounter it in project configuration files.
INI and .env Files
Section titled “INI and .env Files”INI files ([section] headers with key = value pairs) are the oldest format in common use: systemd unit files and many legacy Linux daemons use them. .env files (bare KEY=VALUE with no sections) are used by Docker Compose and shell scripts for environment variable injection. Both formats are simpler than YAML but have no support for nesting or lists.
HCL (HashiCorp Configuration Language) is Terraform’s format. It looks superficially similar to JSON but is designed specifically for infrastructure configuration: it supports comments, expressions, function calls, and references between resources. Terraform chose HCL over YAML specifically to give operators a language expressive enough for conditional logic and interpolation without requiring a full programming language. The Infrastructure as Code lecture covers HCL in detail.