Shell Scripting

This activity puts into practice the concepts from the Shell Scripting and Automation Basics lecture. You will write a Bash script that installs nginx on your EC2 instance, deploys a custom page with your ONID, and adds a cron-based health check. By the end, you will have a working imperative setup script you can rerun predictably and compare against the declarative Ansible workflow in the next activity.

What You Will Need

Your AWS Academy Learner Lab session started
An EC2 instance with 20 GiB of storage, running Ubuntu, with SSH and HTTP (port 80) access configured on the security group
Your instance’s SSH private key in ~/.ssh/
Your instance’s public IP from the EC2 console
A terminal with ssh available

Connect and Explore the Environment

Your script will run on the EC2 instance, so start by connecting to it. Before writing any code, examine the shell environment your script will run in.

SSH into your instance. Replace <YOUR-EC2-PUBLIC-IP> with the value from the EC2 console:
Terminal window
```
ssh -i ~/.ssh/cs312-key.pem ubuntu@<YOUR-EC2-PUBLIC-IP>
```
You will land at a prompt like ubuntu@ip-10-0-1-42:~$. Every step from here runs inside this SSH session unless noted otherwise.
Print your interactive shell’s PATH:
Terminal window
```
echo $PATH
```
You will see something like:
```
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
```
This PATH is built from profile files (/etc/profile, ~/.bashrc) when you log in. The shell searches these directories left to right each time you type a command name without a full path.
Simulate a minimal cron-like environment:
Terminal window
```
env -i PATH=/usr/bin:/bin bash -c 'echo $PATH'
```
The output is:
```
/usr/bin:/bin
```
env -i clears every environment variable, then sets PATH to a typical minimal cron value before launching the inner bash. Notice what is gone: /usr/local/bin, /usr/sbin, and /snap/bin. Any command installed in those directories would fail in that stripped-down environment with “command not found,” even though it works at your interactive prompt.
Check where curl lives on this system:
Terminal window
```
which curl
```
curl is at /usr/bin/curl, which is inside the minimal PATH you just simulated. Keep this in mind for the health-check script you will write later.
Observe the difference between exported and unexported variables:
Terminal window
```
export GREETING="hello"
INTERNAL="world"
bash -c 'echo "exported: $GREETING"'
bash -c 'echo "unexported: $INTERNAL"'
```
GREETING appears in the subshell because export marks it for the environment. INTERNAL disappears: the subshell received no value.

Create your working directory:

mkdir -p ~/cs312-scripts
cd ~/cs312-scripts

Your First Script

You will build setup.sh piece by piece, running the script after each change so you can see the effect of each new concept before the next one is added.

Create the file and make it executable:
Terminal window
```
touch setup.sh
chmod +x setup.sh
```
Without chmod +x, the kernel refuses to run the file directly. You will call it with ./setup.sh throughout.
Open setup.sh in any editor and add a shebang and a single print statement:
```
#!/usr/bin/env bash
printf "Hello from %s\n" "$(hostname)"
```
Run it:
Terminal window
```
./setup.sh
```
```
Hello from ip-10-0-1-42
```
The shebang tells the kernel which interpreter to use. #!/usr/bin/env bash searches your PATH for bash rather than assuming a fixed location like /bin/bash, which makes the script more portable across systems where Bash lives elsewhere.
Add variables for the paths the script will manage. Update setup.sh to:
```
#!/usr/bin/env bash
ONID="ulbrical"
WEBROOT="/var/www/html"
LOG_FILE="/tmp/setup-${ONID}.log"
printf "=== Setup: %s on %s ===\n" "$ONID" "$(hostname)"
printf "Log will be written to: %s\n" "$LOG_FILE"
```
Run it:
Terminal window
```
./setup.sh
```
```
=== Setup: ulbrical on ip-10-0-1-42 ===
Log will be written to: /tmp/setup-ulbrical.log
```
Notice the double quotes around every $VARIABLE. If a path ever contained a space, an unquoted expansion would split into two separate arguments and silently break commands like cp or rm. The ${ONID} form with braces is used in the log path to separate the variable name from the literal .log suffix that follows it.
Add set -euo pipefail immediately after the shebang line. The file now starts:
```
#!/usr/bin/env bash
set -euo pipefail
ONID="ulbrical"
...
```
Run the script again: the output is the same. Now see what set -u catches. Run these two commands directly in your shell session:
Terminal window
```
bash -c 'echo "no set -u:   [$TYPO_VAR]"'
bash -c 'set -u; echo "with set -u: [$TYPO_VAR]"'
```
```
no set -u:   []
bash: TYPO_VAR: unbound variable
```
The first silently substitutes an empty string. The second aborts at the point of the mistake. With set -euo pipefail in setup.sh, a typo in a variable name will fail loudly there rather than passing an empty value into a later cp, rm, or apt-get call.
Replace the hardcoded ONID with argument validation. The script will need to run as root to install packages, so add that check too. Update setup.sh to:
```
#!/usr/bin/env bash
set -euo pipefail

if [[ "$EUID" -ne 0 ]]; then
    printf "Run as root: sudo %s <your-onid>\n" "$(basename "$0")" >&2
    exit 1
fi

if [[ "$#" -ne 1 ]]; then
    printf "Usage: sudo %s <your-onid>\n" "$(basename "$0")" >&2
    exit 1
fi

ONID="$1"
WEBROOT="/var/www/html"
NGINX_DEFAULT="/etc/nginx/sites-available/default"
LOG_FILE="/tmp/setup-${ONID}.log"

printf "=== Server Setup: %s on %s ===\n" "$ONID" "$(hostname)"
```
Run without arguments to see the usage message:
Terminal window
```
sudo ./setup.sh
```
```
Usage: sudo setup.sh <your-onid>
```
Run without sudo to trigger the root check:
Terminal window
```
./setup.sh myonid
```
```
Run as root: sudo setup.sh <your-onid>
```
A few of Bash’s built-in variables are doing work here. $EUID is the effective user ID of the running process; root is always 0, so -ne 0 means “not equal to zero, i.e., not root.” $# is the count of arguments the caller passed; the script expects exactly one, so -ne 1 catches both zero and two-or-more. $1 is the first argument, which becomes ONID. $0 is the script’s own name as invoked; $(basename "$0") strips any leading path so the usage line prints setup.sh rather than ./setup.sh or /home/ubuntu/cs312-scripts/setup.sh. Both error messages go to stderr with >&2 so they do not corrupt any pipeline that might consume the script’s normal output.
Run the script correctly with your actual ONID:
Terminal window
```
sudo ./setup.sh ulbrical
```
```
=== Server Setup: ulbrical on ip-10-0-1-42 ===
```
Now add a trap to the script, right after the variable block:
Terminal window
```
trap 'printf "[trap] Exiting. Log: %s\n" "$LOG_FILE"' EXIT
```
Run it again:
Terminal window
```
sudo ./setup.sh your-onid
```
```
=== Server Setup: your-onid on ip-10-0-1-42 ===
[trap] Exiting. Log: /tmp/setup-your-onid.log
```
The trap fires on every exit, successful or not. You can use it to clean up temporary files or release locks even if the script fails midway. Later you will call trap - EXIT at the end of the script to clear the handler so the message does not appear on a normal successful exit.

Check Preconditions

Before installing anything, the script should examine the system. Add a logging function, a disk-space check using awk, and a package-status helper using grep.

Add a log function to setup.sh, right after the trap line:
Terminal window
```
log() {
    printf "[%s] %s\n" "$(date '+%H:%M:%S')" "$1" | tee -a "$LOG_FILE"
}
```
tee -a writes each message to stdout (so you see it live) and appends it to $LOG_FILE for later inspection. Every status line from here on calls log.
Add a disk check below the log function:
Terminal window
```
DISK_PCT=$(df -h / | awk 'NR==2 { gsub(/%/, "", $5); print $5 }')
if [[ "$DISK_PCT" -gt 80 ]]; then
    log "WARNING: root filesystem at ${DISK_PCT}% capacity"
else
    log "Disk check passed: ${DISK_PCT}% used"
fi
```
The awk command processes df -h / as a table. NR==2 skips the header row. $5 is the “Use%” column. gsub(/%/, "", $5) removes the percent sign so Bash can compare the value as a number inside [[ ]].

Run the script to test the disk check:

sudo ./setup.sh your-onid

You should see:

=== Server Setup: your-onid on ip-10-0-1-42 ===
[10:15:03] Disk check passed: 22% used
[trap] Exiting. Log: /tmp/setup-your-onid.log

Add an is_installed helper function below log:
Terminal window
```
is_installed() {
    dpkg -l "$1" 2>/dev/null | grep -q "^ii"
}
```
dpkg -l <package> shows the package’s status. Lines beginning with ii mean “installed and configured.” grep -q returns exit code 0 on a match (package is installed) and exit code 1 otherwise. No output is produced either way: the exit code is the result.

Test is_installed directly in the shell before relying on it in the script:

is_installed() { dpkg -l "$1" 2>/dev/null | grep -q "^ii"; }
is_installed bash && echo "bash: installed" || echo "bash: not installed"
is_installed doesnotexist && echo "doesnotexist: installed" || echo "doesnotexist: not installed"

You should see bash: installed and doesnotexist: not installed.

Run grep -E directly against a package that is already installed so you can see the status codes the pattern is matching:
Terminal window
```
dpkg -l bash | grep -E "^ii|^rc"
```
You should see a line starting with ii, which means the package is installed and configured. If you later remove a package without purging its configuration files, its line would begin with rc instead. The is_installed function checks for ^ii specifically because an rc package is not usable even though its name still appears in the dpkg database. The ^ anchor ensures you are matching the status column at the start of the line, not the string ii anywhere in the package name or description.

Install Packages

Add an ensure_installed function that checks before acting, then loop over a list of required packages. For package installation, the script will reach the same final state whether it runs once or ten times.

Add ensure_installed to setup.sh, right below is_installed:

ensure_installed() {
    local pkg="$1"
    if is_installed "$pkg"; then
        log "OK (already installed): $pkg"
    else
        log "Installing: $pkg"
        apt-get install -y "$pkg" >> "$LOG_FILE" 2>&1
        log "Done: $pkg"
    fi
}

Apt output goes to the log file to keep the terminal readable. If apt-get fails, set -e stops the script immediately rather than letting it continue in a broken state.

Add a package list and loop below the disk check block:

log "Refreshing apt package metadata..."
apt-get update >> "$LOG_FILE" 2>&1

PACKAGES=("nginx" "curl")

log "Checking required packages..."
for pkg in "${PACKAGES[@]}"; do
    ensure_installed "$pkg"
done

Refreshing package metadata first makes the install step more reliable on a fresh or older image.

Run the script:

sudo ./setup.sh your-onid

On the first run, nginx will be downloaded and installed. You should see:

[10:17:01] Refreshing apt package metadata...
[10:17:03] Checking required packages...
[10:17:03] Installing: nginx
[10:17:07] Done: nginx
[10:17:07] OK (already installed): curl

Run the script a second time:
Terminal window
```
sudo ./setup.sh your-onid
```
Both packages now show OK (already installed). The check before the install is what makes this safe to repeat.

Configure nginx and Deploy Your Page

With nginx installed, configure it and deploy a page that identifies the server. This section uses sed to update the server_name directive in the nginx configuration, and a heredoc to write the HTML. Because the page includes a deployment timestamp and the restart is unconditional, rerunning the script will update the page and restart nginx again.

Confirm that the default nginx configuration is valid before modifying it:

sudo nginx -t

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Add the configuration block to setup.sh, below the package loop:
Terminal window
```
log "Configuring nginx..."
CURRENT_HOST="$(hostname)"
sed -i "s|server_name [^;]*;|server_name ${CURRENT_HOST};|" "$NGINX_DEFAULT"
```
The server_name directive tells nginx which Host headers this server block matches. In Ubuntu’s default site, the value _ is not a special wildcard or catch-all token; it is just an invalid hostname commonly used as a placeholder. This server block is still the default for port 80 because the file’s listen 80 default_server line makes it the default, not because of _. Replacing _ with the actual EC2 hostname gives you a concrete line to edit and makes one expected host value explicit, but this block will still answer unmatched requests as the default server unless you add other server blocks.

That is why the later verification with curl http://localhost/ still works: localhost does not match the EC2 hostname, but this server block remains the default server on port 80.

sed -i edits the file in place. The default delimiter for a sed substitution is /: s/pattern/replacement/. Using | here swaps that delimiter so you do not have to escape any slashes that might appear in a hostname or path. The pattern [^;]* is a negated character class: it matches any run of characters that are not a semicolon, stopping at the first ;. Using .* instead would be greedy and could overshoot on a line like server_name _ ; # default;, matching past the intended semicolon and into the comment.
Add the index page right after the sed line:
Terminal window
```
log "Deploying index.html..."
cat > "$WEBROOT/index.html" << EOF
<!DOCTYPE html>
<html>
<head><title>CS 312: ${CURRENT_HOST}</title></head>
<body>
<h1>Configured by ${ONID}</h1>
<p>Host: ${CURRENT_HOST}</p>
<p>Deployed: $(date)</p>
</body>
</html>
EOF
```
The << EOF syntax is a heredoc: it feeds everything between the opening EOF and the closing EOF (which must appear alone on its own line) into the command on the left as stdin. cat > "$WEBROOT/index.html" writes that stdin to a file, so the heredoc becomes a multi-line write in a single statement. The word EOF is a convention; any word works as long as the opening and closing markers match exactly. Because the delimiter is unquoted (<< EOF), the shell expands variables and command substitutions inside the block as it reads it: ${ONID}, ${CURRENT_HOST}, and $(date) all evaluate when the script runs. In the health-check script later you will see << 'HEALTHEOF' with a quoted delimiter, which suppresses all expansion so the variables are written literally into the installed script rather than resolved now.
Add the config test, restart, and verification right after the heredoc:
Terminal window
```
nginx -t 2>>"$LOG_FILE"
systemctl restart nginx
log "nginx restarted"

if /usr/bin/curl -s http://localhost/ | grep -q "$ONID"; then
    log "Verification passed: page contains ONID"
else
    log "Verification FAILED: page does not contain ONID" >&2
    exit 1
fi
```
Notice /usr/bin/curl rather than curl. This follows the absolute-path discipline from Section 1: a habit that is also required in the cron context coming up next. The unconditional systemctl restart nginx is similarly imperative: unlike an Ansible handler, it runs every time whether the configuration changed or not.

Run the script:

sudo ./setup.sh your-onid

You should see the nginx restart and verification lines:

[10:20:01] Configuring nginx...
[10:20:01] Deploying index.html...
[10:20:02] nginx restarted
[10:20:02] Verification passed: page contains ONID

Confirm the page from inside the server:
Terminal window
```
curl http://localhost/
```
The response should be your HTML page with your ONID and the EC2 hostname.

Schedule a Health Check

The last piece is a health-check script that cron runs every minute, recording the HTTP status code from nginx. Before relying on that schedule, confirm that the cron daemon is actually installed and running on your instance.

Confirm that cron is present and active:
Terminal window
```
systemctl status cron --no-pager
```
On Ubuntu Server images, cron is usually already present. You should see Active: active (running). If the unit is missing or inactive, install and start it now:
Terminal window
```
sudo apt-get install -y cron
sudo systemctl enable --now cron
```
Continue once systemctl status cron --no-pager shows the service running.
Add the health-check installation block to setup.sh, below the nginx verification:
```
log "Installing health check..."
cat > /usr/local/bin/nginx-health.sh << 'HEALTHEOF'
#!/usr/bin/env bash
set -euo pipefail
STATUS=$(/usr/bin/curl -s -o /dev/null -w "%{http_code}" http://localhost/)
printf "[%s] nginx: %s\n" "$(date '+%Y-%m-%d %H:%M:%S')" "$STATUS" >> /tmp/nginx-health.log
HEALTHEOF
chmod +x /usr/local/bin/nginx-health.sh
```
The << 'HEALTHEOF' uses a single-quoted delimiter: this prevents the outer script from expanding $STATUS and $(...) while writing the inner script. Those variables must expand when the health check runs, not when setup.sh installs it.

The health check uses /usr/bin/curl with an absolute path. Run which curl to confirm that is its actual location. Cron will usually run this script with a minimal PATH such as /usr/bin:/bin. Using absolute paths removes any dependency on the calling environment.
Add the cron job registration right after:
Terminal window
```
log "Scheduling health check..."
cat > /etc/cron.d/nginx-health << 'CRONEOF'
* * * * * root /usr/local/bin/nginx-health.sh
CRONEOF
chmod 644 /etc/cron.d/nginx-health
log "Health check scheduled: every minute"
```
/etc/cron.d/ is the system-wide cron directory. Each line contains: schedule user command. The root field specifies which user runs the job. Overwriting this file with the same content is idempotent: the cron job does not duplicate.
Add a final completion line at the very end of setup.sh:
Terminal window
```
trap - EXIT
log "Setup complete"
```
trap - EXIT clears the handler you registered at the start, so the “[trap] Exiting” message no longer fires on a clean run. That is reasonable here because this trap only prints a status line. For a real cleanup trap that removes temporary files or releases locks, you would usually leave the trap installed and branch on $? inside the handler instead.
Run the complete setup script:
Terminal window
```
sudo ./setup.sh your-onid
```
You should see the health-check installation and scheduling lines, followed by Setup complete with no trap message.
Confirm the cron job is registered:
Terminal window
```
cat /etc/cron.d/nginx-health
```
Wait about 60 seconds, then check the health log:
Terminal window
```
cat /tmp/nginx-health.log
```
You should see one or more lines like:
```
[2026-04-17 10:25:01] nginx: 200
```
Status 200 means nginx responded correctly.

Your Configured Server

Run the complete script one final time and notice which parts detect state versus which parts simply run again, then verify the result from your own machine.

Run the script again without changing anything:
Terminal window
```
sudo ./setup.sh your-onid
```
Every package line should show OK (already installed). The cron file and health-check script are rewritten with the same content, the page gets a new deployment timestamp, and nginx restarts again. That contrast is the point: some parts of this script check state first, while other parts are intentionally imperative.
Exit the SSH session:
Terminal window
```
exit
```
From your own machine, verify the page over the public internet:
Terminal window
```
curl http://<YOUR-EC2-PUBLIC-IP>/
```
You should see:
```
<h1>Configured by your-onid</h1>
<p>Host: ip-10-0-1-42</p>
<p>Deployed: ...</p>
```
Your ONID and the EC2 hostname are visible in the response.

Going Further

Your script configures a server correctly and is reasonable to re-run, but it is not fully idempotent: it rewrites timestamped content and restarts nginx on every run. The natural next upgrade is replacing the cron job with a systemd timer, which handles missed runs automatically.

On your EC2 instance, remove the cron file first (sudo rm /etc/cron.d/nginx-health) so you do not run both schedulers at once. Then create two unit files: nginx-health.service (a Type=oneshot service that calls /usr/local/bin/nginx-health.sh) and nginx-health.timer (a timer with OnCalendar=minutely and Persistent=true). After writing them, run sudo systemctl daemon-reload, then enable and start the timer with sudo systemctl enable --now nginx-health.timer. Verify the timer is active with systemctl list-timers. Then stop the instance, restart it, and confirm that Persistent=true caused the timer to fire at boot rather than waiting for the next scheduled minute.

Once you work through the next activity, compare the shared web-server pieces of setup.sh to the Ansible playbook task by task. The ensure_installed function maps to the apt task, the heredoc maps to the copy task, the sed call maps to a lineinfile task, and the systemctl restart maps loosely to a handler. Seeing the correspondence concretely shows why Ansible’s idempotence matters at scale: what your script does explicitly, Ansible’s modules do automatically, and Ansible’s handler runs only when a task actually changed.