Compare commits

...

4 Commits

Author SHA1 Message Date
nearxos
5238d457e8 Update boot order configuration for eMMC first, then network
Modify the first-boot script and documentation to set the EEPROM boot order to 0xf21, prioritizing eMMC boot followed by network boot. Adjust network boot settings for faster failure on DHCP timeouts and update related scripts and documentation to reflect these changes. Enhance the rescue script to directly modify EEPROM settings without requiring a chroot into eMMC, streamlining the recovery process for devices stuck in network-only boot. Update relevant documentation to ensure clarity on the new boot order and its implications.
2026-02-21 15:05:17 +02:00
nearxos
ff6258c2af Add status tracking for network boot actions in dashboard
Implement a new function to write status updates during device deployment and backup actions, enhancing user feedback. Update the API to call this function with appropriate messages based on the action taken. Modify the network boot toggle script to improve clarity and functionality, ensuring proper management of DHCP options. Update permissions for the toggle script to ensure it is executable. Additionally, update the initrd.img to reflect recent changes in the network boot process.
2026-02-21 13:05:52 +02:00
nearxos
ea6f846021 Improve network boot troubleshooting documentation and initramfs scripts
Update NETWORK-BOOT-TROUBLESHOOTING.md to clarify the boot process and emphasize the need to disable PXE before rebooting to ensure EEPROM updates are applied. Enhance initramfs scripts to improve DHCP lease acquisition and capture the device's IP address more reliably. Add a revision tracking feature to the initramfs build process for better version control. Modify provisioning-client.sh to ensure proper reboot handling after deployment and backup actions.
2026-02-21 12:57:26 +02:00
nearxos
a6e27219f4 Enhance network boot troubleshooting documentation and scripts
Update NETWORK-BOOT-TROUBLESHOOTING.md to clarify the boot process after start4.elf, emphasizing the importance of config.txt settings for kernel and initramfs. Introduce checks for GPU logging and ensure proper configuration for UART. Modify initramfs scripts to improve DHCP lease acquisition and ensure shell output is directed to the serial console. Update ensure-tftpboot-config-kernel-initrd.sh to enforce necessary config settings and link DTB files in serial-prefix directories for better device compatibility.
2026-02-21 02:27:48 +02:00
19 changed files with 588 additions and 276 deletions

View File

@@ -18,7 +18,7 @@ This script runs once on first boot via cloud-init (see `user-data-remote-gnss.e
10. **Re-apply splash** — Set `disable_splash=0`, Plymouth theme to `custom` only, `update-initramfs`.
11. **Dark theme** — Set GTK dark theme for user `pi`: `~/.config/gtk-3.0/settings.ini` with `gtk-application-prefer-dark-theme=1` and `gtk-theme-name=PiXnoir` (Raspberry Pi OS dark theme).
12. **CM4 EEPROM enable** — On CM4, `rpi-eeprom-update` is disabled by default. First-boot enables it by adding `RPI_EEPROM_USE_FLASHROM=1` and `CM4_ENABLE_RPI_EEPROM_UPDATE=1` to `/etc/default/rpi-eeprom-update`. **No config.txt changes are needed**`dtoverlay=audremap`/`dtoverlay=spi-gpio40-45` are for the flashrom method only and **must not be added** as they conflict with the reTerminal DM display backlight (GPIO13 PWM). The bootloader method (`pieeprom.upd`) is used instead.
13. **Boot order** — If `rpi-eeprom-config` is available, set `BOOT_ORDER=0x21` (network first, then eMMC/SD). On CM4 first boot this may be skipped (EEPROM not yet enabled); a one-shot systemd service runs after reboot to set boot order once.
13. **Boot order** — If `rpi-eeprom-config` is available, set `BOOT_ORDER=0xf21` (eMMC first, then network, then restart). Also sets `NET_BOOT_MAX_RETRIES=3`, `DHCP_TIMEOUT=1500`, `DHCP_REQ_TIMEOUT=500`, `NET_INSTALL_AT_POWER_ON=0` so network boot fails fast when no TFTP server is available. On CM4 first boot this may be skipped (EEPROM not yet enabled); a one-shot systemd service runs after reboot to set boot order once.
14. **One-shots** — Download `set-rotation-once.sh` + `.desktop` from file server (wlr-randr for labwc). Wallpaper is set once via pcmanfm config during first-boot.
15. **Reboot.**
@@ -137,9 +137,9 @@ First-boot sets a dark GTK theme for user **pi** via **`~/.config/gtk-3.0/settin
On **CM4**, first-boot enables `rpi-eeprom-update` by setting **`RPI_EEPROM_USE_FLASHROM=1`** and **`CM4_ENABLE_RPI_EEPROM_UPDATE=1`** in **`/etc/default/rpi-eeprom-update`**. **No dtparams are added to config.txt.** `dtoverlay=audremap` and `dtoverlay=spi-gpio40-45` are only needed for the *flashrom* (direct SPI) update method — they **must not** be added because `audremap` remaps audio to GPIO12/13, which conflicts with the reTerminal DM display backlight PWM on GPIO13, causing a blank screen. The bootloader file method (`pieeprom.upd`) works without these overlays. See: [usbboot](https://github.com/raspberrypi/usbboot/blob/master/Readme.md).
## Boot order (network first, then eMMC/SD)
## Boot order (eMMC first, then network)
If **`rpi-eeprom-config`** and **`rpi-eeprom-update`** are present (Pi 4/CM4), the script sets the EEPROM **`BOOT_ORDER=0x21`**: try **network** first (0x2), then **SD/eMMC** (0x1). **Pi 4:** applied on first-boot; EEPROM update scheduled for next reboot. **CM4:** a one-shot service (**`set-cm4-boot-order-once.service`**) runs after the next boot and sets BOOT_ORDER=0x21, then removes itself (two reboots for network-first). If “Could not read current EEPROM config” appears, run `sudo rpi-eeprom-update -l` on the device to see if a firmware file is listed; you can set boot order manually with `rpi-eeprom-config` if needed. If the tools are not available, the step is skipped.
If **`rpi-eeprom-config`** and **`rpi-eeprom-update`** are present (Pi 4/CM4), the script sets the EEPROM **`BOOT_ORDER=0xf21`**: try **SD/eMMC** first (0x1), then **network** (0x2), then **restart** (0xf). Network boot settings are tuned for fast fallback: `NET_BOOT_MAX_RETRIES=3`, `DHCP_TIMEOUT=1500` (1.5s), `DHCP_REQ_TIMEOUT=500` (0.5s), `NET_INSTALL_AT_POWER_ON=0`. The device boots from eMMC normally; if eMMC is blank it tries network boot for re-provisioning but gives up quickly when no TFTP server is available. **Pi 4:** applied on first-boot; EEPROM update scheduled for next reboot. **CM4:** a one-shot service (**`set-cm4-boot-order-once.service`**) runs after the next boot and sets the boot order, then removes itself. If “Could not read current EEPROM config” appears, run `sudo rpi-eeprom-update -l` on the device to see if a firmware file is listed; you can set boot order manually with `rpi-eeprom-config` if needed. If the tools are not available, the step is skipped.
## Reboot

View File

@@ -214,10 +214,10 @@ grep -q '^RPI_EEPROM_USE_FLASHROM=' "$EEPROM_DEFAULT" && sed -i 's/^RPI_EEPROM_U
grep -q '^CM4_ENABLE_RPI_EEPROM_UPDATE=' "$EEPROM_DEFAULT" && sed -i 's/^CM4_ENABLE_RPI_EEPROM_UPDATE=.*/CM4_ENABLE_RPI_EEPROM_UPDATE=1/' "$EEPROM_DEFAULT" || echo 'CM4_ENABLE_RPI_EEPROM_UPDATE=1' >> "$EEPROM_DEFAULT"
log "Set RPI_EEPROM_USE_FLASHROM=1 and CM4_ENABLE_RPI_EEPROM_UPDATE=1 in $EEPROM_DEFAULT"
# --- 6d. Boot order: network first, then eMMC/SD (for future network boot / re-provisioning) ---
# BOOT_ORDER: 0x2 = network, 0x1 = SD/eMMC. 0x21 = try network first, then local storage.
# --- 6d. Boot order: eMMC/SD first, then network, then restart (0xf21) ---
# BOOT_ORDER nibbles (right-to-left): 1=SD/eMMC, 2=network (TFTP), f=restart loop.
# On CM4, rpi-eeprom-update -l only works after reboot (once 6c is applied). So we try now; if it fails, a one-shot runs after next boot.
log "--- Boot order (network first, then eMMC/SD) ---"
log "--- Boot order (0xf21: eMMC/SD first, then network, restart) ---"
BOOTCONF="/tmp/first-boot-eeprom-conf.txt"
BOOT_ORDER_SET=0
if command -v rpi-eeprom-config >/dev/null 2>&1 && command -v rpi-eeprom-update >/dev/null 2>&1; then
@@ -225,11 +225,17 @@ if command -v rpi-eeprom-config >/dev/null 2>&1 && command -v rpi-eeprom-update
rpi-eeprom-config "$PEE" > "$BOOTCONF" 2>/dev/null || true
fi
if [[ -s "$BOOTCONF" ]]; then
sed -i 's/^BOOT_ORDER=.*/BOOT_ORDER=0x21/' "$BOOTCONF"
grep -q '^BOOT_ORDER=' "$BOOTCONF" || echo 'BOOT_ORDER=0x21' >> "$BOOTCONF"
sed -i 's/^BOOT_ORDER=.*/BOOT_ORDER=0xf21/' "$BOOTCONF"
grep -q '^BOOT_ORDER=' "$BOOTCONF" || echo 'BOOT_ORDER=0xf21' >> "$BOOTCONF"
# Limit network boot: 3 retries, 1500ms DHCP timeout (fail fast to eMMC)
sed -i '/^NET_BOOT_MAX_RETRIES=/d; /^DHCP_TIMEOUT=/d; /^DHCP_REQ_TIMEOUT=/d; /^TFTP_IP=/d; /^NET_INSTALL_AT_POWER_ON=/d' "$BOOTCONF"
echo 'NET_BOOT_MAX_RETRIES=3' >> "$BOOTCONF"
echo 'DHCP_TIMEOUT=1500' >> "$BOOTCONF"
echo 'DHCP_REQ_TIMEOUT=500' >> "$BOOTCONF"
echo 'NET_INSTALL_AT_POWER_ON=0' >> "$BOOTCONF"
if rpi-eeprom-config --apply "$BOOTCONF" 2>/dev/null; then
log "Boot order set to 0x21 (network first, then eMMC/SD); EEPROM update scheduled for next reboot"
BOOT_ORDER_SET=1
log "Boot order set to 0xf21 (eMMC first, then network, restart); EEPROM update scheduled for next reboot"
@ BOOT_ORDER_SET=1
else
log "WARNING: rpi-eeprom-config --apply failed; boot order unchanged"
fi
@@ -247,15 +253,20 @@ if [[ "$BOOT_ORDER_SET" -eq 0 ]] && command -v rpi-eeprom-config >/dev/null 2>&1
ONCE_SVC="/etc/systemd/system/set-cm4-boot-order-once.service"
cat > "$ONCE_SCRIPT" << 'SETBOOTEOF'
#!/bin/bash
# One-shot: set BOOT_ORDER=0x21 (network first) when rpi-eeprom-update becomes available (e.g. after CM4 enable and reboot).
# One-shot: set BOOT_ORDER=0xf21 (eMMC first, then network) when rpi-eeprom-update becomes available (e.g. after CM4 enable and reboot).
BOOTCONF="/tmp/eeprom-boot-order-once.txt"
if PEE="$(rpi-eeprom-update -l 2>/dev/null)" && [[ -n "$PEE" ]] && [[ -f "$PEE" ]]; then
rpi-eeprom-config "$PEE" > "$BOOTCONF" 2>/dev/null
if [[ -s "$BOOTCONF" ]]; then
sed -i 's/^BOOT_ORDER=.*/BOOT_ORDER=0x21/' "$BOOTCONF"
grep -q '^BOOT_ORDER=' "$BOOTCONF" || echo 'BOOT_ORDER=0x21' >> "$BOOTCONF"
sed -i 's/^BOOT_ORDER=.*/BOOT_ORDER=0xf21/' "$BOOTCONF"
grep -q '^BOOT_ORDER=' "$BOOTCONF" || echo 'BOOT_ORDER=0xf21' >> "$BOOTCONF"
sed -i '/^NET_BOOT_MAX_RETRIES=/d; /^DHCP_TIMEOUT=/d; /^DHCP_REQ_TIMEOUT=/d; /^TFTP_IP=/d; /^NET_INSTALL_AT_POWER_ON=/d' "$BOOTCONF"
echo 'NET_BOOT_MAX_RETRIES=3' >> "$BOOTCONF"
echo 'DHCP_TIMEOUT=1500' >> "$BOOTCONF"
echo 'DHCP_REQ_TIMEOUT=500' >> "$BOOTCONF"
echo 'NET_INSTALL_AT_POWER_ON=0' >> "$BOOTCONF"
if rpi-eeprom-config --apply "$BOOTCONF" 2>/dev/null; then
echo "Boot order set to 0x21 (network first, then eMMC/SD)"
echo "Boot order set to 0xf21 (eMMC first, then network)"
fi
fi
rm -f "$BOOTCONF"
@@ -267,7 +278,7 @@ SETBOOTEOF
chmod 755 "$ONCE_SCRIPT"
cat > "$ONCE_SVC" << 'SVCEOF'
[Unit]
Description=Set CM4 boot order once (network first)
Description=Set CM4 boot order once (eMMC first, then network)
After=multi-user.target
[Service]

View File

@@ -220,6 +220,15 @@ DEFAULT_STATUS = {
}
def _write_status(phase, message, progress=None):
try:
os.makedirs(os.path.dirname(STATUS_FILE) or ".", exist_ok=True)
with open(STATUS_FILE, "w") as f:
json.dump({"phase": phase, "message": message, "progress": progress, "updated": time.time()}, f)
except (PermissionError, OSError):
pass
def read_status():
try:
with open(STATUS_FILE, "r") as f:
@@ -555,6 +564,11 @@ def api_device_action():
d["action"] = action
d["action_at"] = time.time()
_save_network_devices(data)
ip = d.get("ip") or mac
if action == "deploy":
_write_status("flashing", f"Deploying to {ip} (network)...")
elif action == "backup":
_write_status("backup", f"Backing up {ip} (network)...")
return jsonify({"ok": True})
return jsonify({"ok": False, "error": "Device not found"}), 404
return jsonify({"ok": False, "error": "source must be 'usb' or 'network'"}), 400
@@ -630,7 +644,13 @@ def api_dhcp_network_boot_post():
def api_action_done():
"""Called by a device when deploy or backup has completed. Disables DHCP network-boot so the device boots from eMMC next time."""
mac = request.args.get("mac") or ((request.get_json(silent=True) or {}).get("mac") or "")
# Remove device from network-devices list so it doesn't keep showing
if mac:
data = _load_network_devices()
data["devices"] = [d for d in data.get("devices", []) if (d.get("mac") or "").lower() != mac.lower()]
_save_network_devices(data)
ok, _ = _dhcp_network_boot_run("disable")
_write_status("done", f"Done ({mac or 'network device'}). Network boot disabled; device will boot from eMMC on next boot.")
if not ok:
return jsonify({"ok": False, "error": "Could not disable DHCP network boot"}), 500
return jsonify({"ok": True, "message": "Network boot disabled; device will boot from eMMC on next boot"})

View File

@@ -470,7 +470,7 @@
</ol>
<p class="help-sub">Network boot</p>
<ol class="steps-list">
<li><span class="num">1</span> Enable network boot (e.g. <code style="background:var(--bg-tertiary);padding:0.15rem 0.35rem;border-radius:4px;">BOOT_ORDER=0x21</code>) and ensure the device can reach this server.</li>
<li><span class="num">1</span> Enable network boot (e.g. <code style="background:var(--bg-tertiary);padding:0.15rem 0.35rem;border-radius:4px;">BOOT_ORDER=0xf21</code>) and ensure the device can reach this server.</li>
<li><span class="num">2</span> Boot with the provisioning client; it will show above. Choose <strong>Backup</strong> or <strong>Deploy</strong>.</li>
</ol>
</div>

View File

@@ -6,7 +6,7 @@ This describes the full flow from power-on to eMMC deploy/backup when using **ne
## Overview
1. **reTerminal** is set to try **network boot first** (EEPROM `BOOT_ORDER=0x21`).
1. **reTerminal** is set to try **eMMC first, then network** (EEPROM `BOOT_ORDER=0xf21`).
2. It is connected to the **same LAN as the LXCs eth1** (e.g. 10.20.50.0/24).
3. On power-on it gets an IP via **DHCP** and loads **boot files via TFTP** from the LXC.
4. The **netboot environment** (kernel + rootfs) runs **provisioning-client.sh**, which registers with the **dashboard** and polls for an action.
@@ -29,7 +29,7 @@ The **dashboard** (Flask) runs in the LXC and is reachable at e.g. `http://10.20
### 2. reTerminal (device)
- **EEPROM**: `BOOT_ORDER=0x21` (network first, then SD/eMMC). Can be set by cloud-init first-boot on an already-flashed device.
- **EEPROM**: `BOOT_ORDER=0xf21` (eMMC first, then network). Can be set by cloud-init first-boot on an already-flashed device.
- **Network**: Ethernet connected to the same segment as the LXCs **eth1** (e.g. same switch/VLAN as 10.20.50.0/24).
- On **power-on**:
1. Pi 4/CM4 firmware does **DHCP** on the wired interface.
@@ -93,7 +93,7 @@ So the “netboot environment” is either:
- **LXC**: eth1 = 10.20.50.1/24, dnsmasq (DHCP + TFTP on eth1; netboot options 66/67 in a separate snippet so they can be toggled), `/srv/tftpboot` with RPi 4 boot files, NAT for 10.20.50.0/24 via eth0. Toggle script **/opt/cm4-provisioning/toggle-network-boot-dhcp.sh** (enable/disable/status). Dashboard running, `golden.img` present for Deploy.
See **NETWORK-BOOT-LXC.md** and **setup-network-boot-on-lxc.sh**.
- **reTerminal**: EEPROM boot order = network first; Ethernet on 10.20.50.0/24; netboot environment that runs **provisioning-client.sh** with `PROVISIONING_SERVER=http://10.20.50.1:5000`.
- **reTerminal**: EEPROM boot order = eMMC first, then network; Ethernet on 10.20.50.0/24; netboot environment that runs **provisioning-client.sh** with `PROVISIONING_SERVER=http://10.20.50.1:5000`.
- **Netboot root**: Must provide network, curl, and the client script (NFS, initramfs, or custom root).
The **TFTP** setup only gets the Pi to boot a kernel (and optional root). The **provisioning** (Deploy/Backup) is done by that kernels environment running the **network-client** against the dashboard on the LXC.

View File

@@ -67,7 +67,7 @@ Your current LXC already has eth0 (10.130.60.141) and eth1 (10.20.50.1); the set
## After setup: reTerminal network boot
1. Set the reTerminal **boot order** to try network first (e.g. `BOOT_ORDER=0x21`; see cloud-init/first-boot).
1. Set the reTerminal **boot order** to try eMMC first, then network (e.g. `BOOT_ORDER=0xf21`; see cloud-init/first-boot).
2. Connect the reTerminal to the **same network as the LXCs eth1** (e.g. 10.20.50.0/24).
3. Power on; it will get an IP via DHCP and load boot files via TFTP from the LXC.
4. For **provisioning** (Backup/Deploy), the netboot environment must run **network-client/provisioning-client.sh** with `PROVISIONING_SERVER=http://10.20.50.1:5000` so it talks to the dashboard on the LXC.
@@ -102,7 +102,7 @@ Each line is: *expiry_epoch MAC IP hostname client_id*. Example: `1734567890 aa:
## Testing network boot
1. **Prerequisites**
- reTerminal has **BOOT_ORDER=0x21** (network first). Check on the device:
- reTerminal has **BOOT_ORDER=0xf21** (eMMC first, then network). Check on the device:
`ssh pi@<device-ip> 'bash -s' < emmc-provisioning/scripts/check-network-boot-priority.sh`
- LXC network-boot options are **enabled**: on the LXC run
`/opt/cm4-provisioning/toggle-network-boot-dhcp.sh status` → should print `enabled`. If not: `toggle-network-boot-dhcp.sh enable`

View File

@@ -48,7 +48,7 @@ Then **power off** the reTerminal and **power it on**. Watch where DHCP (and TFT
|--------|---------------|--------|
| No DHCP/TFTP on eth1 during boot; traffic only after OS | reTerminal on different segment than eth1 | Plug reTerminal into same VLAN/bridge as LXC eth1 (provisioning LAN) |
| DHCP on eth0 during boot, none on eth1 | reTerminal on same segment as eth0 | Move reTerminal to provisioning segment (same as eth1) |
| No DHCP on any interface during boot | Cable unplugged, BOOT_ORDER not 0x21, or device not attempting netboot | Check cable, confirm BOOT_ORDER=0x21, power cycle with cable in before power |
| No DHCP on any interface during boot | Cable unplugged, BOOT_ORDER not 0xf21, or device not attempting netboot | Check cable, confirm BOOT_ORDER=0xf21, power cycle with cable in before power |
---
@@ -75,7 +75,7 @@ So the device **is** on the right segment (eth1, 10.20.50.x). The problem is tha
- **Discover** (client 0.0.0.0 → broadcast) at the very start → thats the bootloader.
- **TFTP (port 69)** right after DHCP Ack → bootloader loading files.
2. If you **never** see Discover or TFTP, only Request/Reply after the OS is up, then the bootloader is either not attempting network boot or is giving up (e.g. link not ready, timeout) and booting from eMMC. Try a full power-off (mains or PSU), wait 10 s, then power on with tcpdump already running.
3. Confirm **BOOT_ORDER=0x21** on the device (network first) and that Ethernet is connected before power-on.
3. Confirm **BOOT_ORDER=0xf21** on the device (eMMC first, then network) and that Ethernet is connected before power-on.
---
@@ -112,7 +112,7 @@ and you **never** see a line about network (e.g. "Trying DHCP", "TFTP", or "Boot
1. **BOOT_ORDER not applied or not read**
From the running OS, confirm:
`sudo vcgencmd bootloader_config`
and check that `BOOT_ORDER=0x21` (and optionally `NET_BOOT_MAX_RETRIES`, `DHCP_TIMEOUT`, `TFTP_IP`). If you see different or missing values, the EEPROM config in use at boot may be different (e.g. old EEPROM, or update not applied on cold boot).
and check that `BOOT_ORDER=0xf21` (and optionally `NET_BOOT_MAX_RETRIES`, `DHCP_TIMEOUT`, `TFTP_IP`). If you see different or missing values, the EEPROM config in use at boot may be different (e.g. old EEPROM, or update not applied on cold boot).
2. **Network tried but failed before any DHCP**
The bootloader may try network, fail very early (e.g. no link, or timeout before sending DHCP), then fall back to SD without printing a “Trying network” line. Slower link-up (switch, cable) can cause this. Increasing `DHCP_TIMEOUT` and `NET_BOOT_MAX_RETRIES` (and setting `TFTP_IP`) gives the best chance.
@@ -123,21 +123,33 @@ and you **never** see a line about network (e.g. "Trying DHCP", "TFTP", or "Boot
**What to try:**
- Re-apply EEPROM config with network first and timeouts (as in NETWORK-BOOT-TROUBLESHOOTING), then **full power cycle** (unplug power 10+ s, then power on) with serial connected. Watch from the first character for any “NET”, “DHCP”, “TFTP” or “order” line.
- For a one-off test you can set `BOOT_ORDER=0x2` (network only). If network fails, the device wont boot (no fallback to SD). Use only to confirm whether the bootloader tries network and what it prints; then set back to `0x21`. If the full serial log never shows "NET", "DHCP", or "TFTP" and goes straight to "Boot mode: SD (01) order 2", trying `BOOT_ORDER=0x2` (network only) once will force a network attempt and should produce DHCP/TFTP messages on serial.
- For a one-off test you can set `BOOT_ORDER=0x2` (network only). If network fails, the device wont boot (no fallback to SD). Use only to confirm whether the bootloader tries network and what it prints; then set back to `0xf21`. If the full serial log never shows "NET", "DHCP", or "TFTP" and goes straight to "Boot mode: SD (01) order 2", trying `BOOT_ORDER=0x2` (network only) once will force a network attempt and should produce DHCP/TFTP messages on serial.
---
## Boot stops after start4.elf ("PCI0 reset" then nothing)
If the serial log shows **TFTP** for config.txt, start4.elf, fixup4.dat, then **"Starting start4.elf"**, **"Stopping network"**, **"PCI0 reset"**, and **no** TFTP requests for **kernel8.img** or **initrd.img**, the bootloader is not loading the kernel. That usually means **config.txt** in the TFTP root does not have the **kernel** and **initramfs** lines.
### Whats actually going on
**Fix on the LXC:** ensure `/srv/tftpboot/config.txt` contains (and that `0d1ddbda/config.txt` is a symlink to it or has the same content):
The **EEPROM bootloader** only does TFTP for config.txt, start4.elf, and fixup4.dat. It then **starts the GPU firmware (start4.elf)** and **stops the network**. The **kernel and initrd are loaded by the GPU firmware**, not by the EEPROM: after “Starting start4.elf”, the GPU is supposed to bring the network back up and TFTP kernel8.img, cmdline.txt, and initrd.img. If you never see TFTP for kernel8.img/initrd.img and the log stops at “PCI0 reset”, the GPU stage is not doing that. Common causes:
1. **Config not seen by the GPU** — The config the EEPROM fetched (e.g. from `0d1ddbda/config.txt`) must contain `kernel=kernel8.img` and `initramfs initrd.img followkernel`. If that file was a symlink or truncated, the GPU may not see those lines. Use a **real copy** of the full config in the serial dir (see ensure script below).
2. **No visibility into the GPU** — The EEPROM logs stop at “PCI0 reset”; the next step is inside the GPU firmware. To see GPU messages (e.g. network bring-up, TFTP, or errors), add **`uart_2ndstage=1`** to config.txt so the GPU logs to the UART. Then power-cycle and watch for lines like `MESS:... genet: LINK STATUS` or TFTP activity.
3. **Firmware/board quirk** — On some boards or firmware versions the GPU netboot path can fail silently. Ensuring the latest Pi 4/CM4 boot files in the TFTP root and trying **start4cd.elf** + **fixup4cd.dat** (or leaving defaults) is worth a try.
If the serial log shows **TFTP** for config.txt, start4.elf, fixup4.dat, then **"Starting start4.elf"**, **"Stopping network"**, **"PCI0 reset"**, and **no** TFTP requests for **kernel8.img** or **initrd.img**, use the checks below.
**Fix on the LXC:** ensure `/srv/tftpboot/config.txt` contains (and that `0d1ddbda/config.txt` is a real copy with the same content):
```ini
enable_uart=1
kernel=kernel8.img
initramfs initrd.img followkernel
uart_2ndstage=1
```
`enable_uart=1` is required for the kernel serial console when netbooting (otherwise the firmware can set 8250.nr_uarts=0). `uart_2ndstage=1` makes the GPU firmware log to the UART so you see **MESS:** lines after "PCI0 reset" (e.g. network bring-up, TFTP, or errors).
You can run:
```bash
@@ -147,6 +159,30 @@ ssh root@<LXC-IP> 'bash -s' < emmc-provisioning/scripts/ensure-tftpboot-config-k
Also ensure the TFTP root has **kernel8.img** and **initrd.img** (and the serial subdir has symlinks or copies). Then power-cycle the device; you should see TFTP_GET for kernel8.img and initrd.img, then the kernel and initramfs (e.g. rescue shell or provisioning client) run.
**If it still stops after “PCI0 reset”:**
- Add **`uart_2ndstage=1`** to the TFTP config.txt (root and serial copy). Re-run the ensure script so the serial dir gets the updated config, then power-cycle. Watch the serial log for **MESS:** lines from the GPU (e.g. `genet: LINK STATUS`, TFTP, or errors). That shows whether the GPU is bringing the network up and trying to load the kernel.
- On the LXC, confirm the config the device gets has the right size and content:
`ssh root@<LXC-IP> 'wc -c /srv/tftpboot/0d1ddbda/config.txt && grep -E "kernel|initramfs|uart_2ndstage" /srv/tftpboot/0d1ddbda/config.txt'`
---
## Kernel loads but serial stops at "Baud rate change done" (no rescue shell)
If you see the GPU load kernel8.img and initrd.img, then **"Baud rate change done..."** and nothing else (no rescue shell, no kernel messages), the kernel is likely hanging very early because of a **missing or invalid Device Tree**. The GPU log may show **`dterror: Failed to load Device Tree file '?'`**.
The GPU loads files from the **serial-prefix** dir (e.g. `0d1ddbda/`). If the **.dtb** files (e.g. `bcm2711-rpi-cm4.dtb`, `bcm2711-rpi-cm4-io.dtb`) are only in the TFTP root and not in that dir, the firmware can fail to load the right DTB and the kernel gets no valid device tree.
**Fix:** Ensure the TFTP root has the Pi 4/CM4 DTB files (from the [Raspberry Pi firmware](https://github.com/raspberrypi/firmware) `boot/` folder) and that each **serial-prefix** dir has symlinks to them. Re-run the ensure script (it now links `*.dtb` into each serial dir):
```bash
ssh root@<LXC-IP> 'bash -s' < emmc-provisioning/scripts/ensure-tftpboot-config-kernel-initrd.sh
```
If the TFTP root has no `*.dtb` files, populate it from the Pi firmware (e.g. run `populate-tftpboot-from-git.sh` or copy `bcm2711-rpi-cm4.dtb`, `bcm2711-rpi-cm4-io.dtb`, and other `bcm2711*.dtb` from the firmware repo into `/srv/tftpboot`), then run the ensure script again and power-cycle the device.
**Serial stops at "Baud rate change done" (no kernel/initramfs output):** On Pi 4/CM4 netboot, the firmware can force **8250.nr_uarts=0**, which disables the kernel serial driver so you get no console after the GPU handoff ([raspberrypi/firmware#1575](https://github.com/raspberrypi/firmware/issues/1575)). The workaround is **`enable_uart=1`** in config.txt (within the first 4KB). The ensure script adds it; re-run the script so the root and serial-prefix configs have it, then power-cycle. Keep serial at **115200** baud.
---
## TFTP "file .../SERIAL/start4.elf not found" — serial-number prefix
@@ -168,16 +204,17 @@ After that, the bootloaders first TFTP requests succeed. The device already h
## Stuck in network-only boot (BOOT_ORDER=0x2): get back to Raspbian and change boot order
If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never try eMMC. To get back to Raspbian and set **BOOT_ORDER=0x1** or **0x21**, use **rescue mode**: the network boot chain loads the provisioning initramfs; with a special kernel cmdline it drops to a shell so you can mount eMMC and run **rpi-eeprom-config** from the eMMC install.
If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never try eMMC. To fix the EEPROM config, use **rescue mode**: the network boot chain loads the Alpine-based provisioning initramfs which includes Python and `rpi-eeprom-config`; with a special kernel cmdline it drops to a shell so you can run `rpi-eeprom-config` directly from the initramfs (no chroot into eMMC needed).
### Prerequisites
- **Initramfs with rescue support** — Build the initramfs (it includes `/rescue-eeprom.sh`) and copy it to the LXC TFTP root and into the serial dir:
- **Initramfs with rescue support** — Build the Alpine-based initramfs (it includes `/rescue-eeprom.sh`, `rpi-eeprom-config`, and EEPROM firmware) and copy it to the LXC TFTP root and into the serial dir:
```bash
cd emmc-provisioning/network-boot-initramfs && ./build.sh
scp initrd.img root@<LXC-IP>:/srv/tftpboot/
ssh root@<LXC-IP> 'cp /srv/tftpboot/initrd.img /srv/tftpboot/0d1ddbda/ 2>/dev/null || true'
```
Building requires Docker or Podman with arm64 emulation (`qemu-user-static`).
- **TFTP config** — Ensure `/srv/tftpboot/config.txt` (and thus `0d1ddbda/config.txt` if its a symlink) has `kernel=kernel8.img` and `initramfs initrd.img followkernel` so the full kernel+initrd chain runs.
### Steps
@@ -196,22 +233,23 @@ If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never
2. **Power on the reTerminal** (or reboot). It will network boot, load kernel + initramfs, and **rescue mode** will start a shell (serial or console). You should see:
`=== RESCUE MODE (provisioning_rescue=1) ===`
3. **In the rescue shell**, run the helper to mount eMMC and run the EEPROM config from the eMMC install:
3. **In the rescue shell**, run the rescue script. It automatically sets `BOOT_ORDER=0xf21` and writes the EEPROM update to the eMMC boot partition:
```bash
/rescue-eeprom.sh
```
In the editor that opens, set **BOOT_ORDER=0x1** (eMMC only) or **0x21** (network first, then eMMC). Save and exit the editor.
The script runs `rpi-eeprom-config` directly from the initramfs (no chroot, no dependency on the eMMC OS). It creates a `pieeprom.upd` file on the eMMC boot partition with the updated config. For manual editing instead, use `/rescue-eeprom.sh --edit`.
4. **Reboot** from the rescue shell:
```bash
reboot
```
The bootloader will apply the EEPROM update and on the next boot use the new order (eMMC only with 0x1, or network then eMMC with 0x21).
4. **Disable network boot and reboot** — The EEPROM update is only applied when the bootloader **boots from the same storage** where the update file was written. You wrote it to **eMMC**, so the bootloader must **boot from eMMC** once to apply it. With **BOOT_ORDER=0x2** (network only) the next reboot netboots again, so the bootloader never reads eMMC and the update is never applied. Do this **before** rebooting from the rescue shell:
- **On the LXC**, disable PXE so the next boot does not advertise TFTP:
`ssh root@<LXC-IP> '/opt/cm4-provisioning/toggle-network-boot-dhcp.sh disable'`
- Then **power cycle** the reTerminal (or run `reboot -f` / `echo b > /proc/sysrq-trigger` in the rescue shell). The bootloader will get DHCP without option 66/67; it may then try eMMC (depending on firmware) and apply the update. If it still netboots (e.g. cached TFTP), unplug the Ethernet cable and power cycle so it has no choice but eMMC.
5. **On the LXC**, restore normal cmdline for the device so the next network boot runs the provisioning client, not rescue:
5. **After you are back in Raspbian**, restore normal cmdline for the device so the next network boot runs the provisioning client, not rescue:
```bash
rm -f /srv/tftpboot/0d1ddbda/cmdline.txt
ln -s ../cmdline.txt /srv/tftpboot/0d1ddbda/cmdline.txt
./emmc-provisioning/scripts/disable-rescue-cmdline-on-lxc.sh root@<LXC-IP> 0d1ddbda
```
Or on the LXC: `rm -f /srv/tftpboot/0d1ddbda/cmdline.txt && ln -s ../cmdline.txt /srv/tftpboot/0d1ddbda/cmdline.txt`
**Why did my boot order not change?** The update file was written to the **eMMC** boot partition. The bootloader applies it only when it **boots from that partition**. When you rebooted, the device netbooted again (TFTP), so the bootloader read the “boot” files from the network, not from eMMC, and never saw or applied the update. Disable PXE (and optionally unplug Ethernet) before rebooting so the next boot is from eMMC and the update is applied.
See also **NETWORK-BOOT-LXC.md** for setup and monitoring.

44
emmc-provisioning/lxc/toggle-network-boot-dhcp.sh Normal file → Executable file
View File

@@ -1,29 +1,45 @@
#!/usr/bin/env bash
# Enable or disable DHCP network-boot options (option 66/67) on the provisioning LXC.
# Does not stop the DHCP server or TFTP; only stops advertising netboot so devices boot from local storage.
# Usage: toggle-network-boot-dhcp.sh enable | disable
# Run as root (or with sudo). Install to /opt/cm4-provisioning/toggle-network-boot-dhcp.sh
# Enable or disable network boot (PXE + TFTP) on the provisioning LXC.
# When disabled, TFTP is stopped and no boot server is advertised; DHCP still runs.
# Usage: toggle-network-boot-dhcp.sh enable | disable | status
# Run as root. Install to /opt/cm4-provisioning/toggle-network-boot-dhcp.sh
set -e
PXE_CONF="/etc/dnsmasq.d/network-boot-pxe.conf"
SNIPPET_CONTENT="# PXE options - do not edit; managed by toggle-network-boot-dhcp.sh
dhcp-option=66,10.20.50.1
dhcp-option=67,start4cd.elf
"
MAIN_CONF="/etc/dnsmasq.d/network-boot.conf"
# Remove enable-tftp / tftp-root from main config if present (legacy; these belong in PXE conf)
cleanup_main_conf() {
if [ -f "$MAIN_CONF" ] && grep -q 'enable-tftp\|tftp-root' "$MAIN_CONF" 2>/dev/null; then
sed -i '/^enable-tftp/d; /^tftp-root/d' "$MAIN_CONF"
fi
}
case "${1:-}" in
enable)
echo "$SNIPPET_CONTENT" > "$PXE_CONF"
systemctl reload dnsmasq 2>/dev/null || service dnsmasq reload 2>/dev/null || true
echo "Network boot (DHCP options) enabled."
cleanup_main_conf
cat > "$PXE_CONF" << 'EOF'
# PXE/network boot ENABLED - managed by toggle-network-boot-dhcp.sh
# TFTP server (only active when network boot is enabled)
enable-tftp
tftp-root=/srv/tftpboot
# BOOTP fields (siaddr = TFTP server, filename = boot file)
dhcp-boot=start4cd.elf,,10.20.50.1
# DHCP options 66/67 (some PXE clients prefer these)
dhcp-option=66,10.20.50.1
dhcp-option=67,start4cd.elf
EOF
systemctl restart dnsmasq 2>/dev/null || service dnsmasq restart 2>/dev/null || true
echo "Network boot enabled."
;;
disable)
cleanup_main_conf
rm -f "$PXE_CONF"
systemctl reload dnsmasq 2>/dev/null || service dnsmasq reload 2>/dev/null || true
echo "Network boot (DHCP options) disabled. Devices will get DHCP but boot from local storage."
systemctl restart dnsmasq 2>/dev/null || service dnsmasq restart 2>/dev/null || true
echo "Network boot disabled. DHCP still running but no TFTP or boot options."
;;
status)
if [ -f "$PXE_CONF" ]; then
if [ -f "$PXE_CONF" ] && grep -q 'enable-tftp' "$PXE_CONF" 2>/dev/null; then
echo "enabled"
else
echo "disabled"

View File

@@ -1,33 +1,36 @@
# Provisioning initramfs for network boot
Minimal initramfs that runs **provisioning-client.sh** after bringing up the network. Used with Raspberry Pi 4 / CM4 (reTerminal) when booting via TFTP from the provisioning LXC.
Alpine Linux-based initramfs that runs **provisioning-client.sh** after bringing up the network. Used with Raspberry Pi 4 / CM4 (reTerminal) when booting via TFTP from the provisioning LXC.
Includes Python 3 and `rpi-eeprom-config` so EEPROM configuration can be modified directly from the initramfs without chrooting into eMMC.
## What it does
1. Mounts `/proc`, `/sys`, `/dev`, `/dev/pts`.
2. Ensures an IP (reuses kernel DHCP or runs `udhcpc` on eth0).
2. Brings up eth0 and obtains a DHCP lease via `udhcpc`.
3. Runs the provisioning client with `PROVISIONING_SERVER` (default `http://10.20.50.1:5000`, overridable via kernel cmdline).
4. The client registers with the dashboard and polls for **Deploy** or **Backup**; on action it performs the dd + curl and exits.
4. The client registers with the dashboard and polls for **Deploy** or **Backup**; on action it performs the dd + curl and reboots.
## Build
**On x86_64 (e.g. your laptop):** the script uses **Podman** or **Docker** with `--platform linux/arm64` to run an arm64 container and copy busybox + curl into the initramfs. Your host must be able to *run* arm64 containers (via QEMU emulation).
The build script uses Docker or Podman with `--platform linux/arm64` to create an Alpine aarch64 rootfs with Python 3, curl, and `rpi-eeprom-config`. Your host must support arm64 containers via QEMU emulation.
- **Fedora:** one-time setup to enable arm64 containers:
### Prerequisites
- **Docker** or **Podman** installed
- **arm64 emulation** (QEMU user-static):
```bash
# Fedora
sudo dnf install -y qemu-user-static
```
Then run the build (Podman will use QEMU automatically):
```bash
cd emmc-provisioning/network-boot-initramfs
./build.sh
```
- If you dont install `qemu-user-static`, the script will fail with an error and print the same instructions and an alternative (build on a Pi).
**On a Raspberry Pi 4 or other aarch64 host:** no Docker. Install deps and run:
# Debian/Ubuntu
sudo apt install -y qemu-user-static
```
### Build the initramfs
```bash
sudo apt install -y busybox curl
cd emmc-provisioning/network-boot-initramfs
./build.sh
```
@@ -37,6 +40,8 @@ Optional: pass an output path:
./build.sh /path/to/initrd.img
```
The resulting `initrd.img` is approximately 25-35 MB compressed (Alpine base + Python + EEPROM firmware).
## Deploy to TFTP root
1. Copy **initrd.img** to the LXC TFTP root (e.g. `/srv/tftpboot`):
@@ -51,7 +56,7 @@ Optional: pass an output path:
initramfs initrd.img followkernel
```
So the firmware loads the kernel and then the initrd that follows it. The Pi will boot the kernel and run `/init` from the initrd.
So the firmware loads the kernel and then the initrd that "follows" it. The Pi will boot the kernel and run `/init` from the initrd.
3. If your DHCP already points the Pi to this TFTP server and `start4cd.elf`, the Pi will load kernel + initrd from the same root. No NFS or extra server needed.
@@ -67,15 +72,43 @@ The init script reads `provisioning_server=` from `/proc/cmdline` and exports `P
### Rescue mode (stuck in network-only boot)
If the device has **BOOT_ORDER=0x2** (network only), it never boots from eMMC. To get a shell and change boot order using the eMMCs **rpi-eeprom-config**, add **provisioning_rescue=1** to the kernel cmdline (e.g. in the TFTP-served `cmdline.txt` for that device). The initramfs will then start an interactive shell instead of the provisioning client. Run **/rescue-eeprom.sh** to mount eMMC and chroot to run `rpi-eeprom-config --edit`; set `BOOT_ORDER=0x1` or `0x21`, save, then `reboot`. See **docs/NETWORK-BOOT-TROUBLESHOOTING.md** (“Stuck in network-only boot”) for full steps.
If the device has **BOOT_ORDER=0x2** (network only), it never boots from eMMC. To fix the EEPROM config, add **provisioning_rescue=1** to the kernel cmdline (e.g. in the TFTP-served `cmdline.txt` for that device). The initramfs will start an interactive shell instead of the provisioning client.
Run **/rescue-eeprom.sh** to set `BOOT_ORDER=0xf21` directly from the initramfs. The script:
1. Reads the current EEPROM config using `rpi-eeprom-config` (included in the initramfs)
2. Creates a modified config with `BOOT_ORDER=0xf21` and tuned network timeouts
3. Embeds the config into a new EEPROM image using the bundled `pieeprom.bin` firmware
4. Copies the update (`pieeprom.upd` + `pieeprom.sig`) to the eMMC boot partition
No chroot, no EDITOR hack, no dependency on the eMMC OS.
After running the script, disable network boot on the LXC and reboot so the bootloader boots from eMMC and applies the update.
See **docs/NETWORK-BOOT-TROUBLESHOOTING.md** ("Stuck in network-only boot") for full steps.
## What's included in the initramfs
| Component | Purpose |
|-----------|---------|
| Alpine Linux base | Minimal rootfs with `apk` package manager |
| BusyBox | Core Unix utilities (sh, mount, ip, udhcpc, dd, etc.) |
| Python 3 | Required by `rpi-eeprom-config` |
| curl | HTTP client for provisioning dashboard API |
| rpi-eeprom-config | EEPROM configuration tool (from rpi-eeprom repo) |
| pieeprom.bin | EEPROM firmware image (for creating update files) |
| init | Boot script: mounts fs, DHCP, rescue or provision |
| provisioning-client.sh | Registers with dashboard, executes deploy/backup |
| rescue-eeprom.sh | Sets EEPROM boot order directly |
| udhcpc.script | Applies DHCP lease (IP, route, DNS) |
## Flow summary
1. Pi does DHCP gets IP and TFTP server (e.g. 10.20.50.1).
1. Pi does DHCP -> gets IP and TFTP server (e.g. 10.20.50.1).
2. Pi loads via TFTP: start4cd.elf, fixup4cd.dat, config.txt, cmdline.txt, kernel8.img, **initrd.img**.
3. Kernel boots with initrd as root; runs `/init`.
4. Init mounts minimal fs, ensures network, runs `/provisioning-client.sh`.
5. Client registers and polls; you choose Deploy or Backup in the dashboard; client runs dd + curl and exits.
6. After deploy, power cycle the Pi so it boots from eMMC.
5. Client registers and polls; you choose Deploy or Backup in the dashboard; client runs dd + curl and reboots.
6. After deploy, device boots from eMMC.
See **docs/NETWORK-BOOT-DEPLOYMENT-FLOW.md** for the full deployment flow.

View File

@@ -2,163 +2,132 @@
# Build provisioning initramfs for Raspberry Pi 4 / CM4 (aarch64).
# Produces initrd.img (gzip cpio) for TFTP boot (config.txt: initramfs initrd.img followkernel).
#
# On x86_64: tries Docker/Podman with --platform linux/arm64; if that fails (no
# arm64 emulation), downloads prebuilt static aarch64 busybox and curl. No sudo needed.
# On aarch64 (e.g. Raspberry Pi): uses local busybox and curl if installed.
# Uses an Alpine Linux aarch64 base with Python 3 so rpi-eeprom-config can run
# directly in the initramfs (no chroot into eMMC needed for rescue).
#
# Requires Docker or Podman with arm64 emulation:
# Fedora: sudo dnf install -y qemu-user-static
# Debian/Ubuntu: sudo apt install -y qemu-user-static
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
OUTPUT="${1:-$SCRIPT_DIR/initrd.img}"
BUILD_DIR=$(mktemp -d)
trap "rm -rf $BUILD_DIR" EXIT
ALPINE_VERSION="3.22"
RPI_EEPROM_REPO="https://github.com/raspberrypi/rpi-eeprom"
RPI_EEPROM_RAW="https://raw.githubusercontent.com/raspberrypi/rpi-eeprom"
echo "Build dir: $BUILD_DIR"
# Layout: /init, /provisioning-client.sh, /bin/busybox, /bin/sh, /usr/bin/curl, /lib/*.so
mkdir -p "$BUILD_DIR"/{bin,usr/bin,proc,sys,dev,dev/pts,lib,mnt}
cp "$SCRIPT_DIR/init" "$BUILD_DIR/init"
cp "$SCRIPT_DIR/provisioning-client.sh" "$BUILD_DIR/provisioning-client.sh"
cp "$SCRIPT_DIR/rescue-eeprom.sh" "$BUILD_DIR/rescue-eeprom.sh"
chmod +x "$BUILD_DIR/init" "$BUILD_DIR/provisioning-client.sh" "$BUILD_DIR/rescue-eeprom.sh"
ARCH=$(uname -m 2>/dev/null)
if [ "$ARCH" = "aarch64" ] || [ "$ARCH" = "arm64" ] || [ "$ARCH" = "armv8l" ]; then
if ! command -v busybox >/dev/null 2>&1 || ! command -v curl >/dev/null 2>&1; then
echo "On aarch64: install busybox and curl (apt install busybox curl) then re-run."
exit 1
# Find container runtime
CONTAINER_RUNTIME=""
for cmd in podman docker; do
if command -v "$cmd" >/dev/null 2>&1; then
CONTAINER_RUNTIME="$cmd"
break
fi
echo "Copying busybox and curl from host (aarch64)..."
cp "$(command -v busybox)" "$BUILD_DIR/bin/busybox"
cp "$(command -v curl)" "$BUILD_DIR/usr/bin/curl"
chmod +x "$BUILD_DIR/bin/busybox" "$BUILD_DIR/usr/bin/curl"
for f in $(ldd "$(command -v curl)" 2>/dev/null | awk '/=>/{print $3}'); do
[ -f "$f" ] && cp "$f" "$BUILD_DIR/lib/"
done
cp /lib/ld-linux-aarch64.so.1 "$BUILD_DIR/lib/" 2>/dev/null || true
else
# x86_64: try container first; on "Exec format error" (no arm64 emulation) use static downloads
CONTAINER_RUNTIME=""
for cmd in docker podman; do
if command -v "$cmd" >/dev/null 2>&1; then
CONTAINER_RUNTIME="$cmd"
break
fi
done
CONTAINER_OK=0
if [ -n "$CONTAINER_RUNTIME" ]; then
echo "Trying $CONTAINER_RUNTIME (linux/arm64)..."
CNT_NAME="cm4-initramfs-build-$$"
# Use a container-internal dir (no bind mount); copy out with podman cp (works with rootless)
$CONTAINER_RUNTIME run --name "$CNT_NAME" --platform linux/arm64 debian:bookworm-slim bash -c '
apt-get update -qq && apt-get install -y -qq busybox curl
mkdir -p /out/bin /out/usr/bin /out/lib
cp /bin/busybox /out/bin/busybox
cp /usr/bin/curl /out/usr/bin/curl
chmod +x /out/bin/busybox /out/usr/bin/curl
ldd /usr/bin/curl 2>/dev/null | awk "/=>/{print \$3}" | while read f; do [ -n "$f" ] && [ -f "$f" ] && cp "$f" /out/lib/; done
cp /lib/ld-linux-aarch64.so.1 /out/lib/ 2>/dev/null || true
' 2>/dev/null || true
# Always copy from container (avoids rootless bind-mount issues)
$CONTAINER_RUNTIME cp "$CNT_NAME:/out/bin/busybox" "$BUILD_DIR/bin/" 2>/dev/null && \
$CONTAINER_RUNTIME cp "$CNT_NAME:/out/usr/bin/curl" "$BUILD_DIR/usr/bin/" 2>/dev/null && \
$CONTAINER_RUNTIME cp "$CNT_NAME:/out/lib/." "$BUILD_DIR/lib/" 2>/dev/null || true
$CONTAINER_RUNTIME rm -f "$CNT_NAME" >/dev/null 2>&1
if [ -f "$BUILD_DIR/bin/busybox" ] && [ -f "$BUILD_DIR/usr/bin/curl" ]; then
CONTAINER_OK=1
echo "Container build succeeded."
fi
fi
if [ "$CONTAINER_OK" -ne 1 ]; then
echo "Using prebuilt static aarch64 binaries (no container/emulation needed)..."
DOWNLOAD_DIR=$(mktemp -d)
trap "rm -rf $BUILD_DIR $DOWNLOAD_DIR" EXIT
if command -v curl >/dev/null 2>&1; then
GET="curl -sL"
GET_O="curl -sL -o"
else
GET="wget -q -O -"
GET_O="wget -q -O"
fi
# Static busybox aarch64: try busybox.net, then Alpine busybox-static package
BB_OK=0
$GET_O "$DOWNLOAD_DIR/busybox" "https://busybox.net/downloads/binaries/1.35.0-defconfig-multiarch-musl/busybox-armv8l" 2>/dev/null || true
if [ -f "$DOWNLOAD_DIR/busybox" ] && [ -s "$DOWNLOAD_DIR/busybox" ]; then
BB_OK=1
fi
if [ "$BB_OK" -ne 1 ]; then
echo "Trying Alpine busybox-static aarch64..."
$GET_O "$DOWNLOAD_DIR/bb.apk" "https://dl-cdn.alpinelinux.org/alpine/v3.19/main/aarch64/busybox-static-1.36.1-r11.apk" 2>/dev/null || true
if [ -f "$DOWNLOAD_DIR/bb.apk" ] && [ -s "$DOWNLOAD_DIR/bb.apk" ]; then
case "$(head -c 4 "$DOWNLOAD_DIR/bb.apk" | od -An -tx1 2>/dev/null | tr -d ' ')" in
3c21444f|3c68746d) ;; # HTML response, skip extract
*)
(cd "$DOWNLOAD_DIR" && (tar xf bb.apk 2>/dev/null || tar xzf bb.apk 2>/dev/null) && [ -f data.tar.gz ] && tar xzf data.tar.gz 2>/dev/null)
;;
esac
fi
if [ -f "$DOWNLOAD_DIR/bin/busybox" ] && [ -s "$DOWNLOAD_DIR/bin/busybox" ]; then
cp "$DOWNLOAD_DIR/bin/busybox" "$DOWNLOAD_DIR/busybox"
BB_OK=1
fi
fi
if [ "$BB_OK" -ne 1 ]; then
echo "Failed to download busybox (x86 host cannot run arm64 container without emulation)."
echo ""
echo "Option A - Enable arm64 containers (one-time, needs sudo):"
echo " Fedora: sudo dnf install -y qemu-user-static"
echo " Then re-run this script (Podman will use QEMU to run the arm64 build)."
echo ""
echo "Option B - Build on a Raspberry Pi 4 (aarch64):"
echo " scp -r $(dirname "$SCRIPT_DIR") pi@<pi-ip>:~/ && ssh pi@<pi-ip> 'cd ~/emmc-provisioning/network-boot-initramfs && sudo apt install -y busybox curl && ./build.sh'"
echo " Then scp pi@<pi-ip>:~/emmc-provisioning/network-boot-initramfs/initrd.img ."
exit 1
fi
chmod +x "$DOWNLOAD_DIR/busybox"
cp "$DOWNLOAD_DIR/busybox" "$BUILD_DIR/bin/busybox"
# Static curl aarch64 glibc (Raspberry Pi OS uses glibc)
$GET "https://github.com/stunnel/static-curl/releases/download/8.18.0/curl-linux-aarch64-glibc-8.18.0.tar.xz" -o "$DOWNLOAD_DIR/curl.tar.xz" || true
if [ ! -f "$DOWNLOAD_DIR/curl.tar.xz" ] || [ ! -s "$DOWNLOAD_DIR/curl.tar.xz" ]; then
echo "Failed to download static curl."
exit 1
fi
(cd "$DOWNLOAD_DIR" && tar xf curl.tar.xz)
CURL_BIN=$(find "$DOWNLOAD_DIR" -maxdepth 3 -name "curl" -type f 2>/dev/null | head -1)
if [ -n "$CURL_BIN" ] && [ -x "$CURL_BIN" ]; then
cp "$CURL_BIN" "$BUILD_DIR/usr/bin/curl"
chmod +x "$BUILD_DIR/usr/bin/curl"
else
echo "Could not find curl binary in tarball."
exit 1
fi
rm -rf "$DOWNLOAD_DIR"
trap "rm -rf $BUILD_DIR" EXIT
fi
fi
# Verify we have busybox (container or fallback must have left it)
if [ ! -f "$BUILD_DIR/bin/busybox" ] || [ ! -s "$BUILD_DIR/bin/busybox" ]; then
echo "Error: busybox not found in $BUILD_DIR/bin. If the container ran, check Podman volume mount."
done
if [ -z "$CONTAINER_RUNTIME" ]; then
echo "Error: Docker or Podman is required to build the Alpine-based initramfs."
echo "Install one of them and ensure arm64 emulation is available:"
echo " Fedora: sudo dnf install -y podman qemu-user-static"
echo " Debian/Ubuntu: sudo apt install -y podman qemu-user-static"
exit 1
fi
chmod +x "$BUILD_DIR/bin/busybox" 2>/dev/null || true
# Busybox applets we need (sh, mount, udhcpc, etc.)
cd "$BUILD_DIR/bin"
./busybox --list 2>/dev/null | while read applet; do
case "$applet" in
sh|ash|mount|umount|mkdir|cat|ip|udhcpc|sleep|echo|grep|cut|awk|hostname|dd|reboot) ln -sf busybox "$applet"; ;;
esac
done
[ -e sh ] || ln -sf busybox sh
echo "Using $CONTAINER_RUNTIME to build Alpine aarch64 initramfs..."
# Revision shown on serial so you can confirm the device is running the latest initrd
REV=$(date +%Y%m%d-%H%M 2>/dev/null || echo "unknown")
if [ -d "$SCRIPT_DIR/../.git" ] || [ -d "$SCRIPT_DIR/../../.git" ]; then
REV="${REV}-$(git -C "$SCRIPT_DIR" rev-parse --short HEAD 2>/dev/null || echo 'nogit')"
fi
CNT_NAME="cm4-initramfs-build-$$"
trap "$CONTAINER_RUNTIME rm -f $CNT_NAME >/dev/null 2>&1; true" EXIT
# Build the rootfs inside an arm64 Alpine container
$CONTAINER_RUNTIME run --name "$CNT_NAME" --platform linux/arm64 \
"alpine:${ALPINE_VERSION}" /bin/sh -c "
set -e
# Install packages: Python for rpi-eeprom-config, curl for provisioning client, coreutils for dd
apk add --no-cache python3 curl busybox coreutils
# Download rpi-eeprom tools from GitHub (use curl -sL which follows redirects; busybox wget does not)
curl -sL -o /usr/bin/rpi-eeprom-config \
'${RPI_EEPROM_RAW}/master/rpi-eeprom-config'
curl -sL -o /usr/bin/rpi-eeprom-update \
'${RPI_EEPROM_RAW}/master/rpi-eeprom-update'
curl -sL -o /usr/bin/rpi-eeprom-digest \
'${RPI_EEPROM_RAW}/master/rpi-eeprom-digest'
chmod +x /usr/bin/rpi-eeprom-config /usr/bin/rpi-eeprom-update /usr/bin/rpi-eeprom-digest
# Download latest stable CM4 (BCM2711) EEPROM firmware
mkdir -p /lib/firmware/raspberrypi/bootloader/default
# Use GitHub API to find the latest pieeprom-*.bin in firmware-2711/default/
LATEST_FW=\$(curl -sL 'https://api.github.com/repos/raspberrypi/rpi-eeprom/contents/firmware-2711/default' \
| grep -o '\"name\" *: *\"pieeprom-[^\"]*\\.bin\"' | sed 's/\"name\" *: *\"//;s/\"//' | sort | tail -1)
if [ -n \"\$LATEST_FW\" ]; then
echo \"Downloading EEPROM firmware: \$LATEST_FW\"
curl -sL -o /lib/firmware/raspberrypi/bootloader/default/pieeprom.bin \
\"${RPI_EEPROM_RAW}/master/firmware-2711/default/\$LATEST_FW\"
else
echo 'WARNING: Could not determine latest firmware; rescue-eeprom.sh may not work'
fi
# Create required directories
mkdir -p /proc /sys /dev/pts /mnt /tmp /run /usr/share/udhcpc /etc/default
# Clean up unnecessary files to reduce size
rm -rf /var/cache/apk/* /usr/share/man /usr/share/doc /usr/include \
/usr/lib/python*/test /usr/lib/python*/unittest /usr/lib/python*/idlelib \
/usr/lib/python*/tkinter /usr/lib/python*/ensurepip /usr/lib/python*/__pycache__/test* \
/usr/lib/python*/lib2to3 /usr/lib/python*/distutils \
/usr/share/terminfo/[a-s]* /usr/share/terminfo/[u-z]*
echo 'Alpine rootfs ready'
"
echo "Container build done. Copying scripts into container..."
# Copy our scripts into the container
$CONTAINER_RUNTIME cp "$SCRIPT_DIR/init" "$CNT_NAME:/init"
$CONTAINER_RUNTIME cp "$SCRIPT_DIR/provisioning-client.sh" "$CNT_NAME:/provisioning-client.sh"
$CONTAINER_RUNTIME cp "$SCRIPT_DIR/rescue-eeprom.sh" "$CNT_NAME:/rescue-eeprom.sh"
$CONTAINER_RUNTIME cp "$SCRIPT_DIR/udhcpc.script" "$CNT_NAME:/usr/share/udhcpc/default.script"
# Write revision.txt
echo "$REV" | $CONTAINER_RUNTIME cp /dev/stdin "$CNT_NAME:/revision.txt" 2>/dev/null || \
(echo "$REV" > /tmp/cm4-rev-$$.txt && $CONTAINER_RUNTIME cp /tmp/cm4-rev-$$.txt "$CNT_NAME:/revision.txt" && rm -f /tmp/cm4-rev-$$.txt)
# Set permissions
$CONTAINER_RUNTIME start "$CNT_NAME" >/dev/null 2>&1 || true
$CONTAINER_RUNTIME exec "$CNT_NAME" chmod +x /init /provisioning-client.sh /rescue-eeprom.sh /usr/share/udhcpc/default.script 2>/dev/null || \
echo "Note: could not chmod in stopped container; permissions set by cp"
# Export container filesystem and create cpio archive
echo "Exporting filesystem and building initrd.img..."
$CONTAINER_RUNTIME export "$CNT_NAME" | gzip -1 > /tmp/cm4-rootfs-$$.tar.gz
BUILD_DIR=$(mktemp -d)
trap "$CONTAINER_RUNTIME rm -f $CNT_NAME >/dev/null 2>&1; rm -rf $BUILD_DIR /tmp/cm4-rootfs-$$.tar.gz; true" EXIT
cd "$BUILD_DIR"
tar xzf /tmp/cm4-rootfs-$$.tar.gz
rm -f /tmp/cm4-rootfs-$$.tar.gz
# Remove container metadata that shouldn't be in initramfs
rm -rf .dockerenv .containerenv
# Ensure init is executable and at the root
chmod +x init provisioning-client.sh rescue-eeprom.sh usr/share/udhcpc/default.script 2>/dev/null || true
# Build cpio (gzip)
echo "Building cpio..."
( cd "$BUILD_DIR"; find . -print0 | cpio -o -H newc -0 2>/dev/null ) | gzip -9 > "$OUTPUT"
echo "Written: $OUTPUT ($(stat -c%s "$OUTPUT" 2>/dev/null || stat -f%z "$OUTPUT" 2>/dev/null) bytes)"
find . -print0 | cpio -o -H newc -0 2>/dev/null | gzip -9 > "$OUTPUT"
SIZE=$(stat -c%s "$OUTPUT" 2>/dev/null || stat -f%z "$OUTPUT" 2>/dev/null || echo "?")
echo ""
echo "Written: $OUTPUT ($SIZE bytes, $(( ${SIZE:-0} / 1048576 )) MB)"
echo ""
echo "Next: copy initrd.img to TFTP root (e.g. /srv/tftpboot on LXC) and in config.txt add:"
echo " initramfs initrd.img followkernel"
echo "Then ensure the kernel line loads the initrd (followkernel does that)."
echo "Default provisioning server: http://10.20.50.1:5000 (override with kernel cmdline: provisioning_server=http://...)"
echo "Default provisioning server: http://10.20.50.1:5000 (override with kernel cmdline: provisioning_server=http://...)"

View File

@@ -1,12 +1,15 @@
#!/bin/sh
# Init for provisioning initramfs: bring up minimal env and run provisioning-client.sh.
# PROVISIONING_SERVER can be set via kernel cmdline: provisioning_server=http://10.20.50.1:5000
# Based on Alpine Linux aarch64 with Python 3 and rpi-eeprom-config.
set -e
export PATH=/bin:/usr/bin
export LD_LIBRARY_PATH=/lib
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
export LD_LIBRARY_PATH=/lib:/usr/lib
echo "=== CM4 provisioning initramfs ==="
# Revision is set at build time; cat /revision.txt to confirm you have the latest initrd on TFTP
[ -f /revision.txt ] && echo "Revision: $(cat /revision.txt)" || echo "Revision: (none)"
# Minimal filesystem
mount -t proc none /proc
@@ -15,12 +18,28 @@ mount -t devtmpfs none /dev
mkdir -p /dev/pts
mount -t devpts none /dev/pts
# Kernel might have brought up eth0 via ip=dhcp; ensure we have an IP
if ! ip addr show | grep -q 'inet .* scope global'; then
# Bring up eth0 (bootloader used it for TFTP but kernel starts with it down)
echo "Bringing up eth0..."
ip link set lo up 2>/dev/null || true
ip link set eth0 up 2>/dev/null || true
# Wait for link (PHY negotiation takes a few seconds after ip link set up)
echo "Waiting for link..."
for _ in 1 2 3 4 5 6 7 8 9 10; do
ip link show eth0 2>/dev/null | grep -q 'LOWER_UP' && break
sleep 1
done
# Get DHCP lease (foreground with retries; -q exits after obtaining lease)
if ! ip addr show eth0 2>/dev/null | grep -q 'inet [0-9]'; then
echo "Getting DHCP lease..."
udhcpc -f -q -i eth0 -n 2>/dev/null || true
udhcpc -i eth0 -q -T 5 -t 5 -n -s /usr/share/udhcpc/default.script 2>&1 || echo "udhcpc failed (will retry)"
fi
# /tmp for client_ip (so client can read IP without running ip/awk)
mkdir -p /tmp
mount -t tmpfs none /tmp 2>/dev/null || true
# Allow kernel cmdline to override: provisioning_server=... and rescue mode
RESCUE=0
for arg in $(cat /proc/cmdline); do
@@ -34,11 +53,20 @@ export PROVISIONING_SERVER
if [ "$RESCUE" -eq 1 ]; then
echo "=== RESCUE MODE (provisioning_rescue=1) ==="
echo "Run /rescue-eeprom.sh to mount eMMC and change boot order (rpi-eeprom-config), then reboot."
echo "Or run /bin/sh for a shell."
echo "Run /rescue-eeprom.sh to set EEPROM boot order (runs rpi-eeprom-config directly), then reboot."
# Ensure shell I/O goes to serial console (some setups drop output otherwise)
[ -c /dev/console ] && exec </dev/console >/dev/console 2>&1
exec /bin/sh -i
fi
echo "Provisioning server: $PROVISIONING_SERVER"
# Capture eth0 IP; retry in case DHCP is still completing
for _ in 1 2 3 4 5; do
ip addr show dev eth0 2>/dev/null | awk '/inet [0-9]/ { print $2; exit }' | cut -d/ -f1 > /tmp/client_ip 2>/dev/null || true
[ -s /tmp/client_ip ] && break
echo "Waiting for IP on eth0..."
sleep 2
done
echo "Client IP: $(cat /tmp/client_ip 2>/dev/null || echo '(none)')"
echo "Running provisioning client..."
exec /bin/sh /provisioning-client.sh

View File

@@ -13,7 +13,12 @@ get_mac() {
}
get_ip() {
hostname -I 2>/dev/null | awk '{print $1}' || echo ""
# Prefer IP captured by init; fallback to ip (match "inet 1.2.3.4/..." to skip inet6)
if [ -f /tmp/client_ip ] && [ -s /tmp/client_ip ]; then
cat /tmp/client_ip
return
fi
ip addr show dev eth0 2>/dev/null | awk '/inet [0-9]/ { print $2; exit }' | cut -d/ -f1
}
MAC=$(get_mac)
@@ -37,10 +42,14 @@ while true; do
sleep 10
continue
fi
curl -sL "$url" | dd of="$EMMC_DEV" bs=4M status=progress conv=fsync
curl -sL "$url" | dd of="$EMMC_DEV" bs=4M conv=fsync 2>&1
sync
echo "Deploy done. Disabling network boot on server so device boots from eMMC next time."
curl -s -X POST "$BASE_URL/api/action-done?mac=$MAC" || true
exit 0
echo "Rebooting in 3 seconds..."
sleep 3
reboot -f 2>/dev/null || echo b > /proc/sysrq-trigger
sleep 60
fi
if [ "$action" = "backup" ] && [ -n "$upload_url" ]; then
@@ -50,16 +59,20 @@ while true; do
sleep 10
continue
fi
dd if="$EMMC_DEV" bs=4M status=progress 2>/dev/null | curl -s -X POST -T - "$upload_url"
dd if="$EMMC_DEV" bs=4M 2>/dev/null | curl -s -X POST -T - "$upload_url"
sync
echo "Backup done. Disabling network boot on server."
curl -s -X POST "$BASE_URL/api/action-done?mac=$MAC" || true
exit 0
echo "Rebooting in 3 seconds..."
sleep 3
reboot -f 2>/dev/null || echo b > /proc/sysrq-trigger
sleep 60
fi
if [ "$action" = "reboot" ]; then
echo "Boot normally: rebooting..."
reboot -f 2>/dev/null || exec reboot 2>/dev/null || true
exit 0
echo "Rebooting..."
reboot -f 2>/dev/null || echo b > /proc/sysrq-trigger
sleep 60
fi
sleep 5

152
emmc-provisioning/network-boot-initramfs/rescue-eeprom.sh Normal file → Executable file
View File

@@ -1,36 +1,138 @@
#!/bin/sh
# Rescue script: mount eMMC root and chroot to run rpi-eeprom-config.
# Use this when stuck in network-only boot (BOOT_ORDER=0x2) to set BOOT_ORDER=0x1 or 0x21.
# Rescue script: set EEPROM boot order directly from the initramfs (no chroot needed).
# Uses rpi-eeprom-config + pieeprom.bin bundled in the Alpine-based initramfs.
# Sets BOOT_ORDER=0xf21 (eMMC first, then network, restart) with fast network timeouts.
# Run from the initramfs rescue shell (after booting with provisioning_rescue=1 in cmdline).
# Pass --edit to open the editor manually instead of applying automatically.
set -e
ROOT="/mnt/emmc"
BOOT="$ROOT/boot/firmware"
[ -d "$ROOT/boot" ] && [ ! -d "$BOOT" ] && BOOT="$ROOT/boot"
EEPROM_FW="/lib/firmware/raspberrypi/bootloader/default/pieeprom.bin"
BOOT_MNT="/mnt/boot"
MANUAL=0
[ "$1" = "--edit" ] && MANUAL=1
echo "=== Mounting eMMC for EEPROM config ==="
# CM4 / reTerminal: eMMC is usually mmcblk0, p1=boot (FAT), p2=root (ext4)
if [ ! -b /dev/mmcblk0p2 ]; then
echo "No /dev/mmcblk0p2 found. Try: ls /dev/mmcblk*"
# Clean up any previous mounts
umount "$BOOT_MNT" 2>/dev/null || true
umount /mnt/emmc 2>/dev/null || true
# --- Read current EEPROM config ---
echo "=== EEPROM rescue (Alpine initramfs) ==="
if ! command -v rpi-eeprom-config >/dev/null 2>&1; then
echo "ERROR: rpi-eeprom-config not found in initramfs."
echo "This initramfs was not built with the Alpine build script."
exit 1
fi
mkdir -p "$ROOT"
mount /dev/mmcblk0p2 "$ROOT" || { echo "Mount root failed"; exit 1; }
if [ -b /dev/mmcblk0p1 ]; then
mkdir -p "$BOOT"
mount /dev/mmcblk0p1 "$BOOT" 2>/dev/null || true
if [ ! -f "$EEPROM_FW" ]; then
echo "ERROR: EEPROM firmware not found at $EEPROM_FW"
echo "Rebuild the initramfs with build.sh to include it."
exit 1
fi
mount -t proc none "$ROOT/proc"
mount -t sysfs none "$ROOT/sys"
mount --bind /dev "$ROOT/dev"
mount --bind /dev/pts "$ROOT/dev/pts" 2>/dev/null || true
if [ -x "$ROOT/usr/bin/rpi-eeprom-config" ]; then
echo "Chroot to eMMC and run: rpi-eeprom-config --edit"
echo "Set BOOT_ORDER=0x1 (eMMC only) or 0x21 (network first, then eMMC), save, then exit and run: reboot"
chroot "$ROOT" /usr/bin/rpi-eeprom-config --edit
else
echo "rpi-eeprom-config not found in eMMC. Chrooting anyway; run: apt install rpi-eeprom && rpi-eeprom-config --edit"
chroot "$ROOT" /bin/sh -i
echo "Reading current EEPROM config from running bootloader..."
CURRENT_CONF="/tmp/eeprom-current.conf"
rpi-eeprom-config 2>/dev/null > "$CURRENT_CONF" || true
if [ ! -s "$CURRENT_CONF" ]; then
echo "Could not read current EEPROM config via vcgencmd."
echo "Extracting config from firmware image instead..."
rpi-eeprom-config "$EEPROM_FW" > "$CURRENT_CONF" 2>/dev/null || true
fi
if [ ! -s "$CURRENT_CONF" ]; then
echo "ERROR: Could not read EEPROM config from either source."
exit 1
fi
echo "Current config:"
cat "$CURRENT_CONF"
echo ""
# --- Manual mode: mount eMMC boot, chroot, edit ---
if [ "$MANUAL" -eq 1 ]; then
echo "Manual mode: mounting eMMC for interactive editing..."
mkdir -p /mnt/emmc
mount /dev/mmcblk0p2 /mnt/emmc 2>/dev/null || true
BOOT="/mnt/emmc/boot/firmware"
[ ! -d "$BOOT" ] && BOOT="/mnt/emmc/boot"
if [ -b /dev/mmcblk0p1 ]; then
mkdir -p "$BOOT"
mount /dev/mmcblk0p1 "$BOOT" 2>/dev/null || true
fi
mount -t proc none /mnt/emmc/proc 2>/dev/null || true
mount -t sysfs none /mnt/emmc/sys 2>/dev/null || true
mount --bind /dev /mnt/emmc/dev 2>/dev/null || true
if [ -x /mnt/emmc/usr/bin/rpi-eeprom-config ]; then
chroot /mnt/emmc /usr/bin/rpi-eeprom-config --edit
else
echo "rpi-eeprom-config not found on eMMC. Dropping to shell."
chroot /mnt/emmc /bin/sh -i
fi
exit 0
fi
# --- Automatic mode: build new config and apply ---
NEW_CONF="/tmp/eeprom-new.conf"
# Keep settings we don't modify, strip the ones we replace
grep -v '^BOOT_ORDER=' "$CURRENT_CONF" \
| grep -v '^NET_BOOT_MAX_RETRIES=' \
| grep -v '^DHCP_TIMEOUT=' \
| grep -v '^DHCP_REQ_TIMEOUT=' \
| grep -v '^TFTP_IP=' \
| grep -v '^NET_INSTALL_AT_POWER_ON=' \
> "$NEW_CONF" || true
echo 'BOOT_ORDER=0xf21' >> "$NEW_CONF"
echo 'NET_BOOT_MAX_RETRIES=3' >> "$NEW_CONF"
echo 'DHCP_TIMEOUT=1500' >> "$NEW_CONF"
echo 'DHCP_REQ_TIMEOUT=500' >> "$NEW_CONF"
echo 'NET_INSTALL_AT_POWER_ON=0' >> "$NEW_CONF"
echo "New config to apply:"
cat "$NEW_CONF"
echo ""
# Create the modified EEPROM image with the new config embedded
EEPROM_OUT="/tmp/pieeprom.upd"
echo "Embedding config into EEPROM firmware image..."
rpi-eeprom-config --config "$NEW_CONF" --out "$EEPROM_OUT" "$EEPROM_FW"
if [ ! -f "$EEPROM_OUT" ] || [ ! -s "$EEPROM_OUT" ]; then
echo "ERROR: Failed to create modified EEPROM image."
exit 1
fi
# Generate the signature file (sha256 of the .upd, named .sig)
EEPROM_SIG="/tmp/pieeprom.sig"
sha256sum "$EEPROM_OUT" | awk '{print $1}' > "$EEPROM_SIG"
# Mount eMMC boot partition and copy the update files
echo "Mounting eMMC boot partition..."
if [ ! -b /dev/mmcblk0p1 ]; then
echo "ERROR: /dev/mmcblk0p1 not found. Is eMMC present?"
exit 1
fi
mkdir -p "$BOOT_MNT"
mount /dev/mmcblk0p1 "$BOOT_MNT" || { echo "ERROR: Could not mount boot partition"; exit 1; }
cp "$EEPROM_OUT" "$BOOT_MNT/pieeprom.upd"
cp "$EEPROM_SIG" "$BOOT_MNT/pieeprom.sig"
sync
echo ""
echo "=== EEPROM update written to eMMC boot partition ==="
echo " BOOT_ORDER=0xf21 (eMMC first, then network, restart)"
echo " NET_BOOT_MAX_RETRIES=3, DHCP_TIMEOUT=1500ms"
echo " Files: pieeprom.upd + pieeprom.sig on /dev/mmcblk0p1"
echo ""
echo "The bootloader will apply this update on next boot from eMMC."
echo ""
echo "Next steps:"
echo " 1. Disable network boot on the LXC (so next boot falls through to eMMC)"
echo " 2. Reboot: reboot -f (or: echo b > /proc/sysrq-trigger)"
umount "$BOOT_MNT" 2>/dev/null || true
rm -f "$CURRENT_CONF" "$NEW_CONF" "$EEPROM_OUT" "$EEPROM_SIG"

View File

@@ -0,0 +1,38 @@
#!/bin/sh
# Minimal udhcpc script: apply IP and default route when lease is obtained.
# udhcpc sets: $1=bound|renew|deconfig, $ip, $subnet (dotted), $router, $dns, $interface
mask2cidr() {
# Convert dotted subnet (e.g. 255.255.255.0) to CIDR prefix (e.g. 24)
_bits=0
for _octet in $(echo "$1" | cut -d. -f1) $(echo "$1" | cut -d. -f2) $(echo "$1" | cut -d. -f3) $(echo "$1" | cut -d. -f4); do
case "$_octet" in
255) _bits=$((_bits+8)) ;; 254) _bits=$((_bits+7)) ;; 252) _bits=$((_bits+6)) ;;
248) _bits=$((_bits+5)) ;; 240) _bits=$((_bits+4)) ;; 224) _bits=$((_bits+3)) ;;
192) _bits=$((_bits+2)) ;; 128) _bits=$((_bits+1)) ;; 0) ;;
esac
done
echo "$_bits"
}
case "$1" in
deconfig)
ip addr flush dev "$interface" 2>/dev/null
;;
bound|renew)
CIDR=$(mask2cidr "${subnet:-255.255.255.0}")
ip addr flush dev "$interface" 2>/dev/null
ip addr add "$ip/$CIDR" dev "$interface"
if [ -n "$router" ]; then
for r in $router; do
ip route add default via "$r" dev "$interface" 2>/dev/null
done
fi
if [ -n "$dns" ]; then
: > /etc/resolv.conf
for d in $dns; do
echo "nameserver $d" >> /etc/resolv.conf
done
fi
;;
esac

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env bash
# Check whether DHCP network-boot options (66/67) are enabled on the LXC.
# Usage: ./check-dhcp-network-boot-on-lxc.sh [LXC_HOST]
# Example: ./check-dhcp-network-boot-on-lxc.sh root@10.20.30.153
LXC="${1:-root@10.20.30.153}"
PXE_CONF="/etc/dnsmasq.d/network-boot-pxe.conf"
echo "Checking DHCP network-boot status on $LXC ..."
ssh "$LXC" "bash -s" << 'REMOTE'
PXE_CONF="/etc/dnsmasq.d/network-boot-pxe.conf"
if [ -f "$PXE_CONF" ]; then
echo "Status: ENABLED (option 66/67 are advertised - devices will try network boot)"
echo "Content of $PXE_CONF:"
cat "$PXE_CONF"
else
echo "Status: DISABLED (no PXE options - devices get DHCP only and boot from local storage)"
fi
# Also show toggle script status if present
if [ -x /opt/cm4-provisioning/toggle-network-boot-dhcp.sh ]; then
echo ""
echo "Toggle script output: $(/opt/cm4-provisioning/toggle-network-boot-dhcp.sh status 2>/dev/null)"
fi
REMOTE

View File

@@ -1,11 +1,12 @@
#!/usr/bin/env bash
# Check if network boot is set as first priority on a Pi 4 / CM4 (reTerminal).
# Check if boot order is set as expected on a Pi 4 / CM4 (reTerminal).
# Run on the device: ./check-network-boot-priority.sh
# Or from your machine: ssh pi@<device-ip> 'bash -s' < scripts/check-network-boot-priority.sh
set -e
# BOOT_ORDER: 0x2 = network, 0x1 = SD/eMMC. 0x21 = network first, then local storage.
WANT_BOOT_ORDER="0x21"
# BOOT_ORDER nibbles (right-to-left): 1=SD/eMMC, 2=network, f=restart.
# 0xf21 = eMMC first, then network, then restart.
WANT_BOOT_ORDER="0xf21"
get_config() {
if command -v vcgencmd >/dev/null 2>&1; then
@@ -29,12 +30,12 @@ if [[ -z "$BOOT_ORDER" ]]; then
fi
echo "BOOT_ORDER=$BOOT_ORDER (current)"
echo "Expected for network first: $WANT_BOOT_ORDER (0x2=network, 0x1=SD/eMMC; 0x21 = network then local)"
echo "Expected: $WANT_BOOT_ORDER (1=eMMC, 2=network, f=restart; eMMC first, then network)"
if [[ "$(echo "$BOOT_ORDER" | tr '[:upper:]' '[:lower:]')" == "$(echo "$WANT_BOOT_ORDER" | tr '[:upper:]' '[:lower:]')" ]]; then
echo "Result: Network boot is set as first priority."
echo "Result: Boot order matches expected (eMMC first, then network)."
exit 0
fi
echo "Result: Network boot is NOT first (current: $BOOT_ORDER). To set network first, set BOOT_ORDER=0x21 (e.g. via cloud-init first-boot or rpi-eeprom-config --edit)."
echo "Result: Boot order does NOT match (current: $BOOT_ORDER, expected: $WANT_BOOT_ORDER). Set via rpi-eeprom-config --edit or cloud-init first-boot."
exit 2

View File

@@ -1,6 +1,6 @@
#!/usr/bin/env bash
# Ensure TFTP config.txt on the LXC has kernel=kernel8.img and initramfs initrd.img followkernel
# so the bootloader loads the kernel and initrd (otherwise boot stops after start4.elf).
# Ensure TFTP config.txt on the LXC has kernel=kernel8.img, initramfs initrd.img followkernel,
# and uart_2ndstage=1 (GPU firmware logs to UART for netboot debugging).
# Run on LXC: bash ensure-tftpboot-config-kernel-initrd.sh
# Or: ssh root@10.20.30.153 'bash -s' < emmc-provisioning/scripts/ensure-tftpboot-config-kernel-initrd.sh
@@ -14,6 +14,12 @@ if [[ ! -f "$CONFIG" ]]; then
fi
CHANGED=0
# enable_uart=1 must be present (and within first 4KB of config) so netboot firmware sets 8250.nr_uarts=1; else kernel has no serial console (Pi firmware #1575).
if ! grep -qE 'enable_uart=1' "$CONFIG" 2>/dev/null; then
echo "Adding enable_uart=1 to $CONFIG (required for kernel serial on netboot)"
echo "enable_uart=1" >> "$CONFIG"
CHANGED=1
fi
if ! grep -qE '^kernel=kernel8\.img' "$CONFIG" 2>/dev/null; then
echo "Adding kernel=kernel8.img to $CONFIG"
echo "kernel=kernel8.img" >> "$CONFIG"
@@ -26,20 +32,34 @@ if ! grep -qE 'initramfs initrd\.img' "$CONFIG" 2>/dev/null; then
echo "initramfs initrd.img followkernel" >> "$CONFIG"
CHANGED=1
fi
if ! grep -qE 'uart_2ndstage=1' "$CONFIG" 2>/dev/null; then
echo "Adding uart_2ndstage=1 to $CONFIG (GPU firmware logs to UART for netboot debug)"
echo "" >> "$CONFIG"
echo "# GPU firmware logs to UART (see MESS: lines after PCI0 reset)" >> "$CONFIG"
echo "uart_2ndstage=1" >> "$CONFIG"
CHANGED=1
fi
if [[ "$CHANGED" -eq 1 ]]; then
echo "Config updated. Ensure $TFTP_ROOT has kernel8.img and initrd.img."
else
echo "Config already has kernel and initramfs lines."
echo "Config already has kernel, initramfs and uart_2ndstage lines."
fi
grep -E 'kernel|initramfs' "$CONFIG" 2>/dev/null || true
grep -E 'enable_uart|kernel|initramfs|uart_2ndstage' "$CONFIG" 2>/dev/null || true
# Ensure serial-prefix dir gets a real copy of config (some TFTP servers don't follow symlinks)
# Ensure serial-prefix dirs get a real copy of config and symlinks to DTB files.
# GPU loads kernel/initrd/dtb from the serial prefix; missing DTBs cause "Failed to load Device Tree file '?'" and the kernel can hang.
for serial_dir in "$TFTP_ROOT"/[0-9a-f]*/; do
[[ -d "$serial_dir" ]] || continue
if [[ -L "$serial_dir/config.txt" ]] || [[ ! -f "$serial_dir/config.txt" ]]; then
rm -f "$serial_dir/config.txt"
cp "$CONFIG" "$serial_dir/config.txt"
echo "Copied config.txt into $(basename "$serial_dir")/ (real file) so device gets full config."
fi
rm -f "$serial_dir/config.txt"
cp "$CONFIG" "$serial_dir/config.txt"
echo "Copied config.txt into $(basename "$serial_dir")/ (real file) so device gets full config."
for dtb in "$TFTP_ROOT"/*.dtb; do
[[ -f "$dtb" ]] || continue
base=$(basename "$dtb")
if [[ ! -e "$serial_dir/$base" ]]; then
ln -sf "../$base" "$serial_dir/$base"
echo "Linked $base into $(basename "$serial_dir")/"
fi
done
done

View File

@@ -35,12 +35,11 @@ fi
# 2) dnsmasq config for eth1 only (DHCP + TFTP); PXE options in network-boot-pxe.conf (toggle with toggle-network-boot-dhcp.sh)
mkdir -p /etc/dnsmasq.d
cat > /etc/dnsmasq.d/network-boot.conf << 'DNSMASQ'
# DHCP + TFTP on eth1 only (provisioning LAN)
# DHCP on eth1 only (provisioning LAN)
# TFTP and PXE options are in network-boot-pxe.conf, controlled by toggle-network-boot-dhcp.sh
interface=eth1
bind-interfaces
dhcp-range=10.20.50.100,10.20.50.200,12h
enable-tftp
tftp-root=/srv/tftpboot
log-dhcp
log-queries
port=0