Files
reterminal-dm4/emmc-provisioning/docs/NETWORK-BOOT-DEPLOYMENT-FLOW.md
nearxos 5238d457e8 Update boot order configuration for eMMC first, then network
Modify the first-boot script and documentation to set the EEPROM boot order to 0xf21, prioritizing eMMC boot followed by network boot. Adjust network boot settings for faster failure on DHCP timeouts and update related scripts and documentation to reflect these changes. Enhance the rescue script to directly modify EEPROM settings without requiring a chroot into eMMC, streamlining the recovery process for devices stuck in network-only boot. Update relevant documentation to ensure clarity on the new boot order and its implications.
2026-02-21 15:05:17 +02:00

6.5 KiB
Raw Blame History

How network boot deployment works

This describes the full flow from power-on to eMMC deploy/backup when using network boot with the provisioning LXC.


Overview

  1. reTerminal is set to try eMMC first, then network (EEPROM BOOT_ORDER=0xf21).
  2. It is connected to the same LAN as the LXCs eth1 (e.g. 10.20.50.0/24).
  3. On power-on it gets an IP via DHCP and loads boot files via TFTP from the LXC.
  4. The netboot environment (kernel + rootfs) runs provisioning-client.sh, which registers with the dashboard and polls for an action.
  5. In the dashboard you see the device under “Device detected (Network)” and choose Deploy or Backup.
  6. The device performs the action (download image → write eMMC, or read eMMC → upload), then you can reboot to run from eMMC.

Step-by-step

1. LXC (provisioning server)

  • eth0 = WAN (e.g. 10.130.60.141), internet for the LXC.
  • eth1 = LAN (e.g. 10.20.50.1/24):
    • dnsmasq: DHCP on eth1 (e.g. 10.20.50.100200) and TFTP with next-server = 10.20.50.1, boot file = start4cd.elf.
    • TFTP root /srv/tftpboot: Raspberry Pi 4/CM4 boot files (from GitHub: start4cd.elf, fixup4cd.dat, kernel8.img, etc.).
    • NAT: traffic from 10.20.50.0/24 is masqueraded out eth0 so netbooted devices have internet if needed.

The dashboard (Flask) runs in the LXC and is reachable at e.g. http://10.20.50.1:5000 from the LAN. The golden image for Deploy lives at /var/lib/cm4-provisioning/golden.img (same LXC or bind-mounted from host).

2. reTerminal (device)

  • EEPROM: BOOT_ORDER=0xf21 (eMMC first, then network). Can be set by cloud-init first-boot on an already-flashed device.
  • Network: Ethernet connected to the same segment as the LXCs eth1 (e.g. same switch/VLAN as 10.20.50.0/24).
  • On power-on:
    1. Pi 4/CM4 firmware does DHCP on the wired interface.
    2. DHCP reply gives: IP (e.g. 10.20.50.100), next-server (TFTP) = 10.20.50.1, boot filename = start4cd.elf.
    3. Device TFTPs boot files from the LXC (start4cd.elf, fixup4cd.dat, kernel, DTB, etc.).
    4. It boots the kernel (and optionally an initramfs or NFS root). That environment must have network, curl, and provisioning-client.sh.

3. Netboot root / environment

The TFTP-loaded kernel (and optional initramfs/NFS root) must end up in an environment where:

  • The device has an IP on the same LAN as the LXC (already from DHCP).
  • provisioning-client.sh is present and run (e.g. from init, a login script, or a systemd service).
  • PROVISIONING_SERVER is set to the dashboard URL on the LXCs LAN IP, e.g.
    PROVISIONING_SERVER=http://10.20.50.1:5000

So the “netboot environment” is either:

  • A custom initramfs (recommended): build with network-boot-initramfs/build.sh, copy initrd.img to the TFTP root, and add initramfs initrd.img followkernel to config.txt. The initramfs brings up the network and runs the provisioning client. See network-boot-initramfs/README.md.
  • A minimal rootfs (e.g. NFS) that runs the client script at boot, or
  • Any other setup that gets the client running with network and the right PROVISIONING_SERVER.

4. Provisioning client (on the device)

  • provisioning-client.sh:
    1. Registers: POST /api/register-device with MAC and IP.
    2. Polls: GET /api/device-action-poll?mac=... every few seconds.
    3. When the dashboard returns action = deploy (with url):
      downloads the image from url and runs dd of=/dev/mmcblk0.
    4. When the dashboard returns action = backup (with upload_url):
      runs dd if=/dev/mmcblk0 and POSTs the stream to upload_url.
    5. Then exits (and you can reboot to eMMC after deploy).

5. Dashboard (your actions)

  • You open the dashboard at http://10.20.50.1:5000 (or the LXCs WAN IP if youre not on the provisioning LAN).
  • Under “Device detected (Network)” you see the device (identified by MAC).
  • You click Deploy, Backup, or Disable network boot.
  • Deploy / Backup: the dashboard sets the action and URL; the client runs dd + curl, then calls /api/action-done, which disables DHCP network-boot options on the LXC so the device will boot from eMMC on the next reboot. No need to unplug ethernet.
  • Disable network boot: turns off DHCP options 66/67 (next-server, boot file) on the LXC. The DHCP server keeps running; devices just stop receiving netboot and will boot from local storage (eMMC) next time. Use this when you don't want to deploy or backup; the netbooted device can then reboot and boot from eMMC.

Data flow summary

Stage Where What happens
Boot reTerminal DHCP (get IP + next-server + boot file), then TFTP (load start4cd.elf, kernel, etc.).
Boot reTerminal Kernel (and netboot root) start; run provisioning-client.sh with PROVISIONING_SERVER=http://10.20.50.1:5000.
Register Device → LXC POST /api/register-device (MAC, IP).
Poll Device → LXC GET /api/device-action-poll?mac=... every 5 s.
Your choice You → LXC In dashboard: click Deploy or Backup for that device.
Deploy LXC → device Client GETs image URL, streams to dd of=/dev/mmcblk0.
Backup Device → LXC Client dd if=/dev/mmcblk0 and POSTs to upload_url.
After Device → LXC Client calls POST /api/action-done; server disables DHCP netboot options.
After reTerminal Reboot; device boots from eMMC (no netboot advertised).

What you need in place

  • LXC: eth1 = 10.20.50.1/24, dnsmasq (DHCP + TFTP on eth1; netboot options 66/67 in a separate snippet so they can be toggled), /srv/tftpboot with RPi 4 boot files, NAT for 10.20.50.0/24 via eth0. Toggle script /opt/cm4-provisioning/toggle-network-boot-dhcp.sh (enable/disable/status). Dashboard running, golden.img present for Deploy.
    See NETWORK-BOOT-LXC.md and setup-network-boot-on-lxc.sh.
  • reTerminal: EEPROM boot order = eMMC first, then network; Ethernet on 10.20.50.0/24; netboot environment that runs provisioning-client.sh with PROVISIONING_SERVER=http://10.20.50.1:5000.
  • Netboot root: Must provide network, curl, and the client script (NFS, initramfs, or custom root).

The TFTP setup only gets the Pi to boot a kernel (and optional root). The provisioning (Deploy/Backup) is done by that kernels environment running the network-client against the dashboard on the LXC.