Update boot order configuration for eMMC first, then network
Modify the first-boot script and documentation to set the EEPROM boot order to 0xf21, prioritizing eMMC boot followed by network boot. Adjust network boot settings for faster failure on DHCP timeouts and update related scripts and documentation to reflect these changes. Enhance the rescue script to directly modify EEPROM settings without requiring a chroot into eMMC, streamlining the recovery process for devices stuck in network-only boot. Update relevant documentation to ensure clarity on the new boot order and its implications.
This commit is contained in:
@@ -48,7 +48,7 @@ Then **power off** the reTerminal and **power it on**. Watch where DHCP (and TFT
|
||||
|--------|---------------|--------|
|
||||
| No DHCP/TFTP on eth1 during boot; traffic only after OS | reTerminal on different segment than eth1 | Plug reTerminal into same VLAN/bridge as LXC eth1 (provisioning LAN) |
|
||||
| DHCP on eth0 during boot, none on eth1 | reTerminal on same segment as eth0 | Move reTerminal to provisioning segment (same as eth1) |
|
||||
| No DHCP on any interface during boot | Cable unplugged, BOOT_ORDER not 0x21, or device not attempting netboot | Check cable, confirm BOOT_ORDER=0x21, power cycle with cable in before power |
|
||||
| No DHCP on any interface during boot | Cable unplugged, BOOT_ORDER not 0xf21, or device not attempting netboot | Check cable, confirm BOOT_ORDER=0xf21, power cycle with cable in before power |
|
||||
|
||||
---
|
||||
|
||||
@@ -75,7 +75,7 @@ So the device **is** on the right segment (eth1, 10.20.50.x). The problem is tha
|
||||
- **Discover** (client 0.0.0.0 → broadcast) at the very start → that’s the bootloader.
|
||||
- **TFTP (port 69)** right after DHCP Ack → bootloader loading files.
|
||||
2. If you **never** see Discover or TFTP, only Request/Reply after the OS is up, then the bootloader is either not attempting network boot or is giving up (e.g. link not ready, timeout) and booting from eMMC. Try a full power-off (mains or PSU), wait 10 s, then power on with tcpdump already running.
|
||||
3. Confirm **BOOT_ORDER=0x21** on the device (network first) and that Ethernet is connected before power-on.
|
||||
3. Confirm **BOOT_ORDER=0xf21** on the device (eMMC first, then network) and that Ethernet is connected before power-on.
|
||||
|
||||
---
|
||||
|
||||
@@ -112,7 +112,7 @@ and you **never** see a line about network (e.g. "Trying DHCP", "TFTP", or "Boot
|
||||
1. **BOOT_ORDER not applied or not read**
|
||||
From the running OS, confirm:
|
||||
`sudo vcgencmd bootloader_config`
|
||||
and check that `BOOT_ORDER=0x21` (and optionally `NET_BOOT_MAX_RETRIES`, `DHCP_TIMEOUT`, `TFTP_IP`). If you see different or missing values, the EEPROM config in use at boot may be different (e.g. old EEPROM, or update not applied on cold boot).
|
||||
and check that `BOOT_ORDER=0xf21` (and optionally `NET_BOOT_MAX_RETRIES`, `DHCP_TIMEOUT`, `TFTP_IP`). If you see different or missing values, the EEPROM config in use at boot may be different (e.g. old EEPROM, or update not applied on cold boot).
|
||||
|
||||
2. **Network tried but failed before any DHCP**
|
||||
The bootloader may try network, fail very early (e.g. no link, or timeout before sending DHCP), then fall back to SD without printing a “Trying network” line. Slower link-up (switch, cable) can cause this. Increasing `DHCP_TIMEOUT` and `NET_BOOT_MAX_RETRIES` (and setting `TFTP_IP`) gives the best chance.
|
||||
@@ -123,7 +123,7 @@ and you **never** see a line about network (e.g. "Trying DHCP", "TFTP", or "Boot
|
||||
**What to try:**
|
||||
|
||||
- Re-apply EEPROM config with network first and timeouts (as in NETWORK-BOOT-TROUBLESHOOTING), then **full power cycle** (unplug power 10+ s, then power on) with serial connected. Watch from the first character for any “NET”, “DHCP”, “TFTP” or “order” line.
|
||||
- For a one-off test you can set `BOOT_ORDER=0x2` (network only). If network fails, the device won’t boot (no fallback to SD). Use only to confirm whether the bootloader tries network and what it prints; then set back to `0x21`. If the full serial log never shows "NET", "DHCP", or "TFTP" and goes straight to "Boot mode: SD (01) order 2", trying `BOOT_ORDER=0x2` (network only) once will force a network attempt and should produce DHCP/TFTP messages on serial.
|
||||
- For a one-off test you can set `BOOT_ORDER=0x2` (network only). If network fails, the device won’t boot (no fallback to SD). Use only to confirm whether the bootloader tries network and what it prints; then set back to `0xf21`. If the full serial log never shows "NET", "DHCP", or "TFTP" and goes straight to "Boot mode: SD (01) order 2", trying `BOOT_ORDER=0x2` (network only) once will force a network attempt and should produce DHCP/TFTP messages on serial.
|
||||
|
||||
---
|
||||
|
||||
@@ -204,16 +204,17 @@ After that, the bootloader’s first TFTP requests succeed. The device already h
|
||||
|
||||
## Stuck in network-only boot (BOOT_ORDER=0x2): get back to Raspbian and change boot order
|
||||
|
||||
If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never try eMMC. To get back to Raspbian and set **BOOT_ORDER=0x1** or **0x21**, use **rescue mode**: the network boot chain loads the provisioning initramfs; with a special kernel cmdline it drops to a shell so you can mount eMMC and run **rpi-eeprom-config** from the eMMC install.
|
||||
If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never try eMMC. To fix the EEPROM config, use **rescue mode**: the network boot chain loads the Alpine-based provisioning initramfs which includes Python and `rpi-eeprom-config`; with a special kernel cmdline it drops to a shell so you can run `rpi-eeprom-config` directly from the initramfs (no chroot into eMMC needed).
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **Initramfs with rescue support** — Build the initramfs (it includes `/rescue-eeprom.sh`) and copy it to the LXC TFTP root and into the serial dir:
|
||||
- **Initramfs with rescue support** — Build the Alpine-based initramfs (it includes `/rescue-eeprom.sh`, `rpi-eeprom-config`, and EEPROM firmware) and copy it to the LXC TFTP root and into the serial dir:
|
||||
```bash
|
||||
cd emmc-provisioning/network-boot-initramfs && ./build.sh
|
||||
scp initrd.img root@<LXC-IP>:/srv/tftpboot/
|
||||
ssh root@<LXC-IP> 'cp /srv/tftpboot/initrd.img /srv/tftpboot/0d1ddbda/ 2>/dev/null || true'
|
||||
```
|
||||
Building requires Docker or Podman with arm64 emulation (`qemu-user-static`).
|
||||
- **TFTP config** — Ensure `/srv/tftpboot/config.txt` (and thus `0d1ddbda/config.txt` if it’s a symlink) has `kernel=kernel8.img` and `initramfs initrd.img followkernel` so the full kernel+initrd chain runs.
|
||||
|
||||
### Steps
|
||||
@@ -232,24 +233,18 @@ If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never
|
||||
2. **Power on the reTerminal** (or reboot). It will network boot, load kernel + initramfs, and **rescue mode** will start a shell (serial or console). You should see:
|
||||
`=== RESCUE MODE (provisioning_rescue=1) ===`
|
||||
|
||||
3. **In the rescue shell**, run the helper to mount eMMC and run the EEPROM config from the eMMC install:
|
||||
3. **In the rescue shell**, run the rescue script. It automatically sets `BOOT_ORDER=0xf21` and writes the EEPROM update to the eMMC boot partition:
|
||||
```bash
|
||||
/rescue-eeprom.sh
|
||||
```
|
||||
In the editor that opens, set **BOOT_ORDER=0x1** (eMMC only) or **0x21** (network first, then eMMC). Save and exit the editor.
|
||||
The script runs `rpi-eeprom-config` directly from the initramfs (no chroot, no dependency on the eMMC OS). It creates a `pieeprom.upd` file on the eMMC boot partition with the updated config. For manual editing instead, use `/rescue-eeprom.sh --edit`.
|
||||
|
||||
4. **Reboot** from the rescue shell:
|
||||
```bash
|
||||
reboot
|
||||
```
|
||||
The bootloader will apply the EEPROM update and on the next boot use the new order (eMMC only with 0x1, or network then eMMC with 0x21).
|
||||
|
||||
5. **Reboot and apply the update** — The EEPROM update is only applied when the bootloader **boots from the same storage** where the update file was written. You wrote it to **eMMC**, so the bootloader must **boot from eMMC** once to apply it. With **BOOT_ORDER=0x2** (network only) the next reboot netboots again, so the bootloader never reads eMMC and the update is never applied. Do this **before** rebooting from the rescue shell:
|
||||
4. **Disable network boot and reboot** — The EEPROM update is only applied when the bootloader **boots from the same storage** where the update file was written. You wrote it to **eMMC**, so the bootloader must **boot from eMMC** once to apply it. With **BOOT_ORDER=0x2** (network only) the next reboot netboots again, so the bootloader never reads eMMC and the update is never applied. Do this **before** rebooting from the rescue shell:
|
||||
- **On the LXC**, disable PXE so the next boot does not advertise TFTP:
|
||||
`ssh root@<LXC-IP> '/opt/cm4-provisioning/toggle-network-boot-dhcp.sh disable'`
|
||||
- Then **power cycle** the reTerminal (or run `reboot -f` / `echo b > /proc/sysrq-trigger` in the rescue shell). The bootloader will get DHCP without option 66/67; it may then try eMMC (depending on firmware) and apply the update. If it still netboots (e.g. cached TFTP), unplug the Ethernet cable and power cycle so it has no choice but eMMC.
|
||||
|
||||
6. **After you are back in Raspbian**, restore normal cmdline for the device so the next network boot runs the provisioning client, not rescue:
|
||||
5. **After you are back in Raspbian**, restore normal cmdline for the device so the next network boot runs the provisioning client, not rescue:
|
||||
```bash
|
||||
./emmc-provisioning/scripts/disable-rescue-cmdline-on-lxc.sh root@<LXC-IP> 0d1ddbda
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user