# Network boot troubleshooting: no DHCP/TFTP during boot, only after OS is up If you run **tcpdump** during power-on but see **no DHCP/TFTP traffic during boot**, and only see traffic **after** the device has booted to the OS, the reTerminal is almost certainly **not on the same L2 segment as the LXC's eth1**. ## What’s going on - The Pi’s **bootloader** (EEPROM) sends DHCP Discover on the Ethernet port when it tries network boot. - That request only reaches interfaces on the **same VLAN / same bridge** (same cable/switch segment). - dnsmasq in the LXC listens only on **eth1** (provisioning LAN). - If the reTerminal is plugged into the **main office LAN** (or the same segment as the LXC’s **eth0**), the netboot DHCP **never reaches eth1** — so you see no DHCP/TFTP on eth1 during boot. - After the OS boots, it uses the same Ethernet port and gets an IP from the main LAN; you then see traffic (e.g. on eth0 or from the device’s new IP). That’s why you only see traffic “after the device boots to OS”. ## What to do ### 1. Confirm which interface sees the boot-time DHCP On the LXC, run tcpdump on **both** interfaces in two terminals (or run one in background): ```bash # Terminal 1: provisioning LAN (where netboot should happen) tcpdump -i eth1 -n -e port 67 or port 68 or port 69 # Terminal 2: WAN / main LAN tcpdump -i eth0 -n -e port 67 or port 68 or port 69 ``` Then **power off** the reTerminal and **power it on**. Watch where DHCP (and TFTP) appear: - If you see DHCP **only on eth0** during boot → the reTerminal is on the same segment as **eth0**, not eth1. So netboot is not using your LXC’s dnsmasq; the device may get an IP from another DHCP server and fall back to eMMC boot. - If you see DHCP **on eth1** during boot → the reTerminal is on the provisioning segment; you should then see TFTP (port 69) as well. ### 2. Fix: put the reTerminal on the same segment as eth1 - The reTerminal’s Ethernet cable must be connected to the **provisioning** segment: the same VLAN or bridge as the LXC’s **eth1** (e.g. 10.20.50.0/24). - On Proxmox, eth1 is often on a **dedicated bridge** (e.g. `vmbr1`). The reTerminal must be plugged into a switch port that belongs to that same bridge/VLAN. - If you have one physical switch: either put the LXC’s eth1 and the reTerminal in the same VLAN, or use a dedicated “provisioning” port group / switch. ### 3. Sanity check: same port as reTerminal - Plug a **laptop** (or another device) into the **same port** (or same VLAN) as the reTerminal. - Run: `sudo dhclient -v ` (or let it get DHCP automatically). - If you get an IP in **10.20.50.x** → that segment is your provisioning LAN (eth1); the reTerminal should netboot from there. - If you get a different range (e.g. 192.168.x.x) → that segment is **not** the provisioning LAN; move the reTerminal’s cable or VLAN to the segment where 10.20.50.x is served. ## Summary table | Symptom | Likely cause | Action | |--------|---------------|--------| | No DHCP/TFTP on eth1 during boot; traffic only after OS | reTerminal on different segment than eth1 | Plug reTerminal into same VLAN/bridge as LXC eth1 (provisioning LAN) | | DHCP on eth0 during boot, none on eth1 | reTerminal on same segment as eth0 | Move reTerminal to provisioning segment (same as eth1) | | No DHCP on any interface during boot | Cable unplugged, BOOT_ORDER not 0x21, or device not attempting netboot | Check cable, confirm BOOT_ORDER=0x21, power cycle with cable in before power | --- ## I only see DHCP Request/Reply, and the client already has 10.20.50.x If your tcpdump on **eth1** shows something like: ```text 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 88:a2:9e:xx:xx:xx 10.20.50.1.67 > 10.20.50.147.68: BOOTP/DHCP, Reply ``` that is **not** the bootloader — it is the **OS** DHCP client (renewal or re-request). The client already has **10.20.50.147**, so this happens **after** the device has booted to the OS. - **Bootloader** (network boot): sends **DHCP Discover** (client 0.0.0.0, no IP yet), then you see **Offer**, **Request**, **Ack**, then **TFTP (port 69)** for start4cd.elf, kernel, etc. - **OS**: sends **DHCP Request** (renew/rebind, often already with an IP or requesting a known one), then **Reply** — no Discover, no TFTP. So the device **is** on the right segment (eth1, 10.20.50.x). The problem is that you are not seeing the **bootloader’s** DHCP/TFTP during the first seconds after power-on. **What to do:** 1. **Start tcpdump before power-on** Run `tcpdump -i eth1 -n -e port 67 or port 68 or port 69` on the LXC, **then** power off the reTerminal, wait a few seconds, and power it on. Capture from the first second. Look for: - **Discover** (client 0.0.0.0 → broadcast) at the very start → that’s the bootloader. - **TFTP (port 69)** right after DHCP Ack → bootloader loading files. 2. If you **never** see Discover or TFTP, only Request/Reply after the OS is up, then the bootloader is either not attempting network boot or is giving up (e.g. link not ready, timeout) and booting from eMMC. Try a full power-off (mains or PSU), wait 10 s, then power on with tcpdump already running. 3. Confirm **BOOT_ORDER=0x21** on the device (network first) and that Ethernet is connected before power-on. --- ## reTerminal DM: serial console vs USB boot (rpiboot) **The serial console is not on the same USB as rpiboot.** | Port / interface | Purpose | |------------------|--------| | **USB Type-C** (next to boot-mode switch) | Power, and **rpiboot** when eMMC is disabled (USB device mode). No serial console here. | | **40-pin GPIO header** (UART) | **Serial console.** Use a USB‑to‑serial adapter; connect its **RX** to **GPIO 14 (Pin 8)**, **GND** to **GPIO 15 (Pin 10)** or any GND. | **Baud rate:** - **Bootloader (BOOT_UART=1):** use **115200** 8N1. This is the Pi EEPROM/bootloader debug output (network boot attempts, DHCP, TFTP, errors). - **OS serial login:** some Seeed docs use **9600** for getty; many Pi images use **115200**. If you only care about bootloader messages, use **115200**. So: use the **same USB‑C cable** only for power and rpiboot. For serial console, use a **USB‑to‑serial adapter** on the **GPIO header** at **115200** to see bootloader output. --- ## Serial shows "Boot mode: SD (01)" and no network attempt If the bootloader serial output shows something like: ```text Boot mode: SD (01) order 2 ``` and you **never** see a line about network (e.g. "Trying DHCP", "TFTP", or "Boot mode: NET (02)"), then the bootloader is **not** attempting network boot for this boot. It goes straight to SD/eMMC (01). That matches “no DHCP during boot, only after OS”. **Possible causes:** 1. **BOOT_ORDER not applied or not read** From the running OS, confirm: `sudo vcgencmd bootloader_config` and check that `BOOT_ORDER=0x21` (and optionally `NET_BOOT_MAX_RETRIES`, `DHCP_TIMEOUT`, `TFTP_IP`). If you see different or missing values, the EEPROM config in use at boot may be different (e.g. old EEPROM, or update not applied on cold boot). 2. **Network tried but failed before any DHCP** The bootloader may try network, fail very early (e.g. no link, or timeout before sending DHCP), then fall back to SD without printing a “Trying network” line. Slower link-up (switch, cable) can cause this. Increasing `DHCP_TIMEOUT` and `NET_BOOT_MAX_RETRIES` (and setting `TFTP_IP`) gives the best chance. 3. **CM4 / carrier quirk** On some CM4 carriers the bootloader may skip or shorten the network attempt. Serial is the only way to see what it actually does; if you never see any network-related line, treat it as “network not attempted” for that boot. **What to try:** - Re-apply EEPROM config with network first and timeouts (as in NETWORK-BOOT-TROUBLESHOOTING), then **full power cycle** (unplug power 10+ s, then power on) with serial connected. Watch from the first character for any “NET”, “DHCP”, “TFTP” or “order” line. - For a one-off test you can set `BOOT_ORDER=0x2` (network only). If network fails, the device won’t boot (no fallback to SD). Use only to confirm whether the bootloader tries network and what it prints; then set back to `0x21`. If the full serial log never shows "NET", "DHCP", or "TFTP" and goes straight to "Boot mode: SD (01) order 2", trying `BOOT_ORDER=0x2` (network only) once will force a network attempt and should produce DHCP/TFTP messages on serial. --- ## TFTP "file .../SERIAL/start4.elf not found" — serial-number prefix The Pi bootloader may request files under a path named after the board serial number (e.g. `0d1ddbda/start4.elf`). If the TFTP root has no such subdirectory, those requests fail and the bootloader falls back to the root (e.g. `start4.elf`). To avoid "not found" for the first requests, on the LXC create the serial directory and symlink the boot files: ```bash # On the LXC (replace 0d1ddbda with your Pi's serial from vcgencmd or serial output) mkdir -p /srv/tftpboot/0d1ddbda cd /srv/tftpboot/0d1ddbda for f in start4.elf start4cd.elf start.elf fixup4.dat fixup4cd.dat config.txt cmdline.txt kernel8.img initrd.img; do [ -f ../$f ] && ln -sf ../$f $f done ``` After that, the bootloader’s first TFTP requests succeed. The device already had this directory created for serial `0d1ddbda`. --- ## Stuck in network-only boot (BOOT_ORDER=0x2): get back to Raspbian and change boot order If you set **BOOT_ORDER=0x2** (network only) for testing, the device will never try eMMC. To get back to Raspbian and set **BOOT_ORDER=0x1** or **0x21**, use **rescue mode**: the network boot chain loads the provisioning initramfs; with a special kernel cmdline it drops to a shell so you can mount eMMC and run **rpi-eeprom-config** from the eMMC install. ### Prerequisites - **Initramfs with rescue support** — Build the initramfs (it includes `/rescue-eeprom.sh`) and copy it to the LXC TFTP root and into the serial dir: ```bash cd emmc-provisioning/network-boot-initramfs && ./build.sh scp initrd.img root@:/srv/tftpboot/ ssh root@ 'cp /srv/tftpboot/initrd.img /srv/tftpboot/0d1ddbda/ 2>/dev/null || true' ``` - **TFTP config** — Ensure `/srv/tftpboot/config.txt` (and thus `0d1ddbda/config.txt` if it’s a symlink) has `kernel=kernel8.img` and `initramfs initrd.img followkernel` so the full kernel+initrd chain runs. ### Steps 1. **On the LXC**, enable rescue for this device by serving a cmdline that includes **provisioning_rescue=1**. The Pi loads `0d1ddbda/cmdline.txt`; replace that with a **real file** (not a symlink) so this device gets the rescue cmdline: ```bash # On the LXC (replace 0d1ddbda with your Pi serial if different) CD="/srv/tftpboot/0d1ddbda" rm -f "$CD/cmdline.txt" # Same as root cmdline plus rescue flag (one line, space-separated) cat /srv/tftpboot/cmdline.txt | tr '\n' ' ' > "$CD/cmdline.txt" echo -n ' provisioning_rescue=1' >> "$CD/cmdline.txt" echo >> "$CD/cmdline.txt" ``` 2. **Power on the reTerminal** (or reboot). It will network boot, load kernel + initramfs, and **rescue mode** will start a shell (serial or console). You should see: `=== RESCUE MODE (provisioning_rescue=1) ===` 3. **In the rescue shell**, run the helper to mount eMMC and run the EEPROM config from the eMMC install: ```bash /rescue-eeprom.sh ``` In the editor that opens, set **BOOT_ORDER=0x1** (eMMC only) or **0x21** (network first, then eMMC). Save and exit the editor. 4. **Reboot** from the rescue shell: ```bash reboot ``` The bootloader will apply the EEPROM update and on the next boot use the new order (eMMC only with 0x1, or network then eMMC with 0x21). 5. **On the LXC**, restore normal cmdline for the device so the next network boot runs the provisioning client, not rescue: ```bash rm -f /srv/tftpboot/0d1ddbda/cmdline.txt ln -s ../cmdline.txt /srv/tftpboot/0d1ddbda/cmdline.txt ``` See also **NETWORK-BOOT-LXC.md** for setup and monitoring.