Files
reterminal-dm4/emmc-provisioning/docs/NETWORK-BOOT-LXC.md
nearxos 123fd8748e Update provisioning documentation and scripts for improved Proxmox deployment</message>
<message>Add a new step-by-step guide for deploying the CM4 eMMC provisioning service on a new Proxmox instance, enhancing clarity for users. Update existing documentation to reflect changes in network configuration options, including the introduction of LAN subnet settings for DHCP and TFTP. Modify cloud-init scripts to ensure proper management of DNS settings and improve the handling of network interfaces. Additionally, enhance the toggle script for network boot to dynamically read the LAN gateway from configuration files, streamlining the setup process and improving user experience.
2026-03-03 08:24:18 +02:00

172 lines
9.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Network boot on the provisioning LXC (eth1 = LAN, eth0 = WAN)
The provisioning LXC can provide **network boot** (PXE-style) and **internet access** to devices connected on **eth1**, while **eth0** is used as WAN for the LXC itself.
## Roles
| Interface | Role | Typical config |
|-----------|------|-----------------|
| **eth0** | WAN | DHCP or static; default route; internet for the LXC |
| **eth1** | LAN (provisioning) | Static e.g. `10.20.50.1/24`; DHCP server + TFTP server; NAT so clients get internet via eth0 |
Devices plugged into the same network as **eth1** (e.g. reTerminals with network boot enabled) will:
1. Get an IP via **DHCP** (from the LXC on eth1).
2. Get **TFTP** boot files (Raspberry Pi firmware: `start4.elf`, `fixup4.dat`, kernel, etc.) for network boot.
3. Have **internet** via NAT through the LXC (eth0).
## What you need on the LXC
1. **DHCP server** on eth1 only (e.g. **dnsmasq**), handing out addresses in e.g. `10.20.50.100``10.20.50.200` and advertising the TFTP server (next-server = LXCs eth1 IP).
2. **TFTP server** (dnsmasq can provide this) with **TFTP root** containing Raspberry Pi 4 / CM4 boot files.
3. **IP forwarding** and **NAT** (nftables or iptables) so traffic from `10.20.50.0/24` is masqueraded out **eth0**.
## One-time setup (inside the LXC)
From your machine, run the setup script **on the LXC** (replace with your LXC IP if different):
```bash
# From the repo (script runs inside the LXC)
./emmc-provisioning/scripts/setup-network-boot-on-lxc.sh root@10.130.60.141
```
Or SSH into the LXC and run the script there:
```bash
ssh root@10.130.60.141
# Copy or rsync the emmc-provisioning tree into the container, then:
bash /path/to/setup-network-boot-on-lxc.sh
```
The script will:
- Install **dnsmasq** (DHCP + TFTP).
- Configure dnsmasq to listen only on **eth1**, with a DHCP range and TFTP root.
- Create `/srv/tftpboot` and **fetch Raspberry Pi 4 boot files from GitHub** (raspberrypi/firmware, `boot/` folder) if not already present.
- Enable **IPv4 forwarding** and **NAT** (nftables) so clients on eth1 use eth0 for internet.
- Enable and start the **dnsmasq** service.
## Proxmox: adding eth1 to the LXC
Use the deploy script with **`DEPLOY_LXC_LAN_BRIDGE`** and **`DEPLOY_LXC_LAN_SUBNET`** so the LXC is created with eth1 (LAN) from the start:
```bash
DEPLOY_LXC_LAN_BRIDGE=vmbr1 DEPLOY_LXC_LAN_SUBNET=10.20.50.1/24 ./emmc-provisioning/scripts/deploy-to-proxmox.sh root@YOUR_PROXMOX_HOST
```
Or add a second interface to an existing container by hand:
1. On the **Proxmox host**, add a second network device to the container, e.g.:
```bash
pct set <CTID> --net1 name=eth1,bridge=vmbr1,ip=10.20.50.1/24
```
Use the bridge that corresponds to the physical LAN where reTerminals are connected (e.g. `vmbr1` or a dedicated provisioning bridge).
2. Inside the LXC, ensure **eth1** has a static address (e.g. in `/etc/network/interfaces`):
```
auto eth1
iface eth1 inet static
address 10.20.50.1/24
```
Your current LXC already has eth0 (10.130.60.141) and eth1 (10.20.50.1); the setup script only adds DHCP, TFTP, and NAT.
### Changing the LAN subnet
When you deploy with **`DEPLOY_LXC_LAN_SUBNET`** (e.g. `10.100.1.1/24`), the deploy script writes **`/opt/cm4-provisioning/lan-subnet.conf`** inside the LXC with `LAN_GW`, `LAN_CIDR`, and the DHCP range. All LXC services use this file:
- **dnsmasq** (DHCP range and, via the toggle script, TFTP next-server)
- **nftables/iptables** (NAT source subnet)
- **toggle-network-boot-dhcp.sh** (option 66/67 next-server)
So changing `DEPLOY_LXC_LAN_SUBNET` and **re-running the deploy script** updates `lan-subnet.conf`. To apply the new subnet to dnsmasq and NAT, **re-run the setup script** after redeploying:
```bash
./emmc-provisioning/scripts/setup-network-boot-on-lxc.sh root@<LXC-IP>
```
Then run **toggle enable** again if you use network boot: `ssh root@<LXC-IP> /opt/cm4-provisioning/toggle-network-boot-dhcp.sh enable`
## After setup: reTerminal network boot
1. Set the reTerminal **boot order** to try eMMC first, then network (e.g. `BOOT_ORDER=0xf21`): use the dashboard **Update EEPROM** when the device is connected via USB boot, or set manually (usbboot recovery / `rpi-eeprom-config` on device). Not set by first-boot.
2. Connect the reTerminal to the **same network as the LXCs eth1** (e.g. 10.20.50.0/24).
3. Power on; it will get an IP via DHCP and load boot files via TFTP from the LXC.
4. For **provisioning** (Backup/Deploy), the netboot environment must run **network-client/provisioning-client.sh** with `PROVISIONING_SERVER=http://10.20.50.1:5000` so it talks to the dashboard on the LXC.
## TFTP boot files (Raspberry Pi 4 / CM4)
The setup script **automatically downloads** the official Raspberry Pi firmware `boot/` folder from GitHub (https://github.com/raspberrypi/firmware) into `/srv/tftpboot` when `start4cd.elf` is missing. No manual copy is needed.
To refresh or populate TFTP without re-running the full setup:
```bash
./emmc-provisioning/scripts/populate-tftpboot-from-git.sh root@<LXC-IP>
```
(Remove `/srv/tftpboot/start4cd.elf` on the LXC first if you want a full re-fetch.)
The TFTP root contains e.g. `start4cd.elf`, `fixup4cd.dat`, `config.txt`, `cmdline.txt`, `kernel8.img`, and other boot files. For a custom kernel or initramfs (e.g. for provisioning), add or replace files in `/srv/tftpboot` and adjust `config.txt` / `cmdline.txt` as needed.
## DHCP leases
On the LXC, dnsmasq stores DHCP leases in **`/var/lib/misc/dnsmasq.leases`** (Debian/Ubuntu default). To see which devices got an IP on the provisioning LAN:
```bash
# On the LXC (or via SSH)
cat /var/lib/misc/dnsmasq.leases
```
Each line is: *expiry_epoch MAC IP hostname client_id*. Example: `1734567890 aa:bb:cc:dd:ee:ff 10.20.50.101 reterminal 01:aa:bb:cc:dd:ee:ff`
---
## Testing network boot
1. **Prerequisites**
- reTerminal has **BOOT_ORDER=0xf21** (eMMC first, then network). Check on the device:
`ssh pi@<device-ip> 'bash -s' < emmc-provisioning/scripts/check-network-boot-priority.sh`
- LXC network-boot options are **enabled**: on the LXC run
`/opt/cm4-provisioning/toggle-network-boot-dhcp.sh status` → should print `enabled`. If not: `toggle-network-boot-dhcp.sh enable`
- reTerminal is on the **same LAN as the LXCs eth1** (e.g. 10.20.50.0/24), Ethernet connected.
2. **Power cycle the reTerminal** (or reboot if its already running). It will request DHCP, get options 66/67 (TFTP server + boot file), then TFTP boot files from the LXC.
3. **What “working” looks like**
- **On the LXC**: a new lease appears in `/var/lib/misc/dnsmasq.leases` (device MAC + IP in 10.20.50.x).
- If the netboot environment runs **provisioning-client.sh** and registers with the dashboard: the device appears under **“Device detected (Network)”** on the dashboard (`http://<LXC-IP>:5000`), and you can choose Backup/Deploy.
- If you only use “plain” Pi netboot (no custom initramfs/provisioning client): you just see the DHCP lease and the device loading files via TFTP; it may boot to a minimal kernel/initramfs or NFS root depending on your TFTP config.
4. **Quick test without a reTerminal**
- From a Linux host on the same VLAN as eth1, run:
`sudo dhclient -v eth0` (or your interface) and check that you get an IP in 10.20.50.x and, if netboot is enabled, that the DHCP reply includes option 66 (next-server) and 67 (boot file).
- Or on the LXC run `tcpdump -i eth1 -n port 67 or port 68` and power on the reTerminal: you should see DHCP (Discover/Offer/Request/Ack) and then TFTP traffic.
---
## Monitoring on the LXC
| What to check | How |
|--------------|-----|
| **Network boot enabled?** | ` /opt/cm4-provisioning/toggle-network-boot-dhcp.sh status` → `enabled` or `disabled` |
| **DHCP leases** | `cat /var/lib/misc/dnsmasq.leases` — lists MAC, IP, hostname for devices that got an IP from dnsmasq on eth1 |
| **dnsmasq (DHCP/TFTP) running** | `systemctl status dnsmasq` or `service dnsmasq status` |
| **TFTP root present** | `ls -la /srv/tftpboot/` — should contain e.g. `start4cd.elf`, `fixup4cd.dat`, `config.txt`, `kernel8.img` |
| **Live DHCP/TFTP traffic** | `tcpdump -i eth1 -n port 67 or port 68 or port 69` (67/68 = DHCP, 69 = TFTP). Run while powering on a device. |
| **Dashboard network devices** | Open `http://<LXC-IP>:5000`; under “Device detected (Network)” you see devices that have called `POST /api/register-device` (only if your netboot environment runs the provisioning client). |
| **Registered devices (raw)** | `cat /var/lib/cm4-provisioning/network_devices.json` (if the dashboard uses default path) — list of MAC, IP, action. |
Optional: enable dnsmasq query logging to see every DHCP request. Add to a config in `/etc/dnsmasq.d/` (e.g. `log-queries.conf`): `log-queries` and `log-facility=/var/log/dnsmasq.log`, then create the log file and `systemctl reload dnsmasq`. Check your distros dnsmasq doc for log location.
---
## Summary
| Component | Where | Purpose |
|-------------|--------|--------|
| eth0 | LXC | WAN; LXCs internet |
| eth1 | LXC | LAN; 10.20.50.1/24; DHCP + TFTP |
| dnsmasq | LXC | DHCP (on eth1) + TFTP |
| TFTP root | LXC | e.g. `/srv/tftpboot` with RPi boot files |
| NAT | LXC | 10.20.50.0/24 → eth0 so LAN has internet |