Files
reterminal-dm4/emmc-provisioning/docs/NETWORK-BOOT-LXC.md
nearxos 10c200f994 Enhance network boot provisioning with support for extra LAN IPs and VLAN configuration</message>
<message>Update documentation and scripts to include configuration for extra LAN IPs on eth1 and VLAN interface eth1.40, allowing the LXC to serve multiple subnets and provide NAT for internet access. Modify nftables NAT configuration to accommodate these changes and ensure proper DHCP and DNS setup on eth1. This improves the overall network boot functionality and user experience for the CM4 eMMC provisioning service.
2026-03-04 19:28:53 +02:00

188 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Network boot on the provisioning LXC (eth1 = LAN, eth0 = WAN)
The provisioning LXC can provide **network boot** (PXE-style) and **internet access** to devices connected on **eth1**, while **eth0** is used as WAN for the LXC itself.
## Roles
| Interface | Role | Typical config |
|-----------|------|-----------------|
| **eth0** | WAN | DHCP or static; default route; internet for the LXC |
| **eth1** | LAN (provisioning) | Static e.g. `10.20.50.1/24`; DHCP server + TFTP server; NAT so clients get internet via eth0 |
Devices plugged into the same network as **eth1** (e.g. reTerminals with network boot enabled) will:
1. Get an IP via **DHCP** (from the LXC on eth1).
2. Get **TFTP** boot files (Raspberry Pi firmware: `start4.elf`, `fixup4.dat`, kernel, etc.) for network boot.
3. Have **internet** via NAT through the LXC (eth0).
## What you need on the LXC
1. **DHCP server** on eth1 only (e.g. **dnsmasq**), handing out addresses in e.g. `10.20.50.100``10.20.50.200` and advertising the TFTP server (next-server = LXCs eth1 IP).
2. **DNS server** on eth1 (dnsmasq): static name **file.server** → eth1 IP so scripts can use `http://file.server/...`; other queries forwarded upstream. See [DNSMASQ-DNS-FILESERVER.md](DNSMASQ-DNS-FILESERVER.md).
3. **TFTP server** (dnsmasq can provide this) with **TFTP root** containing Raspberry Pi 4 / CM4 boot files.
4. **IP forwarding** and **NAT** (nftables or iptables) so traffic from `10.20.50.0/24` is masqueraded out **eth0**.
## One-time setup (inside the LXC)
From your machine, run the setup script **on the LXC** (replace with your LXC IP if different):
```bash
# From the repo (script runs inside the LXC)
./emmc-provisioning/scripts/setup-network-boot-on-lxc.sh root@10.130.60.141
```
Or SSH into the LXC and run the script there:
```bash
ssh root@10.130.60.141
# Copy or rsync the emmc-provisioning tree into the container, then:
bash /path/to/setup-network-boot-on-lxc.sh
```
The script will:
- Install **dnsmasq** (DHCP + TFTP + DNS).
- Configure dnsmasq to listen only on **eth1**, with a DHCP range, TFTP root, and DNS (including **file.server** → eth1).
- Create `/srv/tftpboot` and **fetch Raspberry Pi 4 boot files from GitHub** (raspberrypi/firmware, `boot/` folder) if not already present.
- Enable **IPv4 forwarding** and **NAT** (nftables) so clients on eth1 use eth0 for internet.
- Enable and start the **dnsmasq** service.
## Proxmox: adding eth1 to the LXC
Use the deploy script with **`DEPLOY_LXC_LAN_BRIDGE`** and **`DEPLOY_LXC_LAN_SUBNET`** so the LXC is created with eth1 (LAN) from the start:
```bash
DEPLOY_LXC_LAN_BRIDGE=vmbr1 DEPLOY_LXC_LAN_SUBNET=10.20.50.1/24 ./emmc-provisioning/scripts/deploy-to-proxmox.sh root@YOUR_PROXMOX_HOST
```
Or add a second interface to an existing container by hand:
1. On the **Proxmox host**, add a second network device to the container, e.g.:
```bash
pct set <CTID> --net1 name=eth1,bridge=vmbr1,ip=10.20.50.1/24
```
Use the bridge that corresponds to the physical LAN where reTerminals are connected (e.g. `vmbr1` or a dedicated provisioning bridge).
2. Inside the LXC, ensure **eth1** has a static address (e.g. in `/etc/network/interfaces`):
```
auto eth1
iface eth1 inet static
address 10.20.50.1/24
```
Your current LXC already has eth0 (10.130.60.141) and eth1 (10.20.50.1); the setup script only adds DHCP, TFTP, and NAT.
### Changing the LAN subnet
When you deploy with **`DEPLOY_LXC_LAN_SUBNET`** (e.g. `10.100.1.1/24`), the deploy script writes **`/opt/cm4-provisioning/lan-subnet.conf`** inside the LXC with `LAN_GW`, `LAN_CIDR`, and the DHCP range. All LXC services use this file:
- **dnsmasq** (DHCP range and, via the toggle script, TFTP next-server)
- **nftables/iptables** (NAT source subnet)
- **toggle-network-boot-dhcp.sh** (option 66/67 next-server)
So changing `DEPLOY_LXC_LAN_SUBNET` and **re-running the deploy script** updates `lan-subnet.conf`. To apply the new subnet to dnsmasq and NAT, **re-run the setup script** after redeploying:
```bash
./emmc-provisioning/scripts/setup-network-boot-on-lxc.sh root@<LXC-IP>
```
Then run **toggle enable** again if you use network boot: `ssh root@<LXC-IP> /opt/cm4-provisioning/toggle-network-boot-dhcp.sh enable`
### Extra LAN IPs and VLAN (eth1.40)
The setup script also configures **extra IPs on eth1** and a **VLAN interface** so the LXC can serve multiple subnets and provide internet (NAT) to all of them:
| Address / interface | Purpose |
|--------------------|--------|
| **Primary** (e.g. `10.20.50.1/24`) | Set at deploy; used by dnsmasq for DHCP/TFTP/DNS |
| **192.168.30.1/24** | Extra LAN on eth1 |
| **192.168.127.1/24** | Extra LAN on eth1 |
| **eth1.40** **192.168.0.1/24** | VLAN 40 on eth1 |
- Config is persisted in **`/etc/network/interfaces.d/70-cm4-extra-lan`** (installed when you run `setup-network-boot-on-lxc.sh`).
- **NAT** is applied to all four: primary LAN, 192.168.30.0/24, 192.168.127.0/24, and 192.168.0.0/24 (VLAN 40), so clients on any of these subnets get internet via eth0.
- For **VLAN 40** to receive tagged traffic, the Proxmox bridge connected to eth1 (e.g. vmbr1) must either be a trunk that passes VLAN 40, or you use a dedicated bridge (e.g. vmbr1.40) and attach the container to it as a second interface; the script creates eth1.40 inside the LXC for the in-container VLAN case.
## After setup: reTerminal network boot
1. Set the reTerminal **boot order** to try eMMC first, then network (e.g. `BOOT_ORDER=0xf21`): use the dashboard **Update EEPROM** when the device is connected via USB boot, or set manually (usbboot recovery / `rpi-eeprom-config` on device). Not set by first-boot.
2. Connect the reTerminal to the **same network as the LXCs eth1** (e.g. 10.20.50.0/24).
3. Power on; it will get an IP via DHCP and load boot files via TFTP from the LXC.
4. For **provisioning** (Backup/Deploy), the netboot environment must run **network-client/provisioning-client.sh** with `PROVISIONING_SERVER=http://10.20.50.1:5000` so it talks to the dashboard on the LXC.
## TFTP boot files (Raspberry Pi 4 / CM4)
The setup script **automatically downloads** the official Raspberry Pi firmware `boot/` folder from GitHub (https://github.com/raspberrypi/firmware) into `/srv/tftpboot` when `start4cd.elf` is missing. No manual copy is needed.
To refresh or populate TFTP without re-running the full setup:
```bash
./emmc-provisioning/scripts/populate-tftpboot-from-git.sh root@<LXC-IP>
```
(Remove `/srv/tftpboot/start4cd.elf` on the LXC first if you want a full re-fetch.)
The TFTP root contains e.g. `start4cd.elf`, `fixup4cd.dat`, `config.txt`, `cmdline.txt`, `kernel8.img`, and other boot files. For a custom kernel or initramfs (e.g. for provisioning), add or replace files in `/srv/tftpboot` and adjust `config.txt` / `cmdline.txt` as needed.
## DHCP leases
On the LXC, dnsmasq stores DHCP leases in **`/var/lib/misc/dnsmasq.leases`** (Debian/Ubuntu default). To see which devices got an IP on the provisioning LAN:
```bash
# On the LXC (or via SSH)
cat /var/lib/misc/dnsmasq.leases
```
Each line is: *expiry_epoch MAC IP hostname client_id*. Example: `1734567890 aa:bb:cc:dd:ee:ff 10.20.50.101 reterminal 01:aa:bb:cc:dd:ee:ff`
---
## Testing network boot
1. **Prerequisites**
- reTerminal has **BOOT_ORDER=0xf21** (eMMC first, then network). Check on the device:
`ssh pi@<device-ip> 'bash -s' < emmc-provisioning/scripts/check-network-boot-priority.sh`
- LXC network-boot options are **enabled**: on the LXC run
`/opt/cm4-provisioning/toggle-network-boot-dhcp.sh status` → should print `enabled`. If not: `toggle-network-boot-dhcp.sh enable`
- reTerminal is on the **same LAN as the LXCs eth1** (e.g. 10.20.50.0/24), Ethernet connected.
2. **Power cycle the reTerminal** (or reboot if its already running). It will request DHCP, get options 66/67 (TFTP server + boot file), then TFTP boot files from the LXC.
3. **What “working” looks like**
- **On the LXC**: a new lease appears in `/var/lib/misc/dnsmasq.leases` (device MAC + IP in 10.20.50.x).
- If the netboot environment runs **provisioning-client.sh** and registers with the dashboard: the device appears under **“Device detected (Network)”** on the dashboard (`http://<LXC-IP>:5000`), and you can choose Backup/Deploy.
- If you only use “plain” Pi netboot (no custom initramfs/provisioning client): you just see the DHCP lease and the device loading files via TFTP; it may boot to a minimal kernel/initramfs or NFS root depending on your TFTP config.
4. **Quick test without a reTerminal**
- From a Linux host on the same VLAN as eth1, run:
`sudo dhclient -v eth0` (or your interface) and check that you get an IP in 10.20.50.x and, if netboot is enabled, that the DHCP reply includes option 66 (next-server) and 67 (boot file).
- Or on the LXC run `tcpdump -i eth1 -n port 67 or port 68` and power on the reTerminal: you should see DHCP (Discover/Offer/Request/Ack) and then TFTP traffic.
---
## Monitoring on the LXC
| What to check | How |
|--------------|-----|
| **Network boot enabled?** | ` /opt/cm4-provisioning/toggle-network-boot-dhcp.sh status` → `enabled` or `disabled` |
| **DHCP leases** | `cat /var/lib/misc/dnsmasq.leases` — lists MAC, IP, hostname for devices that got an IP from dnsmasq on eth1 |
| **dnsmasq (DHCP/TFTP) running** | `systemctl status dnsmasq` or `service dnsmasq status` |
| **TFTP root present** | `ls -la /srv/tftpboot/` — should contain e.g. `start4cd.elf`, `fixup4cd.dat`, `config.txt`, `kernel8.img` |
| **Live DHCP/TFTP traffic** | `tcpdump -i eth1 -n port 67 or port 68 or port 69` (67/68 = DHCP, 69 = TFTP). Run while powering on a device. |
| **Dashboard network devices** | Open `http://<LXC-IP>:5000`; under “Device detected (Network)” you see devices that have called `POST /api/register-device` (only if your netboot environment runs the provisioning client). |
| **Registered devices (raw)** | `cat /var/lib/cm4-provisioning/network_devices.json` (if the dashboard uses default path) — list of MAC, IP, action. |
Optional: enable dnsmasq query logging to see every DHCP request. Add to a config in `/etc/dnsmasq.d/` (e.g. `log-queries.conf`): `log-queries` and `log-facility=/var/log/dnsmasq.log`, then create the log file and `systemctl reload dnsmasq`. Check your distros dnsmasq doc for log location.
---
## Summary
| Component | Where | Purpose |
|-------------|--------|--------|
| eth0 | LXC | WAN; LXCs internet |
| eth1 | LXC | LAN; 10.20.50.1/24; DHCP + TFTP |
| dnsmasq | LXC | DHCP (on eth1) + TFTP |
| TFTP root | LXC | e.g. `/srv/tftpboot` with RPi boot files |
| NAT | LXC | 10.20.50.0/24 → eth0 so LAN has internet |