diff --git a/README.md b/README.md index 5c778328..a38f4aff 100644 --- a/README.md +++ b/README.md @@ -182,6 +182,23 @@ This port follows the same model as the POSIX wrapper: - Socket wrappers serialize stack access with a mutex - Blocking operations wait on callback-driven wakeups (instead of busy polling) +## Documentation + +- [API reference](docs/API.md): core stack, socket, and protocol-client APIs +- [Porting guide](docs/porting_guide.md): designing device drivers (with and without DMA) and porting wolfIP to a new operating system + +Module how-tos: + +- [TLS over wolfIP](docs/tls_howto.md): running wolfSSL/TLS on wolfIP sockets, the I/O-callback bridge, and non-blocking handshakes +- [HTTP/HTTPS server](docs/http_server_howto.md): the `src/http/` server module, handler registration, and enabling HTTPS +- [IPsec ESP](docs/ipsec_esp_howto.md): securing traffic with ESP transport mode, SA setup, and Linux `ip xfrm` interop +- [wolfGuard (FIPS WireGuard)](docs/wolfguard_howto.md): the in-stack WireGuard tunnel, peer/key setup, and kernel interop +- [TFTP](docs/tftp_howto.md): the TFTP client/server module, callback wiring, and the firmware-download pattern +- [DHCP & DNS clients](docs/dhcp_dns_howto.md): acquiring a lease, resolving names, and the poll-loop lifecycle +- [Advanced IPv4](docs/advanced_ipv4_howto.md): multicast/IGMP, IPv4 forwarding, multiple interfaces, and loopback + +- [Migrating from lwIP](docs/migrating_from_lwIP.md): mapping lwIP concepts and APIs onto wolfIP + ## Source Layout - `src/wolfip.c`: core TCP/IP stack diff --git a/docs/API.md b/docs/API.md index 69636ae1..c7179d21 100644 --- a/docs/API.md +++ b/docs/API.md @@ -19,6 +19,21 @@ wolfIP is a minimal TCP/IP stack designed for resource-constrained embedded syst - TFTP (RFC 1350, RFC 2347, RFC 2348, RFC 2349, RFC 7440) via the reusable `src/tftp/` module - UDP (RFC 768) - unicast, optional IPv4 multicast with `IP_MULTICAST` - TCP (RFC 793) with options (Timestamps, MSS) + - IPsec ESP (RFC 4303) - transport mode, manual keying, with `WOLFIP_ESP` + +## Module How-To Guides + +The core socket and stack APIs are documented below. Optional modules and +features have dedicated getting-started guides: + +- [TLS over wolfIP](tls_howto.md) — running wolfSSL/TLS on wolfIP sockets (`WOLFSSL_WOLFIP`), the I/O-callback bridge, and non-blocking handshakes. +- [HTTP/HTTPS server](http_server_howto.md) — the `src/http/` server module (`WOLFIP_ENABLE_HTTP`), handler registration, and enabling HTTPS via a `WOLFSSL_CTX`. +- [IPsec ESP how-to](ipsec_esp_howto.md) — build with `WOLFIP_ESP`, install Security Associations, and interoperate with Linux `ip xfrm`. +- [wolfGuard (FIPS WireGuard)](wolfguard_howto.md) — the in-stack WireGuard tunnel (`WOLFGUARD`), peer/key setup, and kernel interop. +- [TFTP how-to](tftp_howto.md) — the callback-driven, allocation-free TFTP client/server in `src/tftp/`, including the firmware-download pattern. +- [DHCP & DNS clients](dhcp_dns_howto.md) — acquiring a lease, resolving names with `nslookup`, and the poll-loop lifecycle. +- [Advanced IPv4](advanced_ipv4_howto.md) — multicast/IGMP, IPv4 forwarding, multiple interfaces, and loopback. +- [Porting guide](porting_guide.md) — writing device drivers and porting wolfIP to a new OS. ## Build Integration diff --git a/docs/advanced_ipv4_howto.md b/docs/advanced_ipv4_howto.md new file mode 100644 index 00000000..3b664ca4 --- /dev/null +++ b/docs/advanced_ipv4_howto.md @@ -0,0 +1,429 @@ +# Advanced IPv4 How-To + +This guide shows how to use wolfIP's **advanced IPv4 features**: UDP multicast +with IGMP membership reports, IPv4 forwarding between interfaces, multi-interface +configuration, and the internal loopback interface. Each feature is opt-in and +removed by the preprocessor when disabled, so the default single-interface +endpoint build is unchanged. + +It is a getting-started document, not a reference manual. The authoritative API +is `wolfip.h` (with the compile-time switches in `config.h`); the worked +examples come from `src/test/test_multicast_interop.c`, +`src/test/unit/unit_tests_multicast.c`, and `src/test/test_wolfssl_forwarding.c`. + +## Table of Contents + +- [1. Multicast and IGMP](#1-multicast-and-igmp) +- [2. IPv4 forwarding](#2-ipv4-forwarding) +- [3. Multiple interfaces](#3-multiple-interfaces) +- [4. The loopback interface](#4-the-loopback-interface) +- [5. Troubleshooting](#5-troubleshooting) + +--- + +## 1. Multicast and IGMP + +IPv4 UDP multicast is compiled out by default. Define `IP_MULTICAST` to enable +the BSD-style multicast socket options and the IGMPv3 any-source-multicast (ASM) +membership reports. When the macro is undefined, none of the multicast code or +state is built into the stack. + +The socket options are exposed under wolfIP-prefixed names so they resolve to the +host's `IP_*` constants when those headers are present and to fixed fallback +values otherwise (`wolfip.h`): + +| Option | Fallback value | `optval` type | Meaning | +|--------|---------------|---------------|---------| +| `WOLFIP_IP_ADD_MEMBERSHIP` | 35 | `struct wolfIP_ip_mreq` | Join a multicast group on an interface. | +| `WOLFIP_IP_DROP_MEMBERSHIP` | 36 | `struct wolfIP_ip_mreq` | Leave a previously joined group. | +| `WOLFIP_IP_MULTICAST_IF` | 32 | `struct wolfIP_mreq_addr` | Pin the egress interface for multicast sends. | +| `WOLFIP_IP_MULTICAST_TTL` | 33 | `int` or `uint8_t` | TTL for outgoing multicast datagrams. | +| `WOLFIP_IP_MULTICAST_LOOP` | 34 | `int` or `uint8_t` | Deliver this socket's own multicast sends back to local members. | + +All multicast options use socket level `WOLFIP_SOL_IP`, and they apply only to +**UDP** sockets. The `struct wolfIP_ip_mreq` / `struct wolfIP_mreq_addr` types are +defined in `wolfip.h`: + +```c +struct wolfIP_mreq_addr { + uint32_t s_addr; +}; + +struct wolfIP_ip_mreq { + struct wolfIP_mreq_addr imr_multiaddr; /* the group address */ + struct wolfIP_mreq_addr imr_interface; /* local iface IP, or ANY */ +}; +``` + +The `s_addr` fields are in **network byte order** — fill them with `inet_pton()` +or `htonl()`, exactly as with a host `struct ip_mreq`. + +### Joining a group + +To receive multicast, bind a UDP socket to the destination port, then join the +group with `WOLFIP_IP_ADD_MEMBERSHIP`. From `src/test/test_multicast_interop.c`: + +```c +wolf_fd = wolfIP_sock_socket(s, AF_INET, IPSTACK_SOCK_DGRAM, 17); + +memset(&bind_addr, 0, sizeof(bind_addr)); +bind_addr.sin_family = AF_INET; +bind_addr.sin_port = htons(MCAST_PORT); +bind_addr.sin_addr.s_addr = 0; +wolfIP_sock_bind(s, wolf_fd, (struct wolfIP_sockaddr *)&bind_addr, + sizeof(bind_addr)); + +memset(&mreq, 0, sizeof(mreq)); +inet_pton(AF_INET, MCAST_GROUP, &mreq.imr_multiaddr.s_addr); +mreq.imr_interface.s_addr = htonl(INADDR_ANY); /* let routing pick the iface */ +wolfIP_sock_setsockopt(s, wolf_fd, WOLFIP_SOL_IP, + WOLFIP_IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq)); +``` + +`imr_interface` selects the interface: + +- `INADDR_ANY` resolves the interface from the route to the group, and that + interface **must already have a configured source IP** — the join is rejected + with `-WOLFIP_EINVAL` otherwise, because there would be no valid source address + to build the IGMP report from (`mcast_if_from_addr`, `src/wolfip.c`). +- A specific local-interface IP pins the membership to that interface. + +A datagram for a joined group is delivered only after the join: an unjoined +socket bound to the same port does not receive the group's traffic. + +### IGMP report behavior + +The join/leave path is driven by the membership table in `struct wolfIP` +(`src/wolfip.c`), and emits IGMPv3 reports automatically: + +- **On the first join** of a `{interface, group}` pair, wolfIP sends an IGMPv3 + Current-State Report with record type `MODE_IS_EXCLUDE` (an ASM join). The + report is sent to the IGMPv3 all-routers address `224.0.0.22` + (`igmp_send_report`). +- Memberships are **reference-counted**: multiple sockets joining the same + `{interface, group}` share one membership entry, so only the first join and the + last leave hit the wire. +- **On the last leave**, wolfIP sends a `CHANGE_TO_INCLUDE` report (the ASM + leave). +- **Incoming IGMP Membership Queries** are answered, but not synchronously. Per + RFC 3376 §5.2, wolfIP schedules a Current-State Report after a random delay + drawn from the query's Max-Response-Time window, which coalesces a query flood + into one deferred report per group (`igmp_input`, `igmp_report_timer_cb`). +- A query is accepted only if it arrives with IP **TTL 1** and is addressed to + `224.0.0.1` (all-hosts) or to the group itself; anything else is dropped as + off-link or spoofed. + +### Sending multicast + +To transmit, set the desired TX options and `sendto()` the group address. The +default multicast TTL is `1` and loopback defaults to `1` for a new UDP socket +(`src/wolfip.c`). From `src/test/test_multicast_interop.c`: + +```c +int ttl = 3; +wolfIP_sock_setsockopt(s, wolf_fd, WOLFIP_SOL_IP, + WOLFIP_IP_MULTICAST_TTL, &ttl, sizeof(ttl)); + +memset(&dst, 0, sizeof(dst)); +dst.sin_family = AF_INET; +dst.sin_port = htons(WOLFIP_MCAST_PORT); +inet_pton(AF_INET, MCAST_GROUP, &dst.sin_addr.s_addr); +wolfIP_sock_sendto(s, wolf_fd, payload, sizeof(payload), 0, + (struct wolfIP_sockaddr *)&dst, sizeof(dst)); +``` + +Two more TX controls: + +- `WOLFIP_IP_MULTICAST_IF` pins the egress interface (by local-interface IP) for + this socket's multicast sends. Passing `INADDR_ANY` clears the pin and reverts + to per-destination routing. +- `WOLFIP_IP_MULTICAST_LOOP` controls local delivery of the socket's own sends. + When enabled (the default), a multicast datagram is looped back to local group + members **after a successful wire send**, inside `wolfIP_poll()` + (`src/wolfip.c`). A single socket that both joined a group and sent to it can + therefore read its own datagram back: + +```c +/* unit_tests_multicast.c: join + set TTL/LOOP, send, poll, then recv self */ +wolfIP_sock_setsockopt(&s, sd, WOLFIP_SOL_IP, WOLFIP_IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq)); +wolfIP_sock_setsockopt(&s, sd, WOLFIP_SOL_IP, WOLFIP_IP_MULTICAST_TTL, &ttl, sizeof(ttl)); +wolfIP_sock_setsockopt(&s, sd, WOLFIP_SOL_IP, WOLFIP_IP_MULTICAST_LOOP, &loop, sizeof(loop)); +wolfIP_sock_sendto(&s, sd, payload, sizeof(payload), 0, (struct wolfIP_sockaddr *)&dst, sizeof(dst)); +wolfIP_poll(&s, 1); /* drives the wire send + loopback */ +wolfIP_sock_recvfrom(&s, sd, out, sizeof(out), 0, NULL, NULL); /* reads its own datagram */ +``` + +### Testing + +The repository ships a Linux interop test that creates a TAP interface +(`wmcast0`) and validates both directions — Linux sending to a wolfIP receiver +and wolfIP sending to a Linux receiver (`src/test/test_multicast_interop.c`): + +```sh +make unit-multicast +./build/test/unit + +make build/test-multicast-interop +sudo ./build/test-multicast-interop +``` + +Per-membership compile-time sizing (`src/wolfip.c`): each UDP socket holds up to +`WOLFIP_UDP_MCAST_MEMBERSHIPS` (default 4) joins, and the stack-wide table holds +`MAX_UDPSOCKETS * WOLFIP_UDP_MCAST_MEMBERSHIPS` distinct memberships. + +## 2. IPv4 forwarding + +By default wolfIP is an endpoint and does not route. Set +`WOLFIP_ENABLE_FORWARDING` to `1` at compile time (default `0` in `config.h`) to +turn the stack into a simple IPv4 router between its interfaces. + +Forwarding is inherently multi-interface: `wolfIP_forward_interface()` returns +`-1` whenever `if_count < 2` (`src/wolfip.c`), so a useful forwarding build also +needs `WOLFIP_MAX_INTERFACES >= 2` (see [section 3](#3-multiple-interfaces)). + +### How a packet is forwarded + +When a frame arrives whose destination is **not** local to the receiving +interface, the IP input path asks `wolfIP_forward_interface()` for an egress +interface. That lookup covers both: + +- directly connected subnets on the stack's other interfaces, and +- optional static routes added with `wolfIP_route_add()`. + +If a route is found, the packet is forwarded out that interface: + +```text + in_if ─▶ IP input ─▶ dest local to this host? ─yes▶ deliver up the stack + │ no + ▼ + forward_interface(in_if, dest) + │ (connected-subnet or static-route lookup) + ┌───────────┴───────────┐ + │ no out iface │ out iface found + ▼ ▼ + (dropped) ttl <= 1 ? ─yes▶ ICMP TTL-exceeded back to src + │ no + ▼ + ttl--, recompute IP checksum, + ARP-resolve next hop, send on out_if +``` + +From the dispatch in `src/wolfip.c`: + +```c +int out_if = wolfIP_forward_interface(s, if_idx, dest); +if (out_if >= 0) { + if (ip->ttl <= 1) { + wolfIP_send_ttl_exceeded(s, if_idx, ip); /* ICMP type 11 to the source */ + return; + } + if (!wolfIP_forward_prepare(s, out_if, dest, mac, &broadcast)) { + arp_queue_packet(s, out_if, dest, ip, len); /* queue until ARP resolves */ + return; + } + ip->ttl--; + ip->csum = 0; + iphdr_set_checksum(ip); + wolfIP_forward_packet(s, out_if, ip, len, broadcast ? NULL : mac, broadcast); + return; +} +``` + +Key behaviors, all from `src/wolfip.c`: + +- **TTL is decremented** by one on every forwarded packet, and the IP header + checksum is recomputed. +- **TTL exhaustion** (`ttl <= 1`) produces an **ICMP TTL-exceeded** (type 11) back + to the original source instead of forwarding. +- **Next-hop resolution** uses ARP on Ethernet out-interfaces; if the MAC is not + yet known the packet is queued (`arp_queue_packet`) and sent once ARP resolves + — it is not dropped. +- A **reverse-path (RPF) check** drops a packet whose source address is local to + another of the host's interfaces before it is forwarded. +- Frames arriving on a non-loopback interface with a `127/8` source or + destination are dropped (loopback addresses must not appear on the wire). + +When forwarding is enabled, the optional static-route API is also compiled in +(`wolfip.h`): `wolfIP_route_add()`, `wolfIP_route_delete()`, +`wolfIP_route_lookup()`, `wolfIP_route_get()`, and `wolfIP_route_count()`. The +route lookup performs longest-prefix matching across connected subnets and static +routes together. + +### Wiring a router + +`src/test/test_wolfssl_forwarding.c` builds a two-interface router: interface 0 +on the LAN, interface 1 on the WAN, each with its own IP config. + +```c +/* router has WOLFIP_MAX_INTERFACES = 2, WOLFIP_ENABLE_FORWARDING = 1 */ +wolfIP_init(router_stack); + +tap_dev = wolfIP_getdev(router_stack); /* iface 0 driver */ +tap_init(tap_dev, TAP_IFNAME, host_addr.s_addr); + +mem_link_attach(wolfIP_getdev_ex(router_stack, 1), /* iface 1 driver */ ...); + +wolfIP_ipconfig_set_ex(router_stack, 0, router_lan_ip4, IP4(255,255,255,0), IP4(0,0,0,0)); +wolfIP_ipconfig_set_ex(router_stack, 1, router_wan_ip4, IP4(255,255,255,0), IP4(0,0,0,0)); +``` + +A host on the LAN reaching a server on the WAN sets the router's LAN address as +its gateway; the router forwards between the two connected subnets automatically. + +If the next hop is **not** directly on one of those connected subnets, add a +static route: + +```c +/* 10.20.0.0/16 is reachable via 192.168.1.254 on interface 0 */ +wolfIP_route_add(s, 0, IP4(10,20,0,0), 16, IP4(192,168,1,254)); +``` + +The static-route API is compiled only when forwarding is enabled: +`wolfIP_route_add()`, `wolfIP_route_delete()`, `wolfIP_route_lookup()`, +`wolfIP_route_get()`, and `wolfIP_route_count()`. + +## 3. Multiple interfaces + +`WOLFIP_MAX_INTERFACES` (default `2` in `config.h`) sizes the per-stack arrays of +link-layer descriptors and IP configurations. `wolfIP_init()` sets `if_count` to +`WOLFIP_MAX_INTERFACES` and initialises every slot (`src/wolfip.c`). + +Each interface slot is addressed by a zero-based `if_idx`. There are two parallel +accessor families in `wolfip.h`: + +| Legacy (first hardware iface) | Indexed (`_ex`) | Purpose | +|-------------------------------|-----------------|---------| +| `wolfIP_getdev(s)` | `wolfIP_getdev_ex(s, if_idx)` | Get the `struct wolfIP_ll_dev *` to wire to a driver. | +| `wolfIP_ipconfig_set(s, ip, mask, gw)` | `wolfIP_ipconfig_set_ex(s, if_idx, ip, mask, gw)` | Set IP / netmask / gateway. | +| `wolfIP_ipconfig_get(s, &ip, &mask, &gw)` | `wolfIP_ipconfig_get_ex(s, if_idx, &ip, &mask, &gw)` | Read IP config. | +| `wolfIP_recv(s, buf, len)` | `wolfIP_recv_ex(s, if_idx, buf, len)` | Hand an inbound frame to the stack. | + +`wolfIP_getdev_ex()` returns `NULL` when `if_idx` is out of range. The legacy +helpers all target the **first hardware interface**, which is index `0` normally +but index `1` when loopback is enabled (see [section 4](#4-the-loopback-interface) +and the `WOLFIP_PRIMARY_IF_IDX` definition in `src/wolfip.c`). + +### Configuring two interfaces + +Wire each slot's `struct wolfIP_ll_dev` to a driver (set `mac`, `mtu`, and the +`poll`/`send` callbacks) and give each slot an IP config +(`src/test/test_wolfssl_forwarding.c`): + +```c +wolfIP_init(s); + +struct wolfIP_ll_dev *dev0 = wolfIP_getdev_ex(s, 0); +struct wolfIP_ll_dev *dev1 = wolfIP_getdev_ex(s, 1); +/* attach each dev to its driver: dev->poll, dev->send, dev->mac, dev->mtu ... */ + +wolfIP_ipconfig_set_ex(s, 0, IP4(192,168,1,1), IP4(255,255,255,0), IP4(0,0,0,0)); +wolfIP_ipconfig_set_ex(s, 1, IP4(10,0,0,1), IP4(255,255,255,0), IP4(0,0,0,0)); +``` + +### Feeding received frames + +`wolfIP_poll()` calls each interface's `poll` callback and routes the resulting +frame to the correct interface internally (`poll_devices`, `src/wolfip.c`) — so a +driver that implements `poll` needs no extra plumbing. + +If instead you push frames into the stack yourself (for example from an ISR or a +bridge), tag each frame with the interface it arrived on using +`wolfIP_recv_ex()`: + +```c +/* a frame arrived on interface 1 */ +wolfIP_recv_ex(s, 1, frame_buf, frame_len); +``` + +`wolfIP_recv(s, ...)` is shorthand for `wolfIP_recv_ex(s, , ...)`, +so use the `_ex` form whenever more than one interface can deliver frames. + +## 4. The loopback interface + +Set `WOLFIP_ENABLE_LOOPBACK` to `1` (default `0`) to give the stack an internal +loopback interface. It **requires `WOLFIP_MAX_INTERFACES > 1`** — `config.h` +enforces this with a compile-time `#error`: + +```c +#if WOLFIP_ENABLE_LOOPBACK && WOLFIP_MAX_INTERFACES < 2 +#error "WOLFIP_ENABLE_LOOPBACK requires WOLFIP_MAX_INTERFACES > 1" +#endif +``` + +When enabled, `wolfIP_init()` configures **interface slot 0** as the loopback +device (`src/wolfip.c`): + +- IP `127.0.0.1`, mask `255.0.0.0` (`WOLFIP_LOOPBACK_IP` / `WOLFIP_LOOPBACK_MASK`, + i.e. `127.0.0.1/8`), gateway none. +- `ifname` `"lo"`, `non_ethernet = 1`, with internal `poll`/`send` callbacks that + move frames through an in-memory queue (`wolfIP_loopback_poll` / + `wolfIP_loopback_send`) — there is no driver to wire. + +### The index shift + +Enabling loopback claims index `0`, so the **first hardware interface shifts to +index `1`**. This is encoded by `WOLFIP_PRIMARY_IF_IDX` (`src/wolfip.c`): + +```text +WOLFIP_ENABLE_LOOPBACK = 0 WOLFIP_ENABLE_LOOPBACK = 1 + idx 0 : first hardware iface idx 0 : loopback (127.0.0.1/8) + idx 1 : second hardware iface idx 1 : first hardware iface + ... idx 2 : second hardware iface + ... +WOLFIP_PRIMARY_IF_IDX = 0 WOLFIP_PRIMARY_IF_IDX = 1 +``` + +The consequence for the **legacy accessors** is exact: `wolfIP_getdev()`, +`wolfIP_ipconfig_set()/_get()`, and `wolfIP_recv()` all operate on +`WOLFIP_PRIMARY_IF_IDX`, so with loopback enabled they act on your hardware NIC at +**index 1**, not index 0. To touch a specific slot regardless of build, use the +`_ex` accessors with an explicit index: + +```c +/* loopback build: configure the real NIC explicitly at index 1 */ +wolfIP_ipconfig_set_ex(s, 1, my_ip, my_mask, my_gw); +struct wolfIP_ll_dev *nic = wolfIP_getdev_ex(s, 1); /* same as wolfIP_getdev() here */ +``` + +### What works over 127.0.0.1 + +The loopback interface is a normal interface as far as the socket layer is +concerned: a socket bound to or connecting to `127.0.0.1` exchanges UDP datagrams +and TCP segments with another local socket through the in-memory loopback queue, +never touching hardware. Loopback addresses are confined to it — frames carrying a +`127/8` source or destination that arrive on a non-loopback interface are dropped +(`src/wolfip.c`). The queue depth is `WOLFIP_LOOPBACK_QUEUE_DEPTH` (default 4); +when it drains, blocked senders are woken via +`wolfIP_notify_loopback_space_available()`. + +## 5. Troubleshooting + +**`WOLFIP_IP_ADD_MEMBERSHIP` returns `-WOLFIP_EINVAL`.** Either the address is not +a multicast group, or you joined with `imr_interface = INADDR_ANY` before giving +the resolved interface a source IP. Call `wolfIP_ipconfig_set*()` first, then +join. Joining the same `{interface, group}` twice on one socket also returns +`-WOLFIP_EINVAL`. + +**Joined but no multicast arrives.** Confirm the socket is bound to the +destination port and that you actually joined (an unjoined socket on the same port +gets nothing). On a real link, also confirm the upstream switch/router honors the +IGMPv3 report wolfIP sends on join. + +**Multicast sends never reach the network.** The default multicast TTL is `1`, +which does not cross a router. Raise it with `WOLFIP_IP_MULTICAST_TTL`. If you +expect to read your own sends back, leave `WOLFIP_IP_MULTICAST_LOOP` enabled and +remember the loopback copy is delivered inside `wolfIP_poll()` after the wire +send — poll the stack before `recvfrom()`. + +**Forwarding does nothing.** Check that `WOLFIP_ENABLE_FORWARDING = 1`, that +`WOLFIP_MAX_INTERFACES >= 2`, and that **both** interfaces have a configured, +non-zero IP — `wolfIP_forward_interface()` skips interfaces whose IP is +`IPADDR_ANY` and returns `-1` when `if_count < 2`. + +**Forwarded traffic stops with ICMP "time exceeded".** The packet's TTL reached 1 +at the router; this is expected behavior, not a bug. The originating host should +be using a TTL large enough for the hop count. + +**Wrong interface after enabling loopback.** Remember the index shift: the legacy +accessors now target index 1 (the first NIC). Use `wolfIP_getdev_ex()` / +`wolfIP_ipconfig_set_ex()` with explicit indices to avoid ambiguity. diff --git a/docs/dhcp_dns_howto.md b/docs/dhcp_dns_howto.md new file mode 100644 index 00000000..3ef577a2 --- /dev/null +++ b/docs/dhcp_dns_howto.md @@ -0,0 +1,387 @@ +# DHCP & DNS Client How-To + +This guide shows how to bring a wolfIP interface up automatically with the +**DHCP client** (RFC 2131) and how to resolve hostnames with the **DNS client** +(RFC 1035): how to start each one, how they make progress inside your poll loop, +how to read the results, and how to chain "get a lease, then resolve a host" +into a single startup sequence. + +It is a getting-started document, not a reference manual. The authoritative API +is `wolfip.h`; the worked examples come from `src/test/test_dhcp_dns.c` (a +full DHCP-then-DNS-then-connect test driven against `dnsmasq`) and +`src/port/stm32h563/main.c` (a bare-metal DHCP bring-up). + +## Table of Contents + +- [1. What the DHCP and DNS clients do (and do not)](#1-what-the-dhcp-and-dns-clients-do-and-do-not) +- [2. Mental model: everything happens inside `wolfIP_poll()`](#2-mental-model-everything-happens-inside-wolfip_poll) +- [3. The API](#3-the-api) +- [4. DHCP: acquiring a lease](#4-dhcp-acquiring-a-lease) +- [5. DHCP: lease lifecycle and renewal](#5-dhcp-lease-lifecycle-and-renewal) +- [6. DNS: configuring the resolver](#6-dns-configuring-the-resolver) +- [7. DNS: resolving a name](#7-dns-resolving-a-name) +- [8. End-to-end: DHCP up, then resolve a host](#8-end-to-end-dhcp-up-then-resolve-a-host) +- [9. Socket-pool implications](#9-socket-pool-implications) +- [10. Troubleshooting](#10-troubleshooting) + +--- + +## 1. What the DHCP and DNS clients do (and do not) + +wolfIP ships a **DHCP client** and a **DNS client**. Both are *client only* — there +is no DHCP server and no DNS server in the stack. + +The DHCP client implements the RFC 2131 four-message bring-up +(DISCOVER → OFFER → REQUEST → ACK) over UDP ports 68/67. On a successful ACK it +configures the primary interface's IP address, subnet mask and gateway, and (if +the server provided DHCP option 6) records a DNS server address. It then tracks +the lease and renews/rebinds it automatically. + +The DNS client implements RFC 1035 forward lookups (A records) over UDP port 53. +You hand it a hostname and a callback; when an answer arrives the callback fires +with the resolved IPv4 address. (A companion reverse-lookup function, +`wolfIP_dns_ptr_lookup()`, resolves PTR records to a name; the rest of this guide +focuses on the forward `nslookup()` path.) + +**Not** supported: acting as a DHCP or DNS server, IPv6 / AAAA records, more than +one outstanding DNS query at a time, caching of DNS answers, and TCP-based DNS +(truncated `TC` responses are dropped, not retried over TCP). + +## 2. Mental model: everything happens inside `wolfIP_poll()` + +Neither client blocks and neither spawns a thread. Both are driven entirely by +your normal `wolfIP_poll(s, now_ms)` loop: + +- `dhcp_client_init()` / `nslookup()` open a UDP socket, register an internal + callback on it, and send the first packet. +- Every later step — receiving an OFFER/ACK, sending the REQUEST, firing the + resolve callback, retransmitting on timeout, renewing the lease — happens when + `wolfIP_poll()` dispatches that UDP socket's readable event or services a + timer. + +```text + your loop: wolfIP_poll(s, now_ms) ──┐ + │ drains DHCP/DNS UDP sockets + │ runs DHCP discover/request timers + │ runs DHCP lease + DNS retransmit timers + ┌─────────────────────────────────────┘ + ▼ + dhcp_callback() ──▶ parse OFFER/ACK ──▶ set ip/mask/gw, dns_server, state=BOUND + dns_callback() ──▶ parse A record ──▶ lookup_cb(ip) +``` + +So the universal pattern is: kick off the operation once, then keep calling +`wolfIP_poll()` while polling a completion predicate (`dhcp_bound()`) or waiting +for a callback (`nslookup`). If `now_ms` does not advance between polls, the +retransmit and lease timers never fire. + +## 3. The API + +All declarations are in `wolfip.h`: + +```c +/* DHCP client */ +int dhcp_client_init(struct wolfIP *s); +int dhcp_bound(struct wolfIP *s); +int dhcp_client_is_running(struct wolfIP *s); +int wolfIP_dns_server_get(struct wolfIP *s, ip4 *dns_server); + +/* DNS client */ +int nslookup(struct wolfIP *s, const char *name, uint16_t *id, + void (*lookup_cb)(uint32_t ip)); +``` + +| Function | Returns | Meaning | +|----------|---------|---------| +| `dhcp_client_init(s)` | `0` on the first DISCOVER sent, negative on error | Opens/binds the DHCP UDP socket, picks a random transaction ID, and sends DISCOVER. Refuses (returns `-1`) if a DHCP session is already running. | +| `dhcp_bound(s)` | non-zero if a lease is held | True when state is `BOUND`, `RENEWING` or `REBINDING` — i.e. a usable IP is configured. | +| `dhcp_client_is_running(s)` | non-zero while in progress | True while DHCP is active but **not yet** bound (DISCOVER/REQUEST in flight). Useful as a loop guard so you stop polling if the client gives up. | +| `wolfIP_dns_server_get(s, &dns)` | `0` on success, `-WOLFIP_EINVAL` on null args | Reads back the DNS server address currently in effect (learned via DHCP or set statically). `0.0.0.0` means none is set. | +| `nslookup(s, name, &id, cb)` | `0` if the query was sent, negative otherwise | Sends an A-record query for `name`; `id` receives the DNS transaction ID; `cb(ip)` fires later with the answer (`ip` in host byte order). | + +`nslookup()` error returns worth knowing (from `src/wolfip.c`): + +| Value | Cause | +|-------|-------| +| `-22` | Invalid argument (null `s`/`name`/`id`/`cb`, or a name longer than 255 bytes / label longer than 63). | +| `-16` | A DNS query is already in progress (only one at a time). | +| `-101` | No DNS server configured (`dns_server == 0`). | + +## 4. DHCP: acquiring a lease + +The whole client-side sequence is: call `dhcp_client_init()` once, then poll +until `dhcp_bound()` is true. From `src/test/test_dhcp_dns.c`: + +```c +gettimeofday(&tv, NULL); +wolfIP_poll(s, tv.tv_sec * 1000 + tv.tv_usec / 1000); +dhcp_client_init(s); +do { + gettimeofday(&tv, NULL); + wolfIP_poll(s, tv.tv_sec * 1000 + tv.tv_usec / 1000); + usleep(1000); + wolfIP_ipconfig_get(s, &ip, &nm, &gw); +} while (!dhcp_bound(s)); +printf("DHCP: obtained IP address.\n"); +wolfIP_ipconfig_get(s, &ip, &nm, &gw); +``` + +Once `dhcp_bound()` returns true, `wolfIP_ipconfig_get(s, &ip, &nm, &gw)` +(`wolfip.h`) returns the leased address, mask and gateway — the same call you +would use after a static `wolfIP_ipconfig_set()`. The values are `ip4` in +network byte order. + +On constrained targets you usually also want a safety net: bound the wait so a +network with no DHCP server does not loop forever. The STM32H5 port +(`src/port/stm32h563/main.c`) combines `dhcp_bound()` with +`dhcp_client_is_running()` and a tick timeout: + +```c +dhcp_ret = dhcp_client_init(IPStack); +if (dhcp_ret < 0) { + /* DHCP init failed: socket pool full, or already running */ +} else { + dhcp_start_tick = tick; + while (!dhcp_bound(IPStack) && dhcp_client_is_running(IPStack)) { + (void)wolfIP_poll(IPStack, tick); + tick++; + delay(8000); /* ~1 ms per iteration */ + if ((tick - dhcp_start_tick) > dhcp_timeout) + break; /* safety-net timeout */ + } + if (dhcp_bound(IPStack)) { + ip4 ip = 0, nm = 0, gw = 0; + wolfIP_ipconfig_get(IPStack, &ip, &nm, &gw); + /* ... use ip/nm/gw ... */ + } +} +``` + +The `dhcp_client_is_running()` guard matters: if the DISCOVER and REQUEST retries +are exhausted the client returns to the `OFF` state, at which point +`dhcp_client_is_running()` goes false and the loop exits even though +`dhcp_bound()` never became true. + +Internally (`src/wolfip.c`), DISCOVER and REQUEST each retry up to three times +(`DHCP_DISCOVER_RETRIES` / `DHCP_REQUEST_RETRIES`, default `3`) with a +2-second base timeout (`DHCP_DISCOVER_TIMEOUT` / `DHCP_REQUEST_TIMEOUT`) and an +exponential backoff capped at `DHCP_BACKOFF_MAX_MS` (64 s). A `DHCPNAK` in +response to a REQUEST restarts the whole process from DISCOVER. + +## 5. DHCP: lease lifecycle and renewal + +A bound lease is not the end of the story. When the ACK is parsed +(`dhcp_parse_ack()` in `src/wolfip.c`) the client requires the mandatory +lease-time option (51) and arms timers from the renewal (T1) and rebind (T2) +times the server supplied. The state machine then moves: + +```text + DISCOVER_SENT ─OFFER─▶ REQUEST_SENT ─ACK─▶ BOUND + │ T1 expires + ▼ + RENEWING ─(no reply by T2)─▶ REBINDING + │ │ + ACK │ ACK │ + ▼ ▼ + BOUND ◀────────────────────────┘ +``` + +You do **not** call anything to renew — it happens inside `wolfIP_poll()` as the +timers fire. Note that `dhcp_bound()` deliberately returns true in `RENEWING` +and `REBINDING` as well as `BOUND`, because the address remains valid and usable +throughout renewal. If both renewal and rebinding fail before the lease expires, +the client tears the configuration down and returns to `OFF`. + +The DNS server learned from DHCP option 6 is stored at this point too, but only +if one is not already set (a statically configured DNS server is **not** +overwritten by DHCP — see §6). + +## 6. DNS: configuring the resolver + +`nslookup()` needs a DNS server address. There are two ways it gets one: + +1. **Learned from DHCP.** If the DHCP ACK carried option 6 (Domain Name Server) + *and* no DNS server was already configured, the client records the first + server. After `dhcp_bound()` you can read it back: + + ```c + ip4 dns = 0; + wolfIP_dns_server_get(s, &dns); /* 0 == none set */ + ``` + +2. **Configured statically for the built-in singleton stack.** Define + `WOLFIP_STATIC_DNS_IP` in `config.h`. The default configuration sets it to + Quad9: + + ```c + /* config.h */ + #define WOLFIP_STATIC_DNS_IP "9.9.9.9" + ``` + + `wolfIP_init_static()` applies it once at startup, **only if** the stack does + not already have a DNS server (`src/wolfip.c`): + + ```c + if (wolfIP_static.dns_server == 0) { + #ifdef WOLFIP_STATIC_DNS_IP + wolfIP_static.dns_server = atoip4(WOLFIP_STATIC_DNS_IP); + #endif + } + ``` + +Because the static value is applied first and DHCP only fills the slot when it is +still zero, a compile-time `WOLFIP_STATIC_DNS_IP` takes precedence over whatever +DHCP offers. Leave it undefined (and rely on DHCP) if you want the network to +choose the resolver. + +Two caveats: + +- The automatic `WOLFIP_STATIC_DNS_IP` assignment happens only in + `wolfIP_init_static()`. If your application allocates its own stack object and + calls `wolfIP_init()`, that macro is **not** applied for you. +- There is currently no public `wolfIP_dns_server_set()` API. For non-static + stacks, the practical choices today are "learn the resolver from DHCP option + 6" or set `s->dns_server` in your own integration code before calling + `nslookup()`. + +If neither path sets a server, `nslookup()` returns `-101`. + +## 7. DNS: resolving a name + +A lookup is a single non-blocking call plus a completion callback. The callback +has signature `void (*)(uint32_t ip)`, where `ip` is the resolved IPv4 address in +**host byte order**. From `src/test/test_dhcp_dns.c`: + +```c +static int example_com_resolved = 0; + +void ns_cb(uint32_t ip) +{ + printf("Obtained ip address for example.com: %s\n", + inet_ntoa(*(struct in_addr *)&ip)); + example_com_resolved = 1; +} + +/* ... */ +uint16_t dns_id; +nslookup(s, "example.com", &dns_id, ns_cb); + +while (!example_com_resolved) { + gettimeofday(&tv, NULL); + wolfIP_poll(s, tv.tv_sec * 1000 + tv.tv_usec / 1000); + usleep(1000); +} +``` + +The callback fires only on a successful answer. NXDOMAIN, a truncated UDP +response (`TC`), malformed replies, or retry exhaustion abort the query without +invoking `lookup_cb`, so production code should pair `nslookup()` with its own +timeout or state flag. + +Behaviour to rely on (verified in `src/wolfip.c`): + +- **One query at a time.** `nslookup()` returns `-16` if a query is already + outstanding. The slot frees when the callback fires or the query is aborted. +- **The transaction ID** written to `*id` is a random non-zero 16-bit value; the + response is dropped unless its ID matches, so a stale reply for an old query is + ignored. +- **When the callback fires.** `dns_callback()` runs inside `wolfIP_poll()` when + the answer datagram arrives. It walks the answer section and, on the first A + record (type A, class IN) it finds, invokes your `lookup_cb(ip)` exactly once, + then clears the query state. +- **The callback does not fire on failure.** A server error (`RCODE != 0`), a + truncated (`TC`) response, or exhausting the retransmit budget aborts the + query silently — the callback is simply never called. Time-box your wait + accordingly. +- **Retransmits.** The query is resent up to `DNS_QUERY_RETRIES` (default `3`) + times on a ~2 s timer (`DNS_QUERY_TIMEOUT`, with a jittered first interval) + before the query is abandoned. + +## 8. End-to-end: DHCP up, then resolve a host + +Putting §4 and §7 together gives the canonical startup sequence used in +`src/test/test_dhcp_dns.c`: bring the interface up with DHCP, then resolve a name +(using the DNS server DHCP just handed us, or the static fallback), then connect. + +```c +/* 1. Acquire a lease. */ +dhcp_client_init(s); +do { + gettimeofday(&tv, NULL); + wolfIP_poll(s, tv.tv_sec * 1000 + tv.tv_usec / 1000); + usleep(1000); +} while (!dhcp_bound(s)); +wolfIP_ipconfig_get(s, &ip, &nm, &gw); /* leased address now valid */ + +/* 2. Resolve a hostname. */ +nslookup(s, "example.com", &dns_id, ns_cb); +while (!example_com_resolved) { + gettimeofday(&tv, NULL); + wolfIP_poll(s, tv.tv_sec * 1000 + tv.tv_usec / 1000); + usleep(1000); +} + +/* 3. ns_cb stashed the address; open a socket and connect to it as usual. */ +``` + +In the test the resolved address is then used to drive an echo client +(`test_wolfip_echoclient(s)`). The single shared `wolfIP_poll()` loop services +DHCP, DNS and the application sockets together — there is no separate "DHCP +thread" or "DNS thread". + +> Illustrative glue: in real code, have `ns_cb` save the `ip` into a global or +> a context struct (it is host byte order — convert with `ee32()` before placing +> it in a `wolfIP_sockaddr_in.sin_addr`) rather than only printing it, so step 3 +> can connect to it. + +## 9. Socket-pool implications + +Both clients consume entries from the **UDP socket pool**, sized by +`MAX_UDPSOCKETS` (`config.h`, default `2`): + +- `dhcp_client_init()` allocates one UDP socket (bound to port 68) for the + lifetime of the DHCP client. +- The first `nslookup()` (or `wolfIP_dns_ptr_lookup()`) lazily allocates one UDP + socket on port 53 and keeps it for reuse by later lookups. + +So if your application also opens UDP sockets, budget for DHCP (+1) and DNS (+1) +on top of your own. With the default `MAX_UDPSOCKETS 2`, enabling **both** DHCP +and DNS consumes the entire default pool — raise `MAX_UDPSOCKETS` in `config.h` +if your application needs UDP sockets of its own. If the pool is exhausted, +`dhcp_client_init()` returns negative and `nslookup()` fails to allocate its +socket. + +## 10. Troubleshooting + +**DHCP never gets bound.** Confirm `now_ms` advances between `wolfIP_poll()` +calls — the DISCOVER/REQUEST retransmit timers are time-driven, so a frozen clock +stalls the handshake. Check that the link is up and that DISCOVER broadcasts +(255.255.255.255:67) are reaching a server. After the retry budget the client +returns to `OFF`; use `dhcp_client_is_running()` as a loop guard to detect this +instead of spinning forever on `!dhcp_bound()`. + +**`dhcp_client_init()` returns negative.** Either DHCP is already running +(returns `-1`), or the UDP socket pool is full — raise `MAX_UDPSOCKETS`. + +**Lease binds but no DNS works.** The server may not have sent option 6. Read it +back with `wolfIP_dns_server_get()`; if it is `0.0.0.0`, set +`WOLFIP_STATIC_DNS_IP` in `config.h` as a fallback. + +**`nslookup()` returns `-101`.** No DNS server is configured. Either you called +it before `dhcp_bound()` populated option 6, or DHCP never supplied one and +`WOLFIP_STATIC_DNS_IP` is undefined. + +**`nslookup()` returns `-16`.** A previous query is still outstanding. Wait for +its callback (or its timeout to abort it) before issuing another — only one DNS +query runs at a time. + +**The resolve callback never fires.** This is the *expected* outcome on failure: +NXDOMAIN / server error (`RCODE != 0`), a truncated (`TC`) reply, or three timed-out +retransmits all abort the query without calling back. Do not wait unboundedly on +the callback flag — pair it with a timeout. Also confirm the DNS server address +is reachable and that `wolfIP_poll()` keeps running so the answer datagram is +drained. + +**Wrong-looking resolved address.** The callback delivers `ip` in **host byte +order**; convert with `ee32()` before storing it in a `sin_addr` (which is network +byte order). diff --git a/docs/http_server_howto.md b/docs/http_server_howto.md new file mode 100644 index 00000000..68407b45 --- /dev/null +++ b/docs/http_server_howto.md @@ -0,0 +1,405 @@ +# HTTP/HTTPS Server How-To + +This guide shows how to use the wolfIP **HTTP server** module (`src/http/`): how +to build it, how to bring up a server on a TCP port, how to register static +pages and dynamic handlers, and how to layer TLS on top of it to serve HTTPS. + +It is a getting-started document. The authoritative API is `src/http/httpd.h`; +the worked examples come from `src/test/test_httpd.c` (a TLS server serving a +static page) and `src/port/stm32h563/main.c` (an HTTPS status page served by a +dynamic handler on an STM32H5). + +## Table of Contents + +- [1. What the HTTP module is](#1-what-the-http-module-is) +- [2. Building with HTTP](#2-building-with-http) +- [3. Architecture: event callbacks, no threads](#3-architecture-event-callbacks-no-threads) +- [4. The server lifecycle](#4-the-server-lifecycle) +- [5. Registering content: static pages and handlers](#5-registering-content-static-pages-and-handlers) +- [6. The handler/callback API](#6-the-handlercallback-api) +- [7. Reading the request: methods, query and form args](#7-reading-the-request-methods-query-and-form-args) +- [8. Building the response](#8-building-the-response) +- [9. Serving a page: a minimal example](#9-serving-a-page-a-minimal-example) +- [10. Enabling HTTPS (TLS layering)](#10-enabling-https-tls-layering) +- [11. URL encoding/decoding helpers](#11-url-encodingdecoding-helpers) +- [12. Troubleshooting](#12-troubleshooting) + +--- + +## 1. What the HTTP module is + +`src/http/` is a small, self-contained HTTP/1.1 server (`httpd.c`/`httpd.h`) +built directly on the wolfIP socket API. Like the rest of wolfIP it performs +**zero dynamic allocation**: the server, its client slots, and the routing table +all live inside a single caller-provided `struct httpd`. Request and response +buffers are fixed-size and stack-allocated per call. + +The server is **event-driven**. It registers wolfIP socket callbacks and is +driven entirely from inside `wolfIP_poll()` on a single thread — there is no +accept loop or per-connection thread to manage. When data arrives on a client +socket, wolfIP calls back into the module, which parses one request and emits a +response synchronously. + +It supports: + +- **GET** and **POST** requests (any other method is rejected as + `400 Bad Request`). +- **Static pages** registered by path, served with `Content-Type: text/html`. +- **Dynamic handlers** registered by path, which build the response themselves. +- **HTTPS** by handing the server a `WOLFSSL_CTX`, in which case every client + connection is wrapped in a TLS session via the wolfSSL-over-wolfIP I/O layer. + +The module is deliberately minimal. Responses are sent immediately as they are +generated, with no buffering; if the output socket is congested the write fails +and the connection is closed (see the header comment in `httpd.c`). + +## 2. Building with HTTP + +The HTTP server is gated by the `WOLFIP_ENABLE_HTTP` macro. The default POSIX +`config.h` enables it: + +```c +/* config.h */ +/* Enable HTTP server for POSIX builds */ +#ifndef WOLFIP_ENABLE_HTTP +#define WOLFIP_ENABLE_HTTP +#endif +``` + +Unlike the TFTP module, `src/http/httpd.c` is **not** globbed into the main +wolfIP library. It is compiled and linked per target, and it always depends on +wolfSSL (the header pulls in `` unconditionally, even for plain +HTTP). The relevant flags, from the top-level `Makefile`, are: + +| Flag | Purpose | +|------|---------| +| `-DWOLFIP_ENABLE_HTTP` | Compiles the module in (everything in `httpd.c`/`httpd.h` is `#ifdef`-guarded by it). | +| `-DWOLFSSL_WOLFIP` | Selects the wolfSSL-over-wolfIP I/O backend used for HTTPS. | +| `-Isrc/http` | Lets sources `#include "httpd.h"`. | + +To build and run the bundled HTTP/TLS server test: + +```sh +make build/test-httpd +``` + +That target links `build/http/httpd.o`, the wolfSSL I/O glue +(`build/port/wolfssl_io.o`), and a test certificate/key against `-lwolfssl` +(`Makefile`). The CMake build defines an equivalent `test-httpd` target / +`add_test(NAME httpd ...)`. + +To use the module in your own build, compile `src/http/httpd.c` with the flags +above, add `src/http` to the include path, and link wolfSSL. + +## 3. Architecture: event callbacks, no threads + +The module installs two kinds of wolfIP callback and never blocks: + +```text + wolfIP_poll(s, now_ms) + │ + ├─ listen socket readable ─▶ http_accept_cb() + │ │ + │ ├─ wolfIP_sock_accept() ─▶ new client_sd + │ ├─ claim a free clients[] slot (max HTTPD_MAX_CLIENTS) + │ ├─ if ssl_ctx: wolfSSL_new() + wolfSSL_SetIO_wolfIP() + │ └─ wolfIP_register_callback(client_sd, http_recv_cb, &client) + │ + └─ client socket readable ─▶ http_recv_cb() + │ + ├─ read : wolfSSL_read() (HTTPS) or wolfIP_sock_recv() + ├─ parse : parse_http_request() ── one request, in place + ├─ route : http_find_url(path) + │ ├─ static_content ─▶ headers + body + │ └─ handler(...) ─▶ app builds the response + └─ write : wolfSSL_write() (HTTPS) or wolfIP_sock_send() +``` + +All you do at the application level is: create a `wolfIP` stack, call +`httpd_init()`, register your routes, and keep calling `wolfIP_poll()` from your +main loop. Everything above happens inside `poll`. + +## 4. The server lifecycle + +A server is one `struct httpd` plus a call to `httpd_init()` +(`src/http/httpd.h`): + +```c +int httpd_init(struct httpd *httpd, struct wolfIP *s, uint16_t port, void *ssl_ctx); +``` + +`httpd_init()` zeroes the `struct httpd`, then creates a TCP socket, binds it to +`port`, calls `listen()` with a backlog of 5, and registers the accept callback +(`httpd.c`). Pass `ssl_ctx = NULL` for plain HTTP, or a configured +`WOLFSSL_CTX *` for HTTPS (see [section 10](#10-enabling-https-tls-layering)). +It returns `0` on success or `-1` if any socket step fails. + +The lifecycle is: + +1. Bring up the wolfIP stack and configure its IP (DHCP or static). +2. `httpd_init(&httpd, s, port, ssl_ctx)`. +3. Register one or more routes with `httpd_register_static_page()` and/or + `httpd_register_handler()`. +4. Drive the stack: call `wolfIP_poll(s, now_ms)` in your main loop forever. + +There is no explicit teardown call; individual client connections are closed +automatically by the module on error or after a response. `struct httpd` holds +its own listen socket, so its lifetime must match the server's. + +## 5. Registering content: static pages and handlers + +Routes are matched by **exact path** (`strcmp`), up to `HTTPD_MAX_URLS` (16) +entries. There are two registration calls (`src/http/httpd.h`): + +```c +int httpd_register_static_page(struct httpd *httpd, const char *path, + const char *content); +int httpd_register_handler(struct httpd *httpd, const char *path, + int (*handler)(struct httpd *httpd, struct http_client *hc, + struct http_request *req)); +``` + +- **Static page** — `content` is a NUL-terminated string the module serves + verbatim with status `200 OK` and `Content-Type: text/html`. The string is + referenced by pointer, not copied, so it must outlive the server. +- **Handler** — for any request to `path`, the module calls your function and + lets it build the entire response. + +Both copy the path into the route table (truncated to `HTTP_PATH_LEN`, 128) and +return `0`, or `-1` if all 16 slots are taken. Each successful registration +consumes one slot; there is no public deregistration API. A registered route with +neither a handler nor static content yields `503 Service Unavailable`; an +unmatched path yields `404 Not Found` (`parse_http_request()` in `httpd.c`). + +## 6. The handler/callback API + +A handler has this exact signature (`src/http/httpd.h`): + +```c +int my_handler(struct httpd *httpd, struct http_client *hc, + struct http_request *req); +``` + +- `httpd` — the server instance (often unused; cast to `(void)`). +- `hc` — the opaque per-client context. You pass it back to the response + helpers; you do not write its fields directly. +- `req` — the parsed request (see below). The struct is stack-allocated for the + duration of the call, so copy out anything you need to keep. + +The parsed request (`struct http_request`, `httpd.h`) is fixed-size: + +```c +struct http_request { + char method[HTTP_METHOD_LEN]; /* "GET", "POST" (max 8) */ + char path[HTTP_PATH_LEN]; /* URL path, percent-decoded (max 128) */ + char query[HTTP_QUERY_LEN]; /* raw query string (max 256) */ + char headers[HTTP_HEADERS_LEN]; /* last header line seen (max 512) */ + char body[HTTP_BODY_LEN]; /* request body (max 1024) */ + size_t body_len; +}; +``` + +The return value is propagated by the module: return `0` after you have sent a +response. A negative return from your handler causes the module to close the +client connection (`http_recv_cb()` treats a negative parse/handler result as a +failure and tears the connection down). + +> **Note.** `req->headers` holds only the **last** header line parsed, not the +> full header block — the parser reuses one buffer. Use it for at most a single +> expected header; framing headers (`Content-Length`, `Transfer-Encoding`) are +> consumed internally and are not meant to be re-read here. + +## 7. Reading the request: methods, query and form args + +Only `GET` and `POST` reach a route; anything else is rejected before dispatch. +To pull a named argument out of a GET query string or a POST form body, use the +helper (`src/http/httpd.h`): + +```c +int httpd_get_request_arg(struct http_request *req, const char *name, + char *value, size_t value_len); +``` + +It searches `req->query` for a `GET` and `req->body` for a `POST`, looking for +`name=value` pairs separated by `&`. On a match it copies the value (NUL +terminated) into `value` and returns `0`; it returns `-1` if the key is not +found, the method is unsupported, or `value_len` is too small +(`httpd_get_request_arg()` in `httpd.c`). + +```c +char user[32]; +if (httpd_get_request_arg(req, "user", user, sizeof(user)) == 0) { + /* user now holds the value of ?user=... or user=... */ +} +``` + +The body parser derives the body length from the declared `Content-Length` and +rejects `Transfer-Encoding` (chunked) request bodies, as well as a body with no +`Content-Length`, to avoid request-smuggling ambiguity (see the comment in +`parse_http_request()`). + +## 8. Building the response + +The response helpers all take the `struct http_client *hc` from your handler and +write straight to the socket (plain or TLS). From `src/http/httpd.h`: + +| Function | Purpose | +|----------|---------| +| `http_send_response_headers(hc, code, status_text, content_type, content_length)` | Emit the status line + headers. `content_length == 0` switches to `Transfer-Encoding: chunked`. | +| `http_send_response_body(hc, body, len)` | Send a response body (after fixed-length headers). | +| `http_send_response_chunk(hc, chunk, len)` | Send one chunk (after chunked headers). | +| `http_send_response_chunk_end(hc)` | Terminate a chunked response. | +| `http_send_200_OK(hc)` | Shorthand: 200 headers, `text/plain`, chunked. | +| `http_send_500_server_error(hc)` | Shorthand: `500 Internal Server Error`. | +| `http_send_503_service_unavailable(hc)` | Shorthand: `503 Service Unavailable`. | +| `http_send_418_teapot(hc)` | Shorthand: `418 I'm a teapot`. | + +The common fixed-length pattern is **headers then body**: + +```c +http_send_response_headers(hc, HTTP_STATUS_OK, "OK", "text/html", len); +http_send_response_body(hc, response, len); +return 0; +``` + +For a response whose size is not known up front, send chunked headers +(`content_length == 0`), then any number of `http_send_response_chunk()` calls, +and finish with `http_send_response_chunk_end()`. Any write failure inside these +helpers closes the client connection automatically. + +Status code constants are defined in `httpd.h`: `HTTP_STATUS_OK` (200), +`HTTP_STATUS_BAD_REQUEST` (400), `HTTP_STATUS_NOT_FOUND` (404), +`HTTP_STATUS_TEAPOT` (418), `HTTP_STATUS_TOO_MANY_REQUESTS` (429), +`HTTP_STATUS_INTERNAL_SERVER_ERROR` (500), `HTTP_STATUS_SERVICE_UNAVAILABLE` +(503). + +## 9. Serving a page: a minimal example + +A complete dynamic handler that builds an HTML page and returns it, condensed +from `src/port/stm32h563/main.c` (`https_status_handler`): + +```c +static int status_handler(struct httpd *httpd, struct http_client *hc, + struct http_request *req) +{ + char response[512]; + int len; + (void)httpd; + + /* Build the page (req->method / req->path are available if needed) */ + len = snprintf(response, sizeof(response), + "wolfIP" + "

wolfIP Status

" + "

You asked for: %s

", + req->path); + + http_send_response_headers(hc, HTTP_STATUS_OK, "OK", "text/html", len); + http_send_response_body(hc, response, len); + return 0; +} +``` + +Wiring it up (the static-page variant is even shorter, from +`src/test/test_httpd.c`): + +```c +struct httpd httpd; +const char homepage[] = "

Hello, world!

"; + +/* Plain HTTP on port 80; pass NULL for the ssl_ctx argument. */ +if (httpd_init(&httpd, s, 80, NULL) < 0) + return -1; + +httpd_register_static_page(&httpd, "/", homepage); /* serves homepage */ +httpd_register_handler(&httpd, "/status", status_handler); + +/* Main loop: the server runs entirely inside wolfIP_poll(). */ +for (;;) { + uint32_t ms_next = wolfIP_poll(s, now_ms()); + /* sleep up to ms_next, service other work, then loop */ +} +``` + +That is the whole server. `httpd_register_static_page(&httpd, "/", homepage)` +is exactly what `test_httpd.c` does (over TLS, on port 443). + +## 10. Enabling HTTPS (TLS layering) + +HTTPS is enabled by passing a configured `WOLFSSL_CTX *` as the fourth argument +to `httpd_init()` instead of `NULL`. The module then: + +- calls `wolfSSL_SetIO_wolfIP_CTX(ssl_ctx, ipstack)` in `httpd_init()` to bind + the TLS context to the wolfIP I/O backend, and +- for every accepted connection creates a `WOLFSSL` with `wolfSSL_new()` and + attaches it to the client socket with `wolfSSL_SetIO_wolfIP()` + (`http_accept_cb()`). + +After that, every read uses `wolfSSL_read()` and every write +`wolfSSL_write()` transparently — the same handlers and response helpers work +for HTTP and HTTPS unchanged. The wolfSSL-over-wolfIP I/O glue +(`wolfSSL_SetIO_wolfIP*`, `wolfSSL_CleanupIO_wolfIP`) lives in +`src/port/wolfssl_io.c` and must be linked in. + +Setting up the context is ordinary wolfSSL. From `src/test/test_httpd.c`: + +```c +WOLFSSL_CTX *server_ctx = wolfSSL_CTX_new(wolfTLSv1_2_server_method()); + +wolfSSL_CTX_use_certificate_buffer(server_ctx, server_der, server_der_len, + SSL_FILETYPE_ASN1); +wolfSSL_CTX_use_PrivateKey_buffer(server_ctx, server_key_der, server_key_der_len, + SSL_FILETYPE_ASN1); + +httpd_init(&httpd, s, 443, server_ctx); /* HTTPS on 443 */ +httpd_register_static_page(&httpd, "/", homepage); +``` + +`src/port/stm32h563/main.c` does the same with `wolfTLSv1_3_server_method()` and +PEM certificate/key buffers, then registers a dynamic handler. Either TLS +version works; the choice is entirely in how you build the `WOLFSSL_CTX`. + +On the read path, a `WOLFSSL_ERROR_WANT_READ` from `wolfSSL_read()` is handled +internally (the callback simply returns and waits for more data), so partial TLS +records do not break the connection. + +## 11. URL encoding/decoding helpers + +The module exposes its in-place percent-codec (`src/http/httpd.h`); the request +parser already uses `http_url_decode()` on `req->path`, but the functions are +public if you need them: + +```c +/* Returns the decoded length, or a negative error code. */ +int http_url_decode(char *buf, size_t len); +int http_url_encode(char *buf, size_t len, size_t max_len); +``` + +`http_url_decode()` rewrites `%XX` escapes in place and returns the new length, +or `HTTP_URL_DECODE_ERR_TRUNCATED` (-1) / `HTTP_URL_DECODE_ERR_BAD_ESCAPE` (-2). +`http_url_encode()` escapes spaces as `%20`, growing the buffer up to `max_len`, +and returns the new length or `-1` if there is not enough room. + +## 12. Troubleshooting + +- **Server never accepts connections.** `httpd_init()` returned `-1` (socket, + bind, or listen failed) — check the port is free and the stack has an IP + configured before `httpd_init()`. Also confirm you are calling + `wolfIP_poll()` continuously; nothing happens between polls. +- **All requests get `404`.** Routes are matched by **exact** path with + `strcmp`. `/status` and `/status/` are different, and there is no prefix or + wildcard matching. Register the exact path the client requests. +- **A registered route returns `503`.** The route exists but has neither a + handler nor static content (e.g. a handler registered as `NULL`). +- **POST body is empty or the request is `400`.** The parser requires a valid + `Content-Length` matching the body length and rejects chunked + (`Transfer-Encoding`) request bodies. The body is also capped at + `HTTP_BODY_LEN` (1024); larger bodies are rejected. +- **HTTPS connection resets at handshake.** Confirm `WOLFSSL_WOLFIP` is defined, + `src/port/wolfssl_io.c` is linked, the certificate/key loaded into the + `WOLFSSL_CTX` without error, and `wolfSSL_Init()` was called at startup. +- **Connections drop under load / large responses truncated.** Responses are + written immediately with no buffering; a congested or full TX window makes the + send fail and the module closes the connection. Keep responses within a TX + window, or send incrementally with the chunked helpers. +- **Fifth concurrent client is refused.** Only `HTTPD_MAX_CLIENTS` (4) client + slots exist; a new accept with no free slot is dropped. diff --git a/docs/ipsec_esp_howto.md b/docs/ipsec_esp_howto.md new file mode 100644 index 00000000..64cc5a36 --- /dev/null +++ b/docs/ipsec_esp_howto.md @@ -0,0 +1,370 @@ +# IPsec ESP How-To + +This guide shows how to secure wolfIP traffic with **IPsec ESP in transport +mode**: how to build wolfIP with ESP support, how to install Security +Associations (SAs), and how to interoperate with the Linux kernel's IPsec +stack for testing. + +It is a getting-started document, not a reference manual. The authoritative API +is `wolfesp.h`; the worked examples come from `src/test/esp/` and the +`ip xfrm` helper scripts in `tools/ip-xfrm/`. + +## Table of Contents + +- [1. What wolfIP ESP does (and does not) do](#1-what-wolfip-esp-does-and-does-not-do) +- [2. Building with ESP support](#2-building-with-esp-support) +- [3. Mental model: SAs and the data path](#3-mental-model-sas-and-the-data-path) +- [4. The SA API](#4-the-sa-api) +- [5. Step by step: securing a socket](#5-step-by-step-securing-a-socket) +- [6. Choosing a cipher suite](#6-choosing-a-cipher-suite) +- [7. Key material and IV/nonce handling](#7-key-material-and-ivnonce-handling) +- [8. Interoperating with Linux `ip xfrm`](#8-interoperating-with-linux-ip-xfrm) +- [9. Inspecting ESP traffic in Wireshark](#9-inspecting-esp-traffic-in-wireshark) +- [10. Troubleshooting](#10-troubleshooting) + +--- + +## 1. What wolfIP ESP does (and does not) do + +wolfIP implements the ESP datagram format (RFC 4303) in **transport mode**: it +encrypts and authenticates the payload of an IPv4 packet (the TCP/UDP/ICMP +segment) while leaving the IP header in the clear. When an SA matches, an +outbound packet is wrapped into ESP before it leaves the interface, and an +inbound ESP packet (IP protocol number 50) is unwrapped and verified before the +stack dispatches it to TCP/UDP/ICMP. + +Supported: + +- **Transport mode** only. +- Cipher suites: AES-CBC + HMAC (RFC 3602 / 4868), 3DES-CBC + HMAC (RFC 2451), + AES-GCM (RFC 4106), AES-GMAC (RFC 4543). +- A small static pool of SAs (`WOLFIP_ESP_NUM_SA`, default 2 inbound + 2 + outbound) with a 32-packet anti-replay window. + +The ESP data path is deliberately scoped for embedded use. Tunnel mode, UDP +encapsulation of ESP, IPv6, ESP across forwarded/routed packets, and a full SPD +(security policy database) with port/proto selectors are outside the current +scope — matching is by `{src IP, dst IP, SPI}`. These can be added as +deployments come to need them. + +**No IKE is a design choice, not a gap.** You provision keys out of band on both +peers, exactly as you would with a manually-keyed `ip xfrm state` on Linux. For +embedded line rates, devices are not expected to rekey often or to exhaust an SA +rapidly, so an online key-exchange daemon buys little for the code and attack +surface it adds. Sequence numbers are 32-bit, which comfortably covers the +packet volumes these links carry between planned key rotations. + +## 2. Building with ESP support + +ESP relies on wolfCrypt for the cipher and HMAC primitives, so you need wolfSSL +installed first. To exercise all suites, build it with 3DES and the streaming +AES-GCM API: + +```sh +./configure --enable-des3 --enable-aesgcm-stream +make +sudo make install +``` + +This is a full wolfSSL build so a single library can exercise every ESP suite +for interop testing. A practical deployment does not need all of it: ESP only +uses wolfCrypt, so you can build with `--enable-cryptonly` and enable just the +specific algorithms your SAs use (for example only AES-GCM), which yields a much +slimmer library. + +Then compile wolfIP with the ESP feature flags (`Makefile`, `ESP_CFLAGS`): + +``` +-DWOLFIP_ESP -DWOLFSSL_WOLFIP +``` + +A plain `make` in the wolfIP tree already produces the two ESP example +binaries, linked against `-lwolfssl`: + +- `./build/test-esp` — a self-contained event-loop test that spawns both an ESP + client and server. +- `./build/esp-server` — an ESP echo server (TCP or UDP) you can drive from the + Linux host. + +When ESP is enabled, `src/wolfesp.c` is included directly into `src/wolfip.c` +(see the `#include "src/wolfesp.c"` guarded by `WOLFIP_ESP`), so the ESP hooks +are compiled into the core data path rather than linked as a separate module. + +## 3. Mental model: SAs and the data path + +An **SA (Security Association)** is a one-way agreement on how to protect a flow: +a direction, a `{src, dst}` IP pair, an SPI (a 4-byte tag carried in every ESP +packet so the receiver knows which SA to use), and the cipher/auth keys. A +two-way TCP connection therefore needs **two** SAs: one outbound and one +inbound. + +wolfIP keeps two separate pools: + +- **inbound SAs** — consulted when a packet with IP protocol 50 arrives. wolfIP + reads the SPI from the ESP header, finds the matching inbound SA, verifies the + ICV, decrypts, and hands the plaintext packet to the normal input path. +- **outbound SAs** — consulted just before a packet is transmitted. If an + outbound SA matches the packet's `{src, dst}`, wolfIP encrypts and wraps it as + ESP; otherwise the packet goes out in the clear. + +```text + socket send ─▶ TCP/UDP build ─▶ [outbound SA match?] ─yes▶ ESP wrap ─▶ wire + │no + └────────────────────────────▶ wire + + wire ─▶ IP input ─▶ [proto == 50 (ESP)?] ─yes▶ inbound SA lookup by SPI + │no ─▶ verify ICV ─▶ decrypt + └─▶ TCP/UDP/ICMP ◀──────────────────────┘ +``` + +The application never sees ESP. You install SAs once at startup, then use +ordinary `wolfIP_sock_*` calls (or the BSD wrapper). Encryption happens +transparently inside `wolfIP_poll()`. + +## 4. The SA API + +All functions are declared in `wolfesp.h`. Initialise the subsystem once, then +install one SA per direction. The first argument `in` selects the pool: +`1` = inbound, `0` = outbound. + +```c +int wolfIP_esp_init(void); +void wolfIP_esp_sa_del_all(void); +void wolfIP_esp_sa_del(int in, uint8_t *spi); + +/* AEAD: AES-GCM (RFC 4106) or AES-GMAC (RFC 4543) */ +int wolfIP_esp_sa_new_gcm(int in, uint8_t *spi, ip4 src, ip4 dst, + esp_enc_t enc, uint8_t *enc_key, uint8_t enc_key_len); + +/* Authentication-only (NULL encryption) HMAC SA */ +int wolfIP_esp_sa_new_hmac(int in, uint8_t *spi, ip4 src, ip4 dst, + esp_auth_t auth, uint8_t *auth_key, + uint8_t auth_key_len, uint8_t icv_len); + +/* AES-CBC encryption + HMAC authentication */ +int wolfIP_esp_sa_new_cbc_hmac(int in, uint8_t *spi, ip4 src, ip4 dst, + uint8_t *enc_key, uint8_t enc_key_len, + esp_auth_t auth, uint8_t *auth_key, + uint8_t auth_key_len, uint8_t icv_len); + +/* 3DES-CBC encryption + HMAC authentication (key length fixed at 24 bytes) */ +int wolfIP_esp_sa_new_des3_hmac(int in, uint8_t *spi, ip4 src, ip4 dst, + uint8_t *enc_key, esp_auth_t auth, + uint8_t *auth_key, uint8_t auth_key_len, + uint8_t icv_len); +``` + +`spi` is a 4-byte array. `src`/`dst` are `ip4` values in **network byte order** +(use `atoip4("a.b.c.d")`). All functions return `0` on success and a negative +error on failure (for example, when the SA pool is full — raise +`WOLFIP_ESP_NUM_SA` if you need more than two SAs per direction). + +The relevant cipher/auth enums (`wolfesp.h`): + +```c +typedef enum { ESP_ENC_NONE, ESP_ENC_CBC_AES, ESP_ENC_CBC_DES3, + ESP_ENC_GCM_RFC4106, ESP_ENC_GCM_RFC4543 } esp_enc_t; + +typedef enum { ESP_AUTH_NONE, ESP_AUTH_MD5_RFC2403, ESP_AUTH_SHA1_RFC2404, + ESP_AUTH_SHA256_RFC4868, ESP_AUTH_GCM_RFC4106, + ESP_AUTH_GCM_RFC4543 } esp_auth_t; +``` + +ICV length constants: `ESP_ICVLEN_HMAC_96` (12) and `ESP_ICVLEN_HMAC_128` (16) +for HMAC suites; `ESP_GCM_RFC4106_ICV_LEN` (16) for GCM. + +## 5. Step by step: securing a socket + +The `esp-server` example (`src/test/esp/esp_server.c`) is the canonical +walk-through. The structure is: init the ESP subsystem, install the inbound and +outbound SAs, then open and use sockets normally. + +```c +#include "wolfip.h" +#include "wolfesp.h" + +/* SPIs (any unique 4-byte tags, agreed with the peer) */ +static uint8_t in_spi[ESP_SPI_LEN] = { 0x03, 0x03, 0x03, 0x03 }; +static uint8_t out_spi[ESP_SPI_LEN] = { 0x04, 0x04, 0x04, 0x04 }; + +/* AES-CBC keys (32 bytes) + HMAC-SHA256 keys (16 bytes), shared with the peer */ +static uint8_t in_enc[32] = { /* ... */ }; +static uint8_t out_enc[32] = { /* ... */ }; +static uint8_t in_auth[16] = { /* ... */ }; +static uint8_t out_auth[16] = { /* ... */ }; + +#define WOLFIP_IP "10.10.10.2" /* this node */ +#define PEER_IP "10.10.10.1" /* the other end */ + +int secure_init(void) +{ + int err = wolfIP_esp_init(); + if (err) return err; + + /* Inbound SA: packets FROM the peer TO us. src = peer, dst = us. */ + err = wolfIP_esp_sa_new_cbc_hmac(1, in_spi, + atoip4(PEER_IP), atoip4(WOLFIP_IP), + in_enc, sizeof(in_enc), + ESP_AUTH_SHA256_RFC4868, in_auth, sizeof(in_auth), + ESP_ICVLEN_HMAC_128); + if (err) return err; + + /* Outbound SA: packets FROM us TO the peer. src = us, dst = peer. */ + err = wolfIP_esp_sa_new_cbc_hmac(0, out_spi, + atoip4(WOLFIP_IP), atoip4(PEER_IP), + out_enc, sizeof(out_enc), + ESP_AUTH_SHA256_RFC4868, out_auth, sizeof(out_auth), + ESP_ICVLEN_HMAC_128); + return err; +} +``` + +After `secure_init()` succeeds, application code is unchanged — open a socket, +`bind`, `listen`/`accept` or `connect`, then `send`/`recv`. Every packet that +matches the outbound SA is encrypted on the way out, and matching inbound ESP +packets are verified and decrypted before your `recv` sees them. + +Note the **src/dst asymmetry** between the two SAs: the inbound SA's `dst` is +this node, the outbound SA's `src` is this node. The peer installs the mirror +image (its inbound SA = our outbound SA, same SPI and keys). Getting this +backwards is the most common cause of "packets go out encrypted but nothing +comes back." + +## 6. Choosing a cipher suite + +The `esp-server` example exposes all four supported suites via `-m `, +which is a convenient map of what to call: + +| Mode | Suite | Constructor | Notes | +|------|-------|-------------|-------| +| 0 | AES-GCM (RFC 4106) | `wolfIP_esp_sa_new_gcm(..., ESP_ENC_GCM_RFC4106, ...)` | AEAD; needs `--enable-aesgcm-stream`. Recommended. | +| 1 | AES-CBC + HMAC-SHA256 | `wolfIP_esp_sa_new_cbc_hmac(...)` | Encrypt-then-MAC; widely interoperable. | +| 2 | 3DES-CBC + HMAC-SHA256 | `wolfIP_esp_sa_new_des3_hmac(...)` | Legacy; needs `--enable-des3`. | +| 3 | AES-GMAC (RFC 4543) | `wolfIP_esp_sa_new_gcm(..., ESP_ENC_GCM_RFC4543, ...)` | Authentication only, no confidentiality. | + +For new designs prefer **AES-GCM (mode 0)**: a single AEAD pass for both +confidentiality and integrity, and the smallest per-packet overhead of the +encrypting suites. + +## 7. Key material and IV/nonce handling + +A few practical rules drawn from the example key tables +(`src/test/esp/esp_common.c`): + +- **GCM/GMAC keys carry a 4-byte salt.** For the RFC 4106/4543 constructors the + key buffer is the cipher key **plus** a trailing 4-byte nonce salt. The + example uses a 36-byte buffer (32-byte AES-256 key + 4-byte salt) and passes + `sizeof(key)` (36). The salt may be public and is shared with the peer; you + never supply a per-packet IV yourself. +- **The GCM nonce is built deterministically** following NIST SP 800-38D + §8.2.1. At SA creation wolfIP draws an 8-byte random pre-IV from + `wc_RNG_GenerateBlock()`; per packet, the low 32 bits of that pre-IV are XORed + with the ESP sequence number to form the 8-byte IV, and the 4-byte salt is + prepended to make the 12-byte GCM nonce. This gives a monotonic counter with a + random starting point, so nonces never repeat under a given key. +- **CBC/3DES keys do not carry the salt.** When reusing the same key bytes for + a CBC SA, pass `sizeof(key) - 4` to drop the trailing salt (the example does + exactly this: `sizeof(in_enc_key) - 4`). The CBC IV is generated randomly per + packet from wolfCrypt's RNG. +- **3DES keys are fixed at 24 bytes** (`ESP_DES3_KEY_LEN`); the constructor does + not take a length argument. +- **HMAC auth keys** are typically 16 bytes for HMAC-SHA256; the ICV length you + pass (`ESP_ICVLEN_HMAC_96` or `_128`) must match the peer. + +Keys must be identical on both peers for a given SPI/direction, and the +inbound/outbound pairing must mirror the peer's. Use real, independently +generated random keys in production — the all-`0x03`/`0x04` values in the +examples are for interop testing only. + +## 8. Interoperating with Linux `ip xfrm` + +The `tools/ip-xfrm/` scripts configure the Linux kernel as the ESP peer so you +can test wolfIP against a reference implementation. Each script installs the +matching `ip xfrm state` (the SAs) and `ip xfrm policy` (when to apply them) for +one suite: + +| Script | Suite | +|--------|-------| +| `cbc_auth` | AES-CBC (RFC 3602) + HMAC (RFC 2403/2404/4868) | +| `des3_auth` | 3DES-CBC (RFC 2451) + HMAC | +| `rfc4106` | AES-GCM (RFC 4106) | +| `rfc4543` | AES-GMAC (RFC 4543) | +| `show` | dump all xfrm state and policies | +| `delete_all`| remove all xfrm state and policies | + +The keys baked into these scripts match `src/test/esp/esp_common.c` and the SPIs +in `esp_sa.txt`, so a wolfIP example and the corresponding script form a working +pair out of the box. An AES-GCM state entry, for instance, looks like +(`tools/ip-xfrm/rfc4106`): + +```sh +sudo ip xfrm state add \ + src 10.10.10.1 dst 10.10.10.2 \ + proto esp spi 0x01010101 mode transport replay-window 64 \ + aead 'rfc4106(gcm(aes))' 0x0303...0a0b0c0d 128 \ + sel src 10.10.10.1 dst 10.10.10.2 +``` + +Note the AEAD key ends in `0a0b0c0d` — the same 4-byte salt wolfIP appends to +its GCM key, confirming the salt convention. + +**End-to-end test (AES-GCM, self-contained loop):** + +```sh +./tools/ip-xfrm/rfc4106 128 +sudo LD_LIBRARY_PATH=/usr/local/lib ./build/test-esp +./tools/ip-xfrm/delete_all +``` + +**End-to-end test (3DES + HMAC, UDP echo server driven from the host):** + +```sh +# terminal 1 — configure the Linux peer +./tools/ip-xfrm/delete_all +./tools/ip-xfrm/des3_auth sha256 128 udp + +# terminal 2 — run the wolfIP ESP echo server +sudo LD_LIBRARY_PATH=/usr/local/lib ./build/esp-server -m 2 -u + +# terminal 1 again — talk to it; bytes arrive/return as ESP packets +nc 10.10.10.2 8 -p 12345 -u +``` + +Always run `./tools/ip-xfrm/delete_all` between tests so stale SAs do not +shadow the new ones. + +## 9. Inspecting ESP traffic in Wireshark + +Because you hold the keys, you can decrypt and verify the captured ESP packets. +Copy the provided Wireshark `esp_sa` config and open a capture: + +```sh +cp tools/ip-xfrm/esp_sa.txt ~/.config/wireshark/esp_sa +wireshark test.pcap +``` + +Wireshark will then show the decrypted inner TCP/UDP payload, validate the ESP +ICV, and check the inner IP/transport checksums — invaluable when debugging a +suite mismatch or a bad key. + +## 10. Troubleshooting + +**Outbound packets are encrypted but no reply arrives.** The inbound SA's +`{src, dst}` is almost certainly reversed. Inbound `src` = peer, `dst` = this +node; outbound is the opposite. Confirm the peer's SAs are the mirror image. + +**Inbound ESP packets are silently dropped.** Causes: SPI mismatch (the receiver +looks up the SA by SPI), wrong key, wrong ICV length, or a suite mismatch. +wolfIP logs `failed to unwrap esp packet, dropping` on the input path; capture +with Wireshark + `esp_sa` to see which check fails. + +**`error: gcm stream not built in` / `des3 not built in`.** wolfSSL was built +without `--enable-aesgcm-stream` or `--enable-des3`. Rebuild wolfSSL with the +needed options and reinstall. + +**SA creation returns a negative error.** The static pool is full. Increase +`WOLFIP_ESP_NUM_SA` (default 2 per direction) at compile time. + +**ESP packets on an interface marked `non_ethernet` are dropped.** ESP transport +unwrap currently runs only on Ethernet interfaces; L3-only links are not +supported on the inbound ESP path. diff --git a/docs/porting_guide.md b/docs/porting_guide.md new file mode 100644 index 00000000..970160aa --- /dev/null +++ b/docs/porting_guide.md @@ -0,0 +1,1116 @@ +# wolfIP Porting Guide + +## Table of Contents + +- [1. Scope](#1-scope) +- [2. Where a port plugs in](#2-where-a-port-plugs-in) +- [3. The link-layer device interface](#3-the-link-layer-device-interface) + - [3.1 L2 versus L3 drivers](#31-l2-versus-l3-drivers) +- [4. Designing a device driver](#4-designing-a-device-driver) + - [4.1 The driver contract](#41-the-driver-contract) + - [4.2 The simplest drivers: loopback, PIO, host-backed](#42-the-simplest-drivers-loopback-pio-host-backed) + - [4.3 A driver with DMA: descriptor rings](#43-a-driver-with-dma-descriptor-rings) + - [4.4 DMA ownership and the poll path](#44-dma-ownership-and-the-poll-path) + - [4.5 DMA ownership and the send path](#45-dma-ownership-and-the-send-path) + - [4.6 Cache coherency on DMA-capable cores](#46-cache-coherency-on-dma-capable-cores) + - [4.7 Descriptor format gotchas](#47-descriptor-format-gotchas) + - [4.8 PHY and MDIO bring-up](#48-phy-and-mdio-bring-up) + - [4.9 Wiring the driver into the stack](#49-wiring-the-driver-into-the-stack) + - [4.10 Driver design checklist](#410-driver-design-checklist) +- [5. Wiring a random-number source](#5-wiring-a-random-number-source) +- [6. Porting to a new operating system](#6-porting-to-a-new-operating-system) + - [6.1 What an OS port has to provide](#61-what-an-os-port-has-to-provide) + - [6.2 The poll task: the heartbeat of the stack](#62-the-poll-task-the-heartbeat-of-the-stack) + - [6.3 The core mutex](#63-the-core-mutex) + - [6.4 Building bsd_socket.c: the public FD table](#64-building-bsd_socketc-the-public-fd-table) + - [6.5 Turning -WOLFIP_EAGAIN into blocking](#65-turning--wolfip_eagain-into-blocking) + - [6.6 Bridging wolfIP callbacks to OS wakeups](#66-bridging-wolfip-callbacks-to-os-wakeups) + - [6.7 A complete wrapper: recv()](#67-a-complete-wrapper-recv) + - [6.8 Initialization and teardown](#68-initialization-and-teardown) +- [7. Case study: the Zephyr port](#7-case-study-the-zephyr-port) +- [8. RTOS locking rules](#8-rtos-locking-rules) +- [9. Porting checklist](#9-porting-checklist) +- [10. Common porting pitfalls](#10-common-porting-pitfalls) + +--- + +## 1. Scope + +This guide is for developers bringing wolfIP up on new hardware or under a new +operating system. It covers the two ports almost every integration needs: + +- a **link-layer device driver** that moves Ethernet frames between wolfIP and + the MAC — both the simple programmed-I/O case and the DMA descriptor-ring + case; +- an **operating-system integration layer** that runs `wolfIP_poll()`, + serializes access to the stack, and (optionally) exposes BSD-style blocking + sockets to application tasks. + +The snippets are drawn from the in-tree ports: the POSIX TAP driver +(`src/port/posix/tap_linux.c`), the NXP LPC and Vorago VA416xx Ethernet +drivers (`src/port/lpc_enet/`, `src/port/va416xx/`), the AMD/Xilinx GEM driver +(`src/port/amd/`), the FreeRTOS BSD wrapper (`src/port/freeRTOS/bsd_socket.c`), +and the Zephyr port (`port/zephyr/`). They are trimmed to show the porting +pattern, not every register or error path of a production driver. + +If you are coming from lwIP, read this guide alongside +[`migrating_from_lwIP.md`](migrating_from_lwIP.md), which maps lwIP concepts +(`netif`, `pbuf`, raw/ALTCP callbacks, `lwipopts.h`) onto the wolfIP +equivalents covered here. + +--- + +## 2. Where a port plugs in + +wolfIP is a single-threaded, statically-allocated stack. It does not own a +thread, it does not allocate memory at runtime, and it never blocks. All +progress happens inside one function: + +```c +int wolfIP_poll(struct wolfIP *s, uint64_t now); +``` + +Every call to `wolfIP_poll()` does four things in order: + +1. asks each link-layer device for one received frame (`ll->poll`); +2. processes that frame through ARP / IP / TCP / UDP / ICMP; +3. fires registered socket callbacks for any state changes (readable, + writable, closed, timeout); +4. drains pending TX by handing frames to the driver (`ll->send`). + +A port therefore has exactly two contact surfaces: + +```text + application / sockets + | + +--------v---------+ OS integration: + | wolfIP core | <- call wolfIP_poll() on a timer/thread, + | wolfIP_poll() | serialize with a mutex, wake blocked tasks + +--------+---------+ + | + +--------v---------+ device driver: + | ll->poll() | <- copy one RX frame up + | ll->send() | <- copy one TX frame to hardware + +------------------+ +``` + +The driver surface is section 4. The OS surface is section 6. On bare metal +you only need the driver — `main()` calls `wolfIP_poll()` in a loop and that is +the whole "OS port." + +--- + +## 3. The link-layer device interface + +A driver is one `struct wolfIP_ll_dev` per interface (from `wolfip.h`): + +```c +struct wolfIP_ll_dev { + uint8_t mac[6]; + char ifname[16]; + uint8_t non_ethernet; + uint32_t mtu; + /* poll function */ + int (*poll)(struct wolfIP_ll_dev *ll, void *buf, uint32_t len); + /* send function */ + int (*send)(struct wolfIP_ll_dev *ll, void *buf, uint32_t len); + /* optional context private pointer */ + void *priv; + /* ... optional VLAN fields ... */ +}; +``` + +You fill in `mac`, `ifname`, `mtu`, `poll`, `send`, and optionally `priv` (a +back-pointer to your driver state, the wolfIP equivalent of lwIP's +`netif->state`). The stack retrieves the primary device with +`wolfIP_getdev(s)` and additional interfaces with `wolfIP_getdev_ex(s, idx)`. + +`buf` is always a **single, contiguous, linear** frame buffer owned by the +stack — there is no `pbuf` chain to walk. On an L2 interface the frame includes +the full Ethernet header (see below). Your driver must not retain the `buf` +pointer after the callback returns. + +### 3.1 L2 versus L3 drivers + +The `non_ethernet` flag selects the driver class: + +- **L2 / Ethernet driver** (`non_ethernet = 0`, the default). The driver moves + complete Ethernet frames: a 14-byte Ethernet header followed by the payload. + wolfIP performs ARP / neighbour resolution and builds and parses the Ethernet + header itself. The `buf` passed to `poll` and `send` begins at the Ethernet + header. The TAP, LPC, VA416xx, and GEM drivers are all L2. + +- **L3 / point-to-point driver** (`non_ethernet = 1`). The link carries bare IP + packets — there is no Ethernet header and no ARP. On transmit, wolfIP strips + the 14-byte Ethernet header it built before calling your `send`, so `send` + receives the IP packet (`buf + ETH_HEADER_LEN`, `len - ETH_HEADER_LEN`). On + receive, your `poll` must return a buffer that begins at the IP header. The + built-in loopback interface and TUN-style devices + (`src/port/posix/linux_tun.c`, `IFF_TUN`) are L3. + +| | L2 (Ethernet) | L3 (point-to-point) | +|---|---|---| +| `non_ethernet` | `0` | `1` | +| Frame at `poll` / `send` | Ethernet header + IP | IP only | +| ARP / neighbour resolution | performed by wolfIP | skipped | +| Examples | TAP, LPC, VA416xx, GEM | loopback `lo`, TUN | + +The stack applies the L3 stripping in `wolfIP_ll_send_frame()`: + +```c +if (ll->non_ethernet) + return ll->send(ll, (uint8_t *)buf + ETH_HEADER_LEN, len - ETH_HEADER_LEN); +``` + +The `mtu` field always describes wolfIP's internal frame budget *including* +Ethernet headroom; on an L3 link the maximum IP payload handed to `send` is +therefore `mtu - ETH_HEADER_LEN`. + +--- + +## 4. Designing a device driver + +### 4.1 The driver contract + +Both callbacks have a small, strict contract. Getting the return values right +is what makes the stack progress correctly. + +**`poll(ll, buf, len)`** — "give me at most one received frame": + +| Return | Meaning | +|---|---| +| `> 0` | A complete frame of this many bytes was copied into `buf`. | +| `0` | No frame is available right now. | +| `< 0` | Driver error; the stack skips RX processing this cycle. | + +**`send(ll, buf, len)`** — "transmit this one complete frame": + +| Return | Meaning | +|---|---| +| `> 0` or `0` | The driver accepted/queued the frame. | +| `-WOLFIP_EAGAIN` | TX ring/queue is full; the stack retries on a later poll. | +| other `< 0` | Hard error (e.g. frame too large). | + +Two rules follow directly from this contract and are worth internalizing +before writing a line of driver code: + +- **One frame per call.** `poll` returns *one* frame even if several are + queued; `wolfIP_poll()` calls it again next cycle. `send` transmits exactly + the bytes handed to it. +- **Never block.** Both callbacks run inline inside `wolfIP_poll()`. If the + hardware is busy, return `0` (poll) or `-WOLFIP_EAGAIN` (send) and let the + next poll cycle make progress. + +### 4.2 The simplest drivers: loopback, PIO, host-backed + +The very simplest driver has no hardware at all. When `WOLFIP_ENABLE_LOOPBACK` +is set (and `WOLFIP_MAX_INTERFACES > 1`), `wolfIP_init()` installs an L3 +loopback interface at index 0 — `ifname` `"lo"`, `non_ethernet = 1`, address +`127.0.0.1/8` — whose `poll`/`send` move IP packets through a small in-memory +queue (`src/wolfip.c`): + +```c +static int wolfIP_loopback_send(struct wolfIP_ll_dev *ll, void *buf, uint32_t len) +{ + struct wolfIP *s = WOLFIP_CONTAINER_OF(ll, struct wolfIP, ll_dev); + + if (len == 0 || len > IP_MTU_MAX) + return 0; + if (s->loopback_count >= WOLFIP_LOOPBACK_QUEUE_DEPTH) + return -WOLFIP_EAGAIN; /* queue full: retry later */ + /* buf is the IP packet — the Ethernet header was already stripped for + * this non_ethernet device. Store as-is; wolfIP_poll re-adds the prefix. */ + memcpy(s->loopback_buf[s->loopback_tail], buf, len); + s->loopback_count++; + return (int)len; +} + +static int wolfIP_loopback_poll(struct wolfIP_ll_dev *ll, void *buf, uint32_t len) +{ + struct wolfIP *s = WOLFIP_CONTAINER_OF(ll, struct wolfIP, ll_dev); + uint32_t pending; + + if (s->loopback_count == 0) + return 0; /* nothing queued */ + pending = s->loopback_pending_len[s->loopback_head]; + if (pending > len) + return 0; + memcpy(buf, s->loopback_buf[s->loopback_head], pending); + s->loopback_count--; + return (int)pending; /* one IP packet */ +} +``` + +This is the `poll`/`send` contract in its purest form: `send` queues a packet +(or returns `-WOLFIP_EAGAIN` when the queue is full), `poll` dequeues one packet +(or returns `0` when empty). No Ethernet header, no DMA, no cache maintenance — +exactly what an L3 driver does, with an in-memory queue standing in for the wire. + +The next simplest driver is a programmed-I/O Ethernet (L2) MAC: a register/FIFO +read on poll and a register/FIFO write on send. The POSIX TAP driver is the +canonical minimal example — the "hardware" is a host file descriptor, but the +shape is identical to a small MCU MAC that exposes an RX/TX FIFO +(`src/port/posix/tap_linux.c`): + +```c +static int tap_poll(struct wolfIP_ll_dev *ll, void *buf, uint32_t len) +{ + struct pollfd pfd; + int ret; + (void)ll; + pfd.fd = tap_fd; + pfd.events = POLLIN; + ret = poll(&pfd, 1, 2); + if (ret < 0) { + perror("poll"); + return -1; /* driver error */ + } + if (ret == 0) { + return 0; /* nothing to receive */ + } + return read(tap_fd, buf, len); /* one frame copied into buf */ +} + +static int tap_send(struct wolfIP_ll_dev *ll, void *buf, uint32_t len) +{ + (void)ll; + return write(tap_fd, buf, len); /* transmit the contiguous frame */ +} +``` + +For a real MCU without DMA the body changes but the skeleton does not: + +```c +static int my_pio_poll(struct wolfIP_ll_dev *ll, void *buf, uint32_t len) +{ + uint32_t flen; + (void)ll; + + if (!(MAC_RX_STATUS & RX_FRAME_READY)) + return 0; /* no frame: return 0, never block */ + + flen = MAC_RX_LEN & RX_LEN_MASK; + if (flen > len) /* never overflow the stack buffer */ + flen = len; + + /* Drain the MAC RX FIFO word by word into the linear stack buffer. */ + for (uint32_t i = 0; i < flen; i += 4) + *(uint32_t *)((uint8_t *)buf + i) = MAC_RX_FIFO; + + MAC_RX_CMD = RX_RELEASE; /* hand the slot back to the MAC */ + return (int)flen; +} + +static int my_pio_send(struct wolfIP_ll_dev *ll, void *buf, uint32_t len) +{ + (void)ll; + if (!(MAC_TX_STATUS & TX_FIFO_FREE)) + return -WOLFIP_EAGAIN; /* full: ask the stack to retry */ + + MAC_TX_LEN = len; + for (uint32_t i = 0; i < len; i += 4) + MAC_TX_FIFO = *(uint32_t *)((uint8_t *)buf + i); + + MAC_TX_CMD = TX_START; + return (int)len; +} +``` + +No descriptor rings, no cache maintenance, no ownership flags. If your MAC can +copy a whole frame in and out through registers or a FIFO, this is all you +need. Most of the remaining complexity in this section exists only because DMA +introduces shared memory between the CPU and the MAC. + +### 4.3 A driver with DMA: descriptor rings + +A DMA-capable MAC does not use FIFOs. Instead the CPU and the MAC share a ring +of **descriptors** in RAM. Each descriptor points at a buffer and carries an +**OWN** bit that says whether the CPU or the MAC currently owns that slot. The +driver's job becomes: + +- on RX, find a descriptor the MAC has filled (OWN handed back to CPU), copy + the frame out, and re-arm the descriptor (OWN back to MAC); +- on TX, find a free descriptor (CPU owns it), copy the frame in, and set OWN + to hand it to the MAC. + +Declare the rings and buffers as static, aligned storage. The LPC driver uses +the Synopsys DesignWare "enhanced" 4-word descriptor +(`src/port/lpc_enet/lpc_enet.c`): + +```c +struct eth_desc { + volatile uint32_t des0; + volatile uint32_t des1; + volatile uint32_t des2; + volatile uint32_t des3; +}; + +#define RX_DESC_COUNT 4U +#define TX_DESC_COUNT 3U + +static struct eth_desc rx_ring[RX_DESC_COUNT] __attribute__((aligned(32))); +static struct eth_desc tx_ring[TX_DESC_COUNT] __attribute__((aligned(32))); +static uint8_t rx_buffers[RX_DESC_COUNT][RX_BUF_SIZE] __attribute__((aligned(32))); +static uint8_t tx_buffers[TX_DESC_COUNT][TX_BUF_SIZE] __attribute__((aligned(32))); + +static uint32_t rx_idx; /* next RX descriptor the CPU will inspect */ +static uint32_t tx_idx; /* next TX descriptor the CPU will fill */ +``` + +Alignment matters: many DMA engines require descriptors and buffers aligned to +the burst size (16 or 32 bytes), and on systems with a data cache the buffers +must be aligned to a cache line so that a clean/invalidate does not disturb +neighbouring data (see 4.6). On some parts the DMA can only reach a specific +RAM bank — the VA416xx driver pins all rings and buffers into a dedicated +section because the Ethernet DMA cannot access the code-bus RAM +(`src/port/va416xx/va416xx_eth.c`): + +```c +static struct eth_desc rx_ring[RX_DESC_COUNT] + __attribute__((aligned(16), section(".dma_bss"))); +``` + +### 4.4 DMA ownership and the poll path + +The RX poll walks to the current descriptor, checks the OWN bit, and bails out +with `0` if the MAC still owns it (no frame yet). Otherwise it copies the frame +out and re-arms the slot. The LPC enhanced-descriptor version +(`src/port/lpc_enet/lpc_enet.c`): + +```c +static int eth_poll(struct wolfIP_ll_dev *dev, void *frame, uint32_t len) +{ + struct eth_desc *desc; + uint32_t status, frame_len = 0; + (void)dev; + + desc = &rx_ring[rx_idx]; + if (desc->des3 & RDES3_OWN) + return 0; /* MAC still owns it: no frame */ + + status = desc->des3; + if ((status & (RDES3_FS | RDES3_LS)) == (RDES3_FS | RDES3_LS)) { + frame_len = status & RDES3_PL_MASK; + if (frame_len > len) frame_len = len; /* clamp to stack buffer */ + if (frame_len > 0) + memcpy(frame, rx_buffers[rx_idx], frame_len); + } + + /* Re-arm: point the descriptor back at its buffer and give OWN to the + * MAC so it can receive into this slot again. */ + desc->des0 = DMA_ADDR(rx_buffers[rx_idx]); + desc->des1 = 0; desc->des2 = 0; + __asm volatile ("dsb sy" ::: "memory"); + desc->des3 = RDES3_OWN | RDES3_IOC | RDES3_BUF1V; + __asm volatile ("dsb sy" ::: "memory"); + ETH_DMACRXDTPR = DMA_ADDR(desc); /* poke the DMA tail pointer */ + rx_idx = (rx_idx + 1) % RX_DESC_COUNT; + + return (int)frame_len; +} +``` + +Note the checks: only accept a descriptor that is both the **first and last +segment** of a frame (`FS | LS`) — a single un-fragmented Ethernet frame — and +always clamp `frame_len` to the buffer length the stack passed in. + +### 4.5 DMA ownership and the send path + +The send path is the mirror image: check that the CPU owns the next TX +descriptor, copy the frame in, pad to the 60-byte Ethernet minimum, then set +OWN to hand it to the MAC. Crucially, if the CPU does **not** own the +descriptor (the ring is full), return `-WOLFIP_EAGAIN` so the stack retries +later instead of corrupting an in-flight frame: + +```c +static int eth_send(struct wolfIP_ll_dev *dev, void *frame, uint32_t len) +{ + struct eth_desc *desc; + uint32_t dma_len, next; + (void)dev; + + if (len == 0 || len > TX_BUF_SIZE) return -1; /* hard error */ + desc = &tx_ring[tx_idx]; + if (desc->des3 & TDES3_OWN) return -2; /* ring full */ + + memcpy(tx_buffers[tx_idx], frame, len); + dma_len = (len < FRAME_MIN_LEN) ? FRAME_MIN_LEN : len; /* pad to 60 */ + if (dma_len > len) memset(tx_buffers[tx_idx] + len, 0, dma_len - len); + + desc->des0 = DMA_ADDR(tx_buffers[tx_idx]); + desc->des1 = 0; + desc->des2 = (dma_len & TDES2_B1L_MASK); + __asm volatile ("dsb sy" ::: "memory"); + /* OWN is the doorbell: write the descriptor body first, OWN last. */ + desc->des3 = (dma_len & TDES3_FL_MASK) | TDES3_FD | TDES3_LD | TDES3_OWN; + __asm volatile ("dsb sy" ::: "memory"); + + ETH_DMACSR = DMACSR_TBU; + next = (tx_idx + 1) % TX_DESC_COUNT; + ETH_DMACTXDTPR = DMA_ADDR(&tx_ring[next]); /* kick TX DMA */ + tx_idx = next; + return (int)len; +} +``` + +> The wolfIP core treats any negative `send` return that is not a hard error as +> "try again later." The AMD GEM driver makes this explicit by returning +> `-WOLFIP_EAGAIN` when the BD ring is backed up (`src/port/amd/common/gem_core.c`): +> `if ((gem_tx_ring[idx].status & TXBUF_USED) == 0) return -WOLFIP_EAGAIN;` + +The ordering — **write the descriptor fields, memory barrier, then write the +OWN bit, barrier, then kick the DMA** — is not optional. The OWN bit is a +doorbell; if it becomes visible to the MAC before the buffer address and length +do, the MAC will DMA garbage. + +### 4.6 Cache coherency on DMA-capable cores + +On a core with a data cache and no hardware cache-coherent DMA (Cortex-A, +Cortex-M7, etc.) the descriptors and buffers live in cacheable memory that both +the CPU and the MAC touch. You must bracket every DMA hand-off with cache +maintenance, or the CPU and MAC will see different memory: + +- **Before the MAC reads** CPU-written data (a TX buffer, a re-armed + descriptor): **clean** (write back) the cache so the MAC sees your writes. +- **Before the CPU reads** MAC-written data (an RX buffer, a completed + descriptor's status/OWN): **invalidate** the cache so you do not read a stale + copy. + +The AMD GEM port wraps the BD ring with exactly these operations. RX poll +(`src/port/amd/ip/gem_rx_poll.c`): + +```c +cache_inval(gem_rx_ring, sizeof(gem_rx_ring)); /* see fresh OWN/status */ +if (!(gem_rx_ring[gem_rx_next].addr & RXBUF_OWN_SW)) + return 0; + +frame_len = gem_rx_ring[gem_rx_next].status & RXBUF_LEN_MASK; +cache_inval(gem_rx_buf_pool[gem_rx_next], frame_len); /* see fresh payload */ +memcpy(buf, gem_rx_buf_pool[gem_rx_next], copy); + +/* re-arm, then push the descriptor back to memory for the MAC */ +gem_rx_ring[gem_rx_next].addr = addr; /* OWN=0 -> MAC owns */ +cache_clean(&gem_rx_ring[gem_rx_next], sizeof(gem_rx_ring[gem_rx_next])); +__asm__ volatile ("dsb" ::: "memory"); +``` + +TX send (`src/port/amd/common/gem_core.c`): + +```c +/* The USED bit is written back by MAC DMA - invalidate so the CPU does not + * see the stale USED=0 we wrote when we last armed this BD. */ +cache_inval(&gem_tx_ring[idx], sizeof(gem_tx_ring[idx])); +... +memcpy(gem_tx_buf_pool[idx], buf, len); +cache_clean(gem_tx_buf_pool[idx], len); /* MAC must see the frame */ +... +gem_tx_ring[idx].status = status; /* USED=0 -> ready for MAC */ +cache_clean(&gem_tx_ring[idx], sizeof(gem_tx_ring[idx])); +GEM_NWCTRL |= NWCTRL_STARTTX; +``` + +The helper semantics, from `src/port/amd/arch/aarch64/cache.h`: *"`cache_clean()` +writes back dirty lines before DMA reads; `cache_inval()` invalidates lines so +CPU reads pull fresh"* data the MAC just wrote. + +Two traps to avoid: + +- **Cache-line aliasing.** If a buffer is not cache-line aligned and padded to + a full line, an invalidate can throw away an adjacent variable, or a clean can + overwrite MAC-written bytes. Align DMA buffers to the cache line. +- **Barriers are not cache ops.** `dsb`/`__DSB()` orders memory accesses but + does not move data between cache and RAM. You need *both*: the cache op for + coherency and the barrier for ordering. The MCU drivers that run cache-off + (LPC, VA416xx) use only `dsb`; the cache-on AMD driver uses both. + +If your platform has an MPU/MMU, an even simpler option during bring-up is to +mark the DMA region **non-cacheable** and drop the per-frame cache ops +entirely, at a small throughput cost. + +### 4.7 Descriptor format gotchas + +DMA descriptor layouts vary even within one IP family, and the differences are +easy to get subtly wrong. The in-tree drivers document two real examples worth +knowing about before you write your own: + +- **Enhanced vs. normal descriptors.** The LPC driver uses the Synopsys + "enhanced" format where control bits (FD/LD/OWN) live in `des3`. The VA416xx + uses the same Synopsys GMAC in **normal/legacy** format, where on TX *only* + the OWN bit lives in `des0` and all frame control (FS/LS/IC) lives in `des1` + — `des0`'s other bits are status the DMA writes back. Loading control bits + into the wrong word makes the DMA advance linearly and never transmit. + +- **Ring wrap: tail pointer vs. chain vs. ring bit.** The LPC enhanced format + wraps via a tail-pointer register. The VA416xx normal format must use + **chain mode** (each `des3` points at the next descriptor) on TX, because the + DMA overwrites `des0` on completion and a ring-mode "end of ring" bit stored + there would be destroyed, sending the DMA walking off the end of the ring + into adjacent memory (`src/port/va416xx/va416xx_eth.c`): + + ```c + /* Chain: each descriptor points to the next; last wraps to first. + * The DMA only writes back to des0; des1/des2/des3 survive, so the + * chain pointer in des3 wraps the ring reliably. */ + tx_ring[i].des3 = (uint32_t)&tx_ring[(i + 1U) % TX_DESC_COUNT]; + ``` + +The lesson: read your MAC's reference driver (vendor SDK, U-Boot, Linux) to +confirm which descriptor variant the silicon actually implements, then copy +that variant's bit layout exactly. + +### 4.8 PHY and MDIO bring-up + +The MAC moves frames; the **PHY** brings the copper link up. You talk to it +over the MDIO (MII management) bus to reset it, start auto-negotiation, and +read back the negotiated speed/duplex so you can program the MAC to match. The +LPC driver's MDIO accessors are a compact reference +(`src/port/lpc_enet/lpc_enet.c`): + +```c +static uint16_t mdio_read(uint32_t phy, uint32_t reg) +{ + uint32_t cr; + mdio_wait(); /* wait for MII not-busy */ + cr = ETH_MACMDIOAR & MDIOAR_CR_MASK; + ETH_MACMDIOAR = cr | (MDIOAR_GOC_READ << MDIOAR_GOC_SHIFT) | + (phy << MDIOAR_PA_SHIFT) | (reg << MDIOAR_RDA_SHIFT) | + MDIOAR_MB; + mdio_wait(); + return (uint16_t)(ETH_MACMDIODR & 0xFFFFU); +} +``` + +Practical PHY-porting notes, all learned the hard way in the in-tree drivers: + +- **Scan for the PHY address.** The PHY's MDIO address (0–31) is a strapping + option; do not assume 0. Read the ID register at each address until you get a + non-`0x0000`/non-`0xFFFF` value. The LPC driver retries the scan because some + PHYs (LAN8742A) need the RMII clock stable before MDIO is reliable. +- **Match the MAC to the negotiated result.** After auto-negotiation completes, + read the PHY's status/vendor register and program the MAC's speed (FES) and + duplex (DM) bits to match. A duplex mismatch looks like "RX works, TX is + silently dropped as collisions." +- **Some status bits are latched.** BMSR link-status latches low on link loss + until read; the VA416xx driver double-reads to get the current state. + +If you are on a simulator or a direct MAC-to-MAC link with no PHY, you can skip +all of this and force the MAC speed/duplex — but on real copper, PHY bring-up +is usually where the time goes. + +### 4.9 Wiring the driver into the stack + +The driver's init function fills the `wolfIP_ll_dev` and brings the hardware +up. Pattern from `lpc_enet_init()`: + +```c +int lpc_enet_init(struct wolfIP_ll_dev *ll, const uint8_t *mac) +{ + if (!ll) return -1; + + memcpy(ll->mac, mac, 6); + strncpy(ll->ifname, "eth0", sizeof(ll->ifname) - 1); + ll->ifname[sizeof(ll->ifname) - 1] = '\0'; + ll->poll = eth_poll; /* <- the two callbacks */ + ll->send = eth_send; + + mac_stop(); + if (hw_reset() != 0) return -2; + mdio_init(); + config_mac(mac); + config_mtl(); + config_dma(); + init_desc(); /* lay out the descriptor rings */ + phy_init(); /* bring the PHY link up */ + config_speed_duplex(); + mac_start(); /* enable TX/RX, arm the RX ring */ + return 0; +} +``` + +The application then retrieves the device, points it at the driver, and sets +the IPv4 configuration (from the lwIP-migration guide's bare-metal pattern): + +```c +wolfIP_init_static(&ipstack); +dev = wolfIP_getdev(ipstack); +lpc_enet_init(dev, my_mac); +wolfIP_ipconfig_set(ipstack, + atoip4("192.168.1.50"), atoip4("255.255.255.0"), atoip4("192.168.1.1")); +``` + +You can fill the `wolfIP_ll_dev` fields in the init function (as above) or in +the caller after `wolfIP_getdev()` — both are used in-tree. For multiple +interfaces use `wolfIP_getdev_ex(s, idx)` and `wolfIP_ipconfig_set_ex()`. + +### 4.10 Driver design checklist + +- `poll` returns `>0` (one frame copied), `0` (nothing), or `<0` (error); it + **never blocks**. +- `send` returns `≥0` (accepted) or `-WOLFIP_EAGAIN` (ring full); it never + blocks and never drops the frame silently when full. +- `poll` clamps the copied length to the `len` the stack passed in. +- `send` pads short frames to the 60-byte Ethernet minimum. +- DMA: write the descriptor body before the OWN doorbell, with a barrier + between, and kick the DMA tail pointer last. +- DMA on a cached core: `clean` before the MAC reads, `invalidate` before the + CPU reads; align buffers to a cache line. +- Descriptors/buffers are static, aligned, and (if required) placed in + DMA-reachable RAM. +- PHY address is discovered by scan; MAC speed/duplex follow the negotiated + result. +- `ll->mac`, `ll->ifname`, `ll->mtu`, `ll->poll`, `ll->send` are all set before + the first `wolfIP_poll()`. + +--- + +## 5. Wiring a random-number source + +Independently of the driver and OS, every port must provide one function the +stack calls for TCP initial sequence numbers, ephemeral ports, DNS IDs, and the +IP identification field (declared in `wolfip.h`): + +```c +uint32_t wolfIP_getrandom(void); +``` + +Back it with a real entropy source — a hardware TRNG, a seeded DRBG, or +wolfCrypt's RNG. The POSIX port uses the OS (`src/port/posix/tap_linux.c`): + +```c +uint32_t wolfIP_getrandom(void) +{ + uint32_t ret; + getrandom(&ret, sizeof(ret), 0); + return ret; +} +``` + +A wolfCrypt-backed version for products that already initialize wolfSSL: + +```c +uint32_t wolfIP_getrandom(void) +{ + static WC_RNG rng; + static int ready; + uint32_t v = 0; + if (!ready) { if (wc_InitRng(&rng) != 0) return 0; ready = 1; } + wc_RNG_GenerateBlock(&rng, (byte *)&v, sizeof(v)); + return v; +} +``` + +Do **not** ship a constant, a bare timer value, or unseeded `rand()`: +predictable sequence numbers and ports are a security hole. Decide explicitly +how the function behaves if the entropy source fails. + +--- + +## 6. Porting to a new operating system + +On bare metal the "OS port" is one line: call `wolfIP_poll()` in `main()`'s +loop. Under an RTOS you usually want three more things: + +- a **dedicated poll task** so the stack runs even while application tasks + block; +- a **mutex** so application tasks and the poll task do not enter the + single-threaded core concurrently; +- **blocking BSD sockets** so application code can write ordinary + `recv()`/`send()` instead of polling for `-WOLFIP_EAGAIN`. + +The FreeRTOS port (`src/port/freeRTOS/bsd_socket.c`) is the reference +implementation for all three; this section walks through building the +equivalent for a new RTOS, and section 7 shows a different shape (Zephyr's +socket-offload model) for contrast. + +### 6.1 What an OS port has to provide + +| Primitive | Used for | +|---|---| +| Mutex | Serialize every entry into the wolfIP core and `wolfIP_poll()`. | +| Binary semaphore / event per socket | Sleep a task until its socket is ready. | +| Task creation | Run the poll task. | +| Millisecond clock | Provide `now_ms` to `wolfIP_poll()`. | +| Sleep/delay | Idle the poll task between cycles. | + +That is the whole dependency list. wolfIP needs no dynamic memory, no per-socket +threads, and no timer callbacks from the OS. + +### 6.2 The poll task: the heartbeat of the stack + +The poll task is a forever-loop that takes the core mutex, runs one poll cycle, +releases the mutex, and sleeps for a bounded interval. `wolfIP_poll()` returns +`>= 0` on success and a negative value on error; the FreeRTOS version clamps the +sleep to a `[MIN, MAX]` window so the task neither spins nor oversleeps: + +```c +static void wolfip_bsd_poll_task(void *arg) +{ + struct wolfIP *ipstack = (struct wolfIP *)arg; + + for (;;) { + uint32_t next_ms; + TickType_t delay_ticks; + uint64_t now_ms = (uint64_t)xTaskGetTickCount() * (uint64_t)portTICK_PERIOD_MS; + + /* One poll cycle under the global lock so socket ops and timer + * processing see a consistent core state. */ + xSemaphoreTake(g_lock, portMAX_DELAY); + next_ms = (uint32_t)wolfIP_poll(ipstack, now_ms); + xSemaphoreGive(g_lock); + + if (next_ms < WOLFIP_FREERTOS_POLL_MIN_MS) next_ms = WOLFIP_FREERTOS_POLL_MIN_MS; + if (next_ms > WOLFIP_FREERTOS_POLL_MAX_MS) next_ms = WOLFIP_FREERTOS_POLL_MAX_MS; + + delay_ticks = pdMS_TO_TICKS(next_ms); + if (delay_ticks == 0) delay_ticks = 1; /* always yield at least 1 tick */ + vTaskDelay(delay_ticks); + } +} +``` + +The default bounds are 5 ms minimum and 20 ms maximum. That floor stops the +task from busy-spinning; the ceiling guarantees TCP retransmit timers, delayed +ACKs, DHCP, and DNS still fire promptly. Lower the ceiling for latency, raise it +for power — but verify TCP behaviour after raising it. + +### 6.3 The core mutex + +wolfIP is single-threaded. The mutex is what lets a multi-tasking OS use it +safely: **every** call into `wolfIP_*` — from the poll task and from every +socket wrapper — happens while holding it. The FreeRTOS port creates one mutex +at init: + +```c +g_lock = xSemaphoreCreateMutex(); +``` + +The non-negotiable rule (section 8) is: hold the mutex *only* while inside the +core, and **never** hold it while blocking on a semaphore. Hold-while-sleeping +deadlocks the poll task and stalls the whole stack. + +### 6.4 Building bsd_socket.c: the public FD table + +Application tasks should not juggle wolfIP's internal socket descriptors +directly. The wrapper keeps a small table mapping a **public fd** (the small +integer it hands back from `socket()`/`accept()`) to the **internal wolfIP fd** +plus the per-socket wakeup primitive: + +```c +typedef struct { + int in_use; + int internal_fd; /* the wolfIP_sock_* descriptor */ + SemaphoreHandle_t ready_sem; /* given by the callback, taken by waiters */ + volatile uint16_t wait_events; /* event bits this fd is blocked on */ + volatile uint16_t seen_events; /* event bits the callback has delivered */ +} wolfip_bsd_fd_entry; + +static struct wolfIP *g_ipstack; +static SemaphoreHandle_t g_lock; +static wolfip_bsd_fd_entry g_fds[WOLFIP_FREERTOS_BSD_MAX_FDS]; +``` + +`socket()` calls `wolfIP_sock_socket()` under the lock, then allocates a table +slot (and a fresh binary semaphore) for the returned internal fd: + +```c +int socket(int domain, int type, int protocol) +{ + int ret, public_fd; + xSemaphoreTake(g_lock, portMAX_DELAY); + ret = wolfIP_sock_socket(g_ipstack, domain, type, protocol); + if (ret < 0) { wolfip_bsd_set_error(ret); xSemaphoreGive(g_lock); return -1; } + public_fd = wolfip_bsd_fd_alloc(ret); /* creates ready_sem */ + if (public_fd < 0) { wolfIP_sock_close(g_ipstack, ret); xSemaphoreGive(g_lock); return -1; } + xSemaphoreGive(g_lock); + return public_fd; +} +``` + +Non-blocking calls (`bind`, `listen`, `setsockopt`, `getsockname`, …) are +trivial: validate the fd, take the lock, call the matching `wolfIP_sock_*`, +release, translate a negative return to `-1`. The interesting ones are the +calls that can return `-WOLFIP_EAGAIN`. + +### 6.5 Turning -WOLFIP_EAGAIN into blocking + +wolfIP's socket calls never block — they return `-WOLFIP_EAGAIN` when an +operation would have to wait. The wrapper converts that into POSIX blocking +semantics with a lock / try / register-callback / unlock / wait / retry loop: + +1. lock the core; +2. call the `wolfIP_sock_*` function; +3. success → unlock, return; +4. hard error → unlock, set errno, return `-1`; +5. `-WOLFIP_EAGAIN` → register a callback for the needed events, clear the + semaphore, **unlock**, then block on the semaphore; +6. when the callback wakes the task, loop and retry. + +Each blocking call waits on the event bits that make sense for it: + +- `accept()` → `CB_EVENT_READABLE | CB_EVENT_CLOSED` +- `connect()` → `CB_EVENT_WRITABLE | CB_EVENT_CLOSED` +- `send()` → `CB_EVENT_WRITABLE | CB_EVENT_CLOSED` +- `recv()` → `CB_EVENT_READABLE | CB_EVENT_CLOSED` +- `close()` → `CB_EVENT_CLOSED` (only if close itself returns `-WOLFIP_EAGAIN`) + +The "prepare wait" helper is the heart of the race-free hand-off. It must run +**while the core lock is held**, so a callback cannot fire between arming the +wait and registering for it: + +```c +static void wolfip_bsd_prepare_wait_locked(wolfip_bsd_fd_entry *entry, uint16_t wait_events) +{ + entry->seen_events = 0; + entry->wait_events = wait_events; + while (xSemaphoreTake(entry->ready_sem, 0) == pdTRUE) { } /* drain stale gives */ + wolfIP_register_callback(g_ipstack, entry->internal_fd, wolfip_bsd_socket_cb, entry); +} +``` + +### 6.6 Bridging wolfIP callbacks to OS wakeups + +wolfIP delivers socket events by calling a registered callback from inside +`wolfIP_poll()` (i.e. from the poll task, with the lock already held). The +callback's only job is to record the events and wake any task waiting on those +bits. It must **not** do socket I/O and must **not** block: + +```c +static void wolfip_bsd_socket_cb(int internal_fd, uint16_t events, void *arg) +{ + wolfip_bsd_fd_entry *entry = (wolfip_bsd_fd_entry *)arg; + (void)internal_fd; + if (entry == NULL) return; + + entry->seen_events |= events; + if ((events & entry->wait_events) != 0) + (void)xSemaphoreGive(entry->ready_sem); /* wake the blocked task */ +} +``` + +The `arg` you pass to `wolfIP_register_callback()` comes straight back here, so +pass the FD-table entry and you have O(1) access to the right semaphore. + +### 6.7 A complete wrapper: recv() + +Putting it together, here is the full blocking `recv()` — every other blocking +wrapper is the same skeleton with different event bits and a different +`wolfIP_sock_*` call: + +```c +int recv(int sockfd, void *buf, size_t len, int flags) +{ + int ret; + wolfip_bsd_fd_entry *entry; + + if (!wolfip_bsd_fd_valid(sockfd)) return -1; + entry = &g_fds[sockfd]; + + for (;;) { + xSemaphoreTake(g_lock, portMAX_DELAY); + ret = wolfIP_sock_recv(g_ipstack, entry->internal_fd, buf, len, flags); + if (ret >= 0) { /* data, or 0 on close */ + xSemaphoreGive(g_lock); + return ret; + } + if (ret != -WOLFIP_EAGAIN) { /* hard error */ + xSemaphoreGive(g_lock); + wolfip_bsd_set_error(ret); + return -1; + } + /* would block: arm the wait while still locked, then release+sleep */ + wolfip_bsd_prepare_wait_locked(entry, + (uint16_t)(CB_EVENT_READABLE | CB_EVENT_CLOSED)); + xSemaphoreGive(g_lock); + if (wolfip_bsd_wait_unlocked(entry) < 0) { /* block on ready_sem */ + wolfip_bsd_set_error(WOLFIP_EAGAIN); + return -1; + } + /* woken: loop and retry wolfIP_sock_recv() */ + } +} +``` + +Trace the lock discipline: the core is locked around `wolfIP_sock_recv()` and +around `prepare_wait`, then **released before** `wait_unlocked()` blocks on the +semaphore. That is what lets the poll task keep running (and the callback fire) +while this task sleeps. `close()` additionally has to handle the TCP core +destroying the socket the instant it delivers `CB_EVENT_CLOSED` — see the +`seen_events & CB_EVENT_CLOSED` branch in the reference file. + +### 6.8 Initialization and teardown + +Init creates the mutex, clears the FD table, stores the stack pointer, and +spawns the poll task: + +```c +int wolfip_freertos_socket_init(struct wolfIP *ipstack, + UBaseType_t poll_task_priority, uint16_t poll_task_stack_words) +{ + if (ipstack == NULL) return -WOLFIP_EINVAL; + g_lock = xSemaphoreCreateMutex(); + if (g_lock == NULL) return -WOLFIP_ENOMEM; + for (int i = 0; i < WOLFIP_FREERTOS_BSD_MAX_FDS; i++) g_fds[i].in_use = 0; + g_ipstack = ipstack; + if (xTaskCreate(wolfip_bsd_poll_task, "wolfip_poll", poll_task_stack_words, + g_ipstack, poll_task_priority, NULL) != pdPASS) { + vSemaphoreDelete(g_lock); + return -WOLFIP_ENOMEM; + } + return 0; +} +``` + +On `close()`, clear the callback and free the table slot (deleting its +semaphore) under the lock so a concurrent callback never touches a freed entry: + +```c +wolfIP_register_callback(g_ipstack, entry->internal_fd, NULL, NULL); +wolfip_bsd_fd_free(sockfd); /* vSemaphoreDelete + mark slot free */ +``` + +--- + +## 7. Case study: the Zephyr port + +The Zephyr port (`port/zephyr/`) reaches the same destination as the FreeRTOS +wrapper — blocking BSD sockets backed by wolfIP — but through Zephyr's own +machinery, and it is worth studying because it shows the two structural pieces +of *any* OS port in a different idiom. + +**The poll task** is the same heartbeat, written with Zephyr primitives. A +dedicated thread loops over `wolfIP_poll()` under a mutex and sleeps on a +semaphore with a timeout (`port/zephyr/patches/0001-wolfip-glue-and-public-api.patch`): + +```c +static void wolfip_worker(void *arg1, void *arg2, void *arg3) +{ + while (true) { + k_mutex_lock(&wolfip_ctx.lock, K_FOREVER); + (void)wolfIP_poll(WOLFIP_STACK(), k_uptime_get()); + k_mutex_unlock(&wolfip_ctx.lock); + + (void)k_sem_take(&wolfip_ctx.wake_sem, + K_MSEC(CONFIG_WOLFIP_POLL_INTERVAL_MS)); + } +} +``` + +`k_mutex` is the core lock; `k_uptime_get()` is the millisecond clock; the +`wake_sem` with a timeout is the bounded sleep. Same three responsibilities as +section 6, different RTOS API. + +**The driver surface** is where Zephyr differs most interestingly. Rather than a +bare-metal `wolfIP_ll_dev`, the port installs a thin **L2 module** +(`NET_L2_WOLFIP`) that overlays Zephyr's existing Ethernet driver. On receive it +linearises the `net_pkt` into a flat buffer and hands the whole raw frame to +wolfIP; on transmit it delegates to the underlying driver's `send` because +wolfIP has already built the Ethernet header +(`port/zephyr/patches/0002-net-l2-wolfip-module.patch`): + +```c +static enum net_verdict wolfip_l2_recv(struct net_if *iface, struct net_pkt *pkt) +{ + ... + if (net_pkt_read(pkt, frame, len) < 0) ... /* linearise */ + wolfip_zephyr_l2_input(iface, frame, len); /* hand raw frame up */ + ... +} + +static int wolfip_l2_send(struct net_if *iface, struct net_pkt *pkt) +{ + ret = net_l2_send(api->send, net_if_get_device(iface), iface, pkt); + ... +} +``` + +This is the same idea as `ll->poll`/`ll->send` — *"copy one complete raw +Ethernet frame in or out"* — expressed against Zephyr's driver model instead of +registers. The application then uses ordinary `socket()`/`recv()`/`send()` via +Zephyr's socket-offload framework, which the port backs with `wolfIP_sock_*`. + +One wolfIP-specific subtlety the Zephyr port documents and that any +event-driven port should heed: wolfIP fires its socket callbacks from +`wolfIP_poll()`, and a listening socket processes one connection at a time. The +port therefore calls `wolfIP_poll()` synchronously after each received frame +and pre-accepts inside the listener callback, so a fast peer's final ACK does +not strand a half-open connection. If your OS port routes RX through an event +queue, make sure a poll cycle runs promptly after each frame. + +--- + +## 8. RTOS locking rules + +These rules apply to every OS port, FreeRTOS, Zephyr, or your own: + +- Hold the core mutex while calling **any** `wolfIP_sock_*` function. +- Hold the core mutex while calling `wolfIP_poll()`. +- **Never** hold the core mutex while blocking on a semaphore/event — release + it first, then wait. +- Keep wolfIP callbacks short: record events and wake a task, nothing more. +- Do not call a blocking socket wrapper from inside a wolfIP callback. +- Arm the wait (drain the semaphore, register the callback) **while locked**, + so no event slips through between the failed call and the wait. +- Protect the public FD table consistently, and clear a socket's callback + before freeing its table slot. +- Pick one error convention — BSD `-1` + errno, or raw wolfIP negatives — and + apply it everywhere. + +--- + +## 9. Porting checklist + +**Driver** +- [ ] `poll`/`send` implemented per the section 4.1 contract; neither blocks. +- [ ] DMA: OWN-bit hand-off ordered with barriers; tail pointer kicked last. +- [ ] DMA on a cached core: clean-before-MAC-read, invalidate-before-CPU-read; + buffers cache-line aligned and in DMA-reachable RAM. +- [ ] PHY discovered by MDIO scan; MAC speed/duplex follow negotiation. +- [ ] `mac`, `ifname`, `mtu`, `poll`, `send` set before first poll. + +**Random** +- [ ] `wolfIP_getrandom()` backed by a real entropy source. + +**Bare metal** +- [ ] `wolfIP_init_static()` / `wolfIP_init()` called; IP config set. +- [ ] `wolfIP_poll()` called in the main loop with a millisecond clock. + +**RTOS** +- [ ] One poll task running `wolfIP_poll()` under the core mutex, bounded sleep. +- [ ] One core mutex around every core entry. +- [ ] Public FD table with one wakeup primitive per socket (if exposing BSD + sockets). +- [ ] `-WOLFIP_EAGAIN` converted to wait-and-retry on the right event bits. +- [ ] Callbacks only wake tasks; never block while holding the lock. + +--- + +## 10. Common porting pitfalls + +**Blocking inside a driver callback.** `poll`/`send` run inline in +`wolfIP_poll()`. Returning `0`/`-WOLFIP_EAGAIN` is how you say "later"; a busy- +wait stalls the whole stack. + +**Setting the DMA OWN bit before the buffer is visible.** Write the descriptor +body, barrier, *then* the OWN doorbell. Otherwise the MAC DMAs stale data. + +**Forgetting cache maintenance on a cached core.** Without clean/invalidate the +CPU and MAC see different memory: TX sends stale bytes, RX reads stale OWN bits. +During bring-up, mark the DMA region non-cacheable to isolate this. + +**Assuming the wrong descriptor variant.** Enhanced vs. normal Synopsys +descriptors put control bits in different words and wrap the ring differently. +Confirm against a known-good reference driver for your exact silicon. + +**Assuming PHY address 0.** It is strapped per board; scan for it. + +**Holding the core lock while sleeping.** The classic RTOS deadlock — the poll +task can never run to deliver the event you are waiting for. Unlock, then wait. + +**Doing socket I/O in a callback.** Callbacks fire from `wolfIP_poll()` with the +lock held. Wake a task and let *it* do the I/O. + +**Accepting only once per readable event.** A listener-readable event can cover +several pending connections; accept in a loop until `-WOLFIP_EAGAIN`. + +**A weak `wolfIP_getrandom()`.** Predictable TCP sequence numbers and ephemeral +ports are a real vulnerability. Wire real entropy before shipping. diff --git a/docs/tftp_howto.md b/docs/tftp_howto.md new file mode 100644 index 00000000..f6ff364e --- /dev/null +++ b/docs/tftp_howto.md @@ -0,0 +1,394 @@ +# TFTP How-To + +This guide shows how to use the wolfIP **TFTP** module (`src/tftp/`): how to +build it, how to run a client (for example to pull a firmware image) and a +server, and how to wire its callback-driven, allocation-free core to wolfIP UDP +sockets. + +It is a getting-started document. The authoritative API is `src/tftp/wolftftp.h`; +the worked examples come from `src/port/stm32h563/tftp_client_demo.c` (a +firmware-download client) and `src/test/test_tftp_interop.c` (a server tested +against Linux `tftp-hpa`). + +## Table of Contents + +- [1. What the TFTP module is](#1-what-the-tftp-module-is) +- [2. Building with TFTP](#2-building-with-tftp) +- [3. Architecture: callbacks, not blocking calls](#3-architecture-callbacks-not-blocking-calls) +- [4. The transport callback (UDP send)](#4-the-transport-callback-udp-send) +- [5. The I/O callbacks (storage)](#5-the-io-callbacks-storage) +- [6. Writing a client](#6-writing-a-client) +- [7. Writing a server](#7-writing-a-server) +- [8. Protocol options: blksize, timeout, windowsize, tsize](#8-protocol-options-blksize-timeout-windowsize-tsize) +- [9. Firmware download pattern (hash + verify)](#9-firmware-download-pattern-hash--verify) +- [10. Interop testing with tftp-hpa](#10-interop-testing-with-tftp-hpa) +- [11. Error codes and troubleshooting](#11-error-codes-and-troubleshooting) + +--- + +## 1. What the TFTP module is + +`src/tftp/` is a reusable, self-contained TFTP engine (RFC 1350 plus the option +extensions RFC 2347/2348/2349 and the windowsize extension RFC 7440). It +implements both a **client** and a **multi-session server**, and like the rest +of wolfIP it performs **zero dynamic allocation** — all state lives in +caller-provided `struct wolftftp_client` / `struct wolftftp_server` objects. + +The module is deliberately decoupled from the socket layer. It never calls +`wolfIP_sock_*` itself; instead you give it: + +- a **transport callback** that sends a UDP datagram, and +- a set of **I/O callbacks** that open/read/write/close your storage. + +That makes the same engine usable over wolfIP, over POSIX sockets, or over any +other UDP transport. It also means the firmware-download path can stream bytes +straight into flash through your `write` callback, with an optional running hash +and a final verify step — no buffering of the whole image. + +## 2. Building with TFTP + +TFTP is opt-in. The default `config.h` sets `WOLFIP_ENABLE_TFTP 0`; enable it at +build time. With the top-level `Makefile`: + +```sh +make WOLFIP_ENABLE_TFTP=1 +``` + +This globs `src/tftp/*.c` into the shared library, static library, and +top-level executable, and defines `-DWOLFIP_ENABLE_TFTP=1`. The CMake build +globs `src/tftp/*.c` into the `wolfip` and `tcpip` targets with +`CONFIGURE_DEPENDS`, so no manual file list is needed there either. + +Tunable compile-time limits (override in `config.h` or via `-D`), from +`wolftftp.h`: + +| Macro | Default | Meaning | +|-------|---------|---------| +| `WOLFTFTP_PORT` | 69 | Well-known server port. | +| `WOLFTFTP_DEFAULT_BLKSIZE` | 512 | Block size before option negotiation. | +| `WOLFTFTP_MAX_BLKSIZE` | 1428 | Largest negotiable block (fits one Ethernet frame). | +| `WOLFTFTP_MAX_WINDOWSIZE` | 8 | Largest negotiable window (RFC 7440). | +| `WOLFTFTP_DEFAULT_TIMEOUT_S` | 1 | Per-block retransmit timeout. | +| `WOLFTFTP_MAX_RETRIES` | 5 | Retries before a transfer fails. | +| `WOLFTFTP_MAX_FILENAME` | 128 | Maximum request filename length. | +| `WOLFTFTP_SERVER_MAX_SESSIONS` | 4 | Concurrent server transfers. | +| `WOLFTFTP_SERVER_PORT_BASE` | 20000 | First ephemeral transfer port. | + +## 3. Architecture: callbacks, not blocking calls + +A TFTP transfer is driven by two engine entry points you call from your normal +poll loop, plus the callbacks the engine calls back into: + +```text + your UDP socket + │ ▲ + recvfrom() │ │ transport.send() (you send the datagram) + ▼ │ + wolftftp_client_receive() │ + │ │ + ┌─────┴──────────┴─────┐ + │ wolftftp engine │── io.open/read/write/close ──▶ storage + └─────────┬────────────┘── io.hash_update/verify ────▶ (optional) + │ + wolftftp_client_poll(now_ms) (drives timeouts/retransmits) +``` + +The loop is always the same shape: + +1. `recvfrom()` on your UDP socket; for each datagram call + `wolftftp_*_receive(...)` to feed it to the engine. +2. Call `wolftftp_*_poll(now_ms)` once per loop iteration to service timeouts + and retransmissions. +3. The engine calls your `transport.send` to put bytes on the wire and your + `io.*` callbacks to touch storage. + +Nothing blocks: `receive` and `poll` return immediately, so the TFTP engine +co-operates with the rest of `wolfIP_poll()` on a single thread. + +## 4. The transport callback (UDP send) + +The engine hands you a fully-formed TFTP datagram and a destination; you put it +on the wire. With wolfIP this is one `wolfIP_sock_sendto()` +(`src/port/stm32h563/tftp_client_demo.c`): + +```c +static int demo_udp_send(void *arg, uint16_t local_port, + const struct wolftftp_endpoint *remote, const uint8_t *buf, uint16_t len) +{ + struct wolfIP_sockaddr_in dst; + int ret; + + (void)arg; (void)local_port; + memset(&dst, 0, sizeof(dst)); + dst.sin_family = AF_INET; + dst.sin_port = ee16(remote->port); /* endpoint port is host order */ + dst.sin_addr.s_addr = ee32(remote->ip); /* endpoint ip is host order */ + ret = wolfIP_sock_sendto(g_stack, g_sock, buf, len, 0, + (struct wolfIP_sockaddr *)&dst, sizeof(dst)); + return (ret == (int)len) ? 0 : (ret < 0 ? ret : -1); +} +``` + +Two conventions to note: `struct wolftftp_endpoint` carries `ip`/`port` in +**host byte order**, so convert with `ee32()`/`ee16()` when filling the wolfIP +sockaddr; and the callback returns `0` on success, a negative value on failure. + +## 5. The I/O callbacks (storage) + +`struct wolftftp_io_ops` decouples the protocol from where bytes live. You +implement the subset your role needs (a download client needs `write`; a server +serving files needs `read`; both need `open`/`close`): + +```c +struct wolftftp_io_ops { + wolftftp_open_cb open; /* open(name, is_write, *size_hint, **handle) */ + wolftftp_read_cb read; /* server -> client: produce file bytes */ + wolftftp_write_cb write; /* client <- server: consume file bytes */ + wolftftp_hash_update_cb hash_update; /* optional: running hash of payload */ + wolftftp_verify_cb verify; /* optional: final integrity/size check */ + wolftftp_close_cb close; /* finalize, status = 0 ok or WOLFTFTP_ERR_* */ + void *arg; /* opaque context passed to every callback */ +}; +``` + +A filesystem-backed server `open`/`read` (`src/test/test_tftp_interop.c`): + +```c +static int io_open(void *arg, const char *name, int is_write, + uint32_t *size_hint, void **handle) +{ + struct tftp_file_ctx *ctx = (struct tftp_file_ctx *)arg; + struct stat st; + + ctx->fp = fopen(ctx->path, is_write ? "wb+" : "rb"); + if (ctx->fp == NULL) return -1; + if (!is_write && stat(ctx->path, &st) == 0 && size_hint != NULL) + *size_hint = (uint32_t)st.st_size; /* advertised as tsize */ + *handle = ctx->fp; + return 0; +} + +static int io_read(void *arg, void *handle, uint32_t offset, + uint8_t *buf, uint16_t max_len, uint16_t *out_len, int *is_last) +{ + FILE *fp = (FILE *)handle; + if (fseek(fp, (long)offset, SEEK_SET) != 0) return -1; + *out_len = (uint16_t)fread(buf, 1, max_len, fp); + /* Flag EOF only on a short read; a file ending on a block boundary needs + * one more 0-byte DATA block (RFC 1350), so a full read is NOT the last. */ + *is_last = (*out_len < max_len) ? 1 : 0; + return 0; +} +``` + +`read` is **offset-addressed**: the engine tells you where to read from, which +is what lets it replay a window on a retransmit without you tracking position. +The `is_last` out-param and the trailing zero-byte block are the classic TFTP +EOF subtlety — see the header comment on `wolftftp_read_cb`. + +> **Security note.** The server rejects absolute paths and any `..` component +> before calling `open`, but `name` may still contain relative subdirectories. +> Resolve it against a confined root (chroot or a fixed base directory), not the +> process cwd. + +## 6. Writing a client + +A client transfer is: create and bind a UDP socket, fill the three config +structs, `wolftftp_client_init()`, then `wolftftp_client_start_rrq()` to kick off +a read request. Condensed from `tftp_client_demo.c`: + +```c +struct wolftftp_client g_client; +struct wolftftp_transport_ops tx = {0}; +struct wolftftp_io_ops io = {0}; +struct wolftftp_transfer_cfg cfg = {0}; +struct wolftftp_endpoint server_ep; + +/* 1. UDP socket bound to a fixed local port */ +g_sock = wolfIP_sock_socket(stack, AF_INET, IPSTACK_SOCK_DGRAM, 0); +/* bind g_sock to TFTP_CLIENT_LOCAL_PORT ... */ + +/* 2. transport + storage callbacks */ +tx.send = demo_udp_send; +io.open = demo_open; +io.write = demo_write; /* download: bytes go to flash */ +io.close = demo_close; + +/* 3. transfer parameters (0 leaves a field at its built-in default) */ +cfg.local_port = TFTP_CLIENT_LOCAL_PORT; +cfg.blksize = TFTP_DEMO_BLKSIZE; +cfg.timeout_s = TFTP_DEMO_TIMEOUT_S; +cfg.windowsize = TFTP_DEMO_WINDOWSIZE; +cfg.max_retries = TFTP_DEMO_MAX_RETRIES; +cfg.max_image_size = WOLFBOOT_PARTITION_SIZE; /* hard cap, refuses bigger */ + +/* 4. initialise and start the read request */ +wolftftp_client_init(&g_client, &tx, &io, &cfg); +server_ep.ip = server_ip; /* host byte order */ +server_ep.port = WOLFTFTP_PORT; /* 69 */ +wolftftp_client_start_rrq(&g_client, &server_ep, filename); +``` + +Then drive it from your poll loop — pump received datagrams into the engine and +call `poll` for timers: + +```c +void tftp_client_demo_poll(uint32_t now_ms) +{ + struct wolfIP_sockaddr_in remote; + socklen_t rlen = sizeof(remote); + int n; + + for (;;) { + n = wolfIP_sock_recvfrom(g_stack, g_sock, g_rx_buf, sizeof(g_rx_buf), + 0, (struct wolfIP_sockaddr *)&remote, &rlen); + if (n <= 0) break; + struct wolftftp_endpoint rep = { + .ip = ee32(remote.sin_addr.s_addr), /* back to host order */ + .port = ee16(remote.sin_port) + }; + wolftftp_client_receive(&g_client, TFTP_CLIENT_LOCAL_PORT, &rep, + g_rx_buf, (uint16_t)n); + } + wolftftp_client_poll(&g_client, now_ms); +} +``` + +Poll the result with `wolftftp_client_status()`: a positive value means "in +progress," `0` means success, and a negative value is a `WOLFTFTP_ERR_*` code. + +## 7. Writing a server + +The server is the same pattern with two sockets: a **listen** socket on port 69 +for incoming RRQ/WRQ, and one or more **transfer** sockets on ephemeral ports +(each active session gets its own TID). You feed datagrams from both into +`wolftftp_server_receive()`, tagging each with the local port it arrived on so +the engine can route it to the right session. From +`src/test/test_tftp_interop.c`: + +```c +struct wolftftp_server server; +struct wolftftp_transport_ops transport = {0}; +struct wolftftp_io_ops io = {0}; +struct wolftftp_transfer_cfg cfg = {0}; + +transport.send = server_send; +io.open = io_open; +io.read = io_read; /* serve file bytes */ +io.write = io_write; /* accept uploads */ +io.close = server_io_close; +io.arg = &file_ctx; + +cfg.blksize = WOLFTFTP_DEFAULT_BLKSIZE; +cfg.timeout_s = 2; +cfg.windowsize = 1; +cfg.max_retries = 5; + +wolftftp_server_init(&server, &transport, &io, &cfg); +server.listen_port = TFTP_INTEROP_PORT; /* 69 in production */ +server.transfer_port_base = TFTP_INTEROP_TRANSFER_PORT; /* ephemeral base */ +``` + +Poll loop — drain both sockets, route by local port, then service timers: + +```c +int socks[2] = { listen_sock, transfer_sock }; +uint16_t ports[2] = { TFTP_INTEROP_PORT, TFTP_INTEROP_TRANSFER_PORT }; + +wolfIP_poll(s, now_ms()); +for (int i = 0; i < 2; i++) { + for (;;) { + int n = wolfIP_sock_recvfrom(s, socks[i], pkt, sizeof(pkt), 0, + (struct wolfIP_sockaddr *)&remote, &rlen); + if (n <= 0) break; + struct wolftftp_endpoint rep = { + .ip = ee32(remote.sin_addr.s_addr), .port = ee16(remote.sin_port) }; + wolftftp_server_receive(&server, ports[i], &rep, pkt, (uint16_t)n); + } +} +wolftftp_server_poll(&server, (uint32_t)now_ms()); +``` + +Concurrency is bounded by `WOLFTFTP_SERVER_MAX_SESSIONS`; a new request that +finds no free session slot is rejected with a TFTP error rather than queued. + +## 8. Protocol options: blksize, timeout, windowsize, tsize + +The engine implements the TFTP option extensions and negotiates them in the +RRQ/WRQ → OACK exchange: + +- **blksize (RFC 2348)** — larger blocks mean fewer round trips. Set + `cfg.blksize` up to `WOLFTFTP_MAX_BLKSIZE` (1428, sized to fit one Ethernet + frame without IP fragmentation). +- **timeout (RFC 2349)** — `cfg.timeout_s` is the per-block retransmit timeout. +- **windowsize (RFC 7440)** — `cfg.windowsize > 1` lets the sender stream + several DATA blocks before waiting for an ACK, which dramatically improves + throughput on links with latency. Capped at `WOLFTFTP_MAX_WINDOWSIZE`. +- **tsize (RFC 2349)** — the transfer size. A server advertises it from the + `size_hint` your `open` returns; a download client can compare the final byte + count against the advertised `tsize` in its `verify` step. + +The negotiated values land in `struct wolftftp_negotiated` inside the +client/server object. Set a `cfg` field to `0` to keep the built-in default for +that option. + +## 9. Firmware download pattern (hash + verify) + +The reason the I/O layer exposes `hash_update` and `verify` is firmware +delivery: stream each DATA block straight into flash, fold it into a running +hash as it arrives, and validate the whole image at the end — without ever +holding the full image in RAM. The flow: + +1. `open(is_write=1)` — unlock/erase the target flash partition, stash a handle. +2. `write(offset, buf, len)` — program bytes at `offset` into flash. +3. `hash_update(buf, len)` — feed the same bytes to a streaming hash (optional). +4. `verify(total_size)` — compare `total_size` against the advertised `tsize`, + finalize the hash / signature check, and set the boot-update flag. +5. `close(status)` — lock flash; `status == 0` means the transfer succeeded. + +`cfg.max_image_size` is a hard ceiling: the engine refuses a transfer whose +advertised or actual size would exceed it, protecting a fixed flash partition. +See `tftp_client_demo.c` for a complete STM32H5 + wolfBoot implementation, +including erase-on-demand and trailer programming. + +## 10. Interop testing with tftp-hpa + +The repository ships an interop harness that runs the wolfIP server against the +standard Linux `tftp-hpa` client (`src/test/test_tftp_interop.c`, configured by +`tools/scripts/tftpd-hpa-wolfip.conf`). The Linux client is driven with +`tftp -c get `, issuing an RRQ that the wolfIP server answers, +serving the fixture file through `io_read`. This is the recommended way to +validate option negotiation (blksize/windowsize/tsize) against a reference +implementation when you change the engine. + +## 11. Error codes and troubleshooting + +Negative return values and `close`/`verify` statuses use the `WOLFTFTP_ERR_*` +codes from `wolftftp.h`: + +| Code | Meaning | +|------|---------| +| `WOLFTFTP_ERR_IO` (-1000) | An `io.*` callback failed (open/read/write/flash). | +| `WOLFTFTP_ERR_STATE` (-1001) | Operation invalid for the current transfer state. | +| `WOLFTFTP_ERR_PACKET` (-1002) | Malformed TFTP packet. | +| `WOLFTFTP_ERR_TIMEOUT` (-1003) | Retries exhausted with no progress. | +| `WOLFTFTP_ERR_SIZE` (-1004) | Transfer exceeds `cfg.max_image_size`. | +| `WOLFTFTP_ERR_VERIFY` (-1005) | Final `verify` callback rejected the image. | +| `WOLFTFTP_ERR_UNSUPPORTED` (-1006) | Unsupported request/option. | +| `WOLFTFTP_ERR_TID` (-1007) | Datagram from an unexpected transfer ID/port. | +| `WOLFTFTP_ERR_NO_SLOT` (-1008) | Server session pool full. | + +Common issues: + +- **Transfer stalls / times out.** You are probably not calling + `wolftftp_*_poll(now_ms)` every loop iteration, or `now_ms` is not advancing — + the engine needs a monotonic millisecond clock to drive retransmits. +- **Bytes never reach storage.** The `write` callback returned non-zero, or the + client wired `io.read` instead of `io.write` (download = `write`). +- **`WOLFTFTP_ERR_TID`.** A reply arrived on the wrong port. Make sure you pass + the correct `local_port` to `*_receive()` for the socket the datagram came in + on, and that the transfer socket is bound. +- **Endianness garbage in addresses.** `struct wolftftp_endpoint` is host byte + order; convert with `ee16()`/`ee32()` at the wolfIP sockaddr boundary. +- **Server rejects a new transfer.** All `WOLFTFTP_SERVER_MAX_SESSIONS` slots + are busy; raise the limit or shorten transfers. diff --git a/docs/tls_howto.md b/docs/tls_howto.md new file mode 100644 index 00000000..6d8b8ef1 --- /dev/null +++ b/docs/tls_howto.md @@ -0,0 +1,458 @@ +# TLS (wolfSSL) How-To + +This guide shows how to run **TLS with wolfSSL on top of wolfIP sockets**: how to +build wolfIP with the wolfSSL glue, how the I/O-callback bridge lets wolfSSL read +and write through a wolfIP socket, and how to drive non-blocking client and +server handshakes from the `wolfIP_poll()` loop. + +It is a getting-started document, not a reference manual. The authoritative glue +is `src/port/wolfssl_io.c` (declared in `wolfip.h` under `WOLFSSL_WOLFIP`); the +worked examples come from `src/test/test_native_wolfssl.c` (a wolfIP TLS echo +server tested against a Linux wolfSSL client) and the bare-metal demos +`src/port/stm32h563/tls_client.c` and `src/port/stm32h563/tls_server.c`. The +wolfSSL TLS API itself (`wolfSSL_CTX_new`, `wolfSSL_connect`, `wolfSSL_read`, …) +is documented by wolfSSL; this guide only covers the wolfIP integration points. + +## Table of Contents + +- [1. What the integration provides](#1-what-the-integration-provides) +- [2. Building with wolfSSL support](#2-building-with-wolfssl-support) +- [3. The I/O callback bridge](#3-the-io-callback-bridge) +- [4. Data path: TLS over a wolfIP socket](#4-data-path-tls-over-a-wolfip-socket) +- [5. The wolfIP glue API](#5-the-wolfip-glue-api) +- [6. Setting up a TLS client](#6-setting-up-a-tls-client) +- [7. Setting up a TLS server](#7-setting-up-a-tls-server) +- [8. Non-blocking handshakes and the poll loop](#8-non-blocking-handshakes-and-the-poll-loop) +- [9. Cleanup and the static descriptor pool](#9-cleanup-and-the-static-descriptor-pool) +- [10. Troubleshooting](#10-troubleshooting) + +--- + +## 1. What the integration provides + +wolfSSL speaks TLS over an abstract byte transport: it never touches the network +directly, but instead calls a pair of application-supplied **I/O callbacks** to +move ciphertext on and off the wire. The wolfIP integration supplies that pair, +bound to a wolfIP TCP socket. The result is plain TLS (any version your wolfSSL +build supports — the examples use TLS 1.3) running on top of `wolfIP_sock_*` +streams, with no extra abstraction layer. + +For applications migrating from lwIP's ALTCP-over-TLS, this is the wolfIP-side +replacement point: instead of an `altcp` allocator that wraps TLS under a PCB, +you open an ordinary wolfIP socket and attach a `WOLFSSL` object to it. See +`docs/migrating_from_lwIP.md` §9 for the lwIP-to-wolfIP mapping; this document +goes deeper on the concrete wiring. + +What the glue does: + +- Registers wolfIP send/recv callbacks on a `WOLFSSL_CTX` so every `WOLFSSL` + object created from that context reads and writes through wolfIP. +- Binds an individual `WOLFSSL` session to a specific wolfIP socket fd. +- Translates wolfIP's non-blocking `-WOLFIP_EAGAIN` into wolfSSL's + `WANT_READ`/`WANT_WRITE`, and treats every other failure as a fatal close. + +What it does **not** do: it does not manage certificates, sockets, or the TLS +state machine for you. You still create the `WOLFSSL_CTX`, load certs/keys, open +and connect/accept the wolfIP socket, and drive the handshake. The glue is only +the transport bridge. + +## 2. Building with wolfSSL support + +The integration is gated by the **`WOLFSSL_WOLFIP`** macro and lives in one +source file, `src/port/wolfssl_io.c`, which you compile in and link against +`-lwolfssl`. + +With the top-level `Makefile`, the wolfSSL test targets already define the flag +and link the glue. For example the wolfIP TLS echo-server test (`Makefile`): + +```make +build/test-wolfssl: CFLAGS += -Wno-cpp -DWOLFSSL_DEBUG -DWOLFSSL_WOLFIP +build/test-wolfssl: $(OBJ) build/test/test_native_wolfssl.o build/port/wolfssl_io.o \ + build/certs/server_key.o build/certs/ca_cert.o build/certs/server_cert.o + $(CC) $(CFLAGS) -o $@ ... -lwolfssl ... +``` + +The CMake build mirrors this (`CMakeLists.txt`): it `find_package(wolfssl)`, +compiles `src/port/wolfssl_io.c` into the target, sets +`-DWOLFSSL_WOLFIP`, and `target_link_libraries(... wolfssl)`. + +To add TLS to your own build: + +1. Build and install wolfSSL first (the examples use a TLS 1.3 configuration). +2. Compile `src/port/wolfssl_io.c` together with your application. +3. Add `-DWOLFSSL_WOLFIP` to the wolfIP/application `CFLAGS` so the declarations + in `wolfip.h` are exposed. +4. Link with `-lwolfssl`. + +When `WOLFSSL_WOLFIP` is defined, `wolfip.h` pulls in the wolfSSL headers and +declares the three glue functions (`wolfip.h`): + +```c +#ifdef WOLFSSL_WOLFIP + ... + #include + int wolfSSL_SetIO_wolfIP(WOLFSSL* ssl, int fd); + int wolfSSL_SetIO_wolfIP_CTX(WOLFSSL_CTX *ctx, struct wolfIP *s); + void wolfSSL_CleanupIO_wolfIP(WOLFSSL* ssl); +#endif /* WOLFSSL_WOLFIP */ +``` + +One compile-time knob, `MAX_WOLFIP_CTX` (default 8, in `src/port/wolfssl_io.c`), +sizes the static pools used internally — see §9. + +## 3. The I/O callback bridge + +wolfSSL calls a *receive* callback when it needs more ciphertext and a *send* +callback when it has ciphertext to emit. The glue implements both as thin +wrappers over `wolfIP_sock_recv()` / `wolfIP_sock_send()` +(`src/port/wolfssl_io.c`): + +```c +static int wolfIP_io_recv(WOLFSSL* ssl, char* buf, int sz, void* ctx) +{ + struct wolfip_io_desc *desc = (struct wolfip_io_desc *)ctx; + int ret; + (void)ssl; + + if (!desc || !desc->stack) + return WOLFSSL_CBIO_ERR_GENERAL; + + ret = wolfIP_sock_recv(desc->stack, desc->fd, buf, sz, 0); + /* Only -WOLFIP_EAGAIN means "would block" ... A -1 is the "not + * established" / torn-down case and must be reported as a fatal close. */ + if (ret == -WOLFIP_EAGAIN) + return WOLFSSL_CBIO_ERR_WANT_READ; + if (ret <= 0) + return WOLFSSL_CBIO_ERR_CONN_CLOSE; + return ret; +} +``` + +The send callback is symmetric: `-WOLFIP_EAGAIN` (TX buffer full, nothing +queued) maps to `WOLFSSL_CBIO_ERR_WANT_WRITE`, any other non-positive return +maps to `WOLFSSL_CBIO_ERR_CONN_CLOSE`, and a positive return is the byte count. + +The `ctx` argument is a `struct wolfip_io_desc { int fd; struct wolfIP *stack; }` +— the per-session binding of a wolfSSL object to one wolfIP socket on one stack. +This is the crux of the non-blocking model: **the glue never blocks.** When the +socket has no data (or no TX room), it returns `WANT_READ`/`WANT_WRITE` so +wolfSSL unwinds, and you re-drive it on the next poll iteration (§8). + +The error mapping is deliberate. `-WOLFIP_EAGAIN` is the *only* "try again +later" signal; any other negative or zero return is the connection being gone, +and is reported as a hard close so wolfSSL does not spin forever retrying a dead +socket and leaking its session. + +## 4. Data path: TLS over a wolfIP socket + +```text + app: wolfSSL_write(ssl, plaintext) + │ (TLS record layer encrypts) + ▼ + wolfIP_io_send ──▶ wolfIP_sock_send(stack, fd, ciphertext) + │ │ + │ -WOLFIP_EAGAIN ─▶ WANT_WRITE ▼ (queued; flushed by wolfIP_poll) + ▲ TCP / IP / Ethernet ──▶ wire + │ + app: wolfSSL_read(ssl, plaintext) + │ (TLS record layer decrypts) + ▼ + wolfIP_io_recv ◀── wolfIP_sock_recv(stack, fd, ciphertext) + │ ▲ + │ -WOLFIP_EAGAIN ─▶ WANT_READ │ + wire ──▶ Ethernet / IP / TCP +``` + +The application sees plaintext via `wolfSSL_read`/`wolfSSL_write`; wolfIP sees +ciphertext via ordinary `wolfIP_sock_recv`/`wolfIP_sock_send`. Actual transmit +and receive progress on the socket happens inside `wolfIP_poll()`, exactly as +for a non-TLS socket — TLS adds the record layer on top but changes nothing +about how the byte stream is pumped. + +## 5. The wolfIP glue API + +Three functions, all in `src/port/wolfssl_io.c`: + +| Function | When to call | Effect | +|----------|--------------|--------| +| `wolfSSL_SetIO_wolfIP_CTX(ctx, stack)` | once, after `wolfSSL_CTX_new`, before any `wolfSSL_new` | Installs the wolfIP send/recv callbacks on the CTX and records which `struct wolfIP *` stack this CTX uses. | +| `wolfSSL_SetIO_wolfIP(ssl, fd)` | once per session, after `wolfSSL_new`, before the handshake | Binds this `WOLFSSL` object to wolfIP socket `fd` (resolving the stack from the CTX) by setting its read/write I/O contexts. | +| `wolfSSL_CleanupIO_wolfIP(ssl)` | on every teardown path, before `wolfSSL_free` | Releases the static descriptor slot allocated by `wolfSSL_SetIO_wolfIP()`. | + +Return values: `wolfSSL_SetIO_wolfIP_CTX` returns `0`. `wolfSSL_SetIO_wolfIP` +returns `0` on success and `-1` on a bad argument or an exhausted descriptor pool +(it returns `WOLFSSL_CBIO_ERR_GENERAL` if the CTX has no registered stack — i.e. +you forgot the `_CTX` call). Always check it for `0`. + +The order matters and is the same for client and server: + +```c +wolfSSL_SetIO_wolfIP_CTX(ctx, stack); /* once per CTX */ +... +ssl = wolfSSL_new(ctx); +wolfSSL_SetIO_wolfIP(ssl, sockfd); /* once per WOLFSSL session */ +``` + +`wolfSSL_SetIO_wolfIP_CTX` must precede `wolfSSL_SetIO_wolfIP`, because the +per-session call looks up the stack that was registered against the CTX. + +## 6. Setting up a TLS client + +A client is: create a TLS-client `WOLFSSL_CTX`, configure verification and CAs, +register the wolfIP callbacks on the CTX, open and connect a wolfIP TCP socket, +then per connection create a `WOLFSSL` object, bind it to the fd, and drive +`wolfSSL_connect()` to completion. Condensed from +`src/port/stm32h563/tls_client.c`: + +```c +/* 1. Library + client context (TLS 1.3) */ +wolfSSL_Init(); +client.ctx = wolfSSL_CTX_new(wolfTLSv1_3_client_method()); + +/* 2. Verification policy. The demo disables it for testing; in production + * load a CA and keep verification on. */ +wolfSSL_CTX_set_verify(client.ctx, WOLFSSL_VERIFY_NONE, NULL); +/* production: wolfSSL_CTX_load_verify_buffer(ctx, ca_der, ca_der_len, + * SSL_FILETYPE_ASN1); and WOLFSSL_VERIFY_PEER */ + +/* 3. Register the wolfIP I/O callbacks on the CTX */ +wolfSSL_SetIO_wolfIP_CTX(client.ctx, stack); +``` + +Then per connection, open and connect a non-blocking wolfIP socket +(`tls_client.c`): + +```c +client.fd = wolfIP_sock_socket(client.stack, AF_INET, IPSTACK_SOCK_STREAM, 0); + +memset(&addr, 0, sizeof(addr)); +addr.sin_family = AF_INET; +addr.sin_port = ee16(port); +addr.sin_addr.s_addr = ee32(client.server_ip); + +ret = wolfIP_sock_connect(client.stack, client.fd, + (struct wolfIP_sockaddr *)&addr, sizeof(addr)); +/* -WOLFIP_EAGAIN means the connect is in progress: keep polling */ +``` + +Once the TCP connection is established, create the session, optionally set SNI, +bind it to the socket, and run the handshake (`tls_client.c`): + +```c +client.ssl = wolfSSL_new(client.ctx); + +/* Server Name Indication, required by most public servers */ +wolfSSL_UseSNI(client.ssl, WOLFSSL_SNI_HOST_NAME, host, host_len); + +ret = wolfSSL_SetIO_wolfIP(client.ssl, client.fd); /* bind to the wolfIP fd */ +if (ret != 0) { /* setup error */ } + +/* Drive the handshake non-blocking (see section 8) */ +ret = wolfSSL_connect(client.ssl); +if (ret != WOLFSSL_SUCCESS) { + int err = wolfSSL_get_error(client.ssl, ret); + if (err == WOLFSSL_ERROR_WANT_READ || err == WOLFSSL_ERROR_WANT_WRITE) { + /* handshake in progress: return and retry on the next poll cycle */ + } +} +``` + +After `wolfSSL_connect()` returns `WOLFSSL_SUCCESS`, use `wolfSSL_read()` / +`wolfSSL_write()` for application data — same `WANT_READ`/`WANT_WRITE` handling +applies. + +## 7. Setting up a TLS server + +A server adds an accept loop: a listening wolfIP socket, and one `WOLFSSL` +object per accepted connection. Create a TLS-server `WOLFSSL_CTX`, register the +wolfIP callbacks on it, load the certificate and private key, then `bind`, +`listen`, and accept. From `src/test/test_native_wolfssl.c`: + +```c +server_ctx = wolfSSL_CTX_new(wolfTLSv1_3_server_method()); + +/* Register the wolfIP I/O callbacks on the CTX */ +wolfSSL_SetIO_wolfIP_CTX(server_ctx, s); + +/* Load the server certificate and private key (DER here; PEM also works) */ +wolfSSL_CTX_use_certificate_buffer(server_ctx, server_der, server_der_len, + SSL_FILETYPE_ASN1); +wolfSSL_CTX_use_PrivateKey_buffer(server_ctx, server_key_der, server_key_der_len, + SSL_FILETYPE_ASN1); + +listen_fd = wolfIP_sock_socket(s, AF_INET, IPSTACK_SOCK_STREAM, 0); +wolfIP_register_callback(s, listen_fd, server_cb, s); +/* ... wolfIP_sock_bind() + wolfIP_sock_listen() ... */ +``` + +On a readable event on the listening socket, accept and bind a fresh `WOLFSSL` +object to the new fd (`test_native_wolfssl.c`, `server_cb`): + +```c +if ((fd == listen_fd) && (event & CB_EVENT_READABLE) && (client_fd == -1)) { + client_fd = wolfIP_sock_accept((struct wolfIP *)arg, listen_fd, NULL, NULL); + if (client_fd > 0) { + server_ssl = wolfSSL_new(server_ctx); + wolfSSL_SetIO_wolfIP(server_ssl, client_fd); + /* The first wolfSSL_read() drives the handshake; an explicit + * wolfSSL_accept() is optional. */ + } +} +``` + +`test_native_wolfssl.c` lets the first `wolfSSL_read()` trigger the handshake +implicitly. The bare-metal demo `src/port/stm32h563/tls_server.c` instead drives +an explicit `wolfSSL_accept()` from a per-client state machine, which is the +clearer pattern for multiple concurrent clients (`tls_server.c`, +`tls_client_cb`): + +```c +case TLS_CLIENT_STATE_HANDSHAKE: + ret = wolfSSL_accept(client->ssl); + if (ret == WOLFSSL_SUCCESS) { + client->state = TLS_CLIENT_STATE_CONNECTED; + } else { + err = wolfSSL_get_error(client->ssl, ret); + if (err != WOLFSSL_ERROR_WANT_READ && + err != WOLFSSL_ERROR_WANT_WRITE) { + /* real failure: tear the client down */ + tls_client_free(client); + } + /* WANT_READ/WANT_WRITE: handshake continues on the next callback */ + } + break; +``` + +In `tls_server.c` each accepted client gets its own slot, its own `WOLFSSL` +object, and its own per-fd wolfIP callback (`wolfIP_register_callback(stack, +client_fd, tls_client_cb, client)`), so several TLS sessions can be in flight at +once on a single stack. + +## 8. Non-blocking handshakes and the poll loop + +This is the crux of the integration. wolfIP is single-threaded and +event-driven: nothing blocks, and all socket progress happens inside +`wolfIP_poll()`. wolfSSL's handshake and data calls must therefore run in +non-blocking mode and be re-driven until they finish. + +The contract, end to end: + +1. The glue's recv/send callbacks return `WANT_READ`/`WANT_WRITE` whenever the + socket would block (`wolfIP_sock_recv`/`send` returned `-WOLFIP_EAGAIN`). +2. wolfSSL therefore returns from `wolfSSL_connect`/`accept`/`read`/`write` with + `wolfSSL_get_error()` equal to `WOLFSSL_ERROR_WANT_READ` or + `WOLFSSL_ERROR_WANT_WRITE` instead of completing. +3. Your code must treat those two as "**not an error, retry later**": return to + the event loop, call `wolfIP_poll()` (which actually moves bytes), and call + the same wolfSSL function again on the next readable/writable event. + +So the loop shape is always: + +```text + for (;;) { + now_ms = monotonic_ms(); + wolfIP_poll(stack, now_ms); /* moves ciphertext, fires callbacks */ + /* in the socket callback (or your state machine): + * ret = wolfSSL_connect/accept/read/write(ssl, ...) + * if WANT_READ/WANT_WRITE -> just return, retry next iteration + * if SUCCESS -> advance state + * else -> error, tear down + */ + } +``` + +The `tls_client_poll()` state machine in `src/port/stm32h563/tls_client.c` is the +canonical shape: `CONNECTING` polls `wolfIP_sock_connect` until it stops +returning `-WOLFIP_EAGAIN`, then `HANDSHAKE` calls `wolfSSL_connect` and stays in +that state while the error is `WANT_READ`/`WANT_WRITE`, then `CONNECTED` does +`wolfSSL_read`/`wolfSSL_write`. The same `WANT_*` rule applies to application +data, not just the handshake: + +```c +ret = wolfSSL_read(client.ssl, client.rx_buf, sizeof(client.rx_buf) - 1); +if (ret > 0) { + /* got plaintext */ +} else { + err = wolfSSL_get_error(client.ssl, ret); + if (err == WOLFSSL_ERROR_ZERO_RETURN) { + /* peer sent close_notify: clean shutdown */ + } else if (err != WOLFSSL_ERROR_WANT_READ) { + /* real error */ + } + /* WANT_READ: nothing to read yet, try again next poll */ +} +``` + +> **Note.** Because the glue maps a non-`-WOLFIP_EAGAIN` failure to +> `WOLFSSL_CBIO_ERR_CONN_CLOSE`, a wolfSSL call that returns an error whose +> `wolfSSL_get_error()` is *neither* `WANT_READ`/`WANT_WRITE` *nor* +> `ZERO_RETURN` means the underlying wolfIP socket is gone — close and free the +> session rather than retrying. + +## 9. Cleanup and the static descriptor pool + +`wolfSSL_SetIO_wolfIP()` allocates a slot from a static array +(`io_descs[MAX_WOLFIP_CTX]` in `src/port/wolfssl_io.c`) to hold the +`{fd, stack}` binding for the session. That slot is **not** freed by +`wolfSSL_free()`. You must release it explicitly with +`wolfSSL_CleanupIO_wolfIP()` on **every** teardown path, before `wolfSSL_free`, +or the pool leaks one slot per connection and is exhausted after +`MAX_WOLFIP_CTX` sessions. + +The correct order (`src/port/stm32h563/tls_server.c`, `tls_client_free`): + +```c +if (client->ssl) { + wolfSSL_shutdown(client->ssl); + wolfSSL_CleanupIO_wolfIP(client->ssl); /* release the io_descs[] slot */ + wolfSSL_free(client->ssl); + client->ssl = NULL; +} +if (client->fd >= 0) { + wolfIP_sock_close(server.stack, client->fd); + client->fd = -1; +} +``` + +`MAX_WOLFIP_CTX` (default 8) caps both the number of registered `WOLFSSL_CTX` +stacks and the number of *simultaneously live* TLS sessions. If you need more +concurrent connections, raise it at compile time and ensure the cleanup call is +on every path (handshake failure, read error, normal close). + +## 10. Troubleshooting + +**`wolfSSL_SetIO_wolfIP()` returns `WOLFSSL_CBIO_ERR_GENERAL`.** The CTX has no +registered stack — you called it before `wolfSSL_SetIO_wolfIP_CTX(ctx, stack)`, +or on a CTX that was never registered. Register the CTX first. + +**`wolfSSL_SetIO_wolfIP()` returns `-1`.** Either `ssl` is NULL / `fd < 0`, or +the static descriptor pool is full. The pool fills when sessions are torn down +without `wolfSSL_CleanupIO_wolfIP()` (§9) — audit every close path — or you have +more than `MAX_WOLFIP_CTX` live sessions; raise the limit. + +**The handshake never completes / stalls in `WANT_READ` or `WANT_WRITE`.** You +are not re-driving wolfSSL after polling. `WANT_READ`/`WANT_WRITE` is normal and +expected on every non-blocking handshake; the call must be retried on the next +`wolfIP_poll()` iteration when the socket is readable/writable. Make sure +`wolfIP_poll()` is being called in the loop and that `now_ms` advances — without +it no bytes ever move and the handshake hangs forever. + +**The connection dies immediately mid-handshake.** A wolfSSL error whose +`wolfSSL_get_error()` is not `WANT_READ`/`WANT_WRITE`/`ZERO_RETURN` means the +glue reported `CONN_CLOSE` — the wolfIP socket was reset or never established. +Confirm the TCP `connect`/`accept` actually completed (stop returning +`-WOLFIP_EAGAIN`) *before* the first `wolfSSL_connect`/`accept`. + +**Server: "no certificate" / handshake fails after ClientHello.** The server +CTX has no cert/key. Load both with +`wolfSSL_CTX_use_certificate_buffer()` and `wolfSSL_CTX_use_PrivateKey_buffer()` +(or the file variants) *before* accepting, and match `SSL_FILETYPE_ASN1` (DER) +vs `WOLFSSL_FILETYPE_PEM` to your buffer format. + +**Client: certificate verification fails.** Load the issuing CA with +`wolfSSL_CTX_load_verify_buffer()` and keep `WOLFSSL_VERIFY_PEER`. The +`tls_client.c` demo uses `WOLFSSL_VERIFY_NONE` for bring-up only — do not ship +that. Also set SNI with `wolfSSL_UseSNI()`; most public servers require it. + +**Link errors on `wolfSSL_*`.** `src/port/wolfssl_io.c` was not compiled with +`-DWOLFSSL_WOLFIP`, or you did not link `-lwolfssl`. See §2. diff --git a/docs/wolfguard_howto.md b/docs/wolfguard_howto.md new file mode 100644 index 00000000..a7cfb977 --- /dev/null +++ b/docs/wolfguard_howto.md @@ -0,0 +1,412 @@ +# wolfGuard How-To + +This guide shows how to use the wolfIP **wolfGuard** module (`src/wolfguard/`): +a FIPS-compliant implementation of the WireGuard VPN protocol that runs entirely +inside the wolfIP stack. It covers how to build it, how to provision keys and +peers, how a tunnel is established, how traffic flows through it, and how to +interoperate with the Linux kernel wolfGuard module for testing. + +It is a getting-started document, not a reference manual. The authoritative API +is `src/wolfguard/wolfguard.h`; the worked examples come from +`src/test/test_wolfguard_loopback.c` (two wolfIP stacks back-to-back) and +`src/test/test_wolfguard_interop.c` plus +`tools/scripts/test-interop-wolfguard.sh` (wolfIP against the kernel module). + +## Table of Contents + +- [1. What wolfGuard is](#1-what-wolfguard-is) +- [2. Building with wolfGuard](#2-building-with-wolfguard) +- [3. Mental model: the wg0 interface and the data path](#3-mental-model-the-wg0-interface-and-the-data-path) +- [4. The public API](#4-the-public-api) +- [5. Keys, peers and allowed IPs](#5-keys-peers-and-allowed-ips) +- [6. Step by step: bringing up a tunnel](#6-step-by-step-bringing-up-a-tunnel) +- [7. How a handshake is established and how traffic flows](#7-how-a-handshake-is-established-and-how-traffic-flows) +- [8. Interop testing against the kernel wolfGuard module](#8-interop-testing-against-the-kernel-wolfguard-module) +- [9. Troubleshooting](#9-troubleshooting) + +--- + +## 1. What wolfGuard is + +wolfGuard is a native wolfIP driver implementing the WireGuard tunnel protocol +on top of wolfSSL/wolfCrypt FIPS-certified primitives. It is WireGuard with the +cryptography replaced by FIPS-approved equivalents, so the construction string +is `Noise_IKpsk2_SECP256R1_AesGcm_SHA256` rather than the original +`Noise_IKpsk2_25519_ChaChaPoly_BLAKE2s` (from `src/wolfguard/wolfguard.h`): + +| WireGuard primitive | wolfGuard FIPS replacement | +|---|---| +| Curve25519 | ECDH with SECP256R1 (P-256) | +| ChaCha20-Poly1305 | AES-256-GCM | +| BLAKE2s | SHA-256 | +| BLAKE2s-HMAC | HMAC-SHA-256 | + +wolfGuard peers interoperate with other wolfGuard instances, including the +[wolfGuard kernel module](https://github.com/wolfssl/wolfguard). + +Supported: + +- The full Noise IKpsk2 1-RTT handshake (initiation / response / cookie / data + messages — `WG_MSG_INITIATION`, `WG_MSG_RESPONSE`, `WG_MSG_COOKIE`, + `WG_MSG_DATA`). +- An optional pre-shared key (PSK) per peer for the IKpsk2 mixing step. +- Static, allocation-free state: a fixed peer table (`WOLFGUARD_MAX_PEERS`), a + flat allowed-IPs table (`WOLFGUARD_MAX_ALLOWED_IPS`), and a per-peer replay + window (`WOLFGUARD_COUNTER_WINDOW` bits). +- Cookie-based DoS mitigation (mac1/mac2 + cookie reply) when under load. +- The WireGuard timer state machine: rekey-on-time/messages, reject-after-time, + persistent keepalive, and handshake retry with jitter. +- Per-peer staging of plaintext packets while a handshake is in flight + (`WOLFGUARD_STAGED_PACKETS`). + +**Not** supported / out of scope: interoperability with upstream Curve25519 +WireGuard; IPv6 inner traffic (the TX path parses an IPv4 header only); dynamic +allocation; and IKE-style negotiation of the cipher suite (the suite is fixed by +the FIPS construction). + +## 2. Building with wolfGuard + +wolfGuard requires a wolfSSL built with wolfGuard support: + +```sh +./configure --enable-wolfguard +make +sudo make install +``` + +In the wolfIP tree, wolfGuard is a separately-compiled module. The Makefile +defines its sources and flags (`Makefile`): + +```make +WOLFGUARD_CFLAGS = -DWOLFGUARD -Wno-cpp +WOLFGUARD_SRC := src/wolfguard/wolfguard.c \ + src/wolfguard/wg_noise.c \ + src/wolfguard/wg_crypto.c \ + src/wolfguard/wg_cookie.c \ + src/wolfguard/wg_allowedips.c \ + src/wolfguard/wg_packet.c \ + src/wolfguard/wg_timers.c +``` + +The whole module is wrapped in `#ifdef WOLFGUARD`, so `-DWOLFGUARD` is the master +switch. The pre-wired Makefile targets build and exercise it: + +```sh +make unit-wolfguard # unit tests +make test-wolfguard-loopback # two-stack loopback integration test +make bench-wolfguard # micro-benchmarks +make test-wolfguard-interop # interop binary (driven by the script in §8) +``` + +The unit and loopback tests also have AddressSanitizer / UBSan variants +(`unit-wolfguard-asan`, `unit-wolfguard-ubsan`, +`test-wolfguard-loopback-asan`, `test-wolfguard-loopback-ubsan`). + +### Compile-time configuration + +These tunables default in `src/wolfguard/wolfguard.h` and can be overridden in +`config.h` or via `-D`: + +| Macro | Default | Meaning | +|-------|---------|---------| +| `WOLFGUARD_MAX_PEERS` | 8 | Max peers per device (`struct wg_device.peers[]`). | +| `WOLFGUARD_MAX_ALLOWED_IPS` | 32 | Max allowed-IP table entries. | +| `WOLFGUARD_STAGED_PACKETS` | 16 | Plaintext packets queued per peer during a handshake. | +| `WOLFGUARD_COUNTER_WINDOW` | 1024 | Anti-replay window size, in bits. | + +Fixed protocol constants (also from `wolfguard.h`) follow the FIPS suite: +`WG_PUBLIC_KEY_LEN` is 65 (uncompressed P-256 point), `WG_PRIVATE_KEY_LEN` is 32, +`WG_SYMMETRIC_KEY_LEN` is 32 (AES-256), `WG_AUTHTAG_LEN` is 16 (AES-GCM tag), and +`WG_HASH_LEN` is 32 (SHA-256). + +## 3. Mental model: the wg0 interface and the data path + +wolfGuard plugs into wolfIP as an **L3 virtual interface** (conventionally +`wg0`). `wolfguard_init()` claims one of the stack's interface slots, marks it +`non_ethernet`, and installs send/poll callbacks (from +`src/wolfguard/wolfguard.c`). It also opens one wolfIP UDP socket bound to the +listen port — that socket is the *outer* transport that carries encrypted +WireGuard messages between endpoints. + +Two interfaces are therefore in play: + +- A **physical interface** (Ethernet or another `non_ethernet` link) carrying the + outer UDP/IP traffic to the remote endpoint. +- The **wg0 interface** carrying *inner* plaintext IP traffic. Applications send + to tunnel IPs (e.g. `10.0.0.2`) through ordinary `wolfIP_sock_*` calls; wolfIP + routes those packets out of `wg0`, where wolfGuard encrypts them. + +```text + app sendto(10.0.0.2) ─▶ wolfIP routing ─▶ wg0.send ─▶ wolfguard_output() + │ allowed-IP lookup + ▼ + wg_packet_send() ── encrypt ──┐ + ▼ + outer UDP socket ◀───────────────────── wolfIP_sock_sendto + │ (to peer endpoint) + ── wire (outer IP/UDP) ─────┘ + + wire ─▶ outer UDP socket ─▶ wg_udp_callback() ─▶ wg_packet_receive() + │ decrypt + replay check + ▼ + wolfIP_recv_ex() into wg0 ─▶ app recv() +``` + +Two entry points drive the module from your loop, alongside `wolfIP_poll()`: + +- **RX is push-based.** When an outer UDP datagram arrives, wolfIP invokes the + registered `wg_udp_callback`, which drains the socket and calls + `wg_packet_receive()`. Decrypted inner packets are injected straight into `wg0` + via `wolfIP_recv_ex()`. The `wg0` poll callback returns `0` (no spontaneous + data) by design — there is nothing to poll because all RX is push-based. +- **Timers are poll-based.** You call `wolfguard_poll(dev, now_ms)` once per loop + iteration; it updates the device clock, evaluates the `under_load` flag, and + ticks the timer state machine (`wg_timers_tick()`). + +The application never sees WireGuard framing — it speaks plain UDP/TCP to tunnel +IPs and wolfGuard does the rest transparently inside the poll loop. + +## 4. The public API + +All public functions are declared in `src/wolfguard/wolfguard.h` and implemented +in `src/wolfguard/wolfguard.c`: + +```c +int wolfguard_init(struct wg_device *dev, struct wolfIP *stack, + unsigned int wg_if_idx, uint16_t listen_port); +int wolfguard_set_private_key(struct wg_device *dev, + const uint8_t *private_key); +int wolfguard_add_peer(struct wg_device *dev, + const uint8_t *public_key, + const uint8_t *preshared_key, + uint32_t endpoint_ip, uint16_t endpoint_port, + uint16_t persistent_keepalive); +int wolfguard_add_allowed_ip(struct wg_device *dev, int peer_idx, + uint32_t ip, uint8_t cidr); +void wolfguard_poll(struct wg_device *dev, uint64_t now_ms); +int wolfguard_output(struct wg_device *dev, const uint8_t *packet, size_t len); +void wolfguard_destroy(struct wg_device *dev); +``` + +- `wolfguard_init` zeroes `dev`, initialises its RNG, configures `wg_if_idx` as + the `wg0` L3 interface (and sets its MTU to `LINK_MTU - 60` to leave room for + the outer IP/UDP/WireGuard overhead), opens the outer UDP socket, binds it to + `listen_port`, and registers the RX callback. Returns `0` on success, `-1` on + failure. +- `wolfguard_set_private_key` stores the 32-byte private key and derives the + device's 65-byte public key (`wg_pubkey_from_private`). It must be called + before adding peers; calling it again rotates the identity and drops live + session keys and staged packets for all peers (preserving each peer's PSK). +- `wolfguard_add_peer` registers a peer by its 65-byte public key and returns the + peer index (`>= 0`) or `-1` if the table is full. `preshared_key` may be `NULL` + (no PSK). `endpoint_ip` / `endpoint_port` are in **network byte order**. + `persistent_keepalive` is in seconds (`0` disables it). +- `wolfguard_add_allowed_ip` binds an IP/CIDR range to a peer in the allowed-IPs + table. `ip` is in **network byte order**, `cidr` is the prefix length 0–32. +- `wolfguard_output` is the `wg0` send hook: it parses the inner IPv4 header, + looks up the destination IP in the allowed-IPs table to find the peer, and + hands the packet to `wg_packet_send()`. You normally do not call it directly — + wolfIP routing invokes it through the interface `send` callback. +- `wolfguard_poll` drives the per-device timers; call it each loop after + `wolfIP_poll()`. +- `wolfguard_destroy` zeroes all key material, closes the outer socket, and frees + the RNG. + +The lower-level helpers (Noise handshake in `wg_noise.c`, crypto in +`wg_crypto.c`, cookie/DoS in `wg_cookie.c`, allowed-IPs in `wg_allowedips.c`, +packet processing in `wg_packet.c`, timers in `wg_timers.c`) are also declared in +`wolfguard.h`, but the seven functions above are the supported integration +surface; the rest are driven internally. + +## 5. Keys, peers and allowed IPs + +A wolfGuard identity is a P-256 key pair. The **private key** is 32 raw bytes +(`WG_PRIVATE_KEY_LEN`); the **public key** is a 65-byte uncompressed SECP256R1 +point (`WG_PUBLIC_KEY_LEN`). Generate a private key from the device RNG and let +wolfGuard derive the public key: + +```c +uint8_t priv[WG_PRIVATE_KEY_LEN]; +wc_RNG_GenerateBlock(&rng, priv, WG_PRIVATE_KEY_LEN); +wolfguard_set_private_key(&dev, priv); +/* dev.static_public now holds the 65-byte public key to share with peers */ +``` + +Each peer is identified by its public key and an **endpoint** — the outer IP and +UDP port where its WireGuard messages are sent. The **allowed-IPs** table maps +inner tunnel IP ranges to peers and serves two purposes (mirroring upstream +WireGuard): on TX it selects which peer encrypts a packet, and on RX it is the +cryptokey-routing source-address check. + +Endpoint and allowed-IP addresses are passed in **network byte order**. In the +wolfIP examples that means wrapping the host-order helper with `ee32()`/`ee16()`: + +```c +/* from src/test/test_wolfguard_loopback.c */ +peer_idx = wolfguard_add_peer(&wg_dev_a, wg_dev_b.static_public, NULL, + ee32(MAKE_IP4(192,168,1,2)), /* endpoint IP */ + ee16(51820), 0); /* endpoint port */ +wolfguard_add_allowed_ip(&wg_dev_a, peer_idx, + ee32(MAKE_IP4(10,0,0,0)), 24); /* 10.0.0.0/24 */ +``` + +## 6. Step by step: bringing up a tunnel + +The canonical end-to-end setup is `setup_loopback_stacks()` in +`src/test/test_wolfguard_loopback.c`, which connects two wolfIP stacks +back-to-back. Condensed to one side: + +```c +/* 1. Bring up wolfIP and a physical (outer) interface as usual */ +wolfIP_init(&stack_a); +ll = wolfIP_getdev_ex(&stack_a, TEST_PHYS_IF); +ll->non_ethernet = 1; +ll->poll = phys_a_poll; +ll->send = phys_a_send; +wolfIP_ipconfig_set_ex(&stack_a, TEST_PHYS_IF, + MAKE_IP4(192,168,1,1), MAKE_IP4(255,255,255,0), 0); + +/* 2. Initialise wolfGuard on a second interface (wg0), listen UDP port 51820 */ +wolfguard_init(&wg_dev_a, &stack_a, TEST_WG_IF, 51820); + +/* 3. Provision our identity */ +wc_RNG_GenerateBlock(&test_rng, priv_a, WG_PRIVATE_KEY_LEN); +wolfguard_set_private_key(&wg_dev_a, priv_a); + +/* 4. Give the wg0 interface its tunnel IP */ +wolfIP_ipconfig_set_ex(&stack_a, TEST_WG_IF, + MAKE_IP4(10,0,0,1), MAKE_IP4(255,255,255,0), 0); + +/* 5. Add the remote peer (public key + endpoint) and its allowed inner range */ +peer_idx = wolfguard_add_peer(&wg_dev_a, wg_dev_b.static_public, NULL, + ee32(MAKE_IP4(192,168,1,2)), ee16(51820), 0); +wolfguard_add_allowed_ip(&wg_dev_a, peer_idx, + ee32(MAKE_IP4(10,0,0,0)), 24); +``` + +The peer on the other end (`wg_dev_b`) installs the mirror image: it adds A's +public key, A's endpoint, and the same allowed-IP range. Public keys must be +exchanged out of band — each side passes the *other's* `static_public`. + +Then run both pollers in your event loop, advancing a monotonic millisecond +clock (`pump_stacks()` in the test): + +```c +wolfIP_poll(&stack_a, now); +wolfguard_poll(&wg_dev_a, now); +``` + +From here, application sockets bound to the `wg0` tunnel IPs send and receive +normally — the test sends a UDP datagram from `10.0.0.1` to `10.0.0.2:7777` with +an ordinary `wolfIP_sock_sendto()`, and it arrives decrypted on the far side. + +## 7. How a handshake is established and how traffic flows + +wolfGuard handshakes are **demand-driven**. There is no explicit "connect" call; +the first packet to a peer triggers the Noise exchange (from +`src/wolfguard/wg_packet.c`, `wg_packet_send()`): + +1. An application sends to a tunnel IP. wolfIP routes the inner packet out + `wg0`, calling `wolfguard_output()`, which resolves the peer via the + allowed-IPs table and calls `wg_packet_send()`. +2. If the peer has no valid session keypair, the plaintext packet is **staged** + (queued, up to `WOLFGUARD_STAGED_PACKETS`) and, if the handshake is idle, a + `WG_MSG_INITIATION` is created (`wg_noise_create_initiation`) and sent to the + peer's endpoint over the outer UDP socket. +3. The responder consumes the initiation, replies with `WG_MSG_RESPONSE`, and + both sides derive the session keypair (`wg_noise_begin_session`) — the 1-RTT + IKpsk2 handshake. +4. Once the session exists, staged packets are flushed + (`wg_packet_send_staged`), encrypted as `WG_MSG_DATA` with AES-256-GCM, and + sent. Subsequent packets encrypt directly. +5. Inbound `WG_MSG_DATA` is decrypted, replay-checked against the per-peer + counter window (`wg_counter_validate`), and injected into `wg0`. + +The timer state machine (`wg_timers_tick`, driven by `wolfguard_poll`) handles +the rest of the lifecycle: rekeying after `WG_REKEY_AFTER_TIME` (120 s) or +`WG_REKEY_AFTER_MESSAGES`, rejecting sessions older than `WG_REJECT_AFTER_TIME` +(180 s), retrying handshakes with jitter up to `WG_MAX_HANDSHAKE_ATTEMPTS`, and +sending keepalives when `persistent_keepalive` is configured. Under high +handshake load the device sets `under_load` and engages cookie-based DoS +mitigation (mac1/mac2 validation and `WG_MSG_COOKIE` replies in +`wg_cookie.c`). + +## 8. Interop testing against the kernel wolfGuard module + +The repository ships an end-to-end interop harness that validates bidirectional +tunnel connectivity between wolfIP and the Linux kernel wolfGuard module +(`tools/scripts/test-interop-wolfguard.sh`, driving the +`src/test/test_wolfguard_interop.c` binary). It requires root, kernel headers, +and network access (it clones and builds wolfSSL and the wolfGuard kernel +module): + +```sh +sudo ./tools/scripts/test-interop-wolfguard.sh +``` + +What the script does: + +1. Builds wolfSSL twice — the userspace library (`--enable-wolfguard`) and the + kernel module (`--enable-wolfguard --enable-linuxkm`) — plus the `wg-fips` + user tool and the `wolfguard.ko` kernel module from + `github.com/wolfssl/wolfguard`. +2. Generates fresh P-256 keys with `wg-fips genkey` / `wg-fips pubkey`, then + base64-decodes the wolfIP private key (32 bytes) and the kernel public key (65 + bytes) into raw `.bin` files for the wolfIP binary. +3. Builds `make test-wolfguard-interop` and launches it, which creates a TUN + device for the outer transport. +4. Creates a kernel `wg0` interface (`ip link add wg0 type wolfguard`), + configures it with `wg-fips set` (private key, listen port, the wolfIP peer + public key, endpoint, and `allowed-ips`), assigns the tunnel IP, and starts a + `socat` UDP echo server on the tunnel. + +It then runs a two-phase test (matching the README): + +1. **wolfIP initiates** — wolfIP creates the handshake, sends a UDP probe through + the tunnel, and verifies the echo reply. +2. **Kernel initiates** — wolfIP resets its wolfGuard state; the kernel recreates + `wg0` for a fresh handshake and sends data through the tunnel, which wolfIP + verifies. + +The default addressing the script uses: kernel `wg0` `10.0.0.1/24` listening on +51820, wolfIP `wg0` `10.0.0.2/24` listening on 51821, outer TUN +`192.168.77.1/2`, echo port 7777. Exit code `0` means all interop phases passed. + +The `test-wolfguard-interop` binary takes the two key files as arguments and is +otherwise self-contained: + +```sh +./build/test/test-wolfguard-interop +``` + +The two-stack loopback test (`make test-wolfguard-loopback`) is the quicker +inner-loop check — it needs no kernel module and exercises the full TX/RX path, +rekey, key-zeroing on reconnect, and the cookie/DoS path between two in-process +wolfIP stacks. + +## 9. Troubleshooting + +- **The handshake never completes / no data arrives.** Confirm you call + `wolfguard_poll(dev, now_ms)` every loop iteration with a *monotonic, + advancing* millisecond clock — the handshake retry and rekey logic is entirely + timer-driven. In the tests, time advances on every `pump_stacks()` step. +- **`wolfguard_output` returns `-1` / packets are silently dropped on TX.** The + destination tunnel IP did not match any allowed-IP entry, or the matched peer + is not active. Check that `wolfguard_add_allowed_ip` covers the inner + destination and that `ip`/`cidr` are correct (IP in network byte order). The TX + path also rejects anything that is not a well-formed IPv4 packet (first nibble + must be `4`, length `>= 20`). +- **Endpoint or allowed-IP addresses look byte-swapped.** `endpoint_ip`, + `endpoint_port`, and the allowed-IP `ip` are all **network byte order**. + Convert host-order values with `ee32()` / `ee16()` as the examples do. +- **Nothing happens when a UDP message arrives on the listen port.** The RX + callback drains *all* queued datagrams per invocation; if you reimplement the + socket plumbing, replicate that loop — reading a single packet leaves stale + messages in the FIFO that corrupt later handshakes (see `wg_udp_callback` in + `wolfguard.c`). +- **It will not talk to stock WireGuard.** Expected — wolfGuard uses the FIPS + suite (P-256 / AES-GCM / SHA-256) and is interoperable only with other + wolfGuard peers (kernel module or another wolfIP instance). +- **Inner MTU surprises.** `wolfguard_init` sets the `wg0` MTU to + `LINK_MTU - 60` to reserve the outer IP/UDP/WireGuard overhead; size inner + payloads accordingly.