feat(igc): enable tx checksum offload by ytakano · Pull Request #694 · tier4/awkernel

ytakano · 2026-06-08T09:35:56Z

Description

Add ActiveChecksumContext enum to Tx to track and reuse context descriptors
Add igc_tx_ctx_setup() to program the Advanced TX Context Descriptor for IPv4, TCP/IPv4, and UDP/IPv4 checksum offload
Update igc_send() to prepend a context descriptor when offload context changes
Advertise CSUM_IPv4, CSUM_TCPv4, and CSUM_UDPv4 capabilities
Enable TCP checksum offload in if_net and update the capability comment

How was this PR tested?

Notes for reviewers

- Add ActiveChecksumContext enum to Tx to track and reuse context descriptors - Add igc_tx_ctx_setup() to program the Advanced TX Context Descriptor for IPv4, TCP/IPv4, and UDP/IPv4 checksum offload - Update igc_send() to prepend a context descriptor when offload context changes - Advertise CSUM_IPv4, CSUM_TCPv4, and CSUM_UDPv4 capabilities - Enable TCP checksum offload in if_net and update the capability comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>

Copilot

Pull request overview

This PR adds IPv4/TCPv4/UDPv4 TX checksum offload support for the Intel IGC (I225/I226) driver by introducing advanced TX context descriptors and advertising the corresponding network capabilities so the stack can request hardware checksum insertion.

Changes:

Add tracking for the “active” TX checksum context and emit an Advanced TX Context Descriptor when the offload context changes.
Advertise CSUM_IPv4, CSUM_TCPv4, and CSUM_UDPv4 capabilities in the IGC driver and enable TCP checksum offload in the network interface capabilities mapping.
Extend IGC descriptor definitions/structures to support Advanced Context Descriptors.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
awkernel_lib/src/net/if_net.rs	Enables TCP checksum offload reporting to smoltcp when the device advertises `CSUM_TCPv4`.
awkernel_drivers/src/pcie/intel/igc/igc_defines.rs	Adds constants needed to build Advanced TX Context Descriptors for checksum offload.
awkernel_drivers/src/pcie/intel/igc/igc_base.rs	Extends the advanced TX descriptor union to include the context descriptor view.
awkernel_drivers/src/pcie/intel/igc.rs	Implements context descriptor setup, context reuse tracking, capability advertisement, and send-path changes to prepend context descriptors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Validated on real I225 hardware (UDP echo + TCP both work, valid on-wire checksums confirmed via tcpdump on the peer). - IP header length in the advanced context descriptor was programmed in 32-bit words instead of bytes (`header_len()` returns IHL). With the wrong length the NIC mis-placed the checksums and IP packets never egressed correctly; shift IHL left by 2 to get bytes (matches the igb driver). - Seed the IPv4 pseudo-header checksum into the L4 checksum field before DMA. smoltcp leaves it zero when TX checksum is offloaded, so without the seed the NIC computed segment-only checksums and every TCP/UDP checksum was wrong on the wire. Add `igc_pseudo_cksum()` and write it at the L4 csum offset. - Restore `active_checksum_context` when `igc_send()` bails out due to insufficient descriptors, so the software context cannot desync from what the hardware actually saw (addresses the review's state-desync concern). - Reclaim context descriptors in `igc_txeof()`: they carry no DD writeback, so the per-descriptor DD scan would stall at the first one and eventually wedge TX once the ring filled. Treat a non-DD descriptor whose successor is done as a consumed context descriptor. - Derive `Copy, Clone` for `ActiveChecksumContext` to support the snapshot. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Factor the L4 checksum seed `(usize, u16)` into a `L4CksumSeed` type alias so `igc_tx_ctx_setup`'s return type is no longer flagged by clippy::type_complexity. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

atsushi421 · 2026-06-10T20:22:25Z

+        let mut vlan_macip_lens = 0u32;
+        let mut offload = false;
+
+        vlan_macip_lens |= (core::mem::size_of::<EtherHeader>() as u32) << IGC_ADVTXD_MACLEN_SHIFT;


Inline 802.1Q VLAN tags pass through extract_headers() and are still classified as IPv4/TCP/UDP, but MACLEN is hardcoded to size_of::<EtherHeader>() (14) and l4_off uses ETHER_HDR_LEN. For a frame carrying an 802.1Q tag the real L2 header is 18 bytes, so the NIC computes the L3/L4 checksum over the wrong range and the pseudo-header seed lands at the wrong offset.

Suggest deriving the L2 length from ext.ether_vlan.is_some() (14 vs 18) and using it for both vlan_macip_lens MACLEN and l4_off.

atsushi421 · 2026-06-10T20:22:25Z


+    /// Compute the IPv4 pseudo-header checksum that must be seeded into the L4
+    /// checksum field so the NIC can complete TCP/UDP checksum offload.
+    fn igc_pseudo_cksum(ip: &Ip) -> u16 {


igc_pseudo_cksum() and the seed offset computation assume the L4 header lives at ETHER_HDR_LEN + iphlen, but for any IP fragment (non-zero fragment offset or MF=1) the L4 header is not there. The seed is written into payload bytes and the NIC then "completes" a checksum over the wrong field, producing broken packets on the wire.

Suggest bailing out of offload when (ip.ip_off.swap_bytes() & 0x3fff) != 0 and falling back to the software path.

atsushi421 · 2026-06-10T20:22:25Z

+
+        vlan_macip_lens |= (core::mem::size_of::<EtherHeader>() as u32) << IGC_ADVTXD_MACLEN_SHIFT;
+
+        let NetworkHdr::Ipv4(ip) = &ext.network else {


tx_packet_header_flags() sets TCP_CSUM_OUT/UDP_CSUM_OUT for any TCP/UDP transport regardless of network family, but the new capability is CSUM_TCPv4/CSUM_UDPv4 only and this function early-returns for non-IPv4. An IPv6+TCP/UDP packet thus bypasses both smoltcp's software checksum (skipped because cap.checksum.{tcp,udp}.tx() == false) and the NIC offload, and leaves the wire with a zero/garbage L4 checksum.

Either gate the flags on IPv4 in tx_packet_header_flags(), or split the capability bits per family in if_net.rs.

atsushi421 · 2026-06-10T20:22:25Z

+
+        let ext = match extract_headers(ether_frame.data) {
+            Ok(e) => e,
+            Err(_) => return Ok((0, base_olinfo, None)),


If extract_headers() returns Err, this returns (0, base_olinfo, None) and the data descriptor is posted without offload. But smoltcp's TX layer, observing cap.checksum.{tcp,udp}.tx() == false, did not compute the L4 checksum either, so the frame goes out with an unfilled checksum field.

Treat this as a hard error (e.g. Err(IgcDriverErr::InvalidPacket)) and drop the packet, mirroring how igb/ixgbe handle it.

atsushi421 · 2026-06-10T20:22:25Z

+                // always immediately followed by its data descriptor, and a single queue
+                // completes in order, so if the next descriptor is done this one is a
+                // consumed context descriptor and can be reclaimed; otherwise stop.
+                let next = if idx + 1 == ring_len { 0 } else { idx + 1 };


The look-ahead lacks an idx == tx.next_avail_desc guard. When the ring is empty (next_to_clean == next_avail_desc), if the next slot still has a stale DD bit set (e.g. on a fresh queue before the first reclaim cycle), the loop treats idx as a consumed context descriptor and advances next_to_clean past next_avail_desc, corrupting the igc_desc_unused() accounting.

Add if idx == tx.next_avail_desc { break; } at the top of the loop, and guard the look-ahead so it only fires when next != tx.next_avail_desc.

atsushi421 · 2026-06-10T20:22:25Z

+    ) -> Result<(usize, u32, Option<L4CksumSeed>), IgcDriverErr> {
+        let base_olinfo = (ether_frame.data.len() as u32) << IGC_ADVTXD_PAYLEN_SHIFT;
+
+        let ext = match extract_headers(ether_frame.data) {


extract_headers() is parsed on every send, walking Ethernet/VLAN/IP/TCP/UDP with several unsafe reads, even when csum_flags is empty and no offload is requested. On the fast path this is the dominant per-packet cost added by this PR.

Early-return at the top: if ether_frame.csum_flags.is_empty() { return Ok((0, base_olinfo, None)); }.

atsushi421 · 2026-06-10T20:22:25Z

+        type_tucmd_mlhl |= IGC_ADVTXD_DTYP_CTXT | IGC_TXD_CMD_DEXT;
+        let mss_l4len_idx = l4len << IGC_ADVTXD_L4LEN_SHIFT;
+
+        let desc = &mut tx.tx_desc_ring.as_mut()[head];


igc_tx_ctx_setup writes the context descriptor and updates tx.active_checksum_context before igc_send confirms the ring has room (the in-code comment at L1142-L1143 acknowledges this), which is why the caller has to snapshot and restore saved_ctx on the bail-out path. Functionally it is fine — next_avail_desc does not move so the slot is not booked, the next igc_send overwrites the 16-byte descriptor in full, and the stale mss_l4len_idx overlapping wb.status keeps the DD bit at 0 so igc_txeof is not misled — but the side-effect boundary is fragile and the rollback is easy to forget when extending the function later.

Two cleaner options:

Make igc_tx_ctx_setup pure: take active_checksum_context by value and return (new_ctx, Option<ctx_payload>, olinfo_status, cksum_seed). igc_send commits both the descriptor write and the state update only after the room check.

Or do the conservative early check that the OpenBSD igc driver uses: at the top of igc_send, bail out if igc_desc_unused() < 2 (the maximum a single packet can consume). This removes the snapshot/restore dance with minimal churn.

ytakano and others added 2 commits June 8, 2026 18:30

cargo fmt

de5f0db

Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>

ytakano requested a review from Copilot June 8, 2026 09:53

Copilot started reviewing on behalf of ytakano June 8, 2026 09:53 View session

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Comment thread awkernel_drivers/src/pcie/intel/igc.rs Outdated

Comment thread awkernel_drivers/src/pcie/intel/igc.rs

Comment thread awkernel_drivers/src/pcie/intel/igc.rs

Comment thread awkernel_drivers/src/pcie/intel/igc.rs Outdated

ytakano and others added 2 commits June 10, 2026 06:34

ytakano marked this pull request as ready for review June 10, 2026 01:40

ytakano requested a review from atsushi421 June 10, 2026 02:21

atsushi421 reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(igc): enable tx checksum offload#694

feat(igc): enable tx checksum offload#694
ytakano wants to merge 4 commits into
tier4:mainfrom
ytakano:igc_tx_csum_offload

ytakano commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

atsushi421 Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		vlan_macip_lens \|= (core::mem::size_of::<EtherHeader>() as u32) << IGC_ADVTXD_MACLEN_SHIFT;

		let NetworkHdr::Ipv4(ip) = &ext.network else {

Conversation

ytakano commented Jun 8, 2026

Description

Related links

How was this PR tested?

Notes for reviewers

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

atsushi421 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants