Skip to content

Fix atomvm:read_priv and detect truncated packs#2340

Open
bettio wants to merge 3 commits into
atomvm:release-0.7from
bettio:fix-read_priv
Open

Fix atomvm:read_priv and detect truncated packs#2340
bettio wants to merge 3 commits into
atomvm:release-0.7from
bettio:fix-read_priv

Conversation

@bettio

@bettio bettio commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

The packbeam (.avm) scanners were unbounded and trusted section sizes,
so a lookup miss (e.g. atomvm:read_priv/2 for an unpacked file) ran off
the pack into erased flash and crashed the VM. A truncated pack failed
the same way.

Bound and validate every scan, detect truncation at load, and add a
build-time guard for the one pack known at build time. No on-disk
format change.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

bettio added 3 commits June 22, 2026 16:56
The packbeam scanners were unbounded, so a lookup miss (e.g. read_priv
for an unpacked file) ran off the pack into erased flash and crashed.
Bound and validate every scan against the stored pack size.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Reject truncated or incomplete packs (e.g. an image flashed to a
too-small partition) at load instead of failing late, validating by
tail check or bounded walk.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Fail at configure time when boot.avm does not fit its partition.

Signed-off-by: Davide Bettio <davide@uninstall.it>
@petermm

petermm commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

AMP, only for the 6e6a451 commit, pick and choose..

PR Review — Fix out-of-bounds read on avmpack lookup miss

Commit: 6e6a4517cc32600c1429d8aef84c7cb30c2a4584
Author: Davide Bettio · Mon Jun 22 2026
Reviewer: automated review (oracle-assisted), findings verified against committed code

Intent

The packbeam section scanners (avmpack_find_section_by_flag, avmpack_find_section_by_name,
avmpack_fold) were unbounded: they walked the section chain until a zero flags word, with no
knowledge of where the pack actually ends. A lookup miss — e.g. atomvm:read_priv/2 for a file
that is not in the pack — kept scanning past the last section into erased flash, eventually reading
garbage and crashing the VM.

The fix stores the pack size in struct AVMPackData, threads an avmpack_size argument through every
scanner, and introduces a single bounded/validating read_section helper that every scan now goes
through. All in-tree callers and platforms (esp32, stm32, rp2, generic_unix, emscripten, tests) are
updated to pass the size.

Verdict

Merge with follow-up changes. The central fix is sound: the section walk is now correctly bounded
and validated, which eliminates the reported OOB read. One residual edge case in the JIT-flash path
(a pack that boots but has no END marker) is only partially addressed and can still produce a bad
flash write address; it is largely pre-existing but worth closing while this code is being touched.
The remaining items are minor hardening / documentation nits.

Severity summary

# Severity Area Summary
1 Major (oracle: blocker) jit_stream_flash.c Pack with no END marker → find_first_jit_entry returns a bogus (~0) pointer; nif_jit_stream_flash_new ignores is_valid
2 Minor avmpack.c pad(int) signed cast/overflow for pathologically large name regions
3 Minor nifs.c (read_priv) Payload bounded to whole pack, not to the matched section
4 Nit avmpack.h Docs reference avmpack_check_complete / avmpack_compute_size, which don't exist at this commit

Findings

1. Major — JIT-flash append pointer can be bogus when no END marker is found

Files: globalcontext_find_first_jit_entry,
nif_jit_stream_flash_new

globalcontext_find_first_jit_entry initializes max_end_offset = NULL and only updates it when a
pack yields an END section. When avmpack_find_section_by_flag(... END ...) fails, this commit
correctly sets valid_cache = false and continues — good, the scan no longer reads OOB. But if
no pack produces an END, max_end_offset stays NULL and the function still computes:

uintptr_t max_end_offset_page = (((uintptr_t) NULL - 1) & ~(FLASH_SECTOR_SIZE - 1));
return (struct JITEntry *) (max_end_offset_page + FLASH_SECTOR_SIZE); // wraps to ~0x0

The callers mostly guard on is_valid:

  • globalcontext_find_last_jit_entry returns NULL when !is_valid
  • sys_get_cache_native_code returns false when !is_valid

…but nif_jit_stream_flash_new does not. When last_valid_entry == NULL it re-calls
globalcontext_find_first_jit_entry and uses new_entry while ignoring is_valid
(line 412-413),
then proceeds to erase/write flash at that address.

Scope / nuance (verified): the last_valid_entry == NULL branch is also the normal first-boot
path where the cache marker is still lowercase "end". In that case the END lookup succeeds,
max_end_offset is a valid pointer, and only valid_cache is false — so new_entry is correct. The
bug bites only a corrupt/truncated pack that has a valid start module (so the VM boots) but no END
marker at all
, with JIT-on-flash enabled. This is largely a pre-existing logic gap (the old code
ignored the return value entirely and would have read OOB instead), so this commit improves the
situation (bounded read) but does not fully close it — a bogus flash write is more dangerous than a
read.

Smallest fix: distinguish "END found but marker lowercase (valid append point)" from "END not
found (no valid append point)". E.g. add a found_end out-param (or return NULL) from
globalcontext_find_first_jit_entry when max_end_offset == NULL, and have
nif_jit_stream_flash_new bail / raise instead of using the pointer when the append point is unknown.


2. Minor — signed int padding can overflow for huge name regions

File: pad(int),
read_section

read_section accepts a uint32_t avmpack_size and computes
padded_name_len = (size_t) pad((int)((const char *) nul - name + 1)). Casting a size_t name
length to int and padding via signed (size + 4 - 1) is implementation-defined / UB if the NUL is
found more than ~2 GB into the name region (possible only with a multi-GB mapped pack). Real AtomVM
packs never approach this, but this is validation code that should not rely on signed overflow.

Smallest fix: do the padding in unsigned/size_t:

size_t name_len = (size_t)((const char *) nul - name) + 1;
size_t padded_name_len = (name_len + 3U) & ~(size_t) 3U;
if (padded_name_len > name_region) {
    return AVMPackSectionInvalid;
}

3. Minor — read_priv bounds the payload to the whole pack, not to the matched section

File: nif_atomvm_read_priv

The new guard file_size <= avmpack_data->size - content_offset - sizeof(uint32_t) correctly prevents
reading past the pack (which fixes the crash class). However, it does not bound the payload to the
matched section (section.size is returned in size but not used for this check). A corrupt
priv entry whose length prefix exceeds its own section payload can still return bytes from following
sections, as long as they stay inside the pack. No OOB / no crash, but it slightly undercuts the
"skip corrupt section and continue" intent.

Smallest fix: surface the section payload size from read_section/find_section_by_name and
require file_size <= payload_size - sizeof(uint32_t) (with a payload_size >= sizeof(uint32_t)
guard) before building the binary.


4. Nit — header docs reference functions that don't exist at this commit

File: avmpack_is_valid docs

The updated doc comment says to use avmpack_check_complete or avmpack_compute_size for full
validation, but those functions are not introduced until the follow-up commit (38ae415e5). At this
commit the references dangle.

Smallest fix: drop that sentence here, or move the doc change into the commit that adds those APIs.


Verified correct (no action)

  • Bounds ordering in read_section is safe. offset > avmpack_size is checked before
    avmpack_size - offset, and section_size > avmpack_size - offset avoids offset + section_size
    wraparound. The min-size-16 guard keeps header[0], header[1] and the terminator name bytes in
    bounds. (avmpack.c)
  • memchr(name, '\0', name_region) is bounded by the validated section size; the subsequent
    strcmp is therefore safe.
  • Terminator strcmp(name, "end"/"END") reads at most 4 name bytes (the literal NUL-terminates
    the comparison), which the 16-byte min-size guard guarantees in bounds.
  • END marker as a flag match for (END_OF_FILE_MASK=255, END_OF_FILE=0) preserves the old
    jit_stream_flash semantics: the old loop also matched flags == 0 at termination and returned data
    at offset + 16.
  • find_section_by_name / fold now stop at END or first invalid section — an intended behavior
    change for corrupt/truncated packs that is exactly what avoids the original miss crash.
  • rp2 over-estimated size (XIP_SRAM_BASE - MAIN_AVM, and MAIN_AVM - LIB_AVM for the lib pack)
    is not itself a scanner OOB: all reads stay within the supplied flash/XIP region, and erased flash
    (0xFFFFFFFF) parses as an invalid section header and stops the scan. (The real-size computation
    arrives in follow-up 38ae415e5.)
  • All in-tree callers were converted to the new signatures: globalcontext.c, nifs.c,
    jit_stream_flash.c, and platforms esp32/stm32/rp2/generic_unix/emscripten plus the C tests. No
    unconverted direct caller of the changed avmpack_* signatures was found. Zephyr does not call
    these directly.
  • avmpack_data_init size plumbing is consistent: emscripten now requests the file size from
    load_or_fetch_file, generic_unix uses mapped->size, esp32 uses the partition/file size,
    in-memory NIF packs use the binary size with a bin_size > UINT32_MAX guard added to
    nif_atomvm_add_avm_pack_binary.

Recommendation

Address Finding 1 (or explicitly confirm the no-END-with-JIT-enabled case is impossible on
supported targets) before relying on this on flash-JIT platforms. Findings 2–4 are safe to fold into
this change or a quick follow-up. The OOB-read fix that the commit targets is correct and well
structured.


Proposed fixes (diffs)

The diffs below apply on top of commit 6e6a4517c. Findings 1 and 2 are self-contained. Finding 3
surfaces a per-section payload size, so it touches avmpack.{c,h} and the two find_section_by_name
callers.

Fix 1 — return NULL when no append point exists, and check it in nif_jit_stream_flash_new

--- a/src/libAtomVM/jit_stream_flash.c
+++ b/src/libAtomVM/jit_stream_flash.c
@@ static struct JITEntry *globalcontext_find_first_jit_entry(GlobalContext *global, bool *is_valid)
     }
     synclist_unlock(&global->avmpack_data);
 
+    if (max_end_offset == NULL) {
+        // No pack produced an END/end marker: there is no known place to append, and the
+        // pointer math below would wrap to ~0. Report an invalid, unusable cache.
+        *is_valid = false;
+        return NULL;
+    }
+
     uintptr_t max_end_offset_page = ((((uintptr_t) max_end_offset) - 1) & ~(FLASH_SECTOR_SIZE - 1));
     *is_valid = valid_cache;
 
     TRACE("globalcontext_find_first_jit_entry: return %p\n", (void *) (max_end_offset_page + FLASH_SECTOR_SIZE));
 
     return (struct JITEntry *) (max_end_offset_page + FLASH_SECTOR_SIZE);
 }
@@ static term nif_jit_stream_flash_new(Context *ctx, int argc, term argv[])
     if (last_valid_entry == NULL) {
         // No valid entries, get the first position
         bool is_valid;
         new_entry = globalcontext_find_first_jit_entry(ctx->global, &is_valid);
+        if (IS_NULL_PTR(new_entry)) {
+            // No valid append point (e.g. truncated/corrupt pack with no END marker).
+            RAISE_ERROR(BADARG_ATOM);
+        }
     } else {
         // Get position after last valid entry
         new_entry = jit_entry_next(last_valid_entry);
     }

The normal first-boot path (lowercase "end" marker) is unaffected: there the END lookup
succeeds, so max_end_offset is non-NULL and a valid append pointer is returned. The other
callers (globalcontext_find_last_jit_entry, sys_get_cache_native_code,
globalcontext_set_cache_native_code) already guard on is_valid before dereferencing, so the
NULL return is safe for them.

Fix 2 — unsigned padding in read_section

--- a/src/libAtomVM/avmpack.c
+++ b/src/libAtomVM/avmpack.c
@@ static enum AVMPackSectionKind read_section(const void *avmpack_binary, uint32_t avmpack_size,
     size_t name_region = section_size - AVMPACK_SECTION_HEADER_SIZE;
     const void *nul = memchr(name, '\0', name_region);
     if (nul == NULL) {
         return AVMPackSectionInvalid;
     }
-    size_t padded_name_len = (size_t) pad((int) ((const char *) nul - name + 1));
+    size_t name_len = (size_t) ((const char *) nul - name) + 1;
+    size_t padded_name_len = (name_len + 3U) & ~(size_t) 3U;
     if (padded_name_len > name_region) {
         return AVMPackSectionInvalid;
     }

Fix 3 — bound the read_priv payload to the matched section, not the whole pack

Expose the section's payload size via read_section and a new optional out-param on
avmpack_find_section_by_name.

--- a/src/libAtomVM/avmpack.c
+++ b/src/libAtomVM/avmpack.c
@@ struct AVMPackSection
 {
     uint32_t size;
     uint32_t flags;
     const char *name;
     const void *data;
+    uint32_t data_size;
 };
@@ static enum AVMPackSectionKind read_section(const void *avmpack_binary, uint32_t avmpack_size,
     section->size = section_size;
     section->flags = flags;
     section->name = name;
     section->data
         = (const uint8_t *) avmpack_binary + offset + AVMPACK_SECTION_HEADER_SIZE + padded_name_len;
+    section->data_size = section_size - AVMPACK_SECTION_HEADER_SIZE - (uint32_t) padded_name_len;
 
     return AVMPackSectionRegular;
 }
@@ int avmpack_find_section_by_name(const void *avmpack_binary, uint32_t avmpack_size,
-    const char *name, const void **ptr, uint32_t *size)
+    const char *name, const void **ptr, uint32_t *size, uint32_t *content_size)
 {
     uint32_t offset = AVMPACK_SIZE;
     struct AVMPackSection section;
 
     while (read_section(avmpack_binary, avmpack_size, offset, &section) == AVMPackSectionRegular) {
         if (!strcmp(name, section.name)) {
             *ptr = section.data;
             *size = section.size;
+            if (content_size != NULL) {
+                *content_size = section.data_size;
+            }
             return 1;
         }
         offset += section.size;
     }
 
     return 0;
 }
--- a/src/libAtomVM/avmpack.h
+++ b/src/libAtomVM/avmpack.h
@@
- * @param size will be set to the file section size that has been found, if the section has not been
- * found it will not be updated.
+ * @param size will be set to the file section size that has been found, if the section has not been
+ * found it will not be updated.
+ * @param content_size if non-NULL, will be set to the section payload size (bytes after the header
+ * and padded name), used to bound reads against a corrupt in-section length prefix.
  * @returns 1 if the file section has been found, 0 otherwise.
  */
 
-int avmpack_find_section_by_name(const void *avmpack_binary, uint32_t avmpack_size,
-    const char *name, const void **ptr, uint32_t *size);
+int avmpack_find_section_by_name(const void *avmpack_binary, uint32_t avmpack_size,
+    const char *name, const void **ptr, uint32_t *size, uint32_t *content_size);
--- a/src/libAtomVM/globalcontext.c
+++ b/src/libAtomVM/globalcontext.c
@@ Module *globalcontext_load_module_from_avm(GlobalContext *global, const char *module_name)
-        if (avmpack_find_section_by_name(avmpack_data->data, avmpack_data->size, module_name,
-                &beam_module, &beam_module_size)) {
+        if (avmpack_find_section_by_name(avmpack_data->data, avmpack_data->size, module_name,
+                &beam_module, &beam_module_size, NULL)) {
             break;
         }
--- a/src/libAtomVM/nifs.c
+++ b/src/libAtomVM/nifs.c
@@ static term nif_atomvm_read_priv(Context *ctx, int argc, term argv[])
     const void *bin_data;
     uint32_t size;
+    uint32_t content_size;
@@
         bool prev_in_use = avmpack_data->in_use;
         avmpack_data->in_use = true;
-        if (avmpack_find_section_by_name(
-                avmpack_data->data, avmpack_data->size, complete_path, &bin_data, &size)) {
-            // Bound the length prefix and payload to the pack against a corrupt file size.
-            size_t content_offset
-                = (const uint8_t *) bin_data - (const uint8_t *) avmpack_data->data;
-            if (content_offset + sizeof(uint32_t) <= avmpack_data->size) {
-                uint32_t file_size = READ_32_ALIGNED((uint32_t *) bin_data);
-                if (file_size <= avmpack_data->size - content_offset - sizeof(uint32_t)) {
+        if (avmpack_find_section_by_name(avmpack_data->data, avmpack_data->size, complete_path,
+                &bin_data, &size, &content_size)) {
+            // Bound the length prefix and payload to the matched section against a corrupt file size.
+            if (content_size >= sizeof(uint32_t)) {
+                uint32_t file_size = READ_32_ALIGNED((uint32_t *) bin_data);
+                if (file_size <= content_size - sizeof(uint32_t)) {
                     free(complete_path);
                     complete_path = NULL;
                     if (UNLIKELY(memory_ensure_free_opt(
                                      ctx, TERM_BOXED_REFC_BINARY_SIZE, MEMORY_CAN_SHRINK)
                             != MEMORY_GC_OK)) {
                         avmpack_data->in_use = prev_in_use;
                         synclist_unlock(&glb->avmpack_data);
                         RAISE_ERROR(OUT_OF_MEMORY_ATOM);
                     }
                     result = term_from_const_binary(((uint8_t *) bin_data) + sizeof(uint32_t),
                         file_size, &ctx->heap, ctx->global);
                     break;
                 }
             }
             avmpack_data->in_use = prev_in_use;
         } else {
             avmpack_data->in_use = prev_in_use;
         }

avmpack_find_section_by_name is also declared in the tests/ harness expectations; the only
in-tree callers are globalcontext.c and nifs.c (both updated above). If you prefer to avoid the
signature change, an alternative is to recompute the payload bound in read_priv from
pad(strlen(complete_path) + 1) plus the section header size, but that duplicates the on-disk
layout constants into nifs.c and is more fragile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants