From 9daa16e4bd6374c7fd56412314b7b5dc5c8cdfbe Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 26 May 2026 21:58:27 +0000 Subject: [PATCH 1/6] Add thesis topic: Escaping the Hermetic Boundary New master thesis topic investigating the hermetic gap between what automated dependency prefetch tools (Hermeto, dream2nix) declare and what builds actually consume at runtime, measured via syscall tracing across npm, pip, Cargo, Go, and Maven ecosystems. https://claude.ai/code/session_01JNeeaZ1wtTqu25LVnFjoP3 --- master-thesis.md | 78 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) diff --git a/master-thesis.md b/master-thesis.md index 48bc329..a88a409 100644 --- a/master-thesis.md +++ b/master-thesis.md @@ -243,6 +243,84 @@ The stages of package installation +### Escaping the Hermetic Boundary: What Automated Dependency Prefetch Tools Miss Across Software Ecosystems +Contact: Aman Sharma + +

+Tools like Hermeto +[1] +promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. +In theory, the build then runs against a closed, auditable set of inputs. +In practice, the hermetic guarantee is partial: Hermeto addresses the declared dependency layer — what appears in lockfiles like package-lock.json, Cargo.lock, or requirements.txt — but leaves the toolchain and native dependency layer to the user. +Meanwhile, Nix +[2] +offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. +An ecosystem of automated translation tools — dream2nix, poetry2nix, cargo2nix +[3] +— attempts to generate these derivations from standard lockfiles, but their actual hermetic coverage has never been systematically measured. +

+ +

+This thesis investigates the hermetic gap: the delta between what automated hermetic build tools declare as the dependency set and what the build actually consumes at runtime. +Using syscall tracing (following the methodology of Zheng et al. +[4]), +the study will instrument builds of real-world projects across npm, pip, Cargo, Go, and Maven — first through Hermeto+container, then through auto-generated Nix derivations — and compare observed file accesses against declared inputs. +The goal is not to build a new tool but to characterize where and why existing hermetic build approaches fail, and whether Nix's stronger model actually delivers on its theoretical advantage in practice. +

+ + + +

Related Work:

+ +
    +
  1. +

    Hermeto — prefetch CLI for hermetic container builds

    +
  2. + +
  3. +

    Nix — purely functional package manager with content-addressed builds

    +
  4. + +
  5. +

    dream2nix — automated Nix derivation generation from package manager metadata

    +
  6. + +
  7. +

    Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025

    +
  8. + +
  9. +

    Lamb & Zacchiroli — Reproducible Builds: Increasing the Integrity of Software Supply Chains, IEEE Software 2021

    +
  10. + +
  11. +

    SLSA — Supply-chain Levels for Software Artifacts framework

    +
  12. + +
  13. +

    The Design Space of Lockfiles Across Package Managers, Empirical Software Engineering 2025

    +
  14. + +
  15. +

    SBOM Generation Based on Code-Level External Component Trees, Scientific Reports 2025

    +
  16. +
+ + ### Dependency Fingerprinting: Reconstructing Full Dependency Trees from Partial Observations Contact: Aman Sharma, Eric Cornelissen From 459163a4caf03f1297478cb84af66440c03dc253 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 26 May 2026 22:05:42 +0000 Subject: [PATCH 2/6] Convert hermetic boundary thesis topic to plain Markdown Rewrite from HTML blocks to pure Markdown to fix rendering issue where inline anchor links caused all section text to appear underlined in the site renderer. https://claude.ai/code/session_01JNeeaZ1wtTqu25LVnFjoP3 --- master-thesis.md | 77 ++++++++++-------------------------------------- 1 file changed, 16 insertions(+), 61 deletions(-) diff --git a/master-thesis.md b/master-thesis.md index a88a409..03f892d 100644 --- a/master-thesis.md +++ b/master-thesis.md @@ -246,79 +246,34 @@ The stages of package installation ### Escaping the Hermetic Boundary: What Automated Dependency Prefetch Tools Miss Across Software Ecosystems Contact: Aman Sharma -

-Tools like Hermeto -[1] -promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. -In theory, the build then runs against a closed, auditable set of inputs. -In practice, the hermetic guarantee is partial: Hermeto addresses the declared dependency layer — what appears in lockfiles like package-lock.json, Cargo.lock, or requirements.txt — but leaves the toolchain and native dependency layer to the user. -Meanwhile, Nix -[2] -offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. -An ecosystem of automated translation tools — dream2nix, poetry2nix, cargo2nix -[3] -— attempts to generate these derivations from standard lockfiles, but their actual hermetic coverage has never been systematically measured. -

+Tools like [Hermeto][herm1] promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. In theory, the build then runs against a closed, auditable set of inputs. In practice, the hermetic guarantee is partial: Hermeto addresses the *declared dependency layer* — what appears in lockfiles like `package-lock.json`, `Cargo.lock`, or `requirements.txt` — but leaves the *toolchain and native dependency layer* to the user. Meanwhile, [Nix][herm2] offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. An ecosystem of automated translation tools — `dream2nix`, `poetry2nix`, `cargo2nix` [3] — attempts to generate these derivations from standard lockfiles, but their actual hermetic coverage has never been systematically measured. -

-This thesis investigates the hermetic gap: the delta between what automated hermetic build tools declare as the dependency set and what the build actually consumes at runtime. -Using syscall tracing (following the methodology of Zheng et al. -[4]), -the study will instrument builds of real-world projects across npm, pip, Cargo, Go, and Maven — first through Hermeto+container, then through auto-generated Nix derivations — and compare observed file accesses against declared inputs. -The goal is not to build a new tool but to characterize where and why existing hermetic build approaches fail, and whether Nix's stronger model actually delivers on its theoretical advantage in practice. -

+This thesis investigates the *hermetic gap*: the delta between what automated hermetic build tools declare as the dependency set and what the build actually consumes at runtime. Using syscall tracing (following the methodology of Zheng et al. [4]), the study will instrument builds of real-world projects across npm, pip, Cargo, Go, and Maven — first through Hermeto+container, then through auto-generated Nix derivations — and compare observed file accesses against declared inputs. The goal is not to build a new tool but to characterize where and why existing hermetic build approaches fail, and whether Nix's stronger model actually delivers on its theoretical advantage in practice. - +**RQ2 (Escape Taxonomy):** When dependencies escape the hermetic boundary, what classes do they fall into — undeclared system libraries, build toolchain leakage, native extension bindings, or implicit platform assumptions — and which classes are addressable by the tool vs. inherent to the ecosystem? -

Related Work:

+Related Work: -
    -
  1. -

    Hermeto — prefetch CLI for hermetic container builds

    -
  2. +[1] [Hermeto — prefetch CLI for hermetic container builds](https://github.com/hermetoproject/hermeto) -
  3. -

    Nix — purely functional package manager with content-addressed builds

    -
  4. +[2] [Nix — purely functional package manager with content-addressed builds](https://nixos.org/) -
  5. -

    dream2nix — automated Nix derivation generation from package manager metadata

    -
  6. +[3] [dream2nix — automated Nix derivation generation from package manager metadata](https://github.com/nix-community/dream2nix) -
  7. -

    Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025

    -
  8. +[4] [Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025](https://ieeexplore.ieee.org/document/10703127) -
  9. -

    Lamb & Zacchiroli — Reproducible Builds: Increasing the Integrity of Software Supply Chains, IEEE Software 2021

    -
  10. +[5] [Lamb & Zacchiroli — Reproducible Builds: Increasing the Integrity of Software Supply Chains, IEEE Software 2021](https://arxiv.org/pdf/2104.06020) -
  11. -

    SLSA — Supply-chain Levels for Software Artifacts framework

    -
  12. +[6] [SLSA — Supply-chain Levels for Software Artifacts framework](https://slsa.dev/) -
  13. -

    The Design Space of Lockfiles Across Package Managers, Empirical Software Engineering 2025

    -
  14. +[7] [The Design Space of Lockfiles Across Package Managers, Empirical Software Engineering 2025](https://arxiv.org/abs/2505.04834) -
  15. -

    SBOM Generation Based on Code-Level External Component Trees, Scientific Reports 2025

    -
  16. -
+[8] [SBOM Generation Based on Code-Level External Component Trees, Scientific Reports 2025](https://www.nature.com/articles/s41598-025-29762-0) + +[herm1]: https://github.com/hermetoproject/hermeto +[herm2]: https://nixos.org/ ### Dependency Fingerprinting: Reconstructing Full Dependency Trees from Partial Observations From 2e7146d79888b17ae6619f48fa1197c0111d89cc Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 26 May 2026 22:13:52 +0000 Subject: [PATCH 3/6] Remove references [2] (Nix homepage) and [8] (SBOM paper) Also renumber remaining references accordingly and drop [herm2] reference definition. https://claude.ai/code/session_01JNeeaZ1wtTqu25LVnFjoP3 --- master-thesis.md | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/master-thesis.md b/master-thesis.md index 03f892d..2730ab3 100644 --- a/master-thesis.md +++ b/master-thesis.md @@ -246,9 +246,9 @@ The stages of package installation ### Escaping the Hermetic Boundary: What Automated Dependency Prefetch Tools Miss Across Software Ecosystems Contact: Aman Sharma -Tools like [Hermeto][herm1] promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. In theory, the build then runs against a closed, auditable set of inputs. In practice, the hermetic guarantee is partial: Hermeto addresses the *declared dependency layer* — what appears in lockfiles like `package-lock.json`, `Cargo.lock`, or `requirements.txt` — but leaves the *toolchain and native dependency layer* to the user. Meanwhile, [Nix][herm2] offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. An ecosystem of automated translation tools — `dream2nix`, `poetry2nix`, `cargo2nix` [3] — attempts to generate these derivations from standard lockfiles, but their actual hermetic coverage has never been systematically measured. +Tools like [Hermeto][herm1] promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. In theory, the build then runs against a closed, auditable set of inputs. In practice, the hermetic guarantee is partial: Hermeto addresses the *declared dependency layer* — what appears in lockfiles like `package-lock.json`, `Cargo.lock`, or `requirements.txt` — but leaves the *toolchain and native dependency layer* to the user. Meanwhile, Nix offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. An ecosystem of automated translation tools — `dream2nix`, `poetry2nix`, `cargo2nix` [2] — attempts to generate these derivations from standard lockfiles, but their actual hermetic coverage has never been systematically measured. -This thesis investigates the *hermetic gap*: the delta between what automated hermetic build tools declare as the dependency set and what the build actually consumes at runtime. Using syscall tracing (following the methodology of Zheng et al. [4]), the study will instrument builds of real-world projects across npm, pip, Cargo, Go, and Maven — first through Hermeto+container, then through auto-generated Nix derivations — and compare observed file accesses against declared inputs. The goal is not to build a new tool but to characterize where and why existing hermetic build approaches fail, and whether Nix's stronger model actually delivers on its theoretical advantage in practice. +This thesis investigates the *hermetic gap*: the delta between what automated hermetic build tools declare as the dependency set and what the build actually consumes at runtime. Using syscall tracing (following the methodology of Zheng et al. [3]), the study will instrument builds of real-world projects across npm, pip, Cargo, Go, and Maven — first through Hermeto+container, then through auto-generated Nix derivations — and compare observed file accesses against declared inputs. The goal is not to build a new tool but to characterize where and why existing hermetic build approaches fail, and whether Nix's stronger model actually delivers on its theoretical advantage in practice. **RQ1 (Hermetic Coverage):** What fraction of actual build-time and runtime dependencies are captured by automated hermetic build tools versus observed via syscall tracing, and how does this fraction vary across ecosystems? @@ -258,22 +258,17 @@ Related Work: [1] [Hermeto — prefetch CLI for hermetic container builds](https://github.com/hermetoproject/hermeto) -[2] [Nix — purely functional package manager with content-addressed builds](https://nixos.org/) +[2] [dream2nix — automated Nix derivation generation from package manager metadata](https://github.com/nix-community/dream2nix) -[3] [dream2nix — automated Nix derivation generation from package manager metadata](https://github.com/nix-community/dream2nix) +[3] [Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025](https://ieeexplore.ieee.org/document/10703127) -[4] [Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025](https://ieeexplore.ieee.org/document/10703127) +[4] [Lamb & Zacchiroli — Reproducible Builds: Increasing the Integrity of Software Supply Chains, IEEE Software 2021](https://arxiv.org/pdf/2104.06020) -[5] [Lamb & Zacchiroli — Reproducible Builds: Increasing the Integrity of Software Supply Chains, IEEE Software 2021](https://arxiv.org/pdf/2104.06020) +[5] [SLSA — Supply-chain Levels for Software Artifacts framework](https://slsa.dev/) -[6] [SLSA — Supply-chain Levels for Software Artifacts framework](https://slsa.dev/) - -[7] [The Design Space of Lockfiles Across Package Managers, Empirical Software Engineering 2025](https://arxiv.org/abs/2505.04834) - -[8] [SBOM Generation Based on Code-Level External Component Trees, Scientific Reports 2025](https://www.nature.com/articles/s41598-025-29762-0) +[6] [The Design Space of Lockfiles Across Package Managers, Empirical Software Engineering 2025](https://arxiv.org/abs/2505.04834) [herm1]: https://github.com/hermetoproject/hermeto -[herm2]: https://nixos.org/ ### Dependency Fingerprinting: Reconstructing Full Dependency Trees from Partial Observations From e8b7ba2dd939ad15a8d9d91597f40a19f6d269e5 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 27 May 2026 07:19:15 +0000 Subject: [PATCH 4/6] Update reference [3] link to mcislab.github.io --- master-thesis.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/master-thesis.md b/master-thesis.md index 2730ab3..8e87616 100644 --- a/master-thesis.md +++ b/master-thesis.md @@ -260,7 +260,7 @@ Related Work: [2] [dream2nix — automated Nix derivation generation from package manager metadata](https://github.com/nix-community/dream2nix) -[3] [Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025](https://ieeexplore.ieee.org/document/10703127) +[3] [Zheng, Adams, Hassan — On Build Hermeticity in Bazel-Based Build Systems, IEEE Software 2025](https://mcislab.github.io/publications/2025/ieeesw-shenyu.pdf) [4] [Lamb & Zacchiroli — Reproducible Builds: Increasing the Integrity of Software Supply Chains, IEEE Software 2021](https://arxiv.org/pdf/2104.06020) From c53761cb7b02b3654522374cc7a95e689f9c9ae9 Mon Sep 17 00:00:00 2001 From: Aman Sharma Date: Wed, 3 Jun 2026 10:31:25 +0200 Subject: [PATCH 5/6] Shorten hermetic boundary topic and drop RQs Removes the two explicit research questions and trims the title, shifting the framing toward the conceptual hermeticity ideas rather than methodology. Co-Authored-By: Claude Sonnet 4.6 --- master-thesis.md | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/master-thesis.md b/master-thesis.md index 8e87616..615ae0e 100644 --- a/master-thesis.md +++ b/master-thesis.md @@ -243,16 +243,12 @@ The stages of package installation -### Escaping the Hermetic Boundary: What Automated Dependency Prefetch Tools Miss Across Software Ecosystems +### Escaping the Hermetic Boundary Contact: Aman Sharma -Tools like [Hermeto][herm1] promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. In theory, the build then runs against a closed, auditable set of inputs. In practice, the hermetic guarantee is partial: Hermeto addresses the *declared dependency layer* — what appears in lockfiles like `package-lock.json`, `Cargo.lock`, or `requirements.txt` — but leaves the *toolchain and native dependency layer* to the user. Meanwhile, Nix offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. An ecosystem of automated translation tools — `dream2nix`, `poetry2nix`, `cargo2nix` [2] — attempts to generate these derivations from standard lockfiles, but their actual hermetic coverage has never been systematically measured. +Tools like [Hermeto][herm1] promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. In theory, the build runs against a closed, auditable set of inputs. In practice, the hermetic guarantee is layered and partial: Hermeto addresses the *declared dependency layer* — what appears in lockfiles like `package-lock.json`, `Cargo.lock`, or `requirements.txt` — but leaves the *toolchain and native dependency layer* to the user. Nix offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. An ecosystem of automated translation tools — `dream2nix`, `poetry2nix`, `cargo2nix` [2] — attempts to generate these derivations from standard lockfiles, but the two models rest on different assumptions about what a hermetic boundary even means. -This thesis investigates the *hermetic gap*: the delta between what automated hermetic build tools declare as the dependency set and what the build actually consumes at runtime. Using syscall tracing (following the methodology of Zheng et al. [3]), the study will instrument builds of real-world projects across npm, pip, Cargo, Go, and Maven — first through Hermeto+container, then through auto-generated Nix derivations — and compare observed file accesses against declared inputs. The goal is not to build a new tool but to characterize where and why existing hermetic build approaches fail, and whether Nix's stronger model actually delivers on its theoretical advantage in practice. - -**RQ1 (Hermetic Coverage):** What fraction of actual build-time and runtime dependencies are captured by automated hermetic build tools versus observed via syscall tracing, and how does this fraction vary across ecosystems? - -**RQ2 (Escape Taxonomy):** When dependencies escape the hermetic boundary, what classes do they fall into — undeclared system libraries, build toolchain leakage, native extension bindings, or implicit platform assumptions — and which classes are addressable by the tool vs. inherent to the ecosystem? +This thesis investigates the *hermetic gap*: the delta between what a tool declares as its dependency set and what a build actually consumes. The central question is whether Nix's stronger closure model translates into a meaningfully tighter boundary in practice, and what classes of dependencies — undeclared system libraries, toolchain leakage, native extension bindings, implicit platform assumptions — fall outside the boundary regardless of which model is used. Related Work: From 928ecfb8b34dad3431e82b34096b4b6bc886ffc3 Mon Sep 17 00:00:00 2001 From: Aman Sharma Date: Wed, 3 Jun 2026 10:33:56 +0200 Subject: [PATCH 6/6] Rename hermetic boundary topic with clearer title Co-Authored-By: Claude Sonnet 4.6 --- master-thesis.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/master-thesis.md b/master-thesis.md index 615ae0e..56a7b86 100644 --- a/master-thesis.md +++ b/master-thesis.md @@ -243,7 +243,7 @@ The stages of package installation -### Escaping the Hermetic Boundary +### Beyond Declared Dependencies: The Limits of Hermetic Build Tools Contact: Aman Sharma Tools like [Hermeto][herm1] promise hermetic container builds by prefetching all declared dependencies before network isolation kicks in. In theory, the build runs against a closed, auditable set of inputs. In practice, the hermetic guarantee is layered and partial: Hermeto addresses the *declared dependency layer* — what appears in lockfiles like `package-lock.json`, `Cargo.lock`, or `requirements.txt` — but leaves the *toolchain and native dependency layer* to the user. Nix offers a theoretically stronger model: content-addressed derivations, sandboxed builds, and a store that captures the full dependency closure including compilers and system libraries. An ecosystem of automated translation tools — `dream2nix`, `poetry2nix`, `cargo2nix` [2] — attempts to generate these derivations from standard lockfiles, but the two models rest on different assumptions about what a hermetic boundary even means.