Skip to content

Feat/memory optimize for sync molnix#2745

Merged
szabozoltan69 merged 2 commits into
developfrom
feat/memory-optimize-for-sync-molnix
Jun 16, 2026
Merged

Feat/memory optimize for sync molnix#2745
szabozoltan69 merged 2 commits into
developfrom
feat/memory-optimize-for-sync-molnix

Conversation

@thenav56

@thenav56 thenav56 commented May 22, 2026

Copy link
Copy Markdown
Member

Addresses

  • Molnix provisioning issue in k8s

Changes

  • Replace normal array with generator to limit memory usages
  • Lower the kube cronjob memory requirement

Note

This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.


Headline numbers

Replay A/B (memray-instrumented, two DBs, simultaneous) and live A/B (real Molnix API, two DBs, simultaneous)

Metric Baseline (measured) Modified (measured) Δ (measured) Remark
Peak RAM (replay, memray) 6.035 GB 560.3 MB −91 %, ~11× smaller Cleanest A/B since memray reads the actual heap. The ratio is what matters; absolute numbers include some fixture-loader overhead in both.
Peak RAM (live, docker stats RSS) ~5.60 GiB (observed) ~360 MiB (observed) ~16× smaller docker stats samples once per ~60 s so it can miss a brief spike, but the trend across many samples is consistent.
Total allocations (replay) 10,908,529 14,304,384 +31 % Modified has more small allocs because each record goes through json.dumps/loads extra times when cached.
Total bytes allocated, sum over time (replay) 117.9 GB 152.6 GB +29 % Sum-over-time inflated by the fixture loader's per-page rescans; production would be much smaller.
Max single allocation (replay) 3.845 MB 3.845 MB unchanged One JSON page worth — same either way.
Runtime (local cache only, no Molnix calls) 91 s 134 s +47 % Replay-only timing: no network at all, both runs read JSONL fixture files from local disk. The gap is the pure cost of _CachedPaginated's write+read tee with everything else being microseconds. In production the cache write happens during the network read, so its cost overlaps with network wait.
Runtime (live Molnix API) 2196 s (36m 36s) 2215 s (36m 55s) +19 s (+0.9 %) Within noise. Confirms the prediction: when network round-trips dominate the run, the cache I/O is invisible.

Re-iteration counts (_CachedPaginated)

Each _CachedPaginated.__iter__ call emits a log.warning with a counter. The counts below are measured from log output of a replay run — first iteration streams from source (writes JSONL cache to /tmp/); subsequent iterations replay from the cache.

Example log output from a verification replay:

WARNING  _CachedPaginated[deployments] first iteration #1 streaming from source -> /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[positions]   first iteration #1 streaming from source -> /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[positions]   re-iteration #2 from cache /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[positions]   re-iteration #3 from cache /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #2 from cache /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #3 from cache /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #4 from cache /tmp/molnix_paginated_xxx.jsonl
Wrapper First iteration (from API) Re-iterations (from cache) Total __iter__ calls Call sites in sync_molnix.py
_CachedPaginated[deployments] 1 3 4 get_unique_tags (L78) → [d["id"] for d in ...] (L260) → [get_go_event(d["tags"]) for d in ...] (L266) → for md in molnix_deployments: (L295)
_CachedPaginated[positions] 1 2 3 get_unique_tags (L83) → [p["id"] for p in ...] (L462) → for position in molnix_positions: (L468)
Total 2 5 7

With the cache, each re-iteration is a sequential read of a /tmp/molnix_paginated_*.jsonl file (deployments ~734 MB, positions ~890 MB).

@thenav56 thenav56 force-pushed the feat/memory-optimize-for-sync-molnix branch from b5b90ff to 37f1a26 Compare May 22, 2026 12:36
thenav56 added 2 commits June 16, 2026 09:51
NOTE: This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.
@thenav56 thenav56 force-pushed the feat/memory-optimize-for-sync-molnix branch from 37f1a26 to 5586a5f Compare June 16, 2026 05:25
@thenav56 thenav56 marked this pull request as ready for review June 16, 2026 05:27
@thenav56

Copy link
Copy Markdown
Member Author

Hey @szabozoltan69, When you have time, let's merge and deploy to staging if we will not have a live release soon from staging.

It would be nice to test this in staging for few days.

@szabozoltan69

Copy link
Copy Markdown
Contributor

Thanks so much @thenav56 for this great and gap filling fix! I deploy it to Staging now.

@szabozoltan69 szabozoltan69 merged commit 5f09713 into develop Jun 16, 2026
3 checks passed
@szabozoltan69 szabozoltan69 deleted the feat/memory-optimize-for-sync-molnix branch June 16, 2026 06:07
@szabozoltan69

Copy link
Copy Markdown
Contributor

Deployed to Staging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants