Feat/memory optimize for sync molnix by thenav56 · Pull Request #2745 · IFRCGo/go-api

thenav56 · 2026-05-22T12:34:16Z

Addresses

Molnix provisioning issue in k8s

Changes

Replace normal array with generator to limit memory usages
Lower the kube cronjob memory requirement

Note

This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.

Headline numbers

Replay A/B (memray-instrumented, two DBs, simultaneous) and live A/B (real Molnix API, two DBs, simultaneous)

Metric	Baseline (measured)	Modified (measured)	Δ (measured)	Remark
Peak RAM (replay, memray)	6.035 GB	560.3 MB	−91 %, ~11× smaller	Cleanest A/B since memray reads the actual heap. The ratio is what matters; absolute numbers include some fixture-loader overhead in both.
Peak RAM (live, docker stats RSS)	~5.60 GiB (observed)	~360 MiB (observed)	~16× smaller	docker stats samples once per ~60 s so it can miss a brief spike, but the trend across many samples is consistent.
Total allocations (replay)	10,908,529	14,304,384	+31 %	Modified has more small allocs because each record goes through `json.dumps`/`loads` extra times when cached.
Total bytes allocated, sum over time (replay)	117.9 GB	152.6 GB	+29 %	Sum-over-time inflated by the fixture loader's per-page rescans; production would be much smaller.
Max single allocation (replay)	3.845 MB	3.845 MB	unchanged	One JSON page worth — same either way.
Runtime (local cache only, no Molnix calls)	91 s	134 s	+47 %	Replay-only timing: no network at all, both runs read JSONL fixture files from local disk. The gap is the pure cost of `_CachedPaginated`'s write+read tee with everything else being microseconds. In production the cache write happens during the network read, so its cost overlaps with network wait.
Runtime (live Molnix API)	2196 s (36m 36s)	2215 s (36m 55s)	+19 s (+0.9 %)	Within noise. Confirms the prediction: when network round-trips dominate the run, the cache I/O is invisible.

Re-iteration counts (`_CachedPaginated`)

Each _CachedPaginated.__iter__ call emits a log.warning with a counter. The counts below are measured from log output of a replay run — first iteration streams from source (writes JSONL cache to /tmp/); subsequent iterations replay from the cache.

Example log output from a verification replay:

WARNING  _CachedPaginated[deployments] first iteration #1 streaming from source -> /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[positions]   first iteration #1 streaming from source -> /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[positions]   re-iteration #2 from cache /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[positions]   re-iteration #3 from cache /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #2 from cache /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #3 from cache /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #4 from cache /tmp/molnix_paginated_xxx.jsonl

Wrapper	First iteration (from API)	Re-iterations (from cache)	Total `__iter__` calls	Call sites in `sync_molnix.py`
`_CachedPaginated[deployments]`	1	3	4	`get_unique_tags` (L78) → `[d["id"] for d in ...]` (L260) → `[get_go_event(d["tags"]) for d in ...]` (L266) → `for md in molnix_deployments:` (L295)
`_CachedPaginated[positions]`	1	2	3	`get_unique_tags` (L83) → `[p["id"] for p in ...]` (L462) → `for position in molnix_positions:` (L468)
Total	2	5	7

With the cache, each re-iteration is a sequential read of a /tmp/molnix_paginated_*.jsonl file (deployments ~734 MB, positions ~890 MB).

NOTE: This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.

thenav56 · 2026-06-16T05:41:58Z

Hey @szabozoltan69, When you have time, let's merge and deploy to staging if we will not have a live release soon from staging.

It would be nice to test this in staging for few days.

szabozoltan69 · 2026-06-16T06:07:44Z

Thanks so much @thenav56 for this great and gap filling fix! I deploy it to Staging now.

szabozoltan69 · 2026-06-16T06:51:17Z

Deployed to Staging.

thenav56 force-pushed the feat/memory-optimize-for-sync-molnix branch from b5b90ff to 37f1a26 Compare May 22, 2026 12:36

thenav56 added 2 commits June 16, 2026 09:51

feat(molnix): use generator with local cache

3226ea4

NOTE: This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.

feat(helm): lower the memory requirement for molnix

5586a5f

thenav56 force-pushed the feat/memory-optimize-for-sync-molnix branch from 37f1a26 to 5586a5f Compare June 16, 2026 05:25

thenav56 marked this pull request as ready for review June 16, 2026 05:27

szabozoltan69 merged commit 5f09713 into develop Jun 16, 2026
3 checks passed

szabozoltan69 deleted the feat/memory-optimize-for-sync-molnix branch June 16, 2026 06:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/memory optimize for sync molnix#2745

Feat/memory optimize for sync molnix#2745
szabozoltan69 merged 2 commits into
developfrom
feat/memory-optimize-for-sync-molnix

thenav56 commented May 22, 2026 •

edited

Loading

Uh oh!

thenav56 commented Jun 16, 2026

Uh oh!

szabozoltan69 commented Jun 16, 2026

Uh oh!

Uh oh!

szabozoltan69 commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thenav56 commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Addresses

Changes

Headline numbers

Re-iteration counts (_CachedPaginated)

Uh oh!

thenav56 commented Jun 16, 2026

Uh oh!

szabozoltan69 commented Jun 16, 2026

Uh oh!

Uh oh!

szabozoltan69 commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thenav56 commented May 22, 2026 •

edited

Loading

Re-iteration counts (`_CachedPaginated`)