From 0e68bb01e89206d9123815cbda06188da090a60e Mon Sep 17 00:00:00 2001 From: Jake Jurek Date: Sun, 7 Jun 2026 14:56:02 -0700 Subject: [PATCH] fix(clickhouse-ec2): disable text_log and lower log level in provisioning MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stock ClickHouse ships at `trace` log level with system.text_log enabled and no TTL. On the hosted instance text_log grew ~155 GiB in ~2 days and filled the data volume, after which every INSERT failed with NOT_ENOUGH_SPACE (Code 243) — query_token/query_log writes from the query service were blocked; reads were unaffected. The running server was fixed by hand, but that fix lived only in the container's writable layer because config.d was never bind-mounted, so it would not survive an image bump or instance replacement. Bake it into user-data instead, mirroring the existing users.d pattern: - Write /etc/clickhouse-server/config.d/quiet-logs.xml setting the logger level to `information` and removing text_log. - Bind-mount config.d (:ro) into the container so the override is applied. - chown 101:101 so the container's clickhouse uid can read it (no secrets, so no chmod 600 like admin.xml). System-log table TTLs (system.query_log etc.) are applied separately via SQL; they persist on the preserved EBS data volume. Co-Authored-By: Claude Opus 4.8 --- .../modules/clickhouse-ec2/user-data.sh.tftpl | 22 +++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/infra/modules/clickhouse-ec2/user-data.sh.tftpl b/infra/modules/clickhouse-ec2/user-data.sh.tftpl index aa4aaa9..651f0d0 100644 --- a/infra/modules/clickhouse-ec2/user-data.sh.tftpl +++ b/infra/modules/clickhouse-ec2/user-data.sh.tftpl @@ -64,6 +64,27 @@ EOF chown 101:101 /etc/clickhouse-server/users.d/admin.xml chmod 600 /etc/clickhouse-server/users.d/admin.xml +# config.d overrides — server-level settings, merged with the image's default +# config.xml at startup. Stock ClickHouse ships at `trace` log level with +# system.text_log enabled and no TTL; on this box text_log grew ~155 GiB in +# ~2 days and filled the data volume, after which every INSERT failed with +# NOT_ENOUGH_SPACE (Code 243). Lower the level and disable text_log so the +# server can't log itself to death on a fresh boot / image bump. +mkdir -p /etc/clickhouse-server/config.d + +cat >/etc/clickhouse-server/config.d/quiet-logs.xml <<'EOF' + + + information + + + +EOF + +# Readable by the container's clickhouse uid 101 (no secrets here, so unlike +# admin.xml this stays world-readable rather than chmod 600). +chown 101:101 /etc/clickhouse-server/config.d/quiet-logs.xml + # --network=host: nothing else on this box uses 8123 or 9000, so skip the # docker-proxy NAT layer. --restart=always: container survives reboots # (docker.service is enabled above). --ulimit nofile=262144: ClickHouse @@ -76,4 +97,5 @@ docker run -d \ --ulimit nofile=262144:262144 \ -v /var/lib/clickhouse:/var/lib/clickhouse \ -v /etc/clickhouse-server/users.d:/etc/clickhouse-server/users.d:ro \ + -v /etc/clickhouse-server/config.d:/etc/clickhouse-server/config.d:ro \ clickhouse/clickhouse-server:${clickhouse_version}