Skip to content

fix(mqtt): force instance replace on user_data change#72

Merged
BK1031 merged 1 commit into
mainfrom
bk1031/mqtt-user-data-replace-on-change
Jun 5, 2026
Merged

fix(mqtt): force instance replace on user_data change#72
BK1031 merged 1 commit into
mainfrom
bk1031/mqtt-user-data-replace-on-change

Conversation

@BK1031

@BK1031 BK1031 commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

The AWS provider's default for `aws_instance.user_data` is `user_data_replace_on_change = false`. terraform applies a user_data diff via `ModifyInstanceAttribute`, which stores the new user-data on the instance but doesn't re-execute cloud-init. The new file lands only on the next stop+start.

So the apply for #70 (which added the `mapache` user) updated state in place and never touched the running `/etc/nanomq_pwd.conf` — `mqtt pub -u mapache` got `BAD_USER_NAME_OR_PASSWORD` until I SSM'd in and patched the file by hand.

Setting `user_data_replace_on_change = true` makes user-data a force-replace attribute again, which is what the previous PR's removal of `user_data` from `lifecycle.ignore_changes` was actually trying to achieve.

Apply consequence

Nothing triggers on this PR alone. The running box was already manually patched via SSM, so state and code match the live `/etc/nanomq_pwd.conf`. The fix is forward-looking — the next legitimate user-data edit will trigger a clean instance replacement instead of the silent no-op trap.

Test plan

  • `terraform validate` + `fmt -check`
  • `terraform plan` after merge: zero changes
  • In a future PR that touches mqtt user-data: plan should show `module.mqtt.aws_instance.this` scheduled for replacement (not in-place update), and the apply should produce a fresh `/etc/nanomq_pwd.conf` automatically without the SSM dance

The AWS provider's default for aws_instance.user_data is
user_data_replace_on_change = false. terraform applies a user_data
diff via ModifyInstanceAttribute, which *stores* the new user-data but
doesn't re-execute cloud-init — the new file lands only on the next
stop+start. So the apply for PR #70 (which added the mapache user)
updated state in place and never touched the running /etc/nanomq_pwd.conf.

Setting user_data_replace_on_change = true makes user-data a force-
replace attribute again, which is what the previous PR's removal of
user_data from lifecycle.ignore_changes was actually trying to achieve.

This change won't trigger anything on its own (state and code match
the running box now that the running box was manually patched via SSM).
The next legitimate user-data edit will trigger a clean instance
replacement instead of the silent no-op trap.
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Terraform plan: prod

step result
fmt success
init success
validate success
plan success
plan output
module.origin_cert.tls_private_key.this: Refreshing state... [id=4c5b5e0a4a4f13723ba2aebe888a7cb50529fcc0]
module.postgres.random_password.postgres: Refreshing state... [id=none]
module.clickhouse.random_password.admin: Refreshing state... [id=none]
module.mqtt.random_password.mqtt_tcm26: Refreshing state... [id=none]
module.mqtt.random_password.mqtt_mapache: Refreshing state... [id=none]
module.mqtt.random_password.mqtt: Refreshing state... [id=none]
data.cloudflare_zone.gauchoracing: Reading...
module.origin_cert.tls_cert_request.this: Refreshing state... [id=9398eb1d26e54eb59b18d476ced10810e80172e5]
module.origin_cert.cloudflare_origin_ca_certificate.this: Refreshing state... [id=307819530070461722629184968406649377122389744032]
module.eks.module.eks.aws_cloudwatch_log_group.this[0]: Refreshing state... [id=/aws/eks/gr-prod/cluster]
module.eks.module.eks.module.kms.data.aws_caller_identity.current[0]: Reading...
module.eks.module.eks.data.aws_caller_identity.current[0]: Reading...
module.eks.module.eks.data.aws_iam_policy_document.assume_role_policy[0]: Reading...
module.eks.module.eks.data.aws_partition.current[0]: Reading...
module.mqtt.data.aws_ami.al2023_arm64: Reading...
module.clickhouse.aws_ebs_volume.data: Refreshing state... [id=vol-0e312e8d71875ec89]
module.vpc.module.vpc.aws_vpc.this[0]: Refreshing state... [id=vpc-06e13a97395396a3b]
module.eks.module.eks.data.aws_iam_policy_document.assume_role_policy[0]: Read complete after 0s [id=2830595799]
module.eks.module.eks.data.aws_partition.current[0]: Read complete after 0s [id=aws]
module.eks.module.eks.module.kms.data.aws_partition.current[0]: Reading...
module.eks.module.eks.module.kms.data.aws_partition.current[0]: Read complete after 0s [id=aws]
module.postgres.aws_ebs_volume.data: Refreshing state... [id=vol-02b203218b57629b4]
module.clickhouse.data.aws_ami.al2023_arm64: Reading...
module.eks.module.eks.module.kms.data.aws_caller_identity.current[0]: Read complete after 0s [id=211125506628]
module.postgres.data.aws_ami.al2023_arm64: Reading...
module.eks.module.eks.data.aws_caller_identity.current[0]: Read complete after 0s [id=211125506628]
module.eks.module.eks.data.aws_iam_policy_document.node_assume_role_policy[0]: Reading...
module.eks.module.eks.data.aws_iam_policy_document.node_assume_role_policy[0]: Read complete after 0s [id=3518401652]
module.eks.module.eks.aws_iam_role.this[0]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002]
module.eks.module.eks.data.aws_iam_session_context.current[0]: Reading...
data.cloudflare_zone.gauchoracing: Read complete after 0s [id=5ac5ae9c6086e4b55c5e1b21ca963d94]
module.eks.module.eks.aws_iam_role.eks_auto[0]: Refreshing state... [id=gr-prod-eks-auto-20260601094833482500000004]
module.origin_cert.aws_acm_certificate.this: Refreshing state... [id=arn:aws:acm:us-west-2:211125506628:certificate/d10d5205-6d4b-4798-a152-293c69174660]
module.eks.module.eks.data.aws_iam_policy_document.custom[0]: Reading...
module.eks.module.eks.data.aws_iam_policy_document.custom[0]: Read complete after 0s [id=513122117]
cloudflare_ruleset.ssl_overrides: Refreshing state... [id=5a5ed8237d6f48418c172979ffe5da81]
module.eks.module.eks.aws_iam_policy.custom[0]: Refreshing state... [id=arn:aws:iam::211125506628:policy/gr-prod-cluster-20260601094833480900000001]
module.eks.module.eks.data.aws_iam_session_context.current[0]: Read complete after 0s [id=arn:aws:sts::211125506628:assumed-role/github-actions-terraform/GitHubActions]
module.clickhouse.data.aws_ami.al2023_arm64: Read complete after 0s [id=ami-0a2a049c945b84826]
module.postgres.data.aws_ami.al2023_arm64: Read complete after 0s [id=ami-0a2a049c945b84826]
module.mqtt.data.aws_ami.al2023_arm64: Read complete after 0s [id=ami-0a2a049c945b84826]
module.eks.module.eks.aws_iam_role_policy_attachment.this["AmazonEKSLoadBalancingPolicy"]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::aws:policy/AmazonEKSLoadBalancingPolicy]
module.eks.module.eks.aws_iam_role_policy_attachment.this["AmazonEKSNetworkingPolicy"]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::aws:policy/AmazonEKSNetworkingPolicy]
module.eks.module.eks.aws_iam_role_policy_attachment.this["AmazonEKSClusterPolicy"]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::aws:policy/AmazonEKSClusterPolicy]
module.eks.module.eks.aws_iam_role_policy_attachment.this["AmazonEKSComputePolicy"]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::aws:policy/AmazonEKSComputePolicy]
module.eks.module.eks.aws_iam_role_policy_attachment.custom[0]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::211125506628:policy/gr-prod-cluster-20260601094833480900000001]
module.eks.module.eks.module.kms.data.aws_iam_policy_document.this[0]: Reading...
module.eks.module.eks.aws_iam_role_policy_attachment.this["AmazonEKSBlockStoragePolicy"]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::aws:policy/AmazonEKSBlockStoragePolicy]
module.eks.module.eks.module.kms.data.aws_iam_policy_document.this[0]: Read complete after 0s [id=922405470]
module.eks.module.eks.module.kms.aws_kms_key.this[0]: Refreshing state... [id=7768801a-b38a-4c26-8bc3-7bf6fe2aac86]
module.eks.module.eks.aws_iam_role_policy_attachment.eks_auto["AmazonEC2ContainerRegistryPullOnly"]: Refreshing state... [id=gr-prod-eks-auto-20260601094833482500000004/arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly]
module.eks.module.eks.aws_iam_role_policy_attachment.eks_auto["AmazonEKSWorkerNodeMinimalPolicy"]: Refreshing state... [id=gr-prod-eks-auto-20260601094833482500000004/arn:aws:iam::aws:policy/AmazonEKSWorkerNodeMinimalPolicy]
module.vpc.module.vpc.aws_default_route_table.default[0]: Refreshing state... [id=rtb-08f817bde5f65eb92]
module.postgres.aws_security_group.this: Refreshing state... [id=sg-08a5b2e02e0540520]
module.vpc.module.vpc.aws_route_table.private[0]: Refreshing state... [id=rtb-0c18db918f54ad033]
module.vpc.module.vpc.aws_default_security_group.this[0]: Refreshing state... [id=sg-0a592b2169fd42df8]
module.vpc.module.vpc.aws_default_network_acl.this[0]: Refreshing state... [id=acl-0fd92b5b8eb95b2f9]
module.vpc.module.vpc.aws_route_table.public[0]: Refreshing state... [id=rtb-0815845194166b58b]
module.mqtt.aws_security_group.this: Refreshing state... [id=sg-0f5a2dc492283dafe]
module.vpc.module.vpc.aws_subnet.public[1]: Refreshing state... [id=subnet-0182a0562244a6fac]
module.vpc.module.vpc.aws_subnet.public[0]: Refreshing state... [id=subnet-0264aaaa19faa70f5]
module.vpc.module.vpc.aws_subnet.public[2]: Refreshing state... [id=subnet-0a5a8299f6da9bc59]
module.vpc.module.vpc.aws_subnet.private[0]: Refreshing state... [id=subnet-09fbaccd0b3aaab85]
module.vpc.module.vpc.aws_subnet.private[1]: Refreshing state... [id=subnet-022e58c410c24d794]
module.vpc.module.vpc.aws_subnet.private[2]: Refreshing state... [id=subnet-06540460bfd7d06a2]
module.vpc.module.vpc.aws_internet_gateway.this[0]: Refreshing state... [id=igw-0ace83040de106603]
module.eks.module.eks.module.kms.aws_kms_alias.this["cluster"]: Refreshing state... [id=alias/eks/gr-prod]
module.clickhouse.aws_security_group.this: Refreshing state... [id=sg-0dee964416e7aaeb5]
module.eks.module.eks.aws_security_group.cluster[0]: Refreshing state... [id=sg-0cac44db03a686436]
module.postgres.aws_security_group_rule.ingress_cidr[0]: Refreshing state... [id=sgrule-3721507580]
module.eks.module.eks.aws_security_group.node[0]: Refreshing state... [id=sg-0b19db83dbe18cbf1]
module.eks.module.eks.aws_iam_policy.cluster_encryption[0]: Refreshing state... [id=arn:aws:iam::211125506628:policy/gr-prod-cluster-ClusterEncryption20260601094855072000000006]
module.vpc.module.vpc.aws_route.public_internet_gateway[0]: Refreshing state... [id=r-rtb-0815845194166b58b1080289494]
module.mqtt.aws_security_group_rule.ingress_cidr[0]: Refreshing state... [id=sgrule-1294033340]
module.vpc.module.vpc.aws_eip.nat[0]: Refreshing state... [id=eipalloc-015b9b6ae09534761]
module.clickhouse.aws_security_group_rule.ingress_cidr["9000"]: Refreshing state... [id=sgrule-2026118668]
module.clickhouse.aws_security_group_rule.ingress_cidr["8123"]: Refreshing state... [id=sgrule-467233960]
module.vpc.module.vpc.aws_route_table_association.public[1]: Refreshing state... [id=rtbassoc-041ffe3137ec2ff7a]
module.vpc.module.vpc.aws_route_table_association.public[2]: Refreshing state... [id=rtbassoc-01d455080d37c3108]
module.vpc.module.vpc.aws_route_table_association.public[0]: Refreshing state... [id=rtbassoc-09913d53f1ff40442]
module.vpc.module.vpc.aws_route_table_association.private[2]: Refreshing state... [id=rtbassoc-0b736817e30c38444]
module.vpc.module.vpc.aws_route_table_association.private[0]: Refreshing state... [id=rtbassoc-0c836864d0eccc7fc]
module.vpc.module.vpc.aws_route_table_association.private[1]: Refreshing state... [id=rtbassoc-07ea61af9805b07cb]
module.eks.module.eks.aws_security_group_rule.node["ingress_self_coredns_tcp"]: Refreshing state... [id=sgrule-1577514230]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_8443_webhook"]: Refreshing state... [id=sgrule-3709110977]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_kubelet"]: Refreshing state... [id=sgrule-2079615841]
module.eks.module.eks.aws_security_group_rule.node["egress_all"]: Refreshing state... [id=sgrule-4004824215]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_10251_webhook"]: Refreshing state... [id=sgrule-1337801498]
module.eks.module.eks.aws_security_group_rule.node["ingress_self_coredns_udp"]: Refreshing state... [id=sgrule-4200159001]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_4443_webhook"]: Refreshing state... [id=sgrule-118149494]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_443"]: Refreshing state... [id=sgrule-500562133]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_9443_webhook"]: Refreshing state... [id=sgrule-3115694346]
module.eks.module.eks.aws_security_group_rule.node["ingress_nodes_ephemeral"]: Refreshing state... [id=sgrule-504933240]
module.eks.module.eks.aws_security_group_rule.node["ingress_cluster_6443_webhook"]: Refreshing state... [id=sgrule-3460871904]
module.eks.module.eks.aws_security_group_rule.cluster["ingress_nodes_443"]: Refreshing state... [id=sgrule-535925259]
module.clickhouse.aws_instance.this: Refreshing state... [id=i-0f862cc6460b5d98a]
module.postgres.aws_instance.this: Refreshing state... [id=i-013aab40e28a0b6b7]
module.mqtt.aws_instance.this: Refreshing state... [id=i-0bf98528bc8e9dab0]
module.eks.module.eks.aws_iam_role_policy_attachment.cluster_encryption[0]: Refreshing state... [id=gr-prod-cluster-20260601094833481300000002/arn:aws:iam::211125506628:policy/gr-prod-cluster-ClusterEncryption20260601094855072000000006]
module.vpc.module.vpc.aws_nat_gateway.this[0]: Refreshing state... [id=nat-019992cce709b8681]
module.clickhouse.aws_security_group_rule.ingress_sg["sg-0b19db83dbe18cbf1-8123"]: Refreshing state... [id=sgrule-4177740833]
module.mqtt.aws_security_group_rule.ingress_sg["sg-0b19db83dbe18cbf1"]: Refreshing state... [id=sgrule-1062873908]
module.postgres.aws_security_group_rule.ingress_sg["sg-0b19db83dbe18cbf1"]: Refreshing state... [id=sgrule-1130224857]
module.clickhouse.aws_security_group_rule.ingress_sg["sg-0b19db83dbe18cbf1-9000"]: Refreshing state... [id=sgrule-2466486093]
module.vpc.module.vpc.aws_route.private_nat_gateway[0]: Refreshing state... [id=r-rtb-0c18db918f54ad0331080289494]
module.eks.module.eks.aws_eks_cluster.this[0]: Refreshing state... [id=gr-prod]
module.eks.module.eks.data.tls_certificate.this[0]: Reading...
module.eks.module.eks.aws_eks_access_entry.this["arn-aws-iam--211125506628-user-admin-cli"]: Refreshing state... [id=gr-prod:arn:aws:iam::211125506628:user/admin-cli]
module.eks.module.eks.aws_eks_access_entry.this["arn-aws-iam--211125506628-role-github-actions-terraform"]: Refreshing state... [id=gr-prod:arn:aws:iam::211125506628:role/github-actions-terraform]
module.eks.module.eks.time_sleep.this[0]: Refreshing state... [id=2026-06-01T10:46:10Z]
module.eks.module.eks.data.tls_certificate.this[0]: Read complete after 0s [id=f97f646c2cd14cc0db0f757f0fccc96abbbe2af5]
module.eks.module.eks.aws_iam_openid_connect_provider.oidc_provider[0]: Refreshing state... [id=arn:aws:iam::211125506628:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/21512EE80634956C7C9D0B9647C70224]
module.eks.module.eks.aws_eks_access_policy_association.this["arn-aws-iam--211125506628-role-github-actions-terraform_admin"]: Refreshing state... [id=gr-prod#arn:aws:iam::211125506628:role/github-actions-terraform#arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy]
module.eks.module.eks.aws_eks_access_policy_association.this["arn-aws-iam--211125506628-user-admin-cli_admin"]: Refreshing state... [id=gr-prod#arn:aws:iam::211125506628:user/admin-cli#arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy]
module.argocd.helm_release.argocd: Refreshing state... [id=argocd]
module.mqtt.aws_eip.this[0]: Refreshing state... [id=eipalloc-000796d397533c6d8]
cloudflare_dns_record.gr_mqtt: Refreshing state... [id=53badd7d1ab83280dc57c671ee486f90]
module.postgres.aws_volume_attachment.data: Refreshing state... [id=vai-3584522434]
module.postgres.aws_eip.this[0]: Refreshing state... [id=eipalloc-06d6c59b1e0a49482]
module.clickhouse.aws_volume_attachment.data: Refreshing state... [id=vai-843739595]
module.clickhouse.aws_eip.this[0]: Refreshing state... [id=eipalloc-0970fced638b7d8d9]
cloudflare_dns_record.gr_clickhouse: Refreshing state... [id=d6f3d3c09db5c56703a7b107d6ac3f29]
cloudflare_dns_record.gr_postgres: Refreshing state... [id=a69d68353c27a536e84d7448af48e3f0]

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.mqtt.aws_instance.this will be updated in-place
  ~ resource "aws_instance" "this" {
        id                                   = "i-0bf98528bc8e9dab0"
        tags                                 = {
            "Name" = "gr-mqtt"
            "Role" = "nanomq"
        }
      ~ user_data_replace_on_change          = false -> true
        # (40 unchanged attributes hidden)

        # (9 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't
guarantee to take exactly these actions if you run "terraform apply" now.

@BK1031 BK1031 merged commit 702f2a1 into main Jun 5, 2026
1 check passed
@BK1031 BK1031 deleted the bk1031/mqtt-user-data-replace-on-change branch June 5, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant