Issue creating a cluster with the openstack ccm as additional manifest using the rancher2 terraform provider

I am in the process of setting up a cluster using the rancher2 terraform provider. However, i want to disable the cloud controller and instead install the external cloud controller manager for openstack.
As recommended in this discussion: https://github.com/rancher/rke2/discussions/6131 i want to install it using the additional manifest flag. However my cluster gets stuck “Waiting for cluster agent to connect”. This doesnt happen when i omit the additional manifest and the disable-cloud-provider=true in the machine_global_config.


resource "rancher2_cluster_v2" "cluster" {
  name                = var.cluster_name
  kubernetes_version  = "v1.30.8+rke2r1"
  enable_network_policy = false
  
  rke_config {
    machine_global_config = <<EOF
  kubelet-arg:
    - cloud-provider=external
  disable-cloud-controller: true
  EOF

    additional_manifest = <<EOF
  ---
  apiVersion: v1
  kind: Secret
  metadata:
    name: cloud-config
    namespace: kube-system
  type: Opaque
  data:
    cloud.conf: $(base64_encode(
      <<-CLOUDCONF
      [Global]
      auth_url="${var.openstack_auth_url}"
      region="${var.openstack_region}"
      user_domain_name = "${var.openstack_user_domain_name}"
      tenant_name=${var.openstack_tenant_name}
      application_credential_id="${var.openstack_application_credential_id}"
      application_credential_secret="${var.openstack_application_credential_secret}"

      [LoadBalancer]
      use-octavia=true
      subnet-id="${data.terraform_remote_state.network.outputs.subnet_id}"
      floating-network-id=${data.terraform_remote_state.network.outputs.external_network_id}
      lb-provider="ampohra"
      lb-method=ROUND_ROBIN
      lb-create-monitor=true
      manage-security-groups=true
  CLOUDCONF
    ))
  ---
  apiVersion: helm.cattle.io/v1
  kind: HelmChart
  metadata:
    name: openstack-cloud-controller-manager
    namespace: kube-system
  spec:
    chart: openstack-cloud-controller-manager
    repo: https://kubernetes.github.io/cloud-provider-openstack
    targetNamespace: kube-system
    bootstrap: true
    valuesContent: |-
      cloud-config:
        secret: cloud-config
  EOF
    machine_pools {
      name               = "control-plane-etcd"
      control_plane_role = true
      etcd_role          = true
      worker_role        = false
      quantity           = 1
      drain_before_delete = true
      machine_config {
        kind = rancher2_machine_config_v2.machine_config.kind
        name = rancher2_machine_config_v2.machine_config.name
      }
    }

    machine_pools {
      name               = "worker"
      control_plane_role = false
      etcd_role          = false
      worker_role        = true
      quantity           = 2
      drain_before_delete = true
      machine_config {
        kind = rancher2_machine_config_v2.machine_config.kind
        name = rancher2_machine_config_v2.machine_config.name
      }
    }
  }
}

Does anybody have experience with this ?

Based on your config I was able to get my cluster up and running. You have made 2 mistakes in the helm config. Correct configs are:

  1. secret.create: false
  2. nodeSelector.node-role.kubernetes.io/control-plane: “true”

Why these are needed is because you are creating your own external config and dont want the chart to create the same. Setting create=false ensures that and uses your config as is. The default secret name is cloud-config so you can omit that from here. And, the node selector in the chart is set to empty string but in case of rancher it must be set to ‘true’