CPU pinning or affinity on Harvester 1.3.2

I’m trying to setup CPU affinity as described in Mr Durden’s instructions here: https://github.com/bcdurden/affinity_on_harvester

My one node cluster does not show the changes to cpu manager after a reboot

harvester01:~ # cat /var/lib/kubelet/cpu_manager_state
{"policyName":"none","defaultCpuSet":"","checksum":1353318690}

I have confirmed the changes to /oem/90_custom.yaml have been propagated to 90-harvester-server.yaml and 90-harvester-agent.yaml after a reboot of my node

harvester01:~ # cat /etc/rancher/rke2/config.yaml.d/90-harvester-server.yaml
cni: multus,canal
cluster-cidr: 10.52.0.0/16
service-cidr: 10.53.0.0/16
cluster-dns: 10.53.0.10
tls-san:
  - 10.0.111.4
kubelet-arg:
- "max-pods=200"
- "cpu-manager-policy=static"
- "cpu-manager-policy-options=full-pcpus-only=true"
- "cpu-manager-reconcile-period=0s"
- "reserved-cpus=2"
audit-policy-file: /etc/rancher/rke2/config.yaml.d/92-harvester-kube-audit-policy.yaml
harvester01:~ # cat /etc/rancher/rke2/config.yaml.d/90-harvester-agent.yaml
kubelet-arg:
- "max-pods=200"
- "cpu-manager-policy=static"
- "cpu-manager-policy-options=full-pcpus-only=true"
- "cpu-manager-reconcile-period=0s"
- "reserved-cpus=2"

The top of my /oem/90_custom.yaml

name: Harvester Configuration
stages:
    initramfs:
        - commands:
            - modprobe kvm
            - modprobe vhost_net
            - sed -i 's/^NETCONFIG_DNS_STATIC_SERVERS.*/NETCONFIG_DNS_STATIC_SERVERS="10.0.111.10"/' /etc/sysconfig/network/config
            - rm -f /etc/sysconfig/network/ifroute-mgmt-br
            - rm /var/lib/kubelet/cpu_manager_state || true
            - sysctl -w vm.nr_hugepages=16784

and my kubevirt config

spec:
  certificateRotateStrategy: {}
  configuration:
    developerConfiguration:
      featureGates:
      - LiveMigration
      - HotplugVolumes
      - HostDevices
      - GPU
      - NUMA
      - CPUManager

Any idea where to look? It seems the cpu manager is not getting the config changes. These same changes worked on 1.4.0-dev by the way.

The issue you’re seeing — cpu-manager-policy not being applied after reboot in v1.3.2 — is likely related to a regression or a timing issue with how the kubelet state is initialized on Harvester’s immutable OS. Here are the key troubleshooting steps and observations:

Root cause of the likely issue:

Your /oem/90_custom.yaml includes rm /var/lib/kubelet/cpu_manager_state || true in initramfs commands, which should delete the stale state file before kubelet starts. However, in Harvester 1.3.x, the initramfs stage may run at a different point relative to when RKE2/kubelet initializes, causing the state file to be regenerated with the default none policy before the kubelet args take effect.

Checks to perform:

  1. Verify kubelet is actually receiving the args:

  2. bash
    
  3. ps aux | grep kubelet | grep cpu-manager

  4. 
    If `cpu-manager-policy=static` doesn't appear, the kubelet args from `90-harvester-server.yaml` are not being passed correctly.
    
  5. Check the kubelet arg config path:

  6. In Harvester v1.3.x, the kubelet args in RKE2 config are processed differently than in v1.4+. Verify that /etc/rancher/rke2/config.yaml.d/90-harvester-server.yaml is being read at startup:

  7. bash
    
  8. journalctl -u rke2-server | grep cpu-manager

  9. 
    
    
    1. State file timing:
    2. The cpu_manager_state file must be deleted before kubelet starts. If your initramfs commands run before kubelet but after the file is created by a previous run, it should work. Try also adding the delete in the boot stage as a fallback:
    3. yaml
      
    4. stages:
    5. boot:
      • commands:
      • - rm -f /var/lib/kubelet/cpu_manager_state
        
    1. KubeVirt CPUManager feature gate:
    2. Your KubeVirt config has CPUManager in featureGates — good. But also verify the KubeVirt CPUAllocationRatio is set appropriately for dedicated CPU pinning.
  10. Note on v1.3.2 vs v1.4.0:

  11. Since you mentioned this worked on v1.4.0-dev, this suggests a bug in v1.3.2’s handling of kubelet args or the initramfs stage ordering. Consider upgrading to v1.4.x or later where this was fixed.

Reference: Official Harvester docs > Advanced > Single Node Clusters and the Harvester GitHub issue tracker