Update the image and other properties of worker machines

Learn how to update the image and other properties of worker machines

If your cluster workload changes, use a different instance type for your worker machines. To ensure your worker machines’ operating system is up to date, use a different machine image that includes a more recent patch version of the operating system.

By default, Konvoy groups your worker machines in the worker node pool. If you change properties of the machines and apply the change, the machines may be destroyed and re-created, disrupting their running workloads.

This tutorial describes how to update the properties of worker machines without disrupting your cluster workload. You create a new node pool, with up-to-date properties. You then move your workload, to the new node pool, from the worker nodepool, and then scale down the worker node pool.

Follow these steps:

  1. Use this command to list all node pools, and identify the node pool with worker machines:

    konvoy get nodepools
    

    NOTE: If your workers are grouped in the worker node pool, the default for a Konvoy cluster, skip this step. If your cluster uses a different worker node pool, use that node pool name instead of worker in the following steps.

  2. Create a new node pool, called worker2, copying the properties of the worker node pool.

    konvoy create nodepool worker2 --from worker
    
  3. Edit cluster.yaml to change the machine image and other properties of the worker2 node pool if needed. If necessary, update the count.

    This is an excerpt of an edited cluster.yaml. Note that, compared to the worker node pool, the worker2 node pool has twice as many nodes, uses a different instance type, a different machine image, and allocates twice as much space for image and container storage.

    kind: ClusterProvisioner
    apiVersion: konvoy.mesosphere.io/v1beta2
    spec:
    nodePools:
    - name: worker
        count: 4
        machine:
        rootVolumeSize: 80
        rootVolumeType: gp2
        imagefsVolumeEnabled: true
        imagefsVolumeSize: 160
        imagefsVolumeType: gp2
        imagefsVolumeDevice: xvdb
        type: m5.2xlarge
        imageID: ami-01ed306a12b7d1c96
    - name: worker2
        count: 8
        machine:
        rootVolumeSize: 80
        rootVolumeType: gp2
        imagefsVolumeEnabled: true
        imagefsVolumeSize: 320
        imagefsVolumeType: gp2
        imagefsVolumeDevice: xvdb
        type: p2.xlarge
        imageID: ami-079f731edfe27c29c
    
  4. Apply the change to your infrastructure:

    konvoy up
    
  5. Move your workload, from the machines in the worker pool, to the machines in the worker2 pool. For more information on draining, see Safely Drain a Node.

    konvoy drain nodepool worker
    
  6. Verify your workload has been rescheduled and is healthy. To list all Pods that are not Running, use this command:

    kubectl get pods --all-namespaces=true --field-selector=status.phase!=Running
    

    NOTE: No single method applies for all workloads. A pod that is not Running can be, but is not always, a sign of an unhealthy workload. We recommend you implement application health checks and exercise them when migrating your workload from one node pool to another. For information on implementing health checks in Kubernetes, see Configure Liveness, Readiness and Startup Probes.

  7. Scale down the worker node pool to zero.

     konvoy scale nodepool worker --count=0
     konvoy up
    

    NOTE: Due to a known issue, Konvoy does not currently support deleting a node pool from its configuration.