Manage your Node Pools
What to know about Node Pools
- Node pools allow you to group one or more Crusoe Cloud instances of the same type and associate them to a specific cluster control plane, to function as worker nodes.
- When creating a node pool, you must specify the number of VMs you want to create via a
countfield. Once specified, the node pool will maintain this VM count where possible. For example, if you stop or terminate one of the VMs in your node pool, the node pool will provision new VMs up till thecountspecified. - You may scale up or down your node pool by specifying a new
countvalue. Note that setting acountvalue lower than the current number does not automatically delete VMs from your node pools. You must manually delete the instances you want removed. - We currently do not allow editing the startup script or image associated with node pools. To install packages or software on node bring-up, we recommend using Daemonsets.
Creating a New Node Pool
- CLI
- UI
- Terraform
Nodepools can be created by using the kubernetes nodepools create command. Nodepools must be created in the context of a specific cluster. Use the '--help' flag for an exhaustive list of options.
crusoe kubernetes nodepools create \
--name my-first-nodepool \
--cluster-id 6f8e2a1b-7b1d-4c8e-a9f2-8e3d6c1f2a0c \
--type h100-80gb-sxm-ib.8x \
--count 4 \
--ib-partition-id 4c8e2a1b-7b1d-4c8e-a9f2-8e3d6c1f2a0c \
In order to create a CMK cluster via the cloud console:
- Visit the Crusoe Cloud console
- Click the "Orchestration" tab in the left nav
- Select the cluster you want to edit
- Click the "Create Node Pool" button
- Fill out the required fields specifying the type of nodes you want to create and the count
- Click the "Create" button
You can use the crusoe_kubernetes_node_pool resource to create a new node pools in the context of a specific cluster via Terraform.
terraform {
required_providers {
crusoe = {
source = "crusoecloud/crusoe"
}
}
}
locals {
my_ssh_key = file("~/.ssh/id_ed25519.pub") # replace with path to your public SSH key if different
}
resource "crusoe_kubernetes_cluster" "my_first_cluster" {
name = "tf-cluster"
version = "1.31.7-cmk.x" # Replace with the version you want
location = "us-east1-a"
subnet_id = "6f8e2a1b-7b1d-4c8e-a9f2-8e3d6c1f2a0c"
add_ons = ["nvidia_gpu_operator","nvidia_network_operator","crusoe_csi"]
}
resource "crusoe_kubernetes_node_pool" "l40s_nodepool" {
name = "tf-l40s-nodepool"
cluster_id = crusoe_kubernetes_cluster.my_first_cluster.id
instance_count = "4"
type = "l40s-48gb.10x"
ssh_key = local.my_ssh_key
version = crusoe_kubernetes_cluster.my_first_cluster.version
requested_node_labels = {
# Optional: Kubernetes Node objects will be labeled with the following key:value pairs
# "labelkey" = "labelvalue"
}
}
Viewing Existing Node Pools
- CLI
- UI
Use the kubernetes nodepools list command to list all existing node pools across clusters. You can also use the `kubernetes nodepools get' command to retrieve individual node pool details.
crusoe kubernetes nodepools get <name/id>
To view your node pools via the Crusoe Cloud console:
- Visit the Crusoe Cloud console
- Click the "Orchestration" tab in the left nav
- Select the cluster you are interested in
- You will see a list of node pools in the 'Node Pools' section of the cluster details view.
Update your Node Pool
You can the following properties of your node pool:
- Change the count of nodes in the node pool
- Set the pool to use (or not use) local ephemeral NVMe for containerd storage (CLI and Terraform only)
- Set the labels for your pool. The newly provided labels will overwrite all old labels. (CLI and Terraform only)
- The version of the pool's Kubernetes worker nodes. You can get a list of available versions by running
crusoe kubernetes versions listfrom the Crusoe CLI. (CLI and Terraform only)
Updating properties of a nodepool will update the nodepool template, but will not perform an in-place upgrade of the existing nodes in the pool. If you scale the nodepool after applying an update, the updated template will only apply to the newly created nodes. To update existing nodes, you must perform a rolling upgrade (see section below).
- CLI
- UI
- Terraform
You may update the number of nodes in your nodepools by using the crusoe kubernetes nodepools update command.
crusoe kubernetes nodepools update <name/id> --count 3
For instructions on how to update additional elements of your nodepool (e.g. ephemeral storage for containerd, nodepool labels), run:
crusoe kubernetes nodepools update -h
To update the number of VMs in your node pool using the cloud console:
- Visit the Crusoe Cloud console
- Click the "Orchestration" tab in the left nav
- Select the cluster you want to update
- Select the edit icon next to the node pool you want to update
- Specify the new number of nodes that you want to add
- Click the "Update" button
resource "crusoe_kubernetes_node_pool" "l40s_nodepool" {
name = "tf-l40s-nodepool"
cluster_id = crusoe_kubernetes_cluster.my_first_cluster.id
# Optional: Set the desired instance count
instance_count = "6" # add 2 more nodes
# Optional: Set the desired CMK worker node version
# If not specified, the default is the latest stable version compatible with the cluster
# List available node pool versions with "crusoe kubernetes versions list"
version = "1.31.7-cmk.x" # Replace with the version you want
type = "l40s-48gb.10x"
# Optional: Kubernetes Node objects will be labeled with the following key:value pairs
# requested_node_labels = {
# "labelkey" = "labelvalue"
# }
# Optional: Use local ephemeral NVMe disks for containerd storage
# ephemeral_storage_for_containerd = true
}
Perform a Rolling Upgrade on your Node Pool
Rolling upgrades of nodepools is currently in Limited Availability and only offered through the Crusoe CLI and Terraform. If you would like to join the Limited Availability program and gain access to this feature, please contact customer support to request access.
If you have updated your nodepool template but still have existing nodes from an older template, you can perform a rolling upgrade of your nodepool to update existing nodes to the latest configuration.
Note: A rolling upgrade requires specification of a batch-size (number of nodes to upgrade at one time) or a batch-percentage (percent of total nodepool size to update at one time). During the upgrade process, your nodepool will be down however many nodes you specified in batch-size or batch-percentage. For minimal workload interruptions, use a batch-size of 1.
- CLI
- Terraform
Once you have applied an update to your nodepool you can trigger a rollout of your updates to existing nodes in the pool. You must specify either a --batch-size or --batch-percentage flag to instruct the rollout process how to process your rollout. The limit to the number of nodes that can be upgraded at a time is 10. If you supply either a higher --batch-size than 10 or a --batch-percentage that results in a number greater than 10, your rollout will not start.
Example: Run a rolling upgrade, one node at a time
crusoe kubenetes nodepools rollout start <node-pool-name/node-pool-id> --batch-size 1`
Example: Run a rolling upgrade, 10% of the nodepool at a time
crusoe kubenetes nodepools rollout start <node-pool-name/node-pool-id> --batch-percentage 10`
You can monitor the status of the last initiated rollout.
crusoe kubernetes nodepools rollout status <node-pool-name/node-pool-id>
You can cancel an in progress rollout. Any in progress node upgrades will complete before the operation is cancelled. Subsequent rollouts on the same nodepool will pick up where the cancelled operation left off.
crusoe kubernetes nodepools rollout cancel <node-pool-name/node-pool-id>
When updating your terraform config, if you specify batch-size or batch-percentage, the applied configuration changes will be rolled out to existing nodes.
resource "crusoe_kubernetes_node_pool" "l40s_nodepool" {
name = "tf-l40s-nodepool"
cluster_id = crusoe_kubernetes_cluster.my_first_cluster.id
# Optional: Set the desired instance count
instance_count = "6" # add 2 more nodes
# Optional: Set the desired CMK worker node version
# If not specified, the default is the latest stable version compatible with the cluster
# List available node pool versions with "crusoe kubernetes versions list"
version = "1.31.7-cmk.x" # Replace with the version you want
type = "l40s-48gb.10x"
# Optional: Kubernetes Node objects will be labeled with the following key:value pairs
# requested_node_labels = {
# "labelkey" = "labelvalue"
# }
# Optional: Use local ephemeral NVMe disks for containerd storage
# ephemeral_storage_for_containerd = true
# Optional: Control the number of nodes to delete and recreate in batches when updating the node pool.
# If omitted, any existing nodes will not be updated, but future ones will use the new config.
# batch_size = 10 # The number of nodes to replace at a time
# batch_percentage = 100 # The percentage of nodes to replace at a time
}
Delete a Node Pool
- CLI
- UI
- Terraform
You can delete nodepools by using the kubernetes nodepools delete command and specifying either the name of ID of the nodepool you want to delete.
crusoe kubernetes nodepools delete <name/id>
In order to create a node pool via the cloud console:
- Visit the Crusoe Cloud console
- Click the "Orchestration" tab in the left nav
- Select the cluster you want to update
- Select the delete icon next to the node pool you want to delete
Deleting nodepools may be accomplished by removing the desired crusoe_kubernetes_nodepool resource from your terraform configuration.
If you are having issues creating or deleting clusters, please contact support.