Skip to main content

Cluster Details

This page provides information on key aspects of CMK, such as our current supported versions, cluster components and other operational details.

Version Support

CMK currently supports Kubernetes version 1.30 for clusters, with planned support for 1.31 and 1.32. Specific CMK versions are appended with a -cmk.x suffix, where x denotes a monotonically increasing count starting from '0', denoting Crusoe-specific patch version releases.

Cluster Components

CMK's cluster distribution closely follows upstream Kubernetes. We aim for our distribution to be as 'standard' as possible to simplify installation and configuration of packages and libraries typically used when building AI workloads.

ComponentDetails
Container Runtimecontainerd
Cluster DNSCoreDNS
Container Network Interface (CNI)Cilium
Worker Node OSUbuntu 22.04
GPU / Network Driver InstallationVia operators, available through cluster add-ons

Shared Responsibility

CMK is a managed platform service. Crusoe owns the availability, monitoring and maintenance of control plane nodes and components, such as the API server, etcd and controller manager. We are also responsible for ensuring connectivity between various cluster components and node registration.

You are responsible for any configuration or software deployed on the cluster post provisioning, along with maintenance and patching of worker node VMs. While we periodically publish updated worker node images that you may consume, you must manage re-provisioning of VMs with the updates.

Firewall Rules and Secrets

When creating a cluster in a subnet, we create and manage the following firewall rules to manage and maintain connectivity to the cluster. These rules will be created on cluster creation, updated as necessary and removed when clusters are deleted.

Note that for non-default VPCs, you will need to enable intra-VPC communication between nodes in your subnet for your cluster to successfully bootstrap. We do not create these rules by default. Additionally, the 'Destination Resource' for rules may not be visible via our interfaces given that our control plane nodes are fully managed.

ComponentDetails
cmk-cp-api-access-cp-*Provides public API access to the cluster control plane
cmk-cp-metrics-access-cp-*Opens up port 24224 on cluster control plane nodes to export metrics
cmk-cp-fluent-bit-access-cp-*Opens up port 24224 on cluster control plane nodes to export logs
cmk-cp-ssh-access-cp-*Provides SSH access to control plane nodes from a restricted set of Crusoe-managed IPs

In addition to firewall rules, we also create an API token with the name cmk-{clusterName} that provides API access to relevant cluster components and add-ons like the Crusoe CSI or Cloud Controller Manager.

Cluster Upgrades

You may request an upgrade by reaching out to Crusoe Cloud support. Our team will work with you to schedule an appropriate window to upgrade your cluster control planes.

Cluster Access Control

CMK cluster identity is currently decoupled from Crusoe identity and roles. Upon cluster provisioning, 'admin' users in your organization will be able to retreive a cluster Kubeconfig tied to an admin role within the cluster. This may be done via the UI, CLI or API.

Cluster Trust

Communication between control plane and worker nodes, control plane and etcd along with etcd to etcd communication is encrypted via mTLS. Secrets stored in etcd are also encrypted at rest.