Skip to main content

Optimize GPU Performance

info

Crusoe Cloud is currently in private beta. If you do not currently have access, please request access to continue.

Across all of the NVIDIA-based Crusoe accelerated instances it is possible to further optimize the performance of the GPUs by leveraging a clock locking mechanism to reduce latency and maximize performance of the workloads. With the NVIDIA driver installed apply the following settings:

Ensure that the GPUs are in persistent mode:

sudo nvidia-smi -pm 1

Graphics Clock Locking - To reduce clock switching latency and ensure that the maximum SM clock is available across all execution kernels in your code. You can set all GPUs to the maximum value:

nvidia-smi -i 0 --query-supported-clocks="gr" --format=csv,noheader | head -n 1 | awk '{print $1}' | xargs sudo nvidia-smi -lgc

Memory Clock Locking - To reduce clock switching latency and ensure that the maximum memory clock is available across all execution kernels and DMAs in yoru code. You can set all GPUs to maximum value:

nvidia-smi -i 0 --query-supported-clocks="mem" --format=csv,noheader | head -n 1 | awk '{print $1}' | xargs sudo nvidia-smi -lmc

You can confirm that the settings were applied based on the clock SMs/Memory values below:

GPUMax Graphics Clock (Mhz)Max Memory Clock (MHz)
NVIDIA A600021008001
NVIDIA A4017407251
NVIDIA A100 40GB PCIe14101215*
NVIDIA A100 80GB PCIe14101512*
NVIDIA A100 80GB SXM414101593
NVIDIA H100 SXM519802619

*Setting the memory clock through the locking mechanism is not supported

For workloads that can tolerate memory errors (ie. Graphics targeted workloads) or if your code has an out-of-band error correcting mechanism. You can maximize performance of your code by disabling ECC. To disable ECC run the following:

sudo nvidia-smi -e 0

which will disable ECC for all GPUs in the instance, a reboot of the instance will be required to take effect.

MIG - To partition certain GPU types into multiple instances for running different, isolated workloads, you can run

sudo nvidia-smi -i 0 -mig 1
# -i 0 partitions gpu with index 0

For more information and to see which Nvidia GPUs support this, see this link.