Skip to main content

Optimize GPU Performance

Across all of the NVIDIA-based Crusoe accelerated instances it is possible to further optimize the performance of the GPUs by leveraging a clock locking mechanism to reduce latency and maximize performance of the workloads. With the NVIDIA driver installed apply the following settings:

Ensure that the GPUs are in persistent mode:

sudo nvidia-smi -pm 1

Graphics Clock Locking - To reduce clock switching latency and ensure that the maximum SM clock is available across all execution kernels in your code. You can set all GPUs to the maximum value:

nvidia-smi -i 0 --query-supported-clocks="gr" --format=csv,noheader | head -n 1 | awk '{print $1}' | xargs sudo nvidia-smi -lgc

Memory Clock Locking - To reduce clock switching latency and ensure that the maximum memory clock is available across all execution kernels and DMAs in yoru code. You can set all GPUs to maximum value:

nvidia-smi -i 0 --query-supported-clocks="mem" --format=csv,noheader | head -n 1 | awk '{print $1}' | xargs sudo nvidia-smi -lmc

You can confirm that the settings were applied based on the clock SMs/Memory values below:

GPUMax Graphics Clock (Mhz)Max Memory Clock (MHz)
NVIDIA A600021008001
NVIDIA A4017407251
NVIDIA A100 40GB PCIe14101215*
NVIDIA A100 80GB PCIe14101512*
NVIDIA A100 80GB SXM414101593
NVIDIA H100 SXM519802619

*Setting the memory clock through the locking mechanism is not supported

For workloads that can tolerate memory errors (ie. Graphics targeted workloads) or if your code has an out-of-band error correcting mechanism. You can maximize performance of your code by disabling ECC. To disable ECC run the following:

sudo nvidia-smi -e 0

which will disable ECC for all GPUs in the instance, a reboot of the instance will be required to take effect.

MIG - To partition certain GPU types into multiple instances for running different, isolated workloads, you can run

sudo nvidia-smi -i 0 -mig 1
# -i 0 partitions gpu with index 0

For more information and to see which Nvidia GPUs support this, see this link.