Optimize GPU Performance

Across all of the NVIDIA-based Crusoe accelerated instances it is possible to further optimize the performance of the GPUs by leveraging a clock locking mechanism to reduce latency and maximize performance of the workloads. With the NVIDIA driver installed apply the following settings:

Ensure that the GPUs are in persistent mode:

sudo nvidia-smi -pm 1

Graphics Clock Locking - To reduce clock switching latency and ensure that the maximum SM clock is available across all execution kernels in your code. You can set all GPUs to the maximum value:

nvidia-smi -i 0 --query-supported-clocks="gr" --format=csv,noheader | head -n 1 | awk '{print $1}' | xargs sudo nvidia-smi -lgc

Memory Clock Locking - To reduce clock switching latency and ensure that the maximum memory clock is available across all execution kernels and DMAs in yoru code. You can set all GPUs to maximum value:

nvidia-smi -i 0 --query-supported-clocks="mem" --format=csv,noheader | head -n 1 | awk '{print $1}' | xargs sudo nvidia-smi -lmc

You can confirm that the settings were applied based on the clock SMs/Memory values below:

GPU	Max Graphics Clock (Mhz)	Max Memory Clock (MHz)
NVIDIA A6000	2100	8001
NVIDIA A40	1740	7251
NVIDIA A100 40GB PCIe	1410	1215*
NVIDIA A100 80GB PCIe	1410	1512*
NVIDIA A100 80GB SXM4	1410	1593
NVIDIA H100 SXM5	1980	2619

*Setting the memory clock through the locking mechanism is not supported

For workloads that can tolerate memory errors (ie. Graphics targeted workloads) or if your code has an out-of-band error correcting mechanism. You can maximize performance of your code by disabling ECC. To disable ECC run the following:

sudo nvidia-smi -e 0

which will disable ECC for all GPUs in the instance, a reboot of the instance will be required to take effect.

MIG - To partition certain GPU types into multiple instances for running different, isolated workloads, you can run

sudo nvidia-smi -i 0 -mig 1
# -i 0 partitions gpu with index 0

For more information and to see which Nvidia GPUs support this, see this link.