Quickstart
Follow this quickstart to connect dstack to your Crusoe Cloud project and learn how to run the following three workload types on demand:
- Tasks for batch jobs
- Dev environments for interactive GPU access
- Services for long-running endpoints
Prerequisites
- A Crusoe Cloud account with a project, and sufficient quota for the GPU instances you plan to use. Contact customer support if you need a quota increase.
- An API key. See Manage your API keys for instructions on creating an API key.
- Python 3.8+ and
pip(oruv), or Docker, to run the dstack server.
1. Install and start the dstack server
Install dstack and start the server:
pip install "dstack[all]" -U
dstack server
Example output:
Applying ~/.dstack/server/config.yml...
The admin token is "bbae0f28-d3dd-4820-bf61-8f4bb40815da"
The server is running at http://127.0.0.1:3000/
Next, point the CLI to the server using the admin token from the output:
dstack project add \
--name main \
--url http://127.0.0.1:3000 \
--token bbae0f28-d3dd-4820-bf61-8f4bb40815da
The server can also run with Docker.
2. Configure a backend
A backend connects the dstack server to Crusoe. Choose one of the following two options, add it to ~/.dstack/server/config.yml, and restart the server.
- VMs (native backend)
- Crusoe Managed Kubernetes
With the native crusoe backend, dstack provisions instances directly through the Crusoe API:
projects:
- name: main
backends:
- type: crusoe
project_id: your-project-id
creds:
type: access_key
access_key: your-access-key
secret_key: your-secret-key
With the kubernetes backend, dstack schedules workloads onto an existing CMK cluster. Prepare the cluster and then add the backend configuration:
-
Go to Networking > Firewall Rules, click Create Firewall Rule, and allow ingress traffic on port
30022. The dstack server uses this port to reach the SSH jump host it deploys on the cluster. -
Go to Orchestration and click Create Cluster. Enable the NVIDIA GPU Operator add-on.
-
Open the cluster and click Create Node Pool. Select the instance type and the desired number of nodes, then wait until they're provisioned.
notedstack schedules workloads only onto nodes that are already provisioned. Enabling autoscaling on the node pool doesn't allow dstack to trigger scale-ups.
-
Configure the backend with the cluster's kubeconfig:
projects:- name: mainbackends:- type: kuberneteskubeconfig:filename: <kubeconfig path>proxy_jump:port: 30022
3. Create a fleet
A fleet is a pool of instances that runs are scheduled onto. Create fleet.dstack.yml:
type: fleet
name: my-fleet
nodes: 0..1
backends: [crusoe]
resources:
gpu: A100:80GB:8
Apply the configuration:
dstack apply -f fleet.dstack.yml
With nodes: 0..1, dstack provisions an instance only when you submit a workload and terminates it after the configured idle_duration (3 days by default), so an empty fleet costs nothing. Use a fixed count (nodes: 1) to keep instances up.
If you configured the CMK backend in the previous step, set backends: [kubernetes] instead of [crusoe]; dstack then uses your node pool's existing nodes.
This Quickstart uses a single-instance fleet. For multi-node InfiniBand clusters, add placement: cluster, covered in Clusters.
4. Run a task
A task runs commands to completion. Create hello-gpu.dstack.yml:
type: task
name: hello-gpu
commands:
- nvidia-smi
resources:
gpu: A100:80GB:8
Submit it:
dstack apply -f hello-gpu.dstack.yml
dstack schedules the task on the fleet, streams the output to your terminal, and the node's eight GPUs appear in the nvidia-smi output. You can also use tasks to run training jobs, including distributed, multi-node training—see Clusters.
5. Run a dev environment
A dev environment gives you interactive SSH and IDE access to a GPU instance. Create vscode.dstack.yml:
type: dev-environment
name: vscode
ide: vscode
resources:
gpu: A100:80GB:8
Apply the configuration:
dstack apply -f vscode.dstack.yml
After running the command, the CLI prints a link that opens the remote machine directly in VS Code. Dev environments can auto-stop after a period of inactivity using inactivity_duration.
6. Deploy a service
A service is a long-running workload exposed as an endpoint—for example, an inference server. Create llama-service.dstack.yml:
type: service
name: llama-service
env:
- HF_TOKEN
commands:
- pip install vllm
- vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
port: 8000
model: meta-llama/Meta-Llama-3.1-8B-Instruct
resources:
gpu: A100:80GB:8
The model property makes the deployment available through an OpenAI-compatible endpoint. Services support replicas, auto-scaling, and custom domains through gateways; see the dstack services docs. For production inference stacks, including SGLang, vLLM, TensorRT-LLM, and disaggregated prefill/decode serving with Dynamo, see dstack's examples.
Manage runs and fleets
Use these CLI commands to inspect and manage runs, fleets, and available instances:
| Command | Description |
|---|---|
dstack ps | List runs and their status |
dstack logs <run> | View logs of a run |
dstack stop <run> | Stop a run |
dstack fleet list | List fleets and instances |
dstack offer -b crusoe | List available instance types and regions |
dstack delete -f fleet.dstack.yml | Delete a fleet and its instances |
Storage
The crusoe backend doesn't support dstack network volumes. Use instance volumes, bind-mounts of host directories into the run's container, for caching datasets and checkpoints between runs on the same instance.
Troubleshooting
| Issue | Resolution |
|---|---|
| No offers found when applying a configuration | Run dstack offer -b crusoe to check available instance types and regions; verify your resources spec and project quota. |
| Workloads stay queued on CMK | dstack uses only already-provisioned nodes. Check the node pool size and that the GPU Operator add-on is enabled. |
| dstack server can't reach the CMK cluster | Verify the firewall rule allowing ingress on port 30022. |
Volume creation fails on the crusoe backend | Network volumes aren't supported; use instance volumes instead. |
Next steps
- Clusters - Provision multi-node InfiniBand clusters and validate them with NCCL tests
- dstack Resources - Links to dstack concepts, inference and training examples, and GitHub
- dstack documentation - Full concepts, CLI, and API reference