Skip to main content

Quickstart

Follow this quickstart to connect dstack to your Crusoe Cloud project and learn how to run the following three workload types on demand:

  • Tasks for batch jobs
  • Dev environments for interactive GPU access
  • Services for long-running endpoints

Prerequisites

  • A Crusoe Cloud account with a project, and sufficient quota for the GPU instances you plan to use. Contact customer support if you need a quota increase.
  • An API key. See Manage your API keys for instructions on creating an API key.
  • Python 3.8+ and pip (or uv), or Docker, to run the dstack server.

1. Install and start the dstack server

Install dstack and start the server:

pip install "dstack[all]" -U
dstack server

Example output:

Applying ~/.dstack/server/config.yml...

The admin token is "bbae0f28-d3dd-4820-bf61-8f4bb40815da"
The server is running at http://127.0.0.1:3000/

Next, point the CLI to the server using the admin token from the output:

dstack project add \
--name main \
--url http://127.0.0.1:3000 \
--token bbae0f28-d3dd-4820-bf61-8f4bb40815da

The server can also run with Docker.

2. Configure a backend

A backend connects the dstack server to Crusoe. Choose one of the following two options, add it to ~/.dstack/server/config.yml, and restart the server.

With the native crusoe backend, dstack provisions instances directly through the Crusoe API:

projects:
- name: main
backends:
- type: crusoe
project_id: your-project-id
creds:
type: access_key
access_key: your-access-key
secret_key: your-secret-key

3. Create a fleet

A fleet is a pool of instances that runs are scheduled onto. Create fleet.dstack.yml:

type: fleet
name: my-fleet

nodes: 0..1

backends: [crusoe]

resources:
gpu: A100:80GB:8

Apply the configuration:

dstack apply -f fleet.dstack.yml

With nodes: 0..1, dstack provisions an instance only when you submit a workload and terminates it after the configured idle_duration (3 days by default), so an empty fleet costs nothing. Use a fixed count (nodes: 1) to keep instances up.

If you configured the CMK backend in the previous step, set backends: [kubernetes] instead of [crusoe]; dstack then uses your node pool's existing nodes.

tip

This Quickstart uses a single-instance fleet. For multi-node InfiniBand clusters, add placement: cluster, covered in Clusters.

4. Run a task

A task runs commands to completion. Create hello-gpu.dstack.yml:

type: task
name: hello-gpu

commands:
- nvidia-smi

resources:
gpu: A100:80GB:8

Submit it:

dstack apply -f hello-gpu.dstack.yml

dstack schedules the task on the fleet, streams the output to your terminal, and the node's eight GPUs appear in the nvidia-smi output. You can also use tasks to run training jobs, including distributed, multi-node training—see Clusters.

5. Run a dev environment

A dev environment gives you interactive SSH and IDE access to a GPU instance. Create vscode.dstack.yml:

type: dev-environment
name: vscode

ide: vscode

resources:
gpu: A100:80GB:8

Apply the configuration:

dstack apply -f vscode.dstack.yml

After running the command, the CLI prints a link that opens the remote machine directly in VS Code. Dev environments can auto-stop after a period of inactivity using inactivity_duration.

6. Deploy a service

A service is a long-running workload exposed as an endpoint—for example, an inference server. Create llama-service.dstack.yml:

type: service
name: llama-service

env:
- HF_TOKEN
commands:
- pip install vllm
- vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
port: 8000
model: meta-llama/Meta-Llama-3.1-8B-Instruct

resources:
gpu: A100:80GB:8

The model property makes the deployment available through an OpenAI-compatible endpoint. Services support replicas, auto-scaling, and custom domains through gateways; see the dstack services docs. For production inference stacks, including SGLang, vLLM, TensorRT-LLM, and disaggregated prefill/decode serving with Dynamo, see dstack's examples.

Manage runs and fleets

Use these CLI commands to inspect and manage runs, fleets, and available instances:

CommandDescription
dstack psList runs and their status
dstack logs <run>View logs of a run
dstack stop <run>Stop a run
dstack fleet listList fleets and instances
dstack offer -b crusoeList available instance types and regions
dstack delete -f fleet.dstack.ymlDelete a fleet and its instances

Storage

The crusoe backend doesn't support dstack network volumes. Use instance volumes, bind-mounts of host directories into the run's container, for caching datasets and checkpoints between runs on the same instance.

Troubleshooting

IssueResolution
No offers found when applying a configurationRun dstack offer -b crusoe to check available instance types and regions; verify your resources spec and project quota.
Workloads stay queued on CMKdstack uses only already-provisioned nodes. Check the node pool size and that the GPU Operator add-on is enabled.
dstack server can't reach the CMK clusterVerify the firewall rule allowing ingress on port 30022.
Volume creation fails on the crusoe backendNetwork volumes aren't supported; use instance volumes instead.

Next steps

  • Clusters - Provision multi-node InfiniBand clusters and validate them with NCCL tests
  • dstack Resources - Links to dstack concepts, inference and training examples, and GitHub
  • dstack documentation - Full concepts, CLI, and API reference