Quickstart

Follow this quickstart to connect dstack to your Crusoe Cloud project and learn how to run the following three workload types on demand:

Tasks for batch jobs
Dev environments for interactive GPU access
Services for long-running endpoints

Prerequisites

A Crusoe Cloud account with a project, and sufficient quota for the GPU instances you plan to use. Contact customer support if you need a quota increase.
An API key. See Manage your API keys for instructions on creating an API key.
Python 3.8+ and pip (or uv), or Docker, to run the dstack server.

1. Install and start the dstack server

Install dstack and start the server:

pip install "dstack[all]" -U
dstack server

Example output:

Applying ~/.dstack/server/config.yml...

The admin token is "bbae0f28-d3dd-4820-bf61-8f4bb40815da"
The server is running at http://127.0.0.1:3000/

Next, point the CLI to the server using the admin token from the output:

dstack project add \
  --name main \
  --url http://127.0.0.1:3000 \
  --token bbae0f28-d3dd-4820-bf61-8f4bb40815da

The server can also run with Docker.

2. Configure a backend

A backend connects the dstack server to Crusoe. Choose one of the following two options, add it to ~/.dstack/server/config.yml, and restart the server.

VMs (native backend)
Crusoe Managed Kubernetes

With the native crusoe backend, dstack provisions instances directly through the Crusoe API:

projects:
  - name: main
    backends:
      - type: crusoe
        project_id: your-project-id
        creds:
          type: access_key
          access_key: your-access-key
          secret_key: your-secret-key

With the kubernetes backend, dstack schedules workloads onto an existing CMK cluster. Prepare the cluster and then add the backend configuration:

Go to Networking > Firewall Rules, click Create Firewall Rule, and allow ingress traffic on port 30022. The dstack server uses this port to reach the SSH jump host it deploys on the cluster.
Go to Orchestration and click Create Cluster. Enable the NVIDIA GPU Operator add-on.
Open the cluster and click Create Node Pool. Select the instance type and the desired number of nodes, then wait until they're provisioned.

note
dstack schedules workloads only onto nodes that are already provisioned. Enabling autoscaling on the node pool doesn't allow dstack to trigger scale-ups.

Configure the backend with the cluster's kubeconfig:

projects:
  - name: main
    backends:
      - type: kubernetes
        kubeconfig:
          filename: <kubeconfig path>
        proxy_jump:
          port: 30022

3. Create a fleet

A fleet is a pool of instances that runs are scheduled onto. Create fleet.dstack.yml:

type: fleet
name: my-fleet

nodes: 0..1

backends: [crusoe]

resources:
  gpu: A100:80GB:8

Apply the configuration:

dstack apply -f fleet.dstack.yml

With nodes: 0..1, dstack provisions an instance only when you submit a workload and terminates it after the configured idle_duration (3 days by default), so an empty fleet costs nothing. Use a fixed count (nodes: 1) to keep instances up.

If you configured the CMK backend in the previous step, set backends: [kubernetes] instead of [crusoe]; dstack then uses your node pool's existing nodes.

tip

This Quickstart uses a single-instance fleet. For multi-node InfiniBand clusters, add placement: cluster, covered in Clusters.

4. Run a task

A task runs commands to completion. Create hello-gpu.dstack.yml:

type: task
name: hello-gpu

commands:
  - nvidia-smi

resources:
  gpu: A100:80GB:8

Submit it:

dstack apply -f hello-gpu.dstack.yml

dstack schedules the task on the fleet, streams the output to your terminal, and the node's eight GPUs appear in the nvidia-smi output. You can also use tasks to run training jobs, including distributed, multi-node training—see Clusters.

5. Run a dev environment

A dev environment gives you interactive SSH and IDE access to a GPU instance. Create vscode.dstack.yml:

type: dev-environment
name: vscode

ide: vscode

resources:
  gpu: A100:80GB:8

Apply the configuration:

dstack apply -f vscode.dstack.yml

After running the command, the CLI prints a link that opens the remote machine directly in VS Code. Dev environments can auto-stop after a period of inactivity using inactivity_duration.

6. Deploy a service

A service is a long-running workload exposed as an endpoint—for example, an inference server. Create llama-service.dstack.yml:

type: service
name: llama-service

env:
  - HF_TOKEN
commands:
  - pip install vllm
  - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
port: 8000
model: meta-llama/Meta-Llama-3.1-8B-Instruct

resources:
  gpu: A100:80GB:8

The model property makes the deployment available through an OpenAI-compatible endpoint. Services support replicas, auto-scaling, and custom domains through gateways; see the dstack services docs. For production inference stacks, including SGLang, vLLM, TensorRT-LLM, and disaggregated prefill/decode serving with Dynamo, see dstack's examples.

Manage runs and fleets

Use these CLI commands to inspect and manage runs, fleets, and available instances:

Command	Description
`dstack ps`	List runs and their status
`dstack logs <run>`	View logs of a run
`dstack stop <run>`	Stop a run
`dstack fleet list`	List fleets and instances
`dstack offer -b crusoe`	List available instance types and regions
`dstack delete -f fleet.dstack.yml`	Delete a fleet and its instances

Storage

The crusoe backend doesn't support dstack network volumes. Use instance volumes, bind-mounts of host directories into the run's container, for caching datasets and checkpoints between runs on the same instance.

Troubleshooting

Issue	Resolution
No offers found when applying a configuration	Run `dstack offer -b crusoe` to check available instance types and regions; verify your `resources` spec and project quota.
Workloads stay queued on CMK	dstack uses only already-provisioned nodes. Check the node pool size and that the GPU Operator add-on is enabled.
dstack server can't reach the CMK cluster	Verify the firewall rule allowing ingress on port `30022`.
Volume creation fails on the `crusoe` backend	Network volumes aren't supported; use instance volumes instead.

Next steps

Clusters - Provision multi-node InfiniBand clusters and validate them with NCCL tests
dstack Resources - Links to dstack concepts, inference and training examples, and GitHub
dstack documentation - Full concepts, CLI, and API reference

Prerequisites​

1. Install and start the dstack server​

2. Configure a backend​

3. Create a fleet​

4. Run a task​

5. Run a dev environment​

6. Deploy a service​

Manage runs and fleets​

Storage​

Troubleshooting​

Next steps​