Telemetry Relay
Telemetry Relay is currently in limited availability. Contact your Crusoe account team if you have any questions or would like to enable this feature for your project.
Telemetry Relay enables you to export infrastructure and custom metrics from your CMK cluster or VMs to external observability platforms. You can integrate Crusoe infrastructure data into your existing Grafana, Datadog, or Splunk dashboards without managing separate data pipelines.
Telemetry Relay exposes a Prometheus-compatible scraping endpoint for external tools and is available for Crusoe Managed Kubernetes (CMK) and Crusoe Virtual Machines (VMs).
How it Works
Telemetry Relay uses the same metric collection infrastructure as Command Center Metrics:
- The Crusoe Watch Agent collects metrics from CMK nodes or VMs at 60-second intervals.
- Metrics are published to the Crusoe metrics backend.
- Telemetry Relay exposes a Prometheus-compatible scraping endpoint.
- Your external platform scrapes the endpoint to retrieve metrics.
Prerequisites
To use Telemetry Relay, you need:
- CMK cluster with Crusoe Watch Agent installed (see Metrics)
- External observability platform that supports Prometheus remote read or scraping
Available Metrics
You can export any infrastructure and custom metrics collected by the Crusoe Watch Agent, including GPU (DCGM), CPU, memory, network, InfiniBand, and NVLink metrics. See Infrastructure Metrics for the complete list.
Configuring Telemetry Relay
Endpoint
Use the following endpoint to access your metrics:
https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape
Authentication
You need a monitoring token to authenticate requests. Generate one using the Crusoe CLI. See Querying Metrics via API for instructions.
Connecting to Grafana
To connect Grafana to Telemetry Relay:
In Grafana, navigate to Configuration > Data Sources > Add data source.
Select Prometheus.
Set the URL to:
https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrapeUnder Custom HTTP Headers, add:
- Header:
Authorization - Value:
Bearer <monitoring-token>
- Header:
Set the Scrape interval to a minimum of 60 seconds.
Click Save & Test.
You can now build dashboards using the available infrastructure metrics.
Connecting to Datadog
To connect Datadog to Telemetry Relay:
Add a Prometheus check to your Datadog Agent configuration:
instances:
- prometheus_url: "https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape"
namespace: "crusoe"
metrics:
- "*"
headers:
Authorization: "Bearer <monitoring-token>"Replace the following placeholders:
<project-id>: Your Crusoe project ID (find viacrusoe projects list)<monitoring-token>: Generate withcrusoe monitoring tokens create
Restart the Datadog Agent. Metrics will appear in Datadog under the configured namespace.
Connecting to Splunk
To connect Splunk to Telemetry Relay, configure your OpenTelemetry Collector with Crusoe's scrape endpoint. Below is an example setting:
receivers:
prometheus:
config:
scrape_configs:
- job_name: "crusoe-metrics"
scrape_interval: 60s
scrape_timeout: 10s
scheme: https
authorization:
type: Bearer
credentials: <crusoe-monitoring-token>
static_configs:
- targets: ["api.crusoecloud.com"]
metrics_path: "/api/v1alpha5/projects/<project-id>/metrics/scrape"
processors:
transform:
metric_statements:
- context: datapoint
statements:
- delete_key(attributes, "crusoe_resource")
batch:
timeout: 10s
send_batch_size: 1000
exporters:
otlphttp:
metrics_endpoint: "https://ingest.<splunk-realm>.signalfx.com/v2/datapoint/otlp"
headers:
X-SF-Token: "<splunk-access-token>"
service:
pipelines:
metrics:
receivers: [prometheus]
processors: [transform, batch]
exporters: [otlphttp]
Replace the following placeholders:
<crusoe-monitoring-token>: Generate withcrusoe monitoring tokens create<project-id>: Your Crusoe project ID (find viacrusoe projects list)<splunk-access-token>: Your Splunk Observability Cloud access token<splunk-realm>: Your Splunk realm (e.g.,us1,us2,eu0)
Restart is required. Metrics will appear in Splunk Observability Cloud under Metrics → Metric Finder. Search for crusoe_ to find your Crusoe metrics.
Connecting to Other Prometheus-Compatible Platforms
You can connect any Prometheus-compatible platform to Telemetry Relay using the scrape endpoint.
Configure your platform with:
- Endpoint URL:
https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape - Authentication: Bearer token via
Authorizationheader - Scrape interval: Minimum 60 seconds
Replace the following placeholders:
<project-id>: Your Crusoe project ID (find viacrusoe projects list)- Generate a monitoring token with
crusoe monitoring tokens create
Filtering Metrics
All platforms support filtering metrics by adding query parameters to the scrape endpoint URL. This allows you to reduce the volume of metrics exported and focus on specific data.
Available Filters
| Parameter | Description | Example |
|---|---|---|
metric_name | Filter by metric name (comma-separated list) | metric_name=crusoe_vm_memory_.* |
labels | Filter by label key:value pairs (comma-separated) | labels=collector:disk,device:vda |
metric_category | Filter by category (system or custom) | metric_category=system |
Filter Examples
Filter by memory-related metrics:
https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape?metric_name=crusoe_vm_memory_.*
Filter by labels:
https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape?labels=collector:disk
Filter by metric category (system metrics only):
https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape?metric_category=system
Combined filters for disk metrics on device vda1:
https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape?metric_name=crusoe_vm_disk_.*&labels=device:vda1
Platform-Specific Examples
Datadog:
instances:
- prometheus_url: "https://api.crusoecloud.com/api/v1alpha5/projects/<project-id>/metrics/scrape?metric_name=crusoe_vm_memory_.*"
namespace: "crusoe"
metrics:
- "*"
headers:
Authorization: "Bearer <monitoring-token>"
Splunk OpenTelemetry Collector:
metrics_path: "/api/v1alpha5/projects/<project-id>/metrics/scrape?labels=collector:disk"
Grafana or other Prometheus-compatible platforms:
Add query parameters directly to the configured endpoint URL.
Limitations
- Metrics only — Log streaming is planned for a future release.
- Minimum scrape interval — 60 seconds.
- Metrics retention — Metrics are retained for 30 days on the Crusoe backend. External platform retention is governed by your platform's policies.
What's Next
- Metrics — View metrics directly in the Crusoe Console
- Logs — Access centralized log data
- Notifications — Get notified about resource health via email and in-console, and set up alert routing to Slack or webhooks