Telemetry Relay

note

Telemetry Relay is currently in limited availability. Contact your Crusoe account team if you have any questions or would like to enable this feature for your project.

Telemetry Relay enables you to export infrastructure and custom metrics from your CMK cluster or VMs to external observability platforms. You can integrate Crusoe infrastructure data into your existing Grafana, Datadog, or Splunk dashboards without managing separate data pipelines.

Telemetry Relay exposes a Prometheus-compatible scraping endpoint for external tools and is available for Crusoe Managed Kubernetes (CMK) and Crusoe Virtual Machines (VMs).

How it Works

Telemetry Relay uses the same metric collection infrastructure as Command Center Metrics:

The Crusoe Watch Agent collects metrics from CMK nodes or VMs at 60-second intervals.
Metrics are published to the Crusoe metrics backend.
Telemetry Relay exposes a Prometheus-compatible scraping endpoint.
Your external platform scrapes the endpoint to retrieve metrics.

Prerequisites

To use Telemetry Relay, you need:

CMK cluster with Crusoe Watch Agent installed (see Metrics)
External observability platform that supports Prometheus remote read or scraping

Available Metrics

You can export any infrastructure and custom metrics collected by the Crusoe Watch Agent, including GPU (DCGM), CPU, memory, network, InfiniBand, and NVLink metrics. See Infrastructure Metrics for the complete list.

Configuring Telemetry Relay

Endpoint

Use the following endpoint to access your metrics:

https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape

Authentication

You need a monitoring token to authenticate requests. Generate one using the Crusoe CLI. See Querying Metrics via API for instructions.

Connecting to Grafana

To connect Grafana to Telemetry Relay:

In Grafana, navigate to Configuration > Data Sources > Add data source.
Select Prometheus.

Set the URL to:

https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape

Under Custom HTTP Headers, add:
- Header: Authorization
- Value: Bearer <monitoring-token>
Set the Scrape interval to a minimum of 60 seconds.
Click Save & Test.

You can now build dashboards using the available infrastructure metrics.

Connecting to Datadog

To connect Datadog to Telemetry Relay:

Add a Prometheus check to your Datadog Agent configuration:

instances:
  - prometheus_url: "https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape"
    namespace: "crusoe"
    metrics:
      - "*"
    headers:
      Authorization: "Bearer <monitoring-token>"

Replace the following placeholders:

<project-id>: Your Crusoe project ID (find via crusoe projects list)
<monitoring-token>: Generate with crusoe monitoring tokens create

Restart the Datadog Agent. Metrics will appear in Datadog under the configured namespace.

Connecting to Splunk

To connect Splunk to Telemetry Relay, configure your OpenTelemetry Collector with Crusoe's scrape endpoint. Below is an example setting:

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: "crusoe-metrics"
          scrape_interval: 60s
          scrape_timeout: 10s
          scheme: https

          authorization:
            type: Bearer
            credentials: <crusoe-monitoring-token>

          static_configs:
            - targets: ["api.crusoecloud.com"]

          metrics_path: "/api/v1alpha5/projects/<project-id>/metrics/scrape"

processors:
  transform:
    metric_statements:
      - context: datapoint
        statements:
          - delete_key(attributes, "crusoe_resource")

  batch:
    timeout: 10s
    send_batch_size: 1000

exporters:
  otlphttp:
    metrics_endpoint: "https://ingest.<splunk-realm>.signalfx.com/v2/datapoint/otlp"
    headers:
      X-SF-Token: "<splunk-access-token>"

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [transform, batch]
      exporters: [otlphttp]

Replace the following placeholders:

<crusoe-monitoring-token>: Generate with crusoe monitoring tokens create
<project-id>: Your Crusoe project ID (find via crusoe projects list)
<splunk-access-token>: Your Splunk Observability Cloud access token
<splunk-realm>: Your Splunk realm (e.g., us1, us2, eu0)

Restart is required. Metrics will appear in Splunk Observability Cloud under Metrics → Metric Finder. Search for crusoe_ to find your Crusoe metrics.

Connecting to Other Prometheus-Compatible Platforms

You can connect any Prometheus-compatible platform to Telemetry Relay using the scrape endpoint.

Configure your platform with:

Endpoint URL: https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape
Authentication: Bearer token via Authorization header
Scrape interval: Minimum 60 seconds

Replace the following placeholders:

<project-id>: Your Crusoe project ID (find via crusoe projects list)
Generate a monitoring token with crusoe monitoring tokens create

Filtering Metrics

All platforms support filtering metrics by adding query parameters to the scrape endpoint URL. This allows you to reduce the volume of metrics exported and focus on specific data.

Available Filters

Parameter	Description	Example
`metric_name`	Filter by metric name (comma-separated list)	`metric_name=crusoe_vm_memory_.*`
`labels`	Filter by label key:value pairs (comma-separated)	`labels=collector:disk,device:vda`
`metric_category`	Filter by category (`system` or `custom`)	`metric_category=system`

Filter Examples

Filter by memory-related metrics:

https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape?metric_name=crusoe_vm_memory_.*

Filter by labels:

https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape?labels=collector:disk

Filter by metric category (system metrics only):

https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape?metric_category=system

Combined filters for disk metrics on device vda1:

https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape?metric_name=crusoe_vm_disk_.*&labels=device:vda1

Platform-Specific Examples

Datadog:

instances:
  - prometheus_url: "https://api.crusoecloud.com/v1alpha5/projects/<project-id>/metrics/scrape?metric_name=crusoe_vm_memory_.*"
    namespace: "crusoe"
    metrics:
      - "*"
    headers:
      Authorization: "Bearer <monitoring-token>"

Splunk OpenTelemetry Collector:

metrics_path: "/api/v1alpha5/projects/<project-id>/metrics/scrape?labels=collector:disk"

Grafana or other Prometheus-compatible platforms:

Add query parameters directly to the configured endpoint URL.

Limitations

Metrics only — Log streaming is planned for a future release.
Minimum scrape interval — 60 seconds.
Metrics retention — Metrics are retained for 30 days on the Crusoe backend. External platform retention is governed by your platform's policies.

What's Next

Metrics — View metrics directly in the Crusoe Console
Logs — Access centralized log data
Notifications — Get notified about resource health via email and in-console, and set up alert routing to Slack or webhooks

Telemetry Relay

How it Works​

Prerequisites​

Available Metrics​

Configuring Telemetry Relay​

Endpoint​

Authentication​

Connecting to Grafana​

Connecting to Datadog​

Connecting to Splunk​

Connecting to Other Prometheus-Compatible Platforms​

Filtering Metrics​

Available Filters​

Filter Examples​

Platform-Specific Examples​

Limitations​

What's Next​

How it Works

Prerequisites

Available Metrics

Configuring Telemetry Relay

Endpoint

Authentication

Connecting to Grafana

Connecting to Datadog

Connecting to Splunk

Connecting to Other Prometheus-Compatible Platforms

Filtering Metrics

Available Filters

Filter Examples

Platform-Specific Examples

Limitations

What's Next