Skip to main content

Logs

System logs are collected for Crusoe managed resources automatically and made available in the Console. No SSH or manual log aggregation is required. The Crusoe Watch Agent collects logs, which you can search, filter, and inspect directly in the Console or query via API. The API also supports discovery queries that let you explore available log fields, field values, and streams before writing full queries.

Managed logs is available for both Crusoe Managed Kubernetes (CMK) clusters and Crusoe Virtual Machines (VMs).

Prerequisites

To use Logs, you need:

For CMK clusters:

  • CMK cluster with Crusoe Watch Agent version 0.3.1 or above installed (see Get started)
  • NVIDIA GPU Operator add-on (if using GPU nodes)

For VMs:

  • Crusoe Watch Agent version vm-v1.0.3 or above installed (see VM Telemetry)

Log Sources

The Crusoe Watch Agent collects the following log sources:

Log SourceDescriptionAvailability
JournalDSystem-level logs from journald, including kernel messages such as GPU XID errors and OOM events, and system services. CMK nodes also include kubelet and container runtime. Supported for NVIDIA GPU accelerated instances, AMD GPU accelerated instances, and non-GPU instances.CMK and VM
crusoe-watch-agentCrusoe Watch Agent service logsCMK and VM
cwa-config-reloaderCrusoe Watch Agent config reloader logsVM only

Accessing Logs Using Console UI

You can access logs in Console UI in two ways:

  • Managed Logs page — Navigate to Command Center in the left navigation bar, then select Managed Logs to search logs across all your CMK clusters and VMs in a unified view.
  • Resource-specific view — Navigate to Orchestration > select your cluster > Logs tab.

Searching and Filtering

You can use the following filters to narrow your log search:

FilterDescription
Instance nameFilter logs by specific node or VM name
Log sourceFilter by log source (see Log Sources)
SeverityFilter by log severity level (see severity levels below)
Time windowSpecify a start and end time to narrow results
Text searchSearch log content using basic text matching

Combine multiple filters to narrow results. For example, search for XID errors in JournalD logs from a specific node within the last 24 hours.

Log Severity Levels

Logs are normalized to the 8-tier RFC 5424 severity taxonomy:

LevelSeverityDescription
0EmergencySystem is unusable
1AlertAction must be taken immediately
2CriticalCritical condition; application cannot continue
3ErrorError handled, service continues
4WarningUnexpected situation, but handled gracefully
5NoticeNormal but significant condition
6InfoNormal operational events (startup, shutdown, config changes)
7DebugDetailed diagnostic information
UndefinedLog entry has no severity field

Querying Logs via API

Queries use LogsQL, VictoriaLogs' query language.

Authentication

Use the same monitoring token generated for metrics access (see Get started). Pass it as a bearer token:

Authorization: Bearer $monitoring_token

Conventions

  • Time formats accepted by start, end, start_time, end_time, time, step, and offset: Unix epoch seconds, relative durations (5m, 1h, 6h), RFC3339 (2026-05-10T12:00:00Z), or the literal now.
  • Default time window: Defaults to the last 15 minutes (now-15m to now).
  • Retention boundary: a start_time older than the 7-day retention window returns 400.
  • Unknown query parameters return 400 with the list of accepted names.
  • LogsQL queries (query parameter) are limited to 4096 characters and 10 pipe operations.
  • Repeatable parameters (e.g. levels, instance_names, cluster_id) accept multiple occurrences: ?levels=ERROR&levels=WARNING.
  • NDJSON responses contain one JSON object per line; JSON responses are a single object.

Endpoints

All endpoints are under https://api.crusoecloud.com/v1/projects/{project_id}. Cluster-scoped variants are under https://api.crusoecloud.com/v1/projects/{project_id}/clusters/{cluster_id}.

Project-scoped endpoints:

EndpointPurposeResponse
GET /logs/queryRun a raw LogsQL query and return matching log entries.NDJSON
GET /logs/tailLive tail stream of incoming log entries (SSE).SSE
GET /logsStructured log listing, filterable by instance names, severity levels, cluster, and log source.JSON
GET /logs/facetsAggregated facet counts for use in filtering UI.JSON
GET /logs/countTotal count of log entries matching a query.JSON
GET /logs/histogramLog counts bucketed over a time range.JSON
GET /logs/fieldsList field names present in matching logs, with hit counts. Use to discover available fields before writing queries.JSON
GET /logs/field_valuesList distinct values of a single field, with hit counts. Use to inspect what values a field takes.JSON
GET /logs/streamsList log streams matching a LogsQL query. Use to enumerate available log streams.JSON
GET /logs/statsPoint-in-time stats query (query must contain a stats pipe).JSON
GET /logs/stats_rangeRange stats query over time (query must contain a stats pipe).JSON

Cluster-scoped endpoints (under .../clusters/{cluster_id}):

EndpointPurposeResponse
GET /logs/facetsCluster-scoped facet aggregation.JSON
GET /logs/countCluster-scoped log count.JSON
GET /logs/histogramCluster-scoped log counts bucketed over time.JSON

GET /logs/query

Run a raw LogsQL query. If the query has no _time: filter, the time bounds from start/end are injected automatically.

ParameterTypeRequiredDefaultNotes
querystring (LogsQL)yesValidated
startstring (time)nonow-15m
endstring (time)nonow
limitintegerno5000Must be > 0; values above 5000 are capped to 5000

Example — query logs for a specific VM:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/query" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=crusoe_vm_id:$vm_id"

Example — limit results to 10 entries:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/query" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=crusoe_vm_id:$vm_id" \
--data-urlencode "limit=10"

Example — search for error logs in a specific VM:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/query" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=crusoe_vm_id:$vm_id AND error"

GET /logs/tail

Stream incoming log entries as a live tail using Server-Sent Events (SSE). Each event contains a single NDJSON log entry.

ParameterTypeRequiredDefaultNotes
querystring (LogsQL)yesValidated
startstring (time)nonowStream logs received after this point in time

Limits: Maximum session duration is 10 minutes. There is a limit on the number of concurrent tail connections per project; opening a new connection when the limit is reached returns 429.

Example — tail all logs:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/tail" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=*"

GET /logs

Return a structured list of log entries, with optional filters for instance names, severity levels, cluster, and log source.

ParameterTypeRequiredDefaultNotes
querystring (LogsQL)no*
startstring (time)nonow-15m
endstring (time)nonow
limitintegerno100Max 1000
instance_namesstringnoRepeatable; filter to specific VM or node names
levelsstringnoRepeatable; RFC 5424 severity names (e.g. ERROR)
cluster_idstringnoRepeatable; filter to specific cluster IDs
log_sourcestringnoFilter to a specific log source

Example — list ERROR logs from the last hour:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "levels=ERROR" \
--data-urlencode "start=now-1h"

GET /logs/facets

Return aggregated facet counts for the matching log entries. Useful for populating filter dropdowns in a UI.

ParameterTypeRequiredDefault
querystring (LogsQL)yes
startstring (time)nonow-15m
endstring (time)nonow

Example — get facet counts for all logs in the last hour:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/facets" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=*" \
--data-urlencode "start=now-1h"

GET /logs/count

Return the total count of log entries matching a query within the given time window.

ParameterTypeRequiredDefault
querystring (LogsQL)yes
startstring (time)nonow-15m
endstring (time)nonow

Example — count ERROR logs in the last 24 hours:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/count" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR" \
--data-urlencode "start=now-24h"

GET /logs/histogram

Return log counts bucketed over a time range. Use to draw a log-volume chart.

ParameterTypeRequiredDefaultNotes
querystring (LogsQL)yes
startstring (time)nonow-15m
endstring (time)nonow
stepstring (duration)noBucket size, e.g. 5m, 15m

Example — ERROR log histogram over the last 6 hours in 15-minute buckets:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/histogram" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR" \
--data-urlencode "start=now-6h" \
--data-urlencode "step=15m"

GET /logs/fields

List the field names present in logs matching the query, with hit counts.

ParameterTypeRequiredDefault
querystring (LogsQL)yes
startstring (time)nonow-15m
endstring (time)nonow

Response:

{
"values": [
{ "value": "_msg", "hits": 1234 },
{ "value": "level", "hits": 1230 }
]
}

Example — list fields available in JournalD logs:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/fields" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=log_source:journald"

GET /logs/field_values

List distinct values of a single field, with hit counts.

ParameterTypeRequiredDefaultNotes
fieldstringyesInternal/forbidden fields return 400
querystring (LogsQL)yes
startstring (time)nonow-15m
endstring (time)nonow
limitintegerno100Must be ≥ 1; values above 1000 are capped to 1000

Response:

{
"values": [
{ "value": "INFO", "hits": 8123 },
{ "value": "ERROR", "hits": 142 }
]
}

Example — list the distinct severity levels seen in the last hour:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/field_values" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "field=level" \
--data-urlencode "query=*" \
--data-urlencode "start=now-1h"

GET /logs/streams

List log streams (label-set identifiers) matching a LogsQL query.

ParameterTypeRequiredDefaultNotes
querystring (LogsQL)yes
startstring (time)nonow-15m
endstring (time)nonow
limitintegerno100Must be ≥ 1; values above 1000 are capped to 1000

Example — list streams emitting JournalD logs in the last hour:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/streams" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=log_source:journald" \
--data-urlencode "start=now-1h"

GET /logs/stats

Run a point-in-time LogsQL stats aggregation, e.g. * | stats count().

ParameterTypeRequiredNotes
querystring (LogsQL)yesMust contain a stats pipe, otherwise 400
timestring (time)noPoint-in-time evaluation timestamp

Example — total error count grouped by severity right now:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/stats" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=* | stats by (level) count() AS total"

GET /logs/stats_range

Run a LogsQL stats aggregation over a time range with stepping.

ParameterTypeRequiredNotes
querystring (LogsQL)yesMust contain a stats pipe, otherwise 400
startstring (time)no
endstring (time)no
stepstring (duration)noBucket size, e.g. 5m, 1h
offsetstring (duration)noTime offset, e.g. 2h, 5h

Example — error rate per 5-minute bucket over the last 6 hours:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/stats_range" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR | stats count() AS errors" \
--data-urlencode "start=now-6h" \
--data-urlencode "step=5m"

Cluster-Scoped Endpoints

The following endpoints are identical to their project-scoped counterparts but automatically filter results to a specific cluster. Use them when you want to scope queries to a single CMK cluster without adding a cluster_id filter to every request.

Base URL: https://api.crusoecloud.com/v1/projects/{project_id}/clusters/{cluster_id}

  • GET /logs/facets — Cluster-scoped facet aggregation
  • GET /logs/count — Cluster-scoped log count
  • GET /logs/histogram — Cluster-scoped log histogram

Example — count error logs for a specific cluster in the last hour:

curl -G "https://api.crusoecloud.com/v1/projects/$project_id/clusters/$cluster_id/logs/count" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR" \
--data-urlencode "start=now-1h"

Log Retention

Logs are retained for 7 days and automatically purged after 7 days.

Rate Limits and Quotas

LimitValue
Maximum time range per query7 days
Maximum queries per 5 minutes150
LogsQL query quota (per project, per user, per day)10,000 (HTTP 429 returned if exceeded)

Common Troubleshooting Workflows

Diagnosing Storage Mount Issues

  1. Navigate to Logs and filter by node instance name.
  2. Set the log source to JournalD and search for Kubelet entries.
  3. Search for mount errors: MountVolume, nfs.
  4. Check for filesystem errors, RAID issues, or NFS connectivity problems.

What's Next

  • Topology — Identify unhealthy nodes and run diagnostics
  • Metrics — Correlate log events with performance data
  • Notifications — Get notified about resource health via email and in-console, and set up alert routing to Slack or webhooks