Logs
System logs are collected for Crusoe managed resources automatically and made available in the Console. No SSH or manual log aggregation is required. The Crusoe Watch Agent collects logs, which you can search, filter, and inspect directly in the Console or query via API. The API also supports discovery queries that let you explore available log fields, field values, and streams before writing full queries.
Managed logs is available for both Crusoe Managed Kubernetes (CMK) clusters and Crusoe Virtual Machines (VMs).
Prerequisites
To use Logs, you need:
For CMK clusters:
- CMK cluster with Crusoe Watch Agent version 0.3.1 or above installed (see Get started)
- NVIDIA GPU Operator add-on (if using GPU nodes)
For VMs:
- Crusoe Watch Agent version vm-v1.0.3 or above installed (see VM Telemetry)
Log Sources
The Crusoe Watch Agent collects the following log sources:
| Log Source | Description | Availability |
|---|---|---|
| JournalD | System-level logs from journald, including kernel messages such as GPU XID errors and OOM events, and system services. CMK nodes also include kubelet and container runtime. Supported for NVIDIA GPU accelerated instances, AMD GPU accelerated instances, and non-GPU instances. | CMK and VM |
| crusoe-watch-agent | Crusoe Watch Agent service logs | CMK and VM |
| cwa-config-reloader | Crusoe Watch Agent config reloader logs | VM only |
Accessing Logs Using Console UI
You can access logs in Console UI in two ways:
- Managed Logs page — Navigate to Command Center in the left navigation bar, then select Managed Logs to search logs across all your CMK clusters and VMs in a unified view.
- Resource-specific view — Navigate to Orchestration > select your cluster > Logs tab.
Searching and Filtering
You can use the following filters to narrow your log search:
| Filter | Description |
|---|---|
| Instance name | Filter logs by specific node or VM name |
| Log source | Filter by log source (see Log Sources) |
| Severity | Filter by log severity level (see severity levels below) |
| Time window | Specify a start and end time to narrow results |
| Text search | Search log content using basic text matching |
Combine multiple filters to narrow results. For example, search for XID errors in JournalD logs from a specific node within the last 24 hours.
Log Severity Levels
Logs are normalized to the 8-tier RFC 5424 severity taxonomy:
| Level | Severity | Description |
|---|---|---|
| 0 | Emergency | System is unusable |
| 1 | Alert | Action must be taken immediately |
| 2 | Critical | Critical condition; application cannot continue |
| 3 | Error | Error handled, service continues |
| 4 | Warning | Unexpected situation, but handled gracefully |
| 5 | Notice | Normal but significant condition |
| 6 | Info | Normal operational events (startup, shutdown, config changes) |
| 7 | Debug | Detailed diagnostic information |
| — | Undefined | Log entry has no severity field |
Querying Logs via API
Queries use LogsQL, VictoriaLogs' query language.
Authentication
Use the same monitoring token generated for metrics access (see Get started). Pass it as a bearer token:
Authorization: Bearer $monitoring_token
Conventions
- Time formats accepted by
start,end,start_time,end_time,time,step, andoffset: Unix epoch seconds, relative durations (5m,1h,6h), RFC3339 (2026-05-10T12:00:00Z), or the literalnow. - Default time window: Defaults to the last 15 minutes (
now-15mtonow). - Retention boundary: a
start_timeolder than the 7-day retention window returns400. - Unknown query parameters return
400with the list of accepted names. - LogsQL queries (
queryparameter) are limited to 4096 characters and 10 pipe operations. - Repeatable parameters (e.g.
levels,instance_names,cluster_id) accept multiple occurrences:?levels=ERROR&levels=WARNING. - NDJSON responses contain one JSON object per line; JSON responses are a single object.
Endpoints
All endpoints are under https://api.crusoecloud.com/v1/projects/{project_id}. Cluster-scoped variants are under https://api.crusoecloud.com/v1/projects/{project_id}/clusters/{cluster_id}.
Project-scoped endpoints:
| Endpoint | Purpose | Response |
|---|---|---|
GET /logs/query | Run a raw LogsQL query and return matching log entries. | NDJSON |
GET /logs/tail | Live tail stream of incoming log entries (SSE). | SSE |
GET /logs | Structured log listing, filterable by instance names, severity levels, cluster, and log source. | JSON |
GET /logs/facets | Aggregated facet counts for use in filtering UI. | JSON |
GET /logs/count | Total count of log entries matching a query. | JSON |
GET /logs/histogram | Log counts bucketed over a time range. | JSON |
GET /logs/fields | List field names present in matching logs, with hit counts. Use to discover available fields before writing queries. | JSON |
GET /logs/field_values | List distinct values of a single field, with hit counts. Use to inspect what values a field takes. | JSON |
GET /logs/streams | List log streams matching a LogsQL query. Use to enumerate available log streams. | JSON |
GET /logs/stats | Point-in-time stats query (query must contain a stats pipe). | JSON |
GET /logs/stats_range | Range stats query over time (query must contain a stats pipe). | JSON |
Cluster-scoped endpoints (under .../clusters/{cluster_id}):
| Endpoint | Purpose | Response |
|---|---|---|
GET /logs/facets | Cluster-scoped facet aggregation. | JSON |
GET /logs/count | Cluster-scoped log count. | JSON |
GET /logs/histogram | Cluster-scoped log counts bucketed over time. | JSON |
GET /logs/query
Run a raw LogsQL query. If the query has no _time: filter, the time bounds from start/end are injected automatically.
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
query | string (LogsQL) | yes | — | Validated |
start | string (time) | no | now-15m | |
end | string (time) | no | now | |
limit | integer | no | 5000 | Must be > 0; values above 5000 are capped to 5000 |
Example — query logs for a specific VM:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/query" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=crusoe_vm_id:$vm_id"
Example — limit results to 10 entries:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/query" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=crusoe_vm_id:$vm_id" \
--data-urlencode "limit=10"
Example — search for error logs in a specific VM:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/query" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=crusoe_vm_id:$vm_id AND error"
GET /logs/tail
Stream incoming log entries as a live tail using Server-Sent Events (SSE). Each event contains a single NDJSON log entry.
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
query | string (LogsQL) | yes | — | Validated |
start | string (time) | no | now | Stream logs received after this point in time |
Limits: Maximum session duration is 10 minutes. There is a limit on the number of concurrent tail connections per project; opening a new connection when the limit is reached returns 429.
Example — tail all logs:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/tail" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=*"
GET /logs
Return a structured list of log entries, with optional filters for instance names, severity levels, cluster, and log source.
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
query | string (LogsQL) | no | * | |
start | string (time) | no | now-15m | |
end | string (time) | no | now | |
limit | integer | no | 100 | Max 1000 |
instance_names | string | no | — | Repeatable; filter to specific VM or node names |
levels | string | no | — | Repeatable; RFC 5424 severity names (e.g. ERROR) |
cluster_id | string | no | — | Repeatable; filter to specific cluster IDs |
log_source | string | no | — | Filter to a specific log source |
Example — list ERROR logs from the last hour:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "levels=ERROR" \
--data-urlencode "start=now-1h"
GET /logs/facets
Return aggregated facet counts for the matching log entries. Useful for populating filter dropdowns in a UI.
| Parameter | Type | Required | Default |
|---|---|---|---|
query | string (LogsQL) | yes | — |
start | string (time) | no | now-15m |
end | string (time) | no | now |
Example — get facet counts for all logs in the last hour:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/facets" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=*" \
--data-urlencode "start=now-1h"
GET /logs/count
Return the total count of log entries matching a query within the given time window.
| Parameter | Type | Required | Default |
|---|---|---|---|
query | string (LogsQL) | yes | — |
start | string (time) | no | now-15m |
end | string (time) | no | now |
Example — count ERROR logs in the last 24 hours:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/count" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR" \
--data-urlencode "start=now-24h"
GET /logs/histogram
Return log counts bucketed over a time range. Use to draw a log-volume chart.
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
query | string (LogsQL) | yes | — | |
start | string (time) | no | now-15m | |
end | string (time) | no | now | |
step | string (duration) | no | — | Bucket size, e.g. 5m, 15m |
Example — ERROR log histogram over the last 6 hours in 15-minute buckets:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/histogram" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR" \
--data-urlencode "start=now-6h" \
--data-urlencode "step=15m"
GET /logs/fields
List the field names present in logs matching the query, with hit counts.
| Parameter | Type | Required | Default |
|---|---|---|---|
query | string (LogsQL) | yes | — |
start | string (time) | no | now-15m |
end | string (time) | no | now |
Response:
{
"values": [
{ "value": "_msg", "hits": 1234 },
{ "value": "level", "hits": 1230 }
]
}
Example — list fields available in JournalD logs:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/fields" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=log_source:journald"
GET /logs/field_values
List distinct values of a single field, with hit counts.
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
field | string | yes | — | Internal/forbidden fields return 400 |
query | string (LogsQL) | yes | — | |
start | string (time) | no | now-15m | |
end | string (time) | no | now | |
limit | integer | no | 100 | Must be ≥ 1; values above 1000 are capped to 1000 |
Response:
{
"values": [
{ "value": "INFO", "hits": 8123 },
{ "value": "ERROR", "hits": 142 }
]
}
Example — list the distinct severity levels seen in the last hour:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/field_values" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "field=level" \
--data-urlencode "query=*" \
--data-urlencode "start=now-1h"
GET /logs/streams
List log streams (label-set identifiers) matching a LogsQL query.
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
query | string (LogsQL) | yes | — | |
start | string (time) | no | now-15m | |
end | string (time) | no | now | |
limit | integer | no | 100 | Must be ≥ 1; values above 1000 are capped to 1000 |
Example — list streams emitting JournalD logs in the last hour:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/streams" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=log_source:journald" \
--data-urlencode "start=now-1h"
GET /logs/stats
Run a point-in-time LogsQL stats aggregation, e.g. * | stats count().
| Parameter | Type | Required | Notes |
|---|---|---|---|
query | string (LogsQL) | yes | Must contain a stats pipe, otherwise 400 |
time | string (time) | no | Point-in-time evaluation timestamp |
Example — total error count grouped by severity right now:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/stats" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=* | stats by (level) count() AS total"
GET /logs/stats_range
Run a LogsQL stats aggregation over a time range with stepping.
| Parameter | Type | Required | Notes |
|---|---|---|---|
query | string (LogsQL) | yes | Must contain a stats pipe, otherwise 400 |
start | string (time) | no | |
end | string (time) | no | |
step | string (duration) | no | Bucket size, e.g. 5m, 1h |
offset | string (duration) | no | Time offset, e.g. 2h, 5h |
Example — error rate per 5-minute bucket over the last 6 hours:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/logs/stats_range" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR | stats count() AS errors" \
--data-urlencode "start=now-6h" \
--data-urlencode "step=5m"
Cluster-Scoped Endpoints
The following endpoints are identical to their project-scoped counterparts but automatically filter results to a specific cluster. Use them when you want to scope queries to a single CMK cluster without adding a cluster_id filter to every request.
Base URL: https://api.crusoecloud.com/v1/projects/{project_id}/clusters/{cluster_id}
GET /logs/facets— Cluster-scoped facet aggregationGET /logs/count— Cluster-scoped log countGET /logs/histogram— Cluster-scoped log histogram
Example — count error logs for a specific cluster in the last hour:
curl -G "https://api.crusoecloud.com/v1/projects/$project_id/clusters/$cluster_id/logs/count" \
-H "Authorization: Bearer $monitoring_token" \
--data-urlencode "query=level:ERROR" \
--data-urlencode "start=now-1h"
Log Retention
Logs are retained for 7 days and automatically purged after 7 days.
Rate Limits and Quotas
| Limit | Value |
|---|---|
| Maximum time range per query | 7 days |
| Maximum queries per 5 minutes | 150 |
| LogsQL query quota (per project, per user, per day) | 10,000 (HTTP 429 returned if exceeded) |
Common Troubleshooting Workflows
Diagnosing Storage Mount Issues
- Navigate to Logs and filter by node instance name.
- Set the log source to JournalD and search for Kubelet entries.
- Search for mount errors:
MountVolume,nfs. - Check for filesystem errors, RAID issues, or NFS connectivity problems.
What's Next
- Topology — Identify unhealthy nodes and run diagnostics
- Metrics — Correlate log events with performance data
- Notifications — Get notified about resource health via email and in-console, and set up alert routing to Slack or webhooks