Skip to main content

Object Storage Overview

Crusoe Cloud Object Storage provides high-performance, S3-compatible object storage designed for AI/ML workloads. Store and retrieve datasets, model checkpoints, training artifacts, and other unstructured data with the standard S3 format. It is ideal for petabyte-scale datasets and can be used to migrate data from other cloud providers to Crusoe.

Key Features

  • S3-Compatible API: Use existing S3 tools and libraries (boto3, s3cmd, rclone, aws s3 cli) without modification
  • High Performance: Optimized for large file uploads and downloads common in ML workflows
  • Regional Storage: Data stored in the same location as your VMs for low-latency access
  • Versioning & Object Lock: Protect critical data from accidental deletion or modification
  • Multipart Upload Support: Efficient handling of large files with automatic chunking
  • Pre-Signed URL: Direct access to a private Objects without exposing your credentials

Prerequisites

Before using Object Storage, ensure the following:

  1. You have an active Crusoe Cloud Organization with Object Storage enabled. Contact your account team or Crusoe support if you do not see Object Storage in your Console.
  2. Your VMs are running in a location where the Object Storage bucket is created.
  3. Your project has been migrated to NFS for Shared Disks (if applicable). Projects still using virtiofs for Shared Disks must migrate to NFS before Object Storage can be enabled.
info

Note: Object Storage on Crusoe Cloud is in Limited Availability. Please contact your account team or Crusoe support to get access to Object Storage.

Architecture

Object Storage is a regional resource - buckets are created in specific locations and can only be accessed from VMs in the same location. This design ensures low-latency access and high throughput for data-intensive workloads.

Object Storage Endpoints

Each location has a dedicated Object Storage endpoint:

https://object.<location>.crusoecloudcompute.com

For example:

  • https://object.us-east1-a.crusoecloudcompute.com
  • https://object.us-southcentral1-a.crusoecloudcompute.com

Authentication

Object Storage uses dedicated Object Storage API keys (access key and secret key pairs) for authentication. These are separate from your Crusoe Cloud API tokens and are managed through the Console or CLI. See Managing Object Storage API Keys for more information.

Naming Rules

Bucket Names

Bucket names must:

  • Be unique across a Crusoe Cloud region
  • Be between 3 and 63 characters long
  • Contain only lowercase letters, numbers, and hyphens
  • Start and end with a letter or number
  • Not contain consecutive hyphens
  • Not be formatted as an IP address (e.g., 192.168.1.1)

Object Keys

Object keys (file paths within buckets) can:

  • Be up to 1024 characters long
  • Contain any UTF-8 character
  • Use forward slashes (/) to create logical folder structures

Getting Started

  1. Create an Object Storage API Key: Generate access credentials for Object Storage
  2. Create a Bucket: Set up a Object storage bucket in your desired location
  3. Configure Your Object Storage Client: Point your tools to the Crusoe Object Storage endpoint
  4. Upload and Download Objects: Use standard S3 operations to manage your data

See Managing Object Storage API Keys and Managing Buckets for detailed instructions.

Supported S3 Features

Crusoe Object Storage supports the following S3 features:

  • Basic object operations (PUT, GET, DELETE, HEAD)
  • Multipart uploads
  • Bucket and object listing
  • Bucket versioning
  • Object locking (WORM - Write Once Read Many)
  • Bucket tagging
  • Object metadata
  • Range requests (partial downloads)
  • Presigned URLs

Features not currently supported:

  • Server-side encryption (SSE)
  • Access Control Lists (ACLs) beyond bucket-level permissions
  • Cross-region replication
  • Lifecycle policies
  • Event notifications

Performance Characteristics

  • Upload Speed: Optimized for large file uploads (64 MB+ objects recommended)
  • Download Speed: High-throughput reads for training pipelines
  • Multipart Upload: Automatic chunking for large files
  • Concurrency: High concurrent request handling for distributed workloads

For optimal performance:

  • Use multipart uploads for files larger than 64 MB
  • Increase concurrency settings in your S3 client
  • Ensure your VM type has sufficient VPC network bandwidth

Next Steps