Skip to main content

Overview

Crusoe Cloud's Managed Inference service allows you to interact with supported models via our APIs, available on the Intelligence Foundry. Models are served on our proprietary inference engine with MemoryAlloy, a cluster-wide memory fabric with cache aware routing that maximizes cache hits, improving TTFT and throughput.

You can find more information on available models below. Pricing is listed for each model on the model cards, accessible here.

Available Models

For text generation

We provide an OpenAI-API compatible endpoint at api.crusoe.ai for the models below. All Meta models provided by Crusoe are “Built with Llama”. You may also interact with all of the models via a chat interface on the Intelligence Foundry, access via theCrusoe Cloud console.

NameProviderTypeContext LengthLicenseAcceptable Use Policy
meta-llama/Llama-3.3-70B-Instruct (Model card¹)Metainstruct128kLlama 3.3 Community License AgreementLlama 3.3 Acceptable Use Policy
openai/gpt-oss-120b (Model card¹)OpenAIinstruct128kApache License 2.0Acceptable Use Policy
deepseek-ai/DeepSeek-V3-0324 (Model card¹)DeepSeekinstruct160kMIT LicenseMIT License
deepseek-ai/DeepSeek-R1-0528 (Model card¹)DeepSeekinstruct160kMIT LicenseMIT License
deepseek-ai/DeepSeek-V3.1 (Model card¹)DeepSeekinstruct160kMIT LicenseMIT License
Qwen/Qwen3-235B-A22B (Model card¹)Qweninstruct131kApache License 2.0Apache License 2.0
google/gemma-3-12b-it (Model card¹)Googleinstruct128kGemma Terms of UseGemma Terms of Use, note use restrictions in Section 3.2
moonshotai/Kimi-K2-Thinking (Model card¹)Moonshot AIinstruct131kMoonshot Terms of Use