Managed Inference Overview

Crusoe Cloud's Managed Inference service allows you to interact with supported models through our Intelligence Foundry APIs. Models are served on our proprietary inference engine with MemoryAlloy, a cluster-wide memory fabric with cache aware routing that maximizes cache hits, improving TTFT and throughput.

Available models

You can use the OpenAI-API compatible endpoint at api.inference.crusoecloud.com to access the models below. You can also interact with all of the models using the Intelligence Foundry's chat interface. All Meta models provided by Crusoe are "Built with Llama".

For each model's pricing information, see pricing.

MODEL	PROVIDER	TYPE	CONTEXT LENGTH	LICENSE	ACCEPTABLE USE POLICY
deepseek-ai/DeepSeek-V3-0324	DeepSeek	instruct	160k	MIT License
deepseek-ai/DeepSeek-V4-Flash	DeepSeek	instruct	1M	MIT License
deepseek-ai/DeepSeek-V4-Pro	DeepSeek	instruct	1M	MIT License
google/gemma-4-31b-it	Google	instruct	262k	Apache License 2.0
meta-llama/Llama-3.3-70B-Instruct	Meta	instruct	128k	Llama 3.3 Community License Agreement	Llama 3.3 Acceptable Use Policy
nvidia/Nemotron-3-Nano-30B-A3B	NVIDIA	instruct	262k	NVIDIA Nemotron Open Model License	NVIDIA Acceptable Use Terms
nvidia/Nemotron-3-Nano-Omni-Reasoning-30B-A3B	NVIDIA	instruct	262k	NVIDIA Open Model Agreement
nvidia/Nemotron-3-Super-120B-A12B	NVIDIA	instruct	262k	NVIDIA Nemotron Open Model License	NVIDIA Acceptable Use Terms
nvidia/Nemotron-3-Ultra-550B	NVIDIA	instruct	262k	NVIDIA Nemotron Open Model License	NVIDIA Acceptable Use Terms
nvidia/Nemotron-3-VoiceChat	NVIDIA	speech-to-speech	131k	NVIDIA Software and Model Evaluation License	NVIDIA Acceptable Use Terms
openai/gpt-oss-120b	OpenAI	instruct	128k	Apache License 2.0	Acceptable Use Policy
qwen/Qwen3-235B-A22B	Qwen	instruct	131k	Apache License 2.0	Apache License 2.0
zai/GLM-5.1	Z.ai	instruct	202k	MIT License

Available models​

Available models