Supported GPUs

GPU	Typical workloads	Notes
H100 (80GB)	Large models (14B+), tensor-parallel RL	RDMA-enabled for multi-node training
A100 (40/80GB)	Mid-sized models, SFT	40GB for models up to 7B, 80GB for 14B+
A10G	Small models (≤4B), experiments	Sufficient for initial testing and SFT
L4	Evaluation, preprocessing	Lower cost option for non-training workloads

⌘I