Skip to main content
GPUTypical workloadsNotes
H100 (80GB)Large models (14B+), tensor-parallel RLRDMA-enabled for multi-node training
A100 (40/80GB)Mid-sized models, SFT40GB for models up to 7B, 80GB for 14B+
A10GSmall models (≤4B), experimentsSufficient for initial testing and SFT
L4Evaluation, preprocessingLower cost option for non-training workloads
I