Troubleshooting
vLLM Deployment Pending Due to Insufficient Memory¶
Issue Observed¶
The vLLM deployment pod remains in a Pending state due to an insufficient memory error.
Debugging Steps¶
-
Validate MIG Profile and GPU Capacity: Verify the configured MIG profile and available GPU memory. The node provides
3g.71gb(~71 GB), confirming that the selected MIG profile is adequate. -
Check GPU Utilization: Execute
nvidia-smiand confirm that no GPU workloads are running on the node. -
Review Pod Resource Configuration: Inspect the pod YAML and validate the requested resources. Identify that the memory request and limit are set higher than required.
-
Verify vLLM SKU Service Profile: Review the vLLM SKU Service Profile configuration and confirm that the memory limit is configured with a higher value than necessary.
Resolution¶
Update the memory configuration to an appropriate value and redeploy the model. The pod transitions to the Running state.