Test
In this section, you will query the models deployed using RayLLM.
Step 1: Port forward the Kubernetes Serve service.¶
kubectl port-forward service/rayllm-serve-svc 8000:8000 -n kuberay
Step 2: Send a query to amazon/LightGPT
.¶
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "amazon/LightGPT",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What are the top 5 most popular programming languages?"}
],
"temperature": 0.7
}'
Response:
{"id":"amazon/LightGPT-336fc0f9f06af9cd182d7f7d9009427d","object":"text_completion","created":1700030387,"model":"amazon/LightGPT","choices":[{"message":{"role":"assistant","content":"1. Java\n2. C/C++\n3. C#\n4. JavaScript\n5. Python"},"index":0,"finish_reason":"stop"}],"usage":{"prompt_tokens":26,"completion_tokens":24,"total_tokens":50}}
¶
{"id":"amazon/LightGPT-336fc0f9f06af9cd182d7f7d9009427d","object":"text_completion","created":1700030387,"model":"amazon/LightGPT","choices":[{"message":{"role":"assistant","content":"1. Java\n2. C/C++\n3. C#\n4. JavaScript\n5. Python"},"index":0,"finish_reason":"stop"}],"usage":{"prompt_tokens":26,"completion_tokens":24,"total_tokens":50}}
Recap¶
Congratulations! You have successfully deployed the kuberay-operator on your managed Kubernetes cluster as an add-on in a custom cluster blueprint. You then queried the models deployed using RayLLM.