Skip to content

Endpoint

An endpoint is a URL where end user/application can access one/many LLMs via the OpenAI compatible API. An endpoint can be either multitenant or dedicated to a single tenant.


New Endpoint

  • In the Ops Console, click on GenAI and then Endpoint.
  • Now, click on "New Endpoint" to initiate the workflow

General Section

Provide a unique name for the endpoint and an optional description.

New Endpoint

Deployment Section

Enter the "host name" for the endpoint (e.g. https://api.inference.com) and select the compute cluster from the dropdown that will be used to power the inference service.

New Endpoint

Certificate Section

Users and applications that will access the Inference service's API endpoint will expect the service to be secured using server side TLS. Upload the server certificate (chain) and private key in PEM format.

New Endpoint


List All Endpoints

In the Ops Console, click on GenAI and then Endpoint. This will display the list of configured endpoints, their status and some metadata for the administrator.

List All Endpoints


View Endpoint Details

In the Ops Console, click on GenAI and then Endpoint. This will display details about the endpoint

View Endpoint Details


Delete Endpoint

To delete an endpoint, click on the ellipses (3 dots) under actions for the selected endpoint.

Important

This action is not reversible. Admins will need to recreate the endpoint in case of accidental deletion.