NemoClaw
In this guide you will use an inference endpoint from Rafay's Token Factory as a custom model provider within a self hosted, BYO NemoClaw instance.
Assumptions¶
This exercise assumes the following requirements are already in place.
- An active Token Factory model deployment (If using vLLM, be sure your deployment is using the following extra engine arguments, "--enable-auto-tool-choice --tool-call-parser hermes")
- A customer tenant org with access to a user with an end user role
- You have a machine to run NemoClaw that meets the prerequisites
1. Retrieve Model API Details¶
In this section, you will retrieve the Token Factory Model API details. These details will be used to configure the NemoClaw model provider in a later step.
- Log into the Developer Hub console as a tenant end user
- Navigate to GenAI -> Model APIs
- Click on the model card for the model you will be using with NemoClaw
- Click Get an API Key
- Enter a name for the key
- Click Create
- Copy the key provided and store in a safe location as it cannot be retrieved again
- Copy the model name and endpoint and save for later use
2. Install NemoClaw¶
In this section, you will install NemoClaw and configure a sandbox that will use your Token Factory model.
- Run the following command to install nemoclaw
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
After a few minutes, you will be presented onscreen with options to configure inference.
- Enter 3 for Other OpenAI-compatible endpoint and press enter
- Enter the Endpoint from Token Factory for the Model Provider Base URL (Be sure to remove "/chat/completions" from the end of the URL) and press Enter
- Enter the previously stored API key from Token Factory for the API Key and press Enter
- Enter the model name for the endpoint model and press Enter
Next, you will be presented to choose a name for the sandbox, keep the default and press Enter.
After a few minutes, you will be presented with the policy presets to select. Keep the default and press Enter.
If you intend to use the UI, be sure to copy the tokenized URL in the output.
3. Use NemoClaw with Token Factory¶
In this section, you will initiate a chat session from the NemoClaw.
- List the available sandboxes by running the following command
nemoclaw list
- Run the following command to connect to the sandbox instance
nemoclaw my-assistant connect
- Once connected, run the following command to initiate a chat
openclaw agent --agent main --local -m "Hello, what model are you using?" --session-id test
You will see a response showing your model name.
4. Verify Token Usage¶
In this section, you will verify token usage from NemoClaw within Token Factory.
- Log into the Developer Hub console as a tenant end user
- Navigate to GenAI -> Token Usage
- Select the Token Usage tab
You will see the token usage from the previously sent chat message.








