NemoClaw

In this guide you will use an inference endpoint from Rafay's Token Factory as a custom model provider within a self hosted, BYO NemoClaw instance.

Assumptions¶

This exercise assumes the following requirements are already in place.

An active Token Factory model deployment (If using vLLM, be sure your deployment is using the following extra engine arguments, "--enable-auto-tool-choice --tool-call-parser hermes")
A customer tenant org with access to a user with an end user role
You have a machine to run NemoClaw that meets the prerequisites

1. Retrieve Model API Details¶

In this section, you will retrieve the Token Factory Model API details. These details will be used to configure the NemoClaw model provider in a later step.

Log into the Developer Hub console as a tenant end user
Navigate to GenAI -> Model APIs
Click on the model card for the model you will be using with NemoClaw
Click Get an API Key
Enter a name for the key
Click Create

Copy the key provided and store in a safe location as it cannot be retrieved again
Copy the model name and endpoint and save for later use

2. Install NemoClaw¶

In this section, you will install NemoClaw and configure a sandbox that will use your Token Factory model.

Run the following command to install nemoclaw

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

After a few minutes, you will be presented onscreen with options to configure inference.

Enter 3 for Other OpenAI-compatible endpoint and press enter
Enter the Endpoint from Token Factory for the Model Provider Base URL (Be sure to remove "/chat/completions" from the end of the URL) and press Enter
Enter the previously stored API key from Token Factory for the API Key and press Enter
Enter the model name for the endpoint model and press Enter

Next, you will be presented to choose a name for the sandbox, keep the default and press Enter.

After a few minutes, you will be presented with the policy presets to select. Keep the default and press Enter.

If you intend to use the UI, be sure to copy the tokenized URL in the output.

3. Use NemoClaw with Token Factory¶

In this section, you will initiate a chat session from the NemoClaw.

List the available sandboxes by running the following command

nemoclaw list

Run the following command to connect to the sandbox instance

nemoclaw my-assistant connect

Once connected, run the following command to initiate a chat

openclaw agent --agent main --local -m "Hello, what model are you using?" --session-id test

You will see a response showing your model name.

4. Verify Token Usage¶

In this section, you will verify token usage from NemoClaw within Token Factory.

Log into the Developer Hub console as a tenant end user
Navigate to GenAI -> Token Usage
Select the Token Usage tab

You will see the token usage from the previously sent chat message.