Skip to content

NemoClaw

In this guide you will use an inference endpoint from Rafay's Token Factory as a custom model provider within a self hosted, BYO NemoClaw instance.

Architecture


Assumptions

This exercise assumes the following requirements are already in place.

  • An active Token Factory model deployment (If using vLLM, be sure your deployment is using the following extra engine arguments, "--enable-auto-tool-choice --tool-call-parser hermes")
  • A customer tenant org with access to a user with an end user role
  • You have a machine to run NemoClaw that meets the prerequisites

1. Retrieve Model API Details

In this section, you will retrieve the Token Factory Model API details. These details will be used to configure the NemoClaw model provider in a later step.

  • Log into the Developer Hub console as a tenant end user
  • Navigate to GenAI -> Model APIs
  • Click on the model card for the model you will be using with NemoClaw
  • Click Get an API Key
  • Enter a name for the key
  • Click Create

API Key

  • Copy the key provided and store in a safe location as it cannot be retrieved again
  • Copy the model name and endpoint and save for later use

API Key


2. Install NemoClaw

In this section, you will install NemoClaw and configure a sandbox that will use your Token Factory model.

  • Run the following command to install nemoclaw
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

After a few minutes, you will be presented onscreen with options to configure inference.

  • Enter 3 for Other OpenAI-compatible endpoint and press enter
  • Enter the Endpoint from Token Factory for the Model Provider Base URL (Be sure to remove "/chat/completions" from the end of the URL) and press Enter
  • Enter the previously stored API key from Token Factory for the API Key and press Enter
  • Enter the model name for the endpoint model and press Enter

Installation

Next, you will be presented to choose a name for the sandbox, keep the default and press Enter.

Installation

After a few minutes, you will be presented with the policy presets to select. Keep the default and press Enter.

Installation

If you intend to use the UI, be sure to copy the tokenized URL in the output.


3. Use NemoClaw with Token Factory

In this section, you will initiate a chat session from the NemoClaw.

  • List the available sandboxes by running the following command
nemoclaw list

NemoClaw Chat

  • Run the following command to connect to the sandbox instance
nemoclaw my-assistant connect
  • Once connected, run the following command to initiate a chat
openclaw agent --agent main --local -m "Hello, what model are you using?" --session-id test

You will see a response showing your model name.

NemoClaw Chat


4. Verify Token Usage

In this section, you will verify token usage from NemoClaw within Token Factory.

  • Log into the Developer Hub console as a tenant end user
  • Navigate to GenAI -> Token Usage
  • Select the Token Usage tab

You will see the token usage from the previously sent chat message.

Token Usage