Use

At this point, the developer will have the ability to deploy and deprovision environments based on the shared environment template. Note that the developer

Does not need to have any knowledge of Terraform
Does not need access to privileged credentials for AWS
Does not any help from the Platform team to deploy their environment

Use Gen AI Environment¶

Once the developer logs into the Rafay Org (SSO using Identity Provider recommended), they will only have access to the specific Project they have been authorized to use. Their level of access in the newly created project will be controlled using RBAC (role based access control). It is recommended that they only be provided with the role "Environment Template User" which allows to use the provided Environment Templates, but nothing more.

Important

Although the recommended workflow assumes and recommends using an Integration with an Identity Provider (IdP) to provide a Single Sign On (SSO) experience, organizations can also use locally managed users.

sequenceDiagram
    participant dev as Developer
    participant rafay as Rafay <br> Environment Manager
    participant csp as ECS Cluster
    participant idp as Identity Provider 

    dev->>idp: Access Environment 
    idp-->>dev: Redirect to Rafay 
    dev-->>rafay: SSO to Rafay with <br> RBAC (Env Template User)

    dev->>rafay: Create Environment <br>based on Env Template 
    rafay->>csp: Provision new ECS Cluster w/VPC, subnets and Gen AI App  
    rafay-->>dev: Environment Ready
    dev->>csp: Uses GenAI Environment 
    dev-->>csp: Explore 1st Gen AI App 
    dev->>rafay: Deploy 2nd Gen AI App 
    dev-->>rafay: Deploy Custom Gen AI App

Step 1: Create Application Environment Resource¶

In this step, a second user, such as a developer, will create an environment resource in the controller which will use the previously created environment template. The environment resource will be used to create the VPC, ECS cluster and Generative AI application. This environment resource will be used to control the lifecycle of the application environment.

Log into the controller and select your project
Navigate to Environments -> Environments
Click New Environment
Enter gen-ai-ecs for the name
Select the existing application environment template
Select the environment template version
Click Create
Navigate to Input Variables
Click Add Variable
Enter image_location for the variable name
Select Text for the value type
Enter the image location public.ecr.aws/rafay-dev/gen-ai-sample-chat-app for the value
Click Add Variable
Enter container_port for the variable name
Select Text for the value type
Enter the container port number 8000 for the value
Click Save

Step 2: Deploy Application Environment¶

In this step, the developer user will now deploy the previously created application environment. Deploying the environment will create a VPC, ECS Cluster and deploy a generative AI application onto the cluster.

Log into the controller and select your project
Navigate to Environments -> Environments
Click on the gen-ai-ecs environment
Click Publish

The environment will begin to publish and could take ~5 minutes to complete.

Step 3: Access Application¶

We have provided two Gen AI example applications in a public ECR repository. The environment template will deploy one of the Gen AI example applications as part of the environment creation.

Once the environment has finished deploying, the user can use the environment output to find the application endpoint. The endpoint can be entered into a browser to test the application.

Log into the controller and select your project
Navigate to Environments -> Environments
Click on the gen-ai-ecs environment
Click Resource
Expand the resource named gen-ai-aws-app, you will see a public endpoint

Copy the endpoint and enter it into a browser

You will now access the first application. This application uses Amazon Bedrock to act as an intelligent chat bot. You can enter text into the chat and the engine will respond.

Step 4: Update Application¶

We will now deploy the second GenAI application to ECS using the Environment resource that was previously created.

Log into the controller and select your project
Navigate to Environments -> Environments
Click on the gen-ai-ecs environment
Click Edit Configuration
Navigate to Input Variables
Update the image location with public.ecr.aws/rafay-dev/genai:latest for the value
Update the container port number with 80 for the value
Click Save

Step 5: Deploy Application Environment¶

In this step, the developer user will now deploy the updated application environment. Deploying the environment will update the generative AI application with a new container image.

Log into the controller and select your project
Navigate to Environments -> Environments
Click on the gen-ai-ecs environment
Click Publish

The environment will begin to publish and could take ~5 minutes to complete.

Step 6: Access Application¶

Once the environment has finished deploying, the user can use the environment output to find the application endpoint. The endpoint can be entered into a browser to test the application.

Log into the controller and select your project
Navigate to Environments -> Environments
Click on the gen-ai-ecs environment
Click Resource
Expand the resource named gen-ai-aws-app, you will see a public endpoint

Copy the endpoint and enter it into a browser

You will now access the first application. This application takes a text file as input and summarizes the content. The application uses Amazon Bedrock to produce a summary of the text file.

Develop & Deploy Your Containers¶

At this point, the developer is ready to go ahead with the development and testing of their own Gen AI containerized applications. They are welcome to use the source code for the two example applications as the starting point. The typical steps are as follows

Build the new GenAI container image
Upload the container image to a container registry such as ECR
Deploy their Gen AI application by updating the image location within the Environment resource

In summary, with Rafay, developers can now develop, deploy and validate their Generative AI applications on Amazon ECS Clusters using Amazon Bedrock for the foundational models.