Use
At this point, the developer will have the ability to deploy and deprovision environments based on the shared environment template. Note that the developer
- Does not need to have any knowledge of Terraform
- Does not need access to privileged credentials for AWS
- Does not any help from the Platform team to deploy their environment
Use Gen AI Environment¶
Once the developer logs into the Rafay Org (SSO using Identity Provider recommended), they will only have access to the specific Project they have been authorized to use. Their level of access in the newly created project will be controlled using RBAC (role based access control). It is recommended that they only be provided with the role "Environment Template User" which allows to use the provided Environment Templates, but nothing more.
Important
Although the recommended workflow assumes and recommends using an Integration with an Identity Provider (IdP) to provide a Single Sign On (SSO) experience, organizations can also use locally managed users.
sequenceDiagram
participant dev as Developer
participant rafay as Rafay <br> Environment Manager
participant csp as ECS Cluster
participant idp as Identity Provider
dev->>idp: Access Environment
idp-->>dev: Redirect to Rafay
dev-->>rafay: SSO to Rafay with <br> RBAC (Env Template User)
dev->>rafay: Create Environment <br>based on Env Template
rafay->>csp: Provision new ECS Cluster w/VPC, subnets and Gen AI App
rafay-->>dev: Environment Ready
dev->>csp: Uses GenAI Environment
dev-->>csp: Explore 1st Gen AI App
dev->>rafay: Deploy 2nd Gen AI App
dev-->>rafay: Deploy Custom Gen AI App
Step 1: Create Application Environment Resource¶
In this step, a second user, such as a developer, will create an environment resource in the controller which will use the previously created environment template. The environment resource will be used to create the VPC, ECS cluster and Generative AI application. This environment resource will be used to control the lifecycle of the application environment.
- Log into the controller and select your project
- Navigate to Environments -> Environments
- Click New Environment
- Enter gen-ai-ecs for the name
- Select the existing application environment template
- Select the environment template version
- Click Create
- Navigate to Input Variables
- Click Add Variable
- Enter image_location for the variable name
- Select Text for the value type
- Enter the image location public.ecr.aws/rafay-dev/gen-ai-sample-chat-app for the value
- Click Add Variable
- Enter container_port for the variable name
- Select Text for the value type
- Enter the container port number 8000 for the value
- Click Save
Step 2: Deploy Application Environment¶
In this step, the developer user will now deploy the previously created application environment. Deploying the environment will create a VPC, ECS Cluster and deploy a generative AI application onto the cluster.
- Log into the controller and select your project
- Navigate to Environments -> Environments
- Click on the gen-ai-ecs environment
- Click Publish
The environment will begin to publish and could take ~5 minutes to complete.
Step 3: Access Application¶
We have provided two Gen AI example applications in a public ECR repository. The environment template will deploy one of the Gen AI example applications as part of the environment creation.
Once the environment has finished deploying, the user can use the environment output to find the application endpoint. The endpoint can be entered into a browser to test the application.
- Log into the controller and select your project
- Navigate to Environments -> Environments
- Click on the gen-ai-ecs environment
- Click Resource
- Expand the resource named gen-ai-aws-app, you will see a public endpoint
- Copy the endpoint and enter it into a browser
You will now access the first application. This application uses Amazon Bedrock to act as an intelligent chat bot. You can enter text into the chat and the engine will respond.
Step 4: Update Application¶
We will now deploy the second GenAI application to ECS using the Environment resource that was previously created.
- Log into the controller and select your project
- Navigate to Environments -> Environments
- Click on the gen-ai-ecs environment
- Click Edit Configuration
- Navigate to Input Variables
- Update the image location with public.ecr.aws/rafay-dev/genai:latest for the value
- Update the container port number with 80 for the value
- Click Save
Step 5: Deploy Application Environment¶
In this step, the developer user will now deploy the updated application environment. Deploying the environment will update the generative AI application with a new container image.
- Log into the controller and select your project
- Navigate to Environments -> Environments
- Click on the gen-ai-ecs environment
- Click Publish
The environment will begin to publish and could take ~5 minutes to complete.
Step 6: Access Application¶
Once the environment has finished deploying, the user can use the environment output to find the application endpoint. The endpoint can be entered into a browser to test the application.
- Log into the controller and select your project
- Navigate to Environments -> Environments
- Click on the gen-ai-ecs environment
- Click Resource
- Expand the resource named gen-ai-aws-app, you will see a public endpoint
- Copy the endpoint and enter it into a browser
You will now access the first application. This application takes a text file as input and summarizes the content. The application uses Amazon Bedrock to produce a summary of the text file.
Develop & Deploy Your Containers¶
At this point, the developer is ready to go ahead with the development and testing of their own Gen AI containerized applications. They are welcome to use the source code for the two example applications as the starting point. The typical steps are as follows
- Build the new GenAI container image
- Upload the container image to a container registry such as ECR
- Deploy their Gen AI application by updating the image location within the Environment resource
In summary, with Rafay, developers can now develop, deploy and validate their Generative AI applications on Amazon ECS Clusters using Amazon Bedrock for the foundational models.