Feature View
Please review the overview section to understand details about the feature store you will implement in the steps below.
Step 1: Login¶
In this step, you will login to your MLOps Platform.
- Navigate to the URL (This will be provided by your platform team)
- Login using your local credentials or SSO credentials (Identity Provider such as Okta)
Once logged in, you will see the home dashboard screen.
Step 2: Create a Notebook¶
In this step, you will create a Jupyter Notebook. The Rafay MLOps platform based on Kubeflow provides a way to use notebooks that run in its own container.
In this step, we will create a notebook which will be used to create and manage a feature store
- Navigate to Notebooks
- Click New Notebook
- Enter a name for the notebook
- Select JupyterLab
- Set the minimum CPU to 1
- Set the minimum memory to 1
- Click Launch
It will take 1-2 minutes to create the notebook.
Step 3: Create Feature Repository¶
In this step, you will configure the notebook with the appropriate Python packages to interact with Feast and use that package to create a Feast repository.
- Navigate to Notebooks
- Click Connect on the previously created notebook
- Click Terminal to open the terminal
- Enter the following command in the terminal to install feast
pip install --user feast
- Enter the following command in the terminal to create the repository named feast_demo
feast init demo_repo
You will see output similar to the following:
Notice that a directory named demo_repo was created in the file browser of the notebook.
Step 4: Generate Data¶
In this step, you will generate sample data which will be used as the data source for the Feature Store.
- Download the following python script
- In the left hand folder tree, double click the directory demo_repo
- In the left hand folder tree, click on the upload files icon
- Upload the previously downloaded data_generator.py file
- In the terminal, execute the following commands to run the data generation script
cd demo_repo/
python data_generator.py
You will see output similar to the following:
Generated data saved to feature_repo/data/customer_purchases_29102024.parquet
customer_id event_timestamp purchase_amount purchase_category
0 6 2024-10-19 10:11:00+00:00 185.70 toys
1 7 2024-10-19 01:21:00+00:00 474.91 toys
2 7 2024-10-19 18:01:00+00:00 91.55 electronics
3 3 2024-10-19 12:29:00+00:00 165.88 toys
4 8 2024-10-19 10:20:00+00:00 119.94 groceries
.. ... ... ... ...
995 8 2024-10-19 04:38:00+00:00 161.55 groceries
996 9 2024-10-19 12:02:00+00:00 301.76 electronics
997 8 2024-10-19 17:59:00+00:00 129.39 clothing
998 6 2024-10-19 07:47:00+00:00 162.52 groceries
999 9 2024-10-19 03:45:00+00:00 55.43 books
[1000 rows x 4 columns]
Step 5: Define Feature Store and Feature View¶
In this step, you will define the configuration of the Feature Store and Feature View by loading their respective configuration files.
- Download the following YAML file which contains the Feature Store configuration
- In the left hand folder tree, navigate to /demo_repo/feature_repo/ directory
- In the left hand folder tree, click on the upload files icon
- Upload the previously downloaded feature_store.yaml file and overwrite the existing sample file in the directory
The contents of the file are below. Notice the configuration is using a local file for an offline store.
project: feast_demo
registry: data/registry.db
provider: local
offline_store:
type: file
entity_key_serialization_version: 2
- Download the following Python script which defines the Feature View configuration
- In the left hand folder tree, navigate to /demo_repo/feature_repo/ directory
- In the left hand folder tree, click on the upload files icon
- Upload the previously downloaded define_feature_view.py file
This Feature View Python script will be used by Feast to create the Feature View.
Step 6: Build Feature Store and Feature View¶
In this step, you will build the previously defined Feature Store and Feature View.
Before building the components, we will first delete teh sample files created in the repo. Please delete the files named example_repo.py and test_workflow.py.
- In the terminal, execute the following commands to build the Feature Store and Feature View
feast plan
feast apply
- In the terminal, execute the following command to view the Feature View
feast feature-views list
You will see output similar to the following:
NAME ENTITIES TYPE
customer_purchases {'customer_id'} FeatureView
Step 7: Extract Historical Data¶
In this step, you will extract historical data using the Feature View.
- Download the following Python script which will retrieve historical data from the Feature Store using the Feature View
- In the left hand folder tree, navigate to /demo_repo/ directory
- In the left hand folder tree, click on the upload files icon
- Upload the previously downloaded extract_historical_data.py file
- In the terminal, execute the following commands to run the script
cd ~/demo_repo
python extract_historical_data.py
You will see output similar to the following:
Generated entity_df:
customer_id event_timestamp
0 3 2024-10-19 17:37:29+00:00
1 1 2024-10-20 22:48:13+00:00
2 4 2024-10-20 10:20:25+00:00
3 4 2024-10-20 12:46:24+00:00
4 5 2024-10-19 21:16:39+00:00
Historical features DataFrame:
customer_id event_timestamp purchase_amount purchase_category
0 3 2024-10-19 17:37:29+00:00 92.94 books
1 5 2024-10-19 21:16:39+00:00 166.15 electronics
2 4 2024-10-20 10:20:25+00:00 127.59 books
3 4 2024-10-20 12:46:24+00:00 127.59 books
4 1 2024-10-20 22:48:13+00:00 24.39 toys
Recap¶
Congratulations! At this point, you have successfully created a Feature Store with a Feature View and used the Feature View to retrieve historical data from a data source.