Overview
Feast (Feature Store) is an open-source feature store for machine learning (ML). Its primary role is to manage, store, and serve ML features across training and inference workflows. It bridges the gap between data engineering and ML operations by making feature data accessible consistently and efficiently.
Key Functions:
- Central repository for features: Ensures consistency between training and serving.
- Online & Offline Stores: Supports real-time (online) serving and historical (offline) batch training.
- Decouples data pipelines from ML models: Promotes reuse and governance.
Feast Integration with Kubeflow¶
Feast can be integrated into Kubeflow pipelines to manage features consistently across stages.
Typical Integration Points:
1. Kubeflow Pipelines:¶
You can use Feast in a Kubeflow pipeline step to retrieve training features from the offline store or real-time features for model inference.
Feast SDKs can be used within Kubeflow pipeline components (Python-based) to fetch or materialize features.
2. Training Phase:¶
Use the Feast offline store to retrieve historical feature data to train models. This ensures feature parity between what was used to train and what will be served at runtime.
3. Inference/Serving Phase:¶
Feast serves real-time features from the online store, ensuring low-latency access to precomputed or live features. It can be embedded in KFServing / KServe model deployments or any real-time inference service.
4. Feature Materialization:¶
Schedule periodic jobs via Kubeflow Pipelines or other orchestration tools (e.g., Argo Workflows) to materialize features from batch sources to the online store.