Ray
Ray.io is an open-source framework to easily scale up Python applications across multiple nodes in a cluster. Ray provides a simple API for building distributed, parallelized applications, especially for deep learning applications.
Ray is a general-purpose distributed computing framework with a rich set of libraries for large scale data processing, model training, reinforcement learning, and model serving. It is popular with users as a simple API for building and scaling AI and Python workloads. It's focus is on the application itself - allowing users to build distributed computing software with a unified and flexible set of APIs. Some of the advanced libraries offered by Ray are:
- RLLib for reinforcement learning
- Ray Tune for hyperparameter tuning
- Ray Train for distributed deep learning
- Ray Serve for scalable model serving
- Ray Data for preprocessing
Why Ray?¶
Many data scientists and researchers prefer Ray because of the following reasons:
Python Native¶
Ray provides a Python-first API. It is extensible and open. It natively integrates with the entire ML Ops ecosystem such as
- ML frameworks: e.g. Pytorch and Tensorflow
- Specialized libraries like vLLM and TRT-LLM
- ML Ops tools like Weights & Balances and MLFlow
Libraries for Developers¶
Users can accelerate development with Ray Data, Ray Train, Ray Tune, Ray Serve, and RLLIb providing easy and familiar APIs for the most common AI workloads.
Seamless Scale¶
It provides simple primitives and with one Python decorator making it really easy to scale from your laptop to the cloud.
Unmatched Precision¶
It allows users to run their AI workloads on CPUs, GPUs and TPUs. It provides the ability to partition for fine grained optimization of utilization for every AI workload.