Why Ray Train?

Just a few of the many reasons why machine learning engineers are choosing to scale their deep learning workloads with Ray Train.


Native multi-GPU support

Easily scale from single threaded to multi-GPU training in under 10 lines of code.


Intuitive API

With a best-in-class API for gradient descent, migrate to production or scale to a large cluster without rewrite code.



Seamlessly works with best-in-class deep learning frameworks including PyTorch, Tensorflow, Horovod, and many more

Try It Yourself

Install Ray Train (and PyTorch) with pip install ray torch and give this example a try.

import ray.train as train
from ray.train import Trainer
import torch
def train_func():
    # Setup model.
    model = torch.nn.Linear(1, 1)
    model = train.torch.prepare_model(model)
    loss_fn = torch.nn.MSELoss()
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-2)
    # Setup data.
    input = torch.randn(1000, 1)
    labels = input * 2
    dataset = torch.utils.data.TensorDataset(input, labels)
    dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
    dataloader = train.torch.prepare_data_loader(dataloader)
    # Train.
    for _ in range(5):
        for X, y in dataloader:
            pred = model(X)
            loss = loss_fn(pred, y)
    return model.state_dict()
trainer = Trainer(backend="torch", num_workers=4)
results = trainer.run(train_func)
Code sample background image

Scale more workloads with Ray

Expand your Ray journey beyond deep learning and bring fast and easy distributed execution to other use cases.

Ray Core

Scale general Python apps.

Ray Tune

Scale hyperparameter search.

Ray Datasets

Scale data loading and processing.

O'Reilly Learning Ray Book

Get your free copy of early release chapters of Learning Ray, the first and only comprehensive book on Ray and its ecosystem, authored by members on the Ray engineering team

Group 5