Why Ray SGD?
Just a few of the many reasons why machine learning engineers are choosing to scale their deep learning workloads with Ray SGD.
Native multi-GPU support
Easily scale from single threaded to multi-GPU training in under 10 lines of code.
With a best-in-class API for gradient descent, migrate to production or scale to a large cluster without rewrite code.
Seamlessly works with best-in-class deep learning frameworks including PyTorch, Tensorflow, Horovod, and many more
Try It Yourself
Install Ray SGD (and PyTorch) with
pip install ray torch and give this example a try.
import torch def train_func(config=None): use_cuda = torch.cuda.is_available() device = torch.device("cuda" if use_cuda else "cpu") train_loader, test_loader = get_data_loaders() model = ConvNet().to(device) optimizer = optim.SGD(model.parameters(), lr=0.1) model = DistributedDataParallel(model) all_results =  for epoch in range(40): train(model, optimizer, train_loader, device) acc = test(model, test_loader, device) all_results.append(acc) return model._module, all_results trainer = Trainer( num_workers=8, use_gpu=True, backend=TorchConfig()) print(trainer) # prints a table of resource usage model = trainer.run(train_func) # scale out here!
Scale more workloads with Ray
Expand your Ray journey beyond deep learning and bring fast and easy distributed execution to other use cases.
Scale general Python apps.
Scale hyperparameter search.
Scale data loading and processing.