Making large AI models cheaper, faster and more accessible
-
Updated
Jul 16, 2024 - Python
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)
A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Distributed Keras Engine, Make Keras faster with only one line of code.
Distributed training (multi-node) of a Transformer model
SC23 Deep Learning at Scale Tutorial Material
Orkhon: ML Inference Framework and Server Runtime
☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System
Official Repository for the paper: Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.
Fast and easy distributed model training examples.
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
Torch Automatic Distributed Neural Network (TorchAD-NN) training library. Built on top of TorchMPI, this module automatically parallelizes neural network training.
Add a description, image, and links to the data-parallelism topic page so that developers can more easily learn about it.
To associate your repository with the data-parallelism topic, visit your repo's landing page and select "manage topics."