Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
-
Updated
Jul 16, 2024 - Python
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks"
An implementation of the transformer deep learning model, based on the research paper "Attention Is All You Need"
Re-implementation of the paper "Attention Is All You Need" for language translation
Implementation of "Attention is all you need" Paper with only PyTorch
GPT-Mini is a small-scale implementation of a GPT using PyTorch.
Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"
A TensorFlow implementation of the Transformer model for machine translation tasks. This package includes data loading, model definition, and training scripts for translating Portuguese to English using the TED Talks dataset. The repository provides a complete pipeline from preprocessing the data to training and testing the model.
Build your own Face App with Stable Diffusion 2.1
Sequence Parallel Attention for Long Context LLM Model Training and Inference
Implementation of the original transformer model described by Vaswani et al for English to German translation
Zeta implemantion of "Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers"
更适合中国宝宝体质的注意力练习题, 中国也要有自己的拉马努金!
A simplistic pytorch implementation of LongVit using my previous implementation of LongNet as a foundation.
Implementation of the transformer from the paper: "Real-World Humanoid Locomotion with Reinforcement Learning"
Implementation of Liquid Nets in Pytorch
Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"
Simple Implementation of TinyGPTV in super simple Zeta lego blocks
Add a description, image, and links to the attention-is-all-you-need topic page so that developers can more easily learn about it.
To associate your repository with the attention-is-all-you-need topic, visit your repo's landing page and select "manage topics."