A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
-
Updated
Jul 16, 2024 - Python
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Official release of InternLM2.5 7B base and chat models. 1M context support
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
Aligning GPT2 model to generate Non-Toxic words
Python client library for improving your LLM app accuracy
SimPO: Simple Preference Optimization with a Reference-Free Reward
Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.
The official implementation of Self-Play Preference Optimization (SPPO)
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
$\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
RewardBench: the first evaluation tool for reward models.
Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.
To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."