[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
-
Updated
Sep 25, 2023 - Python
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
A Toolkit for Distributional Control of Generative Models
SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).
Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
Fine-tuning Language Models with Conditioning on Two Human Preferences
Fine-tuning LLMs using conditional training to learn two human preferences. UCL Module Project: Statistical Natural Language Processing (COMP0087).
Add a description, image, and links to the human-preferences topic page so that developers can more easily learn about it.
To associate your repository with the human-preferences topic, visit your repo's landing page and select "manage topics."