Welcome to the Awesome-RLHF-Vision repository! This is a curated collection of research papers focusing on Reinforcement Learning with Human Feedback (RLHF) as applied to vision models. Our goal is to create a comprehensive resource that is continuously updated to track the latest advancements in this exciting field.
⭐ If you find this repository useful or interesting, please consider giving it a star to show your support! ⭐
Here you can find a list of research papers categorized by topics related to RLHF for vision models.
format:
- [title](paper link) [links]
- author1, author2, and author3...
- publisher
- keyword
- summary
- code
- experiment environments and datasets
-
- Tianyu Yu, Yuan Yao, Haoye Zhang, Taiwen He, Yifeng Han, Ganqu Cui, Jinyi Hu, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun, Tat-Seng Chua
- CVPR 2024
- Keyword: Multimodal Understanding, Hallucination, Dense Direct Preference Optimization, Human-annotated, MiniCPM
- Code: Official
-
Diffusion Model Alignment Using Direct Preference Optimization
- Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik
- Keyword: Image Generation, Image-to-Image Editing, DPO, SDXL-1.0
- Code: Official
-
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
- Yuval Kirstain, Adam Polyak, Uriel Singer, Shahbuland Matiana, Joe Penna, Omer Levy
- NeurIPS 2023
- Keyword: Image Generation, Reward Model, Human Preference Dataset
- Code: Official
-
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
- Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, Yuxiao Dong
- NeurIPS 2023
- Keyword: Image Generation, Reward Model, Reward Feedback Learning
- Code: Official
-
Aligning text-to-image models using human feedback
- Kimin Lee, Hao Liu, Moonkyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu
- NeurIPS 2023
- Keyword: Image Generation, Reward Model, Reward-weighted Learning
-
Human Preference Score: Better Aligning Text-to-Image Models with Human Preference
- Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li
- ICCV 2023
- Keyword: Image Generation, Reward Model (human preference classifier), Preference Tuning
- Code: Official
-
Rich Human Feedback for Text-to-Image Generation
- Youwei Liang, Junfeng He, Gang Li, Peizhao Li, Arseniy Klimovskiy, Nicholas Carolan, Jiao Sun, Jordi Pont-Tuset, Sarah Young, Feng Yang, Junjie Ke, Krishnamurthy Dj Dvijotham, Katie Collins, Yiwen Luo, Yang Li, Kai J Kohlhoff, Deepak Ramachandran, Vidhya Navalpakkam
- CVPR 2024 Best Paper
- Keyword: Image Generation, High-resolution, Reward/Critic Model, Fine-grained Feedback Signals, RFT
- Code: Official
-
- OpenBMB
- Keyword: Human Preference, Rankings, Trustworthiness
- Task: Diverse multimodal understanding tasks
-
- OpenBMB
- Keyword: AI Preference, Rankings, Trustworthiness
- Task: Diverse multimodal understanding tasks
-
- Yuval Kirstain and Adam Polyak and Uriel Singer and Shahbuland Matiana and Joe Penna and Omer Levy
- Keyword: Human Preference, Rankings
- Task: Image Generation
-
- Yuval Kirstain and Adam Polyak and Uriel Singer and Shahbuland Matiana and Joe Penna and Omer Levy
- Keyword: Human Preference, Rankings
- Task: Image Generation
-
- THUDM
- Keyword: Human Preference, Rankings
- Task: Image Generation
-
- John David Pressman and Katherine Crowson and Simulacra Captions Contributors
- Keyword: Human Preference, Ratings
- Task: Image Generation
-
- Xiaoshi Wu and Keqiang Sun and Feng Zhu and Rui Zhao and Hongsheng Li
- Keyword: Human Preference, Ratings
- Task: Image Generation
-
- Google Research
- Keyword: Human Preference, Ratings, Human-labeled Heatmaps (e.g., artifact regions of distorted pixels) and Misalignment Tokens in Prompts
- Task: Image Generation
Note: This list is continually updated. Make sure to check back regularly for the most recent papers.
We welcome contributions from the community! If you have a paper or resource you'd like to add, feel free to submit a pull request or open an issue. Please follow our contribution guidelines for details.
This repository is licensed under the MIT License. See the LICENSE file for more information.
We would like to thank the contributors and researchers whose efforts have made this compilation possible.
For any questions or suggestions, feel free to open an issue or contact us directly ([email protected]). We appreciate your feedback!
Thank you for visiting Awesome-RLHF-Vision! Happy reading and researching!