Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2312.00849

Moral Foundations of Large Language Models

Paper • 2310.15337 • Published Oct 23, 2023 • 1
Specific versus General Principles for Constitutional AI

Paper • 2310.13798 • Published Oct 20, 2023 • 2
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 24
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 47

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Paper • 2312.00849 • Published Dec 1, 2023 • 8
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness

Paper • 2405.17220 • Published May 27, 2024 • 1
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation

Paper • 2304.05977 • Published Apr 12, 2023 • 1
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

Paper • 2406.12845 • Published Jun 18, 2024 • 1

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 28
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 27
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

Paper • 2401.00604 • Published Dec 31, 2023 • 4
LARP: Language-Agent Role Play for Open-World Games

Paper • 2312.17653 • Published Dec 24, 2023 • 31

Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

Paper • 2311.16922 • Published Nov 28, 2023 • 1
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Paper • 2312.00849 • Published Dec 1, 2023 • 8

Super Alignment

Trusted Source Alignment in Large Language Models

Paper • 2311.06697 • Published Nov 12, 2023 • 10
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47
SuperHF: Supervised Iterative Learning from Human Feedback

Paper • 2310.16763 • Published Oct 25, 2023 • 1
Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

Paper • 2311.15657 • Published Nov 27, 2023 • 2

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 41
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 48
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning

Paper • 2310.08166 • Published Oct 12, 2023 • 1
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants

Paper • 2310.00653 • Published Oct 1, 2023 • 3

Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 24
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 47
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Paper • 2312.00849 • Published Dec 1, 2023 • 8

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 24
Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Paper • 2309.13041 • Published Sep 22, 2023 • 8
Voyager: An Open-Ended Embodied Agent with Large Language Models

Paper • 2305.16291 • Published May 25, 2023 • 9

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs