javirandor's picture
Create README.md
a63f472 verified

Poisoned Reward Model

This reward model was used to align this generation model for the trojan detection competition co-located at SaTML 2024. For more information, visit the official competition website