RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques Paper • 2501.14492 • Published 8 days ago • 29 • 2
Enabling Scalable Oversight via Self-Evolving Critic Paper • 2501.05727 • Published 22 days ago • 69 • 2