Reinforcement Learning from Human Feedback
ConceptAI training technique where human evaluators rate outputs to teach models preferred behavior
1 story
AI training technique where human evaluators rate outputs to teach models preferred behavior
1 story