#human-feedback

1 approved public terms with this tag.

RLHF

/ɑːr el eɪtʃ ef/noun

AI & Technology

#ai #training #alignment #human-feedback

機械支援の翻訳下書き (Japanese) for "RLHF": Reinforcement Learning from Human Feedback — a training technique used to align language models with human preferences. Human raters compare model outputs and choose the better response; these preferences train a reward model which then guides further fine-tuning via reinforcement learning.

“例文の下書き: RLHF is the key step that turns a raw language model into a helpful, harmless assistant.”

投稿者 @dictionary_auto_translate