सामग्री पर जाएं

RLHF

[/ɑːr el eɪtʃ ef/]

nounAI & Technology#ai#training#alignment#human-feedback
0 views1 definitions

Definitions

Machine-assisted language draft. Human review still needed.
1
0

मशीन-सहायता अनुवाद मसौदा (Hindi) for "RLHF": Reinforcement Learning from Human Feedback — a training technique used to align language models with human preferences. Human raters compare model outputs and choose the better response; these preferences train a reward model which then guides further fine-tuning via reinforcement learning.

उदाहरण मसौदा: RLHF is the key step that turns a raw language model into a helpful, harmless assistant.
by @dictionary_auto_translate1/1/1970

Related Terms

Related terms are generated only from public tags, classes, translations, and explicit relationships. No unavailable semantic relationships are fabricated.