#training

3 approved public terms with this tag.

Fine-Tuning

/faɪn ˈtjuːnɪŋ/noun

AI & Technology

The process of further training a pre-trained model on a smaller, task-specific dataset to adapt its behavior for a particular domain or style. Fine-tuning updates the model's weights to make it perform better on specific tasks without training from scratch.

“We fine-tuned the base model on our legal contracts corpus so it could draft clauses in the right style.”

by @mlresearcher

RLHF

/ɑːr el eɪtʃ ef/noun

AI & Technology

#ai #training #alignment #human-feedback

Reinforcement Learning from Human Feedback — a training technique used to align language models with human preferences. Human raters compare model outputs and choose the better response; these preferences train a reward model which then guides further fine-tuning via reinforcement learning.

“RLHF is the key step that turns a raw language model into a helpful, harmless assistant.”

by @aisafety

Synthetic Data

/sɪnˈθetɪk ˈdeɪtə/noun

AI & Technology

#ai #ml #training #data

Artificially generated data that mimics the statistical properties of real-world data, used for training or testing AI models. Synthetic data can be created by generative models, rule-based systems, or simulations, and is especially valuable when real data is scarce, sensitive, or expensive to collect.

“We generated synthetic medical records to train the model without risking patient privacy.”

by @mlresearcher