#moderation
1 approved public terms with this tag.
Guardrails
/ˈɡɑːrdreɪlz/noun
机器辅助翻译草稿 (Chinese) for "Guardrails": Safety constraints and filters applied to AI systems to prevent harmful, offensive, or out-of-scope outputs. Guardrails can be implemented at the model level (via training), prompt level (system instructions), or application level (output classifiers) to keep AI behavior within acceptable boundaries.
“示例草稿: The guardrails blocked the model from providing detailed instructions on dangerous activities.”