Guardrails
[/ˈɡɑːrdreɪlz/]
nounAI & Technology#ai#safety#moderation#constraints0 views1 definitions
Definitions
1
+1218
Safety constraints and filters applied to AI systems to prevent harmful, offensive, or out-of-scope outputs. Guardrails can be implemented at the model level (via training), prompt level (system instructions), or application level (output classifiers) to keep AI behavior within acceptable boundaries.
“The guardrails blocked the model from providing detailed instructions on dangerous activities.”
by @aisafety1/1/1970