Jailbreak
[/ˈdʒeɪlbreɪk/]
noun/verbAI & Technology#ai#security#safety#bypass0 views1 definitions
Definitions
1
+2147
A technique used to bypass the safety filters and content policies of an AI model, typically by framing harmful requests in ways the model's defenses don't recognize. Jailbreaks often use role-play scenarios, hypothetical framings, or encoded instructions to make the model comply with prohibited requests.
“The "DAN" jailbreak asked the model to pretend it was an AI with no restrictions.”
by @aisafety1/1/1970