Skip to content

Jailbreak

[/ˈdʒeɪlbreɪk/]

noun/verbAI & Technology#ai#security#safety#bypass
0 views1 definitions

Definitions

1
+2147

A technique used to bypass the safety filters and content policies of an AI model, typically by framing harmful requests in ways the model's defenses don't recognize. Jailbreaks often use role-play scenarios, hypothetical framings, or encoded instructions to make the model comply with prohibited requests.

The "DAN" jailbreak asked the model to pretend it was an AI with no restrictions.
by @aisafety1/1/1970

Related Terms

Related terms are generated only from public tags, classes, translations, and explicit relationships. No unavailable semantic relationships are fabricated.