ποΈ π’ Prompt Injection
Prompt injection is a technique used to hijack a language model's output(@branch2022evaluating)(@crothers2022machine)(@goodside2022inject)(@simon2022inject).
ποΈ π’ Prompt Leaking
Prompt leaking is a form of prompt injection in which the model is asked to
ποΈ π’ Jailbreaking
Jailbreaking is a type of prompt injection, in which prompts attempt to bypass safety and moderation features placed on LLMs by their creators(@perez2022jailbreak)(@brundage_2022)(@wang2022jailbreak).
ποΈ π’ Defensive Measures
Preventing prompt injection can be extremely difficult, and there exist few to no