Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Paper • 2504.11168 • Published Apr 15, 2025 • 3
rogue-security/prompt-injection-jailbreak-sentinel-v2 Text Classification • 0.6B • Updated 26 days ago • 26.2k • 28
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0 Text Classification • Updated Sep 22, 2025 • 9.13k • 29
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Permissive-1.0 Text Classification • Updated Sep 22, 2025 • 7.15k • 18
ShieldGemma Collection ShieldGemma is a family of models for text and image content moderation. • 4 items • Updated 25 days ago • 13