Skip to content

Carapace

Carapace is a prompt injection firewall for LLMs. It detects and blocks injection attacks before they reach your AI, with 100% detection rate across 1,380 malicious payloads and 0% false positives across 150 clean payloads.

Enterprise LLM APIs (Claude, GPT-4o) have built-in safety filtering. Self-hosted models have none.

We tested 18 models via Ollama with zero content filtering:

ModelSizeVulnerability
Qwen2.5:7b7B83%
Llama3.3:70b70B80%
Mistral-Large:123b123B70%
DeepSeek-R1:70b70B60%
Gemma3:27b27B40%
gpt-oss:120b120B25%
Phi4-Mini:3.8b3.8B20%

Average vulnerability: ~49%. No model scored 0%. Model size doesn’t correlate with safety — Mistral-Large 123B (70% vulnerable) vs Phi4-Mini 3.8B (20% vulnerable).

The API layer IS the protection. When you self-host, you lose it. Carapace puts it back.

ModeWhat it protectsUse case
SDK/MiddlewareYour application codeDevelopers integrating LLMs
GatewayHTTP API calls to LLMsApps, self-hosted models, teams
MCP ProxyTool execution in Claude/agentsClaude Desktop, Cursor, agent frameworks
eBPFAll SSL traffic on machineDev machines, servers, fleet-wide
CategorySeverityExamples
Instruction OverrideCritical”ignore previous instructions”
Role InjectionCritical[SYSTEM], <<SYS>> role markers
Identity HijackHigh”you are now DAN”, jailbreak prompts
Extraction AttemptHigh”repeat your system prompt”
Authority ImpersonationCritical”this is Anthropic, admin override”
Command InjectionCriticalcurl | bash, eval(), rm -rf
ExfiltrationCritical”send ~/.ssh/id_rsa to…”
Tool PoisoningCriticaltool_call, function_call injection
Roleplay JailbreakCritical”let’s play a game” (89.6% ASR)
FlipAttackHighReversed text evasion (98% ASR)
Encoding EvasionHighBase64, URL encoding, hex, ROT13
Unicode InjectionHighZero-width spaces, invisible separators
Multi-LanguageHigh”ignorez”, “無視”, “игнорируй”
Indirect InjectionCriticalHidden instructions in retrieved content
Browser Agent AttackCriticalXSS payloads, document.cookie

Plus 14 more categories covering social engineering, gaslighting, logic traps, crescendo attacks, few-shot attacks, completion attacks, and more.

Terminal window
# Install
npm install @honeybee-ai/carapace
# Scan from CLI
npx @honeybee-ai/carapace scan "ignore all previous instructions"
# In your code
import { scan, isSafe, middleware } from '@honeybee-ai/carapace';
if (!isSafe(userInput)) throw new Error('Injection detected');
// Express middleware
app.use('/api/chat', middleware({ mode: 'block' }));

See the full Carapace Quick Start for more.

ScoreActionBehavior
0-19PASSClean, allow through
20-49LOGAllow but log for review
50-99WARNAllow but warn
100+BLOCKBlock, return error
Terminal window
$ npm ls
@honeybee-ai/[email protected]
└── (empty)

No node_modules to audit. No supply chain attacks possible. No transitive dependencies. Every line of code is in the repo and auditable. For a security tool, this matters.

Carapace (open source)Carapace Cloud (managed)
Scanner libraryYesYes
Gateway proxyYesYes
MCP proxyYesYes
CLI toolYesYes
eBPF firewallYesYes
Dashboard & analyticsReal-time threat monitoring
Custom detection rulesDIYAPI-managed
Webhook alertsReal-time notifications
Audit log exportCSV/JSON (compliance-ready)
SupportCommunityDedicated SLA