Surfinguard
Surfinguard
The trust layer for AI agents
Get started
v1.0.0 — Now on npm and PyPI

The trust layer
for AI agents.

Surfinguard sits between AI agents and the actions they take. It scores URLs, commands, file operations, API calls, and 14 more action types against 5 risk primitives — catching threats before they cause harm.

18
analyzers
152
threat patterns
<1ms
local latency
CheckResult
DANGER
Brand impersonation: google
MANIPULATION — score 6
Risky TLD: .xyz
MANIPULATION — score 3
Sensitive keywords detected
EXFILTRATION — score 2
{
  "level": "DANGER",
  "score": 10,
  "allow": false,
  "primitive": "MANIPULATION"
}
JavaScript / TypeScript
npm install @surfinguard/sdk
Python
pip install surfinguard

SDKs & Tools

JavaScript
Python
Go
Rust
CLI
MCP Server

3 lines to protect your agent

Zero-latency local mode. No API calls needed.

import { Guard } from '@surfinguard/sdk';

const guard = await Guard.create({ mode: 'local' });

// Check before the agent acts
const result = guard.checkUrl('https://g00gle-login.tk/verify');
// => { level: "DANGER", score: 9, allow: false }

const cmd = guard.checkCommand('rm -rf / --no-preserve-root');
// => { level: "DANGER", score: 9, allow: false }

const safe = guard.checkCommand('ls -la');
// => { level: "SAFE", score: 0, allow: true }

How it works

Every action is scored against 5 risk primitives. The composite score is the maximum across all primitives.

01
Install the SDK
npm, pip, or go get. Works offline — no API key needed for local mode.
02
Check before acting
Pass any agent action to the Guard. Get back a score, level, and reasons in <1ms.
03
Block or allow
SAFE (0-2), CAUTION (3-6), DANGER (7+). Your policy decides what happens.

5 risk primitives

Every threat maps to one of five dimensions. Scores are additive within each primitive, capped at 10.

D
DESTRUCTION
Detects rm -rf, DROP TABLE, disk formatting, and other irreversible operations.
E
EXFILTRATION
Catches data theft: SSH key reads, credential forwarding, DNS tunneling, cloud metadata access.
E
ESCALATION
Identifies privilege escalation: sudo abuse, container escape, IAM modifications.
P
PERSISTENCE
Finds backdoors: crontab writes, shell config modifications, SSH authorized_keys injection.
M
MANIPULATION
Detects social engineering: brand impersonation, prompt injection, phishing URLs.
18
Analyzers
URLs, commands, text, files, API calls, SQL, code, messages, transactions, auth, git, UI, infra, agent-comm, data pipelines, documents, IoT.

Pricing

Free to start. Scale when you need to.

Free
Free
For experimenting
  • 10 req/min
  • 1K req/day
  • All 18 analyzers
  • Local mode (unlimited)
Get started
Developer
$29/mo
For shipping products
  • 60 req/min
  • 50K req/day
  • LLM enhancement
  • Session tracking
  • Webhooks
Get started
Enterprise
Custom
For organizations
  • 1K req/min
  • 5M req/day
  • SSO / SAML / OIDC
  • Org RBAC & audit log
  • On-prem Docker
  • SLA & support
Contact us

FAQ

Quick answers. No fluff.

Do I need an API key?
No. Local mode runs entirely in your process with zero network calls. API mode requires a key for cloud features like LLM enhancement, sessions, and webhooks.
What action types are supported?
18 types: URLs, commands, text (prompt injection), file read/write, API calls, SQL queries, code execution, messages, transactions, auth, git, UI actions, infrastructure, agent-to-agent communication, data pipelines, documents, and IoT commands.
How fast is it?
Local mode: sub-millisecond. All pattern matching runs in-process via WASM. No network roundtrip needed.
Can I self-host?
Yes. Docker on-prem deployment uses SQLite instead of Cloudflare D1. Same API, your infrastructure. See the self-hosting docs.