v1.0.0 — Now on npm and PyPI

The trust layer
for AI agents.

Surfinguard sits between AI agents and the actions they take. It scores URLs, commands, file operations, API calls, and 14 more action types against 5 risk primitives — catching threats before they cause harm.

Get started View on GitHub

analyzers

152

threat patterns

<1ms

local latency

CheckResult

DANGER

Brand impersonation: google

MANIPULATION — score 6

Risky TLD: .xyz

MANIPULATION — score 3

Sensitive keywords detected

EXFILTRATION — score 2

{
  "level": "DANGER",
  "score": 10,
  "allow": false,
  "primitive": "MANIPULATION"
}

JavaScript / TypeScript

npm install @surfinguard/sdk

Python

pip install surfinguard

SDKs & Tools

JavaScript

Python

Rust

CLI

MCP Server

3 lines to protect your agent

Zero-latency local mode. No API calls needed.

import { Guard } from '@surfinguard/sdk';

const guard = await Guard.create({ mode: 'local' });

// Check before the agent acts
const result = guard.checkUrl('https://g00gle-login.tk/verify');
// => { level: "DANGER", score: 9, allow: false }

const cmd = guard.checkCommand('rm -rf / --no-preserve-root');
// => { level: "DANGER", score: 9, allow: false }

const safe = guard.checkCommand('ls -la');
// => { level: "SAFE", score: 0, allow: true }

How it works

Every action is scored against 5 risk primitives. The composite score is the maximum across all primitives.

Install the SDK

npm, pip, or go get. Works offline — no API key needed for local mode.

Check before acting

Pass any agent action to the Guard. Get back a score, level, and reasons in <1ms.

Block or allow

SAFE (0-2), CAUTION (3-6), DANGER (7+). Your policy decides what happens.

5 risk primitives

Every threat maps to one of five dimensions. Scores are additive within each primitive, capped at 10.

DESTRUCTION

Detects rm -rf, DROP TABLE, disk formatting, and other irreversible operations.

EXFILTRATION

Catches data theft: SSH key reads, credential forwarding, DNS tunneling, cloud metadata access.

ESCALATION

Identifies privilege escalation: sudo abuse, container escape, IAM modifications.

PERSISTENCE

Finds backdoors: crontab writes, shell config modifications, SSH authorized_keys injection.

MANIPULATION

Detects social engineering: brand impersonation, prompt injection, phishing URLs.

Analyzers

URLs, commands, text, files, API calls, SQL, code, messages, transactions, auth, git, UI, infra, agent-comm, data pipelines, documents, IoT.

Pricing

Free to start. Scale when you need to.

Free

For experimenting

✓10 req/min
✓1K req/day
✓All 18 analyzers
✓Local mode (unlimited)

Get started

Developer

$29/mo

For shipping products

✓60 req/min
✓50K req/day
✓LLM enhancement
✓Session tracking
✓Webhooks

Get started

Enterprise

Custom

For organizations

✓1K req/min
✓5M req/day
✓SSO / SAML / OIDC
✓Org RBAC & audit log
✓On-prem Docker
✓SLA & support

FAQ

Quick answers. No fluff.

Do I need an API key?

No. Local mode runs entirely in your process with zero network calls. API mode requires a key for cloud features like LLM enhancement, sessions, and webhooks.

What action types are supported?

18 types: URLs, commands, text (prompt injection), file read/write, API calls, SQL queries, code execution, messages, transactions, auth, git, UI actions, infrastructure, agent-to-agent communication, data pipelines, documents, and IoT commands.

How fast is it?

Local mode: sub-millisecond. All pattern matching runs in-process via WASM. No network roundtrip needed.

Can I self-host?

Yes. Docker on-prem deployment uses SQLite instead of Cloudflare D1. Same API, your infrastructure. See the self-hosting docs.