keyhog · 2026-05-28

Meet keyhog: a GPU-accelerated, open-source secret scanner.

It's an open-source secret scanner written in Rust. One install command, one binary, one SARIF file your CI already knows how to read. We built it because the scanners we were using were either fast or accurate, and we got tired of picking one.

Install it

linux / macos

curl -fsSL https://raw.githubusercontent.com/santhsecurity/keyhog/main/install.sh | sh

windows (powershell)

iwr https://raw.githubusercontent.com/santhsecurity/keyhog/main/install.ps1 -useb | iex

from source

git clone https://github.com/santhsecurity/keyhog && cd keyhog && cargo build --release

Then point it at a tree:

keyhog scan .

What it catches

Most detectors are tied to a specific service rather than a generic "this looks like a token" match. AWS, GitHub, Slack, Stripe, OpenAI, Anthropic, and the major cloud providers are covered, along with Twilio, SendGrid, Notion, Linear, PagerDuty, Datadog, Snowflake, and Databricks. So are the structural formats: Postgres connection strings, JWT bearer tokens, SSH private keys. When no service-specific detector matches, entropy detectors catch high-entropy strings that don't belong in source.

A scan reports findings like this:

┌    CRITICAL ─── OpenAI API Key
│ Secret:     sk-9...M8vZ
│ Location:   src/llm/client.ts:4
│ Confidence: ■■■■■■ 100%
│ Action:     Revoke immediately and rotate.
└─────────────────────────────────────────────

┌    CRITICAL ─── AWS Access Key
│ Secret:     AKIA...JX7Q
│ Location:   infra/terraform/main.tf:142
│ Confidence: ■■■■■■ 100%
│ Action:     Revoke immediately and rotate.
└─────────────────────────────────────────────

Severities run from CRITICAL (a live credential that grants control of a paid production account) down to CLIENT-SAFE (a key that is public by design, like a Sentry DSN). Same scan, separate exit codes per tier, so CI gates can fail on CRITICAL + HIGH without breaking on a Sentry DSN that is meant to ship in your client bundle.

Live verification: is this key actually live?

Most scanners stop at the pattern match. keyhog can check with the provider instead. Pass --verify and every match with a known liveness endpoint gets one HTTP probe before it's printed:

keyhog scan . --verify

Each finding then carries a verdict. LIVE means the provider answered "yes, this key works right now". REVOKED means the provider knows the key and has explicitly disabled it. DEAD means the provider rejected it outright. UNVERIFIED means the detector has no probe (yet) and the finding stands on detection alone, never silently upgraded.

┌    CRITICAL ─── AWS Access Key
│ Secret:       AKIA...JX7Q
│ Location:     infra/terraform/main.tf:142
│ Confidence:   ■■■■■■ 100%
│ Verification: LIVE  (sts:GetCallerIdentity → 200, account 123…)
│ Action:       Revoke immediately and rotate.
└─────────────────────────────────────────────

Many detectors carry a verifier today (AWS STS, GitHub /user, Slack auth.test, OpenAI /v1/models, Stripe balance, Cloudflare token-verify, Anthropic /v1/messages with empty body, Twilio account fetch, and more).
SSRF-safe by construction. Verifiers can't be redirected at internal IPs (RFC1918 / link-local / loopback are blocked at the request layer), can't be aimed at arbitrary hosts (domain allowlist per detector), and are rate-limited so a scan of a large tree full of leaked keys won't flood a vendor with requests.
OOB out-of-band probes for the credentials that need a Burp-Collaborator-style callback rather than a direct HTTP request.
Cached and idempotent. Re-scanning a tree with the same key hits the verifier cache, not the vendor. A pre-commit hook on a fast-changing repo can verify every commit without spamming the AWS STS quota.
Class-separated exit code. exit 10 is reserved for "one or more LIVE credentials found", distinct from the regular finding code. CI gates can block a deploy on LIVE keys while letting an unverified pattern match through.

How it stays accurate

A scanner that fires on everything gets switched off, and one that's switched off finds nothing. So most of the work here goes into keeping false positives down. Some of the mechanisms:

Companion-required validation. An AWS access key without its 40-char secret? Skipped. A Twilio API key without its auth token? Skipped. Noisy detectors need two signals before firing, which clears the common git log -G ghp_ false-positive cluster.
Decode-through scanning. keyhog decodes Kubernetes Secret manifests, JWT payloads, base64-wrapped envs, helm values, and docker-config auth: blobs in place, then scans the plaintext. A key hidden one base64 layer down is still a key.
Multiline reassembly. keyhog reassembles "sk-proj-" + \ continuations in JavaScript, YAML multi-line strings, Makefile backslash-continuation, and Helm / Jinja templated output before it matches, so a secret split across lines doesn't slip through.
Confidence that reflects reality. The badge on each finding isn't a fixed number per detector. It reflects how reliable that detector has been on real inputs, so a noisy pattern reads lower than a precise one and you know where to look first.
Context-aware suppression. A key inside a Markdown fence labelled "do not use", or a comment that says "example", is documentation, not a leak. keyhog reads the surrounding context instead of firing on the match alone.

How fast

keyhog routes each scan to the fastest backend on the host. No flags, no config: it uses SIMD on the CPU, and the GPU when a file is large enough to be worth the dispatch cost.

On a full Linux kernel checkout, a scan with verification off finishes in a couple of seconds on the GPU path and a few more on SIMD. The findings are the same either way, so the GPU path is the default rather than something you have to opt into.

Daemon mode: near-instant re-scans

Every keyhog invocation pays a short cold start to compile its detectors into the Hyperscan / Vyre automata. For a one-shot CI scan that's invisible. For a pre-commit hook on a small diff it's a tax you pay every time. Run keyhog as a long-lived daemon and you pay it once per host - every scan after that returns almost instantly:

keyhog daemon start                    # Unix socket on $XDG_RUNTIME_DIR
keyhog scan --stdin --daemon < .env    # near-instant; no per-scan cold start
keyhog daemon status
keyhog daemon stop

Drop it into a pre-commit hook (keyhog hook install), an IDE save handler, or a per-commit CI loop and secret scanning stops costing anything you would notice. keyhog watch ./src goes one step further - near-instant findings on every save via inotify / FSEvents / RDCW.

System-wide credential triage

Tree-scoped scans are for repos. For incident response, M&A inheritance audits, and quarterly developer-laptop sweeps, keyhog has a triage mode that walks every mounted drive on the host, skips pseudo-filesystems (/proc, /sys, tmpfs, nsfs, fuse.snapfuse), auto-discovers every .git (worktrees + bare repos + submodules), and runs the full scan plus git-history pipeline under a hard space ceiling:

sudo keyhog scan-system --space 50G                 # default 50 GiB ceiling
sudo keyhog scan-system --space 1T --include-network # also NFS / SMB
sudo keyhog scan-system --space 10G --no-git-history # skip historical blobs

Exits 1 on findings. Pair with --verify and the report lists exactly the keys that are still live right now across every drive on the box, which is usually what an IR or M&A audit wants to know.

Lockdown mode for high-trust hosts

For deployments where keyhog runs on the same machine that holds the secrets - paired with EnvSeal on a vault host, say - the scanner itself is in the trust boundary. --lockdown hardens it:

mlockall on Linux so credentials never page to swap.
PR_SET_DUMPABLE = 0 (always on, even outside lockdown) - core dumps, ptrace, and /proc/<pid>/mem reads are disabled. macOS gets PT_DENY_ATTACH.
Refuses to run if ~/.cache/keyhog/* exists, refuses --incremental writes, refuses --verify, refuses --show-secrets, refuses to start if the kernel's coredump_filter would dump anonymous pages.

The always-on hardening (everything except mlock + cache refusal) applies to every invocation - a stock keyhog binary can't be coredumped or ptraced. Lockdown just adds the strict refusals on top.

How you use it day-to-day

One-shot scan. keyhog scan . walks the tree, emits human or JSON.
Live TUI. keyhog tui . streams findings as they land, with stats and the active backend visible.
Verify. keyhog scan . --verify probes credentials that expose a status endpoint. See above.
SARIF for GitHub Code Scanning. keyhog scan . --format sarif --output keyhog.sarif drops straight into the Security tab via the standard github/codeql-action/upload-sarif step.
Pre-commit + CI. keyhog hook install wires a git pre-commit; the in-tree composite Action santhsecurity/keyhog/.github/actions/keyhog@main runs the same scanner in CI with SARIF upload and class-separated exit codes.
Library. The same crates that power the CLI live in crates/core, crates/scanner, crates/sources, crates/verifier and are consumable as workspace deps today; standalone crates.io publishing lands with the next tagged release.

Under the hood, briefly

The GPU path is built on Vyre, our own GPU compute substrate. You write an ordinary Rust state machine once, and Vyre runs it on NVIDIA, AMD, or Apple hardware while producing the same results as the CPU. That last part is what makes the GPU safe to use by default: if the fast path and the slow path could disagree, you couldn't trust the fast one. keyhog is the first tool we've shipped on it.

Try it

github.com/santhsecurity/keyhog - MIT licensed, single binary, no telemetry, no network calls unless you pass --verify. Release notes for every version live at github.com/santhsecurity/keyhog/releases.