Quick start
After cargo install nyx-scanner (or dropping a release binary on your PATH), point Nyx at a directory:
nyx scan ./my-project
First run builds a SQLite index under .nyx/; later runs skip files whose content hash hasn’t changed.
What a finding looks like

The same scan in console form:
/tmp/demo/cmdi_direct.py
6:5 ✖ [HIGH] taint-unsanitised-flow (source 5:11) (Score: 81, Confidence: High)
Unsanitised user input flows from request.args.get → os.system
Source: request.args.get (5:11)
Sink: os.system
6:5 ✖ [HIGH] py.cmdi.os_system (Score: 64, Confidence: High)
os.system() runs a shell command
/tmp/demo/xss_document_write.js
5:5 ✖ [HIGH] taint-unsanitised-flow (source 3:18) (Score: 81, Confidence: High)
Unsanitised user input flows from req.query.content → document.write
Source: req.query.content (3:18)
Sink: document.write
5:5 ⚠ [MEDIUM] js.xss.document_write (Score: 34, Confidence: High)
document.write() is an XSS sink
warning 'demo' generated 10 issues.
Finished in 0.054s.
Each finding is one line of header plus evidence. Fields that matter:
| Field | Meaning |
|---|---|
[HIGH] / [MEDIUM] / [LOW] | Severity after the non-prod downgrade |
| Rule ID | Either a taint rule (taint-unsanitised-flow), a structural rule (cfg-*, state-*), or an AST pattern (<lang>.<category>.<name>) |
| Score | Attack-surface ranking (severity + analysis kind + source kind + evidence). Higher is more exploitable |
| Confidence | High, Medium, Low. Drops for AST-only matches, capped widened flows, and lowered-to-Low backwards-infeasible findings |
| Source / Sink | Where tainted data entered and where the dangerous call happened |
Two rules firing on the same line (the taint finding plus the AST pattern) is normal. The pattern matches the structural presence of document.write; the taint rule adds the evidence that req.query.content actually reached it. Both carry distinct rule IDs so suppressions can target one without the other.
Fail a CI job on High findings
nyx scan . --fail-on HIGH --quiet
Exit 1 if any HIGH finding remains. --quiet drops the “Using default configuration” banner so CI logs stay tidy.
Emit SARIF for GitHub Code Scanning
nyx scan . --format sarif > results.sarif
Full SARIF schema and GitHub Actions wiring: cli.md and output.md.
Tighten the gate
# Only HIGH findings
nyx scan . --severity HIGH
# HIGH + MEDIUM
nyx scan . --severity ">=MEDIUM"
# Drop anything below Medium confidence (useful for CI)
nyx scan . --min-confidence medium
# Also drop findings the engine could not fully resolve (widened / bailed)
nyx scan . --require-converged
--require-converged keeps under-report findings (the emitted flow is still real) but drops over-reports and widenings. Intended for strict gates where a noisy finding is worse than nothing.
Skip dataflow for a fast first pass
nyx scan . --mode ast
AST-only mode runs tree-sitter patterns without building a CFG or running taint. It’s fast and still catches banned-API uses, weak crypto, and obvious XSS sinks, but it can’t tell eval("1+1") apart from eval(userInput). Use it as a pre-commit filter, not as a CI gate replacement.
Next
- CLI reference for every flag and subcommand.
- Configuration for the
nyx.conf/nyx.localschema, profiles, and custom rules. nyx servefor the browser UI, triage workflow, and scan history.- Language maturity for per-language tier and known FP/FN patterns.
Installation
For the happy path (cargo install nyx-scanner, release binary on PATH), see the README. This page covers platform-specific notes and upgrade paths.
Supported platforms
Release binaries are published for:
| Platform | Archive |
|---|---|
| Linux x86_64 | nyx-x86_64-unknown-linux-gnu.zip |
| macOS Intel | nyx-x86_64-apple-darwin.zip |
| macOS Apple Silicon | nyx-aarch64-apple-darwin.zip |
| Windows x86_64 | nyx-x86_64-pc-windows-msvc.zip |
Build from source works on any stable Rust 1.88+ target (edition 2024).
Verify the download
Each release attaches a SHA256SUMS file. When the maintainer signs the release, a detached SHA256SUMS.asc is published alongside it.
# Verify the checksum file's signature (skip if .asc isn't present)
gpg --verify SHA256SUMS.asc SHA256SUMS
# Then check your archive against it
sha256sum -c SHA256SUMS --ignore-missing
If sha256sum is missing on macOS, shasum -a 256 -c SHA256SUMS --ignore-missing is equivalent.
Windows
Expand-Archive -Path nyx-x86_64-pc-windows-msvc.zip -DestinationPath .
Move-Item -Path .\nyx.exe -Destination "C:\Program Files\Nyx\"
# Add C:\Program Files\Nyx to PATH in System Properties → Environment Variables
nyx --version
Build from source
git clone https://github.com/elicpeter/nyx.git
cd nyx
cargo build --release
# Binary at target/release/nyx
The frontend is built and embedded into the binary during cargo build, so there’s no separate step for nyx serve. Node is only required if you’re working on the frontend itself; see CONTRIBUTING.md.
Optional features:
| Flag | Adds |
|---|---|
--features smt | Bundles Z3 for stronger path-constraint solving. MIT-licensed; distributors should include Z3’s license in their attribution |
--features smt-system-z3 | Links against a system-installed Z3 instead of bundling |
Upgrading
Nyx stores its scanner version in the project’s index database. When the binary’s version differs from the stored version, the index is wiped on the next scan and rebuilt against the new engine. You’ll see one info-level log line:
engine version changed (0.4.0 → 0.5.0), rebuilding index
No flag needed. If you see this on every scan, the metadata row isn’t being persisted; file an issue.
Corrupt database recovery
If the SQLite file itself is damaged (killed scan, full disk), delete it and let the next scan rebuild from scratch:
rm "$(nyx config path)"/<project>.sqlite*
Only the named project’s rows are affected.
CLI Reference
Global
nyx [COMMAND]
nyx --version
nyx --help
nyx scan
Run a security scan on a directory.
nyx scan [PATH] [OPTIONS]
PATH defaults to . (current directory).
Analysis Mode
| Flag | Default | Description |
|---|---|---|
--mode <MODE> | full | Analysis mode: full, ast, cfg, or taint |
| Mode | What runs |
|---|---|
full | AST patterns + CFG structural analysis + taint analysis |
ast | AST patterns only (fastest, no CFG or taint) |
cfg / taint | CFG + taint analysis only (no AST patterns) |
Deprecated aliases: --ast-only (use --mode ast), --cfg-only (use --mode cfg), --all-targets (use --mode full).
Index Control
| Flag | Default | Description |
|---|---|---|
--index <MODE> | auto | Index behavior: auto, off, or rebuild |
| Index Mode | Behavior |
|---|---|
auto | Use existing index if available; build if missing |
off | Skip indexing, scan filesystem directly |
rebuild | Force rebuild index before scanning |
Deprecated aliases: --no-index (use --index off), --rebuild-index (use --index rebuild).
Output
| Flag | Default | Description |
|---|---|---|
-f, --format <FMT> | console | Output format: console, json, or sarif |
--quiet | off | Suppress status messages (stderr), including the Preview-tier banner for C/C++ scans |
--no-rank | off | Disable attack-surface ranking |
--no-state | off | Disable state-model analysis (resource lifecycle + auth state). Overrides scanner.enable_state_analysis |
Profiles
| Flag | Default | Description |
|---|---|---|
--profile <NAME> | (none) | Apply a named scan profile. Built-ins: quick, full, ci, taint_only, conservative_large_repo. User-defined profiles override built-ins with the same name. CLI flags still take precedence over profile values |
Filtering
| Flag | Default | Description |
|---|---|---|
--severity <EXPR> | (none) | Filter findings by severity |
--min-score <N> | (none) | Drop findings with rank score below N |
--min-confidence <LEVEL> | (none) | Drop findings below this confidence level (low, medium, high) |
--require-converged | off | Drop findings whose engine provenance notes indicate widening (over-report) or analysis bail. Keeps under-report findings (emitted flow is still real). Intended for strict CI gates. |
--fail-on <SEV> | (none) | Exit code 1 if any finding >= this severity |
--show-suppressed | off | Show inline-suppressed findings (dimmed, tagged [SUPPRESSED]) |
--keep-nonprod-severity | off | Don’t downgrade severity for test/vendor paths |
--all | off | Disable category filtering, rollups, and LOW budgets – show everything |
--include-quality | off | Include Quality-category findings (hidden by default) |
--max-low <N> | 20 | Maximum total LOW findings to show |
--max-low-per-file <N> | 1 | Maximum LOW findings per file |
--max-low-per-rule <N> | 10 | Maximum LOW findings per rule |
--rollup-examples <N> | 5 | Number of example locations in rollup findings |
--show-instances <RULE> | (none) | Expand all instances of a specific rule (bypass rollup) |
Severity expression formats:
--severity HIGH # Only high
--severity "HIGH,MEDIUM" # High or medium
--severity ">=MEDIUM" # Medium and above (high + medium)
--severity ">= low" # All severities (case-insensitive)
Deprecated aliases: --high-only (use --severity HIGH), --include-nonprod (use --keep-nonprod-severity).
--fail-on returns a non-zero exit code when the threshold trips, so CI jobs fail without further wiring:

Quality-category and rollup-prone Low findings are filtered down by default. The footer tells you exactly what got dropped and which knob to turn:

Analysis Engine Toggles
Override the corresponding [analysis.engine] values in nyx.conf for a single run. All default on; pass the --no-* variant to disable.
| Pair | Config field | Effect when disabled |
|---|---|---|
--constraint-solving / --no-constraint-solving | constraint_solving | Skip path-constraint solving; infeasible paths no longer pruned |
--abstract-interp / --no-abstract-interp | abstract_interpretation | Skip interval / string / bit abstract domains |
--context-sensitive / --no-context-sensitive | context_sensitive | Treat intra-file callees insensitively (summary-only) |
--symex / --no-symex | symex.enabled | Skip the symex pipeline; no symbolic verdicts or witnesses |
--cross-file-symex / --no-cross-file-symex | symex.cross_file | Skip extracting / consulting cross-file SSA bodies |
--symex-interproc / --no-symex-interproc | symex.interprocedural | Cap symex frame stack at the entry function |
--smt / --no-smt | symex.smt | Skip the SMT backend (still a no-op without the smt feature) |
--backwards-analysis / --no-backwards-analysis | backwards_analysis | Demand-driven backwards taint walk from sinks (default off) |
--parse-timeout-ms <N> | parse_timeout_ms | Per-file tree-sitter parse timeout (ms); 0 disables the cap |
Lattice-width Caps
Two caps bound the width of taint origin sets and points-to sets per SSA value. When a set would exceed the cap, entries are truncated deterministically and an engine note (OriginsTruncated / PointsToTruncated) is recorded on affected findings so you can see when precision was lost.
| Flag | Default | Description |
|---|---|---|
--max-origins <N> | 32 | Max taint origins retained per lattice value. Raise on very wide codebases where truncation is observed; lower only when lattice width is a measured bottleneck. Also set via NYX_MAX_ORIGINS |
--max-pointsto <N> | 32 | Max abstract heap objects retained per points-to set. Raise on factory-heavy codebases where truncation is observed. Also set via NYX_MAX_POINTSTO |
See configuration.md for the full schema.
Engine-Depth Profile
Individual engine toggles are fine-grained but hard to remember in combination. The --engine-profile shortcut sets the whole stack in one shot, and individual flags are layered on top after the profile is applied.
| Profile | Backwards | Symex | Abstract-interp | Context-sensitive |
|---|---|---|---|---|
fast | off | off | off | off |
balanced (default) | off | off | on | on |
deep | on | on (cross-file + interprocedural) | on | on |
All three profiles build the AST, CFG, and SSA lattice and run forward taint; the columns above show which additional analyses each profile enables. SMT (symex.smt) is always off unless Nyx was built with --features smt.
Individual flags override the profile. For example, --engine-profile fast --backwards-analysis runs the fast stack but with backwards analysis on.
Explain Effective Engine
--explain-engine prints the resolved engine configuration (profile + config + CLI overrides + env-var fallbacks) to stdout and exits without scanning. Useful for sanity-checking a CI invocation.
nyx scan --engine-profile deep --no-smt --explain-engine

Examples
# Basic scan
nyx scan
# Scan specific path, JSON output
nyx scan ./server --format json
# CI gate: fail on medium+, SARIF output
nyx scan . --format sarif --fail-on medium > results.sarif
# Fast AST-only scan, no index
nyx scan . --mode ast --index off
# High-severity only, quiet mode
nyx scan . --severity HIGH --quiet
# Only findings scoring 50 or above
nyx scan . --min-score 50
# Only medium+ confidence findings
nyx scan . --min-confidence medium
# Show everything (no filtering, no rollups)
nyx scan . --all
# Include quality findings but keep rollups and budgets
nyx scan . --include-quality
# See all unwrap findings expanded
nyx scan . --include-quality --show-instances rs.quality.unwrap
# Allow more LOW findings
nyx scan . --max-low 50 --max-low-per-file 5
nyx index
Manage the SQLite file index.
nyx index build
nyx index build [PATH] [--force]
Build or update the index for the given path (default: .).
| Flag | Description |
|---|---|
-f, --force | Force full rebuild, ignoring cached file hashes |
nyx index status
nyx index status [PATH]
Display index statistics (file count, size, last modified) for the given path.

nyx list
nyx list [-v]
List all indexed projects.
| Flag | Description |
|---|---|
-v, --verbose | Show detailed information per project |
nyx clean
nyx clean [PROJECT] [--all]
Remove index data.
| Argument/Flag | Description |
|---|---|
PROJECT | Project name or path to clean |
--all | Clean all indexed projects |
nyx config
Manage configuration.
nyx config show
Print the effective merged configuration as TOML. Useful for sanity-checking what the scanner is actually using after nyx.conf and nyx.local merge:
![nyx config show output: TOML dump of the merged scanner config showing [scanner] mode/min_severity/excluded_extensions/excluded_directories, [database] settings, and resolved engine toggles](../assets/screenshots/docs/cli-configshow.png)
nyx config path
Print the configuration directory path.
nyx config add-rule
nyx config add-rule --lang <LANG> --matcher <MATCHER> --kind <KIND> --cap <CAP>
Add a custom taint rule. Written to nyx.local.
| Flag | Values |
|---|---|
--lang | rust, javascript, typescript, python, go, java, c, cpp, php, ruby |
--matcher | Function or property name to match |
--kind | source, sanitizer, sink |
--cap | env_var, html_escape, shell_escape, url_encode, json_parse, file_io, fmt_string, sql_query, deserialize, ssrf, code_exec, crypto, unauthorized_id, all |
nyx config add-terminator
nyx config add-terminator --lang <LANG> --name <NAME>
Add a terminator function (e.g. process.exit). Written to nyx.local.
Exit codes
See output.md. Summary: 0 on success (including findings without --fail-on), 1 when --fail-on trips, non-zero on scan errors.
Environment variables
Runtime behaviour:
| Variable | Description |
|---|---|
RUST_LOG | Set tracing verbosity (e.g. RUST_LOG=debug nyx scan .) |
NO_COLOR | Disable ANSI color output |
Engine toggles (legacy, still honored; prefer CLI flags or [analysis.engine] config):
| Variable | Matches |
|---|---|
NYX_CONSTRAINT | --constraint-solving |
NYX_ABSTRACT_INTERP | --abstract-interp |
NYX_CONTEXT_SENSITIVE | --context-sensitive |
NYX_SYMEX, NYX_CROSS_FILE_SYMEX, NYX_SYMEX_INTERPROC | --symex and friends |
NYX_SMT | --smt (no-op without the smt feature) |
NYX_BACKWARDS | --backwards-analysis |
NYX_PARSE_TIMEOUT_MS | --parse-timeout-ms |
NYX_MAX_ORIGINS, NYX_MAX_POINTSTO | --max-origins, --max-pointsto |
nyx serve: the browser UI
The CLI is fine for CI. For triage, you want context: the source snippet, the dataflow path, the history of how a finding has moved across scans, and a place to record decisions that survive the next run. nyx serve boots a local React UI bound to loopback.
nyx serve # opens http://localhost:9700 in your default browser
nyx serve ./my-project # serve a specific project root
nyx serve --port 9750 # override port
nyx serve --no-browser # don't auto-open
Persistent settings live under [server] in nyx.conf / nyx.local.

What it serves, and what it doesn’t
The frontend is built and embedded into the nyx binary at compile time. There’s no separate install step, and the binary serves the entire UI from memory; nothing is fetched from a CDN. The UI talks to the local Nyx process over a small JSON API.
There is no account, no telemetry, no remote logging, no auto-update ping. The data the UI shows is the data on your disk: the SQLite project index plus .nyx/triage.json.
Security model
nyx serve enforces three things at the HTTP layer (src/server/security.rs):
- Loopback bind only.
--hostand[server].hostare clamped to127.0.0.1,localhost, or::1. Any other value is refused at startup withNyx serve only binds to loopback addresses; refused host '<value>'. - Host-header check. Every request must carry a
Hostheader that matches the bound address and port. Missing or mismatched headers get a400 invalid Host header. Defends against DNS rebinding. - CSRF on mutations.
POST/PUT/PATCH/DELETErequests must carry a per-process CSRF token in thex-nyx-csrfheader. The token is generated once when the server starts and exposed atGET /api/healthso the embedded SPA can read it. Cross-origin mutations are rejected before the CSRF check via theOriginheader.
If you forward the port over SSH or expose it through a reverse proxy, the host-header check will reject the request because the Host won’t match localhost:9700. That’s the intended behaviour. Don’t do this without a deliberate reason; the loopback bind is part of the security model.
The pages
| Path | Page |
|---|---|
/ | Overview |
/findings | Findings list |
/findings/:id | Finding detail |
/triage | Triage |
/explorer | Explorer |
/scans | Scans |
/scans/:id | Scan detail and compare |
/rules | Rules |
/rules/:id | Rule detail |
/config | Config |
The numeric :id for finding URLs is the position index in the current scan, not a stable fingerprint. Bookmarks across scans aren’t reliable; rely on file path + line.
Overview and Health Score
The overview is the landing page after a scan. Severity counts, top affected files, OWASP coverage, and a 0 to 100 Health Score with a letter grade.
How the Health Score is calculated
Two things drive the score. The density of risk in the codebase, and hard guardrails that decide what the grade can mean.
Each finding contributes weight = severity_base × confidence_factor × verdict_factor × context_factor:
- Severity base: HIGH 10, MEDIUM 3, LOW (security) 0.5
- Confidence: High 1.0, Medium 0.6, Low 0.3
- Symex verdict: Confirmed 1.2, NotAttempted 1.0, Inconclusive 0.7, Infeasible 0.1
- Context: cross-file taint flow 1.15, intra-file flow 1.0, AST-only or no flow 0.75, test path 0.3
Quality lints (rule IDs containing .quality.) skip the per-finding weight and instead apply a saturating drag, capped at 15 points (so 1000 unwrap lints don’t grade worse than 300 do). Total weight gets divided by sqrt(files / 100), clamped between 1 and roughly 22, so a 100-file repo and a 50000-file repo see different denominators but a monorepo can’t dilute its way out of a real HIGH.
The result feeds a log curve into a 0 to 100 base, minus the quality drag. Then HIGH guardrails apply, keyed on the credibility-adjusted HIGH count rather than the raw count:
| effective HIGH | ceiling |
|---|---|
| 0 | 100 |
| 1 | 85 |
| 2 | 78 |
| 3 to 5 | 68 |
| 6 to 10 | 58 |
| 11+ | 45 |
A repo with zero effective HIGHs never grades below C 70. That floor is the structural promise that the score isn’t an automated F-machine for projects that have lots of LOW noise but no critical issues.
Modifiers in the ±5 range nudge the result for trend (only after the second scan), triage coverage (only when total findings ≥ 20), reintroduced findings, and stale HIGHs more than 30 days old.
What the score doesn’t measure
It’s a Nyx-finding-pressure metric, not a security audit. Score 100 means Nyx didn’t find anything under its current rules and language coverage; it doesn’t certify the absence of vulnerabilities. The score doesn’t see runtime config, IAM, secret stores, dependency CVEs, or anything outside the source tree being scanned. A repo of mostly Kotlin (where Nyx coverage is thin) will score artificially well because most of the code never gets evaluated.
The current ceilings are calibrated for v0.5 scanner false-positive rates. As symex coverage and rule precision improve, the ceilings tighten. Calibration data and the rationale behind each tunable lives in health-score-audit.md.
Findings and Finding detail
The findings list is filterable by severity, confidence, category, language, rule ID, and triage state.

Clicking through opens the flow visualiser: a numbered walk from source to sink with the snippet at each step, cross-file markers when the path leaves the current file, the rule’s “How to fix” guidance, and the engine’s evidence object inline.

Engine notes call out when precision was bounded for that finding (OriginsTruncated, PointsToTruncated, PathWidened, ForwardBailed, etc.). Anything tagged under-report means the emitted flow is real and the result set is a lower bound; over-report means widening or bail. --require-converged in the CLI drops the over-report ones for strict gates.
Triage
Each finding carries a triage state: open, investigating, false_positive, accepted_risk, suppressed, or fixed. The triage page bulk-updates them and shows the audit trail.

State writes are persisted to SQLite immediately, and (when [server].triage_sync = true, default on) mirrored to .nyx/triage.json in the project root. Commit that file:
git add .nyx/triage.json
It carries decisions across machines so a teammate’s local scan reflects yours. The format is documented in src/server/triage_sync.rs; the schema is stable and round-trip-safe with nyx serve re-imports.
Explorer
A file tree with per-file finding counts, syntax-highlighted source, and a right rail with the file’s symbols and findings. Useful for “what’s wrong with this module” rather than “what’s wrong with this finding”.

The path query string preselects a file: /explorer?file=src/handler.rs.
Scans and compare
Past runs are persisted when [runs].persist = true (off by default to avoid disk growth on heavy users). When persistence is on, /scans lists historical runs.

Each run drills into a detail page with files scanned, findings count, duration, languages, and a per-pass timing breakdown.

Pick two scans to diff and see what got introduced, fixed, or rediscovered between runs. The retention cap is [runs].max_runs (default 100). Each run can also optionally save its log and stdout (save_logs, save_stdout); both are off by default. Code snippets are saved (save_code_snippets = true); turn off if storage is tight.
Rules
Every rule the engine knows about, built-in plus user-added. Each row shows the matchers, kind (source / sanitiser / sink), capability, language, and how many findings it produced in the latest scan. Filter by language, by kind, or by free text.

User-added rules can be deleted from this page; built-ins are immutable. Built-ins live in src/labels/<lang>.rs and src/patterns/<lang>.rs; user-added entries write to nyx.local.
Config
A live config editor. Reads the merged config (nyx.conf + nyx.local), lets you flip switches and add custom source / sanitizer / sink rules, and writes back to nyx.local. Changes apply to the next scan; the running server uses its initial config snapshot.

The custom-rule form picks a language, a matcher (function or property name), and a capability. The capability list matches the Cap bitflags the taint engine uses; see rules.md for what each one means.
API surface
For tooling, the JSON endpoints under /api/ are stable enough to script against. The full route map lives in src/server/routes/mod.rs. Mutating endpoints require the x-nyx-csrf header (read it from GET /api/health).
Disabling
If you don’t want the UI for a project, set:
[server]
enabled = false
nyx serve will refuse to start. The CLI continues to work.
Configuration
Nyx uses TOML configuration files. A default config is auto-generated on first run. If you’d rather edit settings and rules from the browser, the Config page in nyx serve is a live editor that writes back to nyx.local:

File Locations
| Platform | Directory |
|---|---|
| Linux | ~/.config/nyx/ |
| macOS | ~/Library/Application Support/nyx/ |
| Windows | %APPDATA%\elicpeter\nyx\config\ |
Run nyx config path to see the exact directory on your system.
File Precedence
nyx.conf– Default config (auto-created from built-in template on first run)nyx.local– User overrides (loaded on top of defaults)
Both files are optional. CLI flags take precedence over both.
Merge Strategy
| Type | Behavior |
|---|---|
Scalars (mode, min_severity, booleans) | User value wins |
Arrays (excluded_extensions, excluded_directories, excluded_files) | Union + deduplicate |
| Analysis rules | Per-language union with deduplication |
| Profiles | User profile with same name fully replaces built-in |
| Server / Runs | User value wins (full section override) |
Example:
# nyx.conf (default):
excluded_extensions = ["jpg", "png", "exe"]
# nyx.local (user):
excluded_extensions = ["foo", "jpg"]
# Effective result:
# ["exe", "foo", "jpg", "png"] -- sorted, deduped union
Full Schema
[scanner]
| Field | Type | Default | Description |
|---|---|---|---|
mode | "full" | "ast" | "cfg" | "taint" | "full" | Analysis mode |
min_severity | "Low" | "Medium" | "High" | "Low" | Minimum severity to report |
max_file_size_mb | int | null | 16 | Max file size in MiB; null = unlimited. Default is a safe ceiling for untrusted repos; lift explicitly when scanning trusted codebases with large generated files |
excluded_extensions | [string] | ["jpg", "png", "gif", "mp4", ...] | File extensions to skip |
excluded_directories | [string] | ["node_modules", ".git", "target", ...] | Directories to skip |
excluded_files | [string] | [] | Specific files to skip |
read_global_ignore | bool | false | Honor global ignore file (RESERVED) |
read_vcsignore | bool | true | Honor .gitignore / .hgignore |
require_git_to_read_vcsignore | bool | true | Require .git dir to apply gitignore |
one_file_system | bool | false | Don’t cross filesystem boundaries |
follow_symlinks | bool | false | Follow symbolic links |
scan_hidden_files | bool | false | Scan dot-files |
include_nonprod | bool | false | Keep original severity for test/vendor paths |
enable_state_analysis | bool | true | Enable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires mode = "full" or mode = "taint". |
[database]
| Field | Type | Default | Description |
|---|---|---|---|
path | string | "" | Custom SQLite DB path; empty = platform default (RESERVED) |
auto_cleanup_days | int | 30 | Days to keep DB files (RESERVED) |
max_db_size_mb | int | 1024 | Maximum DB size in MiB (RESERVED) |
vacuum_on_startup | bool | false | Run VACUUM before indexed scans |
[output]
| Field | Type | Default | Description |
|---|---|---|---|
default_format | "console" | "json" | "sarif" | "console" | Default output format (used when --format is not specified) |
quiet | bool | false | Suppress status messages |
max_results | int | null | null | Cap number of findings; null = unlimited |
attack_surface_ranking | bool | true | Enable attack-surface ranking |
min_score | int | null | null | Minimum rank score to include; null = no minimum |
min_confidence | string | null | null | Minimum confidence level ("low", "medium", "high"); null = no minimum |
include_quality | bool | false | Include Quality-category findings (hidden by default) |
show_all | bool | false | Disable category filtering, rollups, and LOW budgets |
max_low | int | 20 | Maximum total LOW findings to show (rollups count as 1) |
max_low_per_file | int | 1 | Maximum LOW findings per file (rollups count as 1) |
max_low_per_rule | int | 10 | Maximum LOW findings per rule (rollups count as 1) |
rollup_examples | int | 5 | Number of example locations stored in rollup findings |
[performance]
| Field | Type | Default | Description |
|---|---|---|---|
max_depth | int | null | null | Max filesystem traversal depth; null = unlimited |
min_depth | int | null | null | Min depth for reported entries (RESERVED) |
prune | bool | false | Stop traversing into matching directories (RESERVED) |
worker_threads | int | null | null | Worker thread count; null/0 = auto-detect |
batch_size | int | 100 | Files per index batch |
channel_multiplier | int | 4 | Channel capacity = threads x multiplier |
rayon_thread_stack_size | int | 8388608 | Rayon thread stack size in bytes (8 MiB) |
scan_timeout_secs | int | null | null | Per-file timeout in seconds (RESERVED) |
memory_limit_mb | int | 512 | Max memory in MiB (RESERVED) |
[server]
Configuration for the local web UI (nyx serve).
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Whether the serve command is enabled |
host | string | "127.0.0.1" | Host to bind to (localhost by default) |
port | int | 9700 | Port for the web UI |
open_browser | bool | true | Open browser automatically on serve |
auto_reload | bool | true | Auto-reload UI when scan results change |
persist_runs | bool | true | Persist scan runs for history view |
max_saved_runs | int | 50 | Maximum number of saved runs |
[runs]
Configuration for scan run persistence and history.
| Field | Type | Default | Description |
|---|---|---|---|
persist | bool | false | Persist scan run history to disk |
max_runs | int | 100 | Maximum number of runs to keep |
save_logs | bool | false | Save scan logs with each run |
save_stdout | bool | false | Save stdout capture with each run |
save_code_snippets | bool | true | Save code snippets in findings |
[profiles.<name>]
Named scan presets that override scan-related config. Activate with --profile <name>.
All fields are optional; omitted fields inherit from the base config.
| Field | Type | Description |
|---|---|---|
mode | string | Analysis mode |
min_severity | string | Minimum severity |
max_file_size_mb | int | Max file size in MiB |
include_nonprod | bool | Keep original severity for test/vendor |
enable_state_analysis | bool | Enable state analysis |
default_format | string | Output format |
quiet | bool | Suppress status output |
attack_surface_ranking | bool | Enable ranking |
max_results | int | Max findings |
min_score | int | Min rank score |
show_all | bool | Show all findings |
include_quality | bool | Include quality findings |
worker_threads | int | Worker thread count |
max_depth | int | Max traversal depth |
Built-in profiles:
| Name | Description |
|---|---|
quick | AST-only, medium+ severity |
full | Full analysis with state analysis enabled |
ci | Full analysis, medium+ severity, quiet, SARIF output |
taint_only | Taint analysis only |
conservative_large_repo | AST-only, high severity, 5 MiB file limit, depth 10 |
User-defined profiles with the same name as a built-in will override it.
[analysis.engine]
Release-grade switches for the optional analysis passes. Each toggle has a
matching CLI flag (pair of --foo / --no-foo) that overrides the config
value for a single run. These used to be NYX_* environment variables
(NYX_CONSTRAINT, NYX_ABSTRACT_INTERP, NYX_SYMEX, NYX_CROSS_FILE_SYMEX,
NYX_SYMEX_INTERPROC, NYX_CONTEXT_SENSITIVE, NYX_PARSE_TIMEOUT_MS,
NYX_SMT); those env vars are still honored as a last-resort override when
nyx is used as a library (no CLI entry point), but the config/CLI surface is
the stable path.
| Field | Type | Default | Description |
|---|---|---|---|
constraint_solving | bool | true | Path-constraint solving (prunes infeasible paths in taint) |
abstract_interpretation | bool | true | Interval / string / bit abstract domains carried through the SSA worklist |
context_sensitive | bool | true | k=1 context-sensitive callee inlining for intra-file calls |
backwards_analysis | bool | false | Demand-driven backwards taint walk from sinks (adds scan time; default off) |
parse_timeout_ms | int | 10000 | Per-file tree-sitter parse timeout; 0 disables the cap |
[analysis.engine.symex] sub-section:
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Run the symex pipeline after taint; adds witness strings and symbolic verdicts |
cross_file | bool | true | Persist / consult cross-file SSA bodies so symex can reason about callees defined in other files |
interprocedural | bool | true | Intra-file interprocedural symex (k ≥ 2 via frame stack) |
smt | bool | true | Use the SMT backend when nyx is built with the smt feature; ignored otherwise |
CLI flag map (each pair is --enable / --no-enable):
| Config field | CLI flags |
|---|---|
constraint_solving | --constraint-solving / --no-constraint-solving |
abstract_interpretation | --abstract-interp / --no-abstract-interp |
context_sensitive | --context-sensitive / --no-context-sensitive |
backwards_analysis | --backwards-analysis / --no-backwards-analysis |
parse_timeout_ms | --parse-timeout-ms <N> |
symex.enabled | --symex / --no-symex |
symex.cross_file | --cross-file-symex / --no-cross-file-symex |
symex.interprocedural | --symex-interproc / --no-symex-interproc |
symex.smt | --smt / --no-smt |
Engine-depth profile shortcut: instead of flipping individual toggles, pass --engine-profile {fast,balanced,deep} to set the whole stack at once. Individual flags override the profile, so --engine-profile fast --backwards-analysis runs the fast stack with backwards analysis on. See docs/cli.md for the exact toggle matrix.
Explain effective engine: pass --explain-engine to print the resolved engine configuration (profile + config + CLI overrides) and exit without scanning.
[analysis.languages.<slug>]
Per-language custom rules. <slug> is one of: rust, javascript, typescript, python, go, java, c, cpp, php, ruby.
| Field | Type | Description |
|---|---|---|
rules | array of rule objects | Custom label rules |
terminators | [string] | Functions that terminate execution |
event_handlers | [string] | Event handler function names |
Rule object:
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer" # "source" | "sanitizer" | "sink"
cap = "html_escape" # "env_var" | "html_escape" | "shell_escape" |
# "url_encode" | "json_parse" | "file_io" |
# "fmt_string" | "sql_query" | "deserialize" |
# "ssrf" | "code_exec" | "crypto" | "all"
Example Configurations
Minimal override (nyx.local)
[scanner]
min_severity = "Medium"
[output]
default_format = "json"
max_results = 100
CI-optimized
[scanner]
mode = "full"
min_severity = "Medium"
excluded_directories = ["node_modules", ".git", "target", "vendor", "dist"]
[output]
quiet = true
default_format = "sarif"
[performance]
worker_threads = 4
Using a scan profile
# Use a built-in profile
nyx scan --profile ci
# CLI flags still override profile values
nyx scan --profile ci --format json
Custom profile
[profiles.security_audit]
mode = "full"
min_severity = "Low"
enable_state_analysis = true
show_all = true
Custom rules for a Node.js project
[analysis.languages.javascript]
terminators = ["process.exit", "abort"]
event_handlers = ["addEventListener"]
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind = "sanitizer"
cap = "html_escape"
[[analysis.languages.javascript.rules]]
matchers = ["dangerouslySetInnerHTML"]
kind = "sink"
cap = "html_escape"
[[analysis.languages.javascript.rules]]
matchers = ["getRequestBody", "readUserInput"]
kind = "source"
cap = "all"
Adding rules via CLI
# Add a sanitizer
nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape
# Add a terminator
nyx config add-terminator --lang javascript --name process.exit
# Verify
nyx config show
Config Validation
Config is validated after loading and merging. Validation checks include:
- Server port must be 1–65535
- Server host must not be empty
max_saved_runsmust be > 0 whenpersist_runsis truemax_runsmust be > 0 whenpersistis truebatch_sizeandchannel_multipliermust be > 0rollup_examplesmust be > 0- Profile names must be alphanumeric with underscores only
Invalid config produces structured error messages identifying the section, field, and issue.
State Analysis
State analysis detects resource lifecycle violations (use-after-close, double-close, resource leaks) and unauthenticated access patterns. It is enabled by default.
To disable:
[scanner]
enable_state_analysis = false
State analysis requires mode = "full" or mode = "taint". It has no effect in mode = "ast".
Tradeoffs:
- Additional per-function state-machine pass adds some scan time
- May produce findings that require domain knowledge to evaluate (e.g., whether a resource handle is intentionally left open)
- Most useful for C, C++, Rust, Go, and Java where acquire/release patterns are common
Upgrading
Engine-version mismatch is handled automatically
Nyx stores the scanner’s CARGO_PKG_VERSION in the project index database.
When the version recorded in the DB differs from the running binary; or the
row is missing entirely; every cached summary, SSA body, and file-hash row
is wiped on the next open so the next scan rebuilds the index against the new
engine. No flag is needed; CI pipelines keep working across upgrades.
The rebuild is logged at info level:
engine version changed (0.4.0 → 0.5.0), rebuilding index
If you see this once per upgrade it is working as intended. If you see it on every scan, the metadata row is not being persisted; file an issue.
Forcing a reindex
Use --index rebuild to throw away the current project’s cached summaries
and re-run pass 1 against the current rules. Useful after editing
nyx.local rules, after an upgrade that changed label definitions without
changing the engine version, or when you want a known-clean baseline:
nyx scan --index rebuild .
This clears the current project’s rows in files, function_summaries,
ssa_function_summaries, and ssa_function_bodies; other projects sharing
the same DB directory are untouched.
Recovering from a corrupt database
If the .sqlite file itself is damaged (e.g. from a killed scan or full
disk) and nyx scan fails to open it, delete the file and let the next
scan recreate it:
rm "$(nyx config path)"/<project>.sqlite*
On the next scan Nyx builds a fresh index from scratch.
Reserved Fields
Some config fields are defined but not yet implemented. They are marked (RESERVED) in the default config and accept values without effect. This allows forward-compatible config files; settings will activate when the feature is implemented without requiring config changes.
Output Formats
Nyx supports three output formats, selected with --format or output.default_format in config.
Console (default)
Human-readable, color-coded output to stdout. Status messages go to stderr.
[HIGH] taint-unsanitised-flow (source 5:11) src/handler.rs:12:5 (Score: 76, Confidence: High)
Source: env::var("CMD") → Command::new("sh").arg("-c")
[MEDIUM] cfg-unguarded-sink src/handler.rs:12:5 (Score: 35, Confidence: Medium)
[LOW] rs.quality.unwrap src/lib.rs:88:5 (Score: 10, Confidence: High)
Severity indicators
| Tag | Color | Meaning |
|---|---|---|
[HIGH] | Red, bold | Critical – likely exploitable |
[MEDIUM] | Orange, bold | Important – may be exploitable |
[LOW] | Muted blue-gray | Informational – code quality or weak signal |
Evidence fields
Taint and state findings include structured evidence:
| Label | Meaning |
|---|---|
| Source | Where tainted data originated (function name + location) |
| Sink | Where the dangerous operation happens |
| Path guard | Type of validation predicate protecting the path |
Score
When attack-surface ranking is enabled (default), each finding shows a Score value. Higher scores indicate greater exploitability. See Detector Overview for the scoring formula.
Rollup findings
High-frequency LOW Quality findings (e.g. rs.quality.unwrap) are grouped into rollup findings by (file, rule):
21:10 ● [LOW] rs.quality.unwrap
rs.quality.unwrap (38 occurrences)
Examples: 21:10, 50:10, 79:10, 105:10, 134:10
Run: nyx scan --show-instances rs.quality.unwrap
Rollups count as one finding for LOW budget enforcement. Use --show-instances <RULE> to expand a specific rule or --all to disable rollups entirely.
Suppression footer
When findings are suppressed by the prioritization pipeline, a footer is shown:
Suppressed 195 LOW/Quality findings.
Active filters:
include_quality = false
max_low = 20
max_low_per_file = 1
max_low_per_rule = 10
Use --include-quality, --max-low, or --all to adjust.
JSON
Machine-readable JSON array. Each finding is an object:
[
{
"path": "src/handler.rs",
"line": 12,
"col": 5,
"severity": "High",
"id": "taint-unsanitised-flow (source 5:11)",
"path_validated": false,
"labels": [
["Source", "env::var(\"CMD\") at 5:11"],
["Sink", "Command::new(\"sh\").arg(\"-c\")"]
],
"confidence": "High",
"evidence": {
"source": {
"path": "src/handler.rs",
"line": 5,
"col": 11,
"kind": "source",
"snippet": "env::var(\"CMD\")"
},
"sink": {
"path": "src/handler.rs",
"line": 12,
"col": 5,
"kind": "sink",
"snippet": "Command::new(\"sh\")"
},
"notes": ["source_kind:EnvironmentConfig"]
},
"rank_score": 76.0,
"rank_reason": [
["severity_base", "60"],
["analysis_kind", "10"],
["source_kind", "5"],
["evidence_count", "1"]
]
}
]
Field descriptions
| Field | Type | Always present | Description |
|---|---|---|---|
path | string | yes | File path relative to scan root |
line | int | yes | 1-indexed line number |
col | int | yes | 1-indexed column number |
severity | string | yes | "High", "Medium", or "Low" |
id | string | yes | Rule ID |
category | string | yes | Finding category: "Security", "Reliability", or "Quality" |
path_validated | bool | no | True if guarded by validation predicate |
guard_kind | string | no | Predicate type (e.g. "NullCheck", "ValidationCall") |
message | string | no | Human-readable context (state analysis findings) |
labels | array | no | Array of [label, value] pairs for console display |
confidence | string | no | Confidence level: "Low", "Medium", or "High" |
evidence | object | no | Structured evidence (source/sink spans, state, notes) |
rank_score | float | no | Attack-surface score (omitted when ranking disabled) |
rank_reason | array | no | Score breakdown (omitted when ranking disabled) |
rollup | object | no | Rollup data when findings are grouped (see below) |
Fields marked “no” are omitted when empty/null/false to keep output compact.
Confidence levels
| Level | Meaning |
|---|---|
High | Strong signal – taint-confirmed flow, definite state violation |
Medium | Moderate signal – resource leak, path-validated taint, CFG structural |
Low | Weak signal – AST pattern match, possible resource leak, degraded analysis |
Evidence object
The evidence field provides structured provenance data:
| Field | Type | Description |
|---|---|---|
source | object | Source span (path, line, col, kind, snippet) |
sink | object | Sink span (path, line, col, kind, snippet) |
guards | array | Validation guard spans |
sanitizers | array | Sanitizer spans |
state | object | State-machine evidence (machine, subject, from_state, to_state) |
notes | array | Free-form notes (e.g. "source_kind:UserInput", "path_validated") |
All fields are omitted when empty/null.
Rollup object
When a finding is a rollup (grouped from multiple occurrences), the rollup field is present:
{
"rollup": {
"count": 38,
"occurrences": [
{ "line": 21, "col": 10 },
{ "line": 50, "col": 10 },
{ "line": 79, "col": 10 }
]
}
}
| Field | Type | Description |
|---|---|---|
count | int | Total number of occurrences |
occurrences | array | First N example locations (controlled by rollup_examples) |
SARIF (Static Analysis Results Interchange Format)
SARIF 2.1.0 JSON, suitable for GitHub Code Scanning and other SARIF-compatible tools.
nyx scan . --format sarif > results.sarif
The SARIF output includes:
- Tool metadata – Nyx name and version
- Rules – Rule ID, description, severity mapping
- Results – One result per finding with location, message, and properties
- Properties – Each result includes
categoryand optionallyconfidenceandrollup.count - Related locations – Rollup findings include example locations in
relatedLocations - Artifacts – File paths referenced by findings
GitHub Code Scanning integration
- name: Run Nyx
run: nyx scan . --format sarif > results.sarif
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
Exit Codes
| Code | Meaning |
|---|---|
0 | Scan completed successfully; no findings matched --fail-on threshold |
1 | --fail-on threshold breached (at least one finding meets or exceeds the specified severity) |
| Non-zero | Error (I/O, config, database, parse error) |
Without --fail-on, Nyx always exits 0 on a successful scan regardless of findings count.
Severity Levels
| Level | Description | Typical rules |
|---|---|---|
| High | Critical vulnerabilities – likely exploitable | Command injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources |
| Medium | Important issues – may be exploitable with additional context | SQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks |
| Low | Informational – code quality or weak signals | Weak crypto algorithms, insecure randomness, unwrap()/panic!(), type-safety escapes |
Non-production severity downgrade
By default, findings in paths matching common non-production patterns (tests/, test/, vendor/, build/, examples/, benchmarks/) are downgraded by one tier:
- High → Medium
- Medium → Low
- Low → Low (unchanged)
Use --keep-nonprod-severity to disable this behavior.
Inline Suppressions
Suppress specific findings directly in source code using nyx:ignore comments. Suppressed findings are excluded from output, severity counts, and --fail-on checks by default.
Comment syntax
| Language | Comment styles |
|---|---|
| Rust, C, C++, Java, Go, JS, TS | // nyx:ignore ... or /* nyx:ignore ... */ |
| Python, Ruby | # nyx:ignore ... |
| PHP | // nyx:ignore ..., # nyx:ignore ..., or /* nyx:ignore ... */ |
Directive forms
x = dangerous() # nyx:ignore taint-unsanitised-flow ← suppresses this line
# nyx:ignore-next-line taint-unsanitised-flow
x = dangerous() ← suppresses this line
nyx:ignore <RULE_ID>– suppresses findings on the same line as the comment.nyx:ignore-next-line <RULE_ID>– suppresses findings on the next line.- For taint findings, the primary line is the sink line (the
linefield in output).
Rule ID matching
- Case-sensitive, exact match after canonicalization.
- Comma-separated:
nyx:ignore rule-a, rule-b - Wildcard suffix:
nyx:ignore rs.quality.*matches any ID starting withrs.quality. - Taint IDs are canonicalized:
nyx:ignore taint-unsanitised-flowmatchestaint-unsanitised-flow (source 5:1)(parenthetical suffix stripped).
Console behavior
- Default: suppressed findings are hidden entirely.
--show-suppressed: suppressed findings appear dimmed with[SUPPRESSED]tag. Summary shows"N issues (M suppressed)".
JSON / SARIF behavior
- Default: suppressed findings are excluded from JSON/SARIF output.
--show-suppressed: suppressed findings are included with additional fields:
{
"suppressed": true,
"suppression": {
"kind": "SameLine",
"matched_pattern": "taint-unsanitised-flow",
"directive_line": 42
}
}
Exit code
Suppressed findings do not trigger --fail-on. A scan with only suppressed findings exits 0.
Rule ID Format
| Prefix | Detector | Example |
|---|---|---|
taint-* | Taint analysis | taint-unsanitised-flow (source 5:11) |
cfg-* | CFG structural | cfg-unguarded-sink, cfg-auth-gap |
state-* | State model | state-use-after-close, state-resource-leak |
<lang>.*.* | AST patterns | rs.memory.transmute, js.code_exec.eval |
See the Rule Reference for a complete listing.
Language Maturity Matrix
Nyx supports ten languages, but support depth is not uniform. This page gives an honest per-language picture so you can calibrate expectations before depending on Nyx for a given stack.
The classifications here are grounded in three concrete signals:
- Rule depth: how many distinct source / sanitizer / sink matchers exist
for the language in
src/labels/<lang>.rs, and how many vulnerability classes (Cap bits) those matchers cover. - Benchmark results: rule-level precision / recall / F1 on the 433-case
corpus in
tests/benchmark/RESULTS.md, last measured 2026-04-29 with scanner version 0.5.0. - Known weak spots: FPs and FNs the maintainers have deliberately left
in the benchmark rather than suppressed, plus structural engine
limitations the corpus does not stress, documented release-by-release in
RESULTS.md.
As of 2026-04-29 the synthetic corpus has effectively saturated: every real-CVE fixture fires and rule-level recall is 100%. Nine of ten languages report rule-level F1 = 100.0%; Go reports 98.0% on the back of a single safe-fixture FP. Aggregate rule-level P=0.995, R=1.000, F1=0.998. That means F1 alone no longer differentiates tiers, so the differentiators are rule depth, gated-sink coverage, and structural idioms the corpus does not fully stress (deep pointer aliasing in C/C++, framework-specific context). All parser integrations use tree-sitter and are stable; parsing is not a differentiator.
Tier Summary
| Tier | Languages | F1 | What to expect |
|---|---|---|---|
| Stable | Python, JavaScript, TypeScript | 100% | Deep rule sets, gated sinks (argument-role-aware), framework detection, extensive fixtures, and the bulk of advanced-analysis (SSA two-level solve, context-sensitivity, symbolic execution, abstract interpretation) coverage. Safe to depend on in CI gates. |
| Beta | Go, Java, PHP, Ruby, Rust | 98.0% to 100% | Solid mid-depth rule sets with narrower cap coverage and no gated sinks. Cross-file flows work; some idioms (variable-typed method receivers, framework context, string interpolation, match-arm guards) are partially modeled. Usable in CI; review FP/FN lists before tightening gates. |
| Preview | C, C++ | 100% on synthetic corpus | Recent work taught the engine to follow taint through std::vector / std::string / map containers (including c_str()), through fluent builder chains like Socket::builder().host(h).connect(), and through inline class member functions. Function pointers and deeper pointer aliasing through *p / p->field are still not tracked. Rule-level scores against a corpus of obvious unsafe-API uses look perfect, but that is not the same as a clean audit on a real codebase. Pair with clang-tidy, Clang Static Analyzer, or Infer. |
Per-Language Detail
Stable tier
Python: 100% P / 100% R / 100% F1 (46-case corpus)
- Rule depth: 5 source families, 7 sanitizer families, 21 sink matchers spanning HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
- Framework context: Flask, Django, argparse source matchers;
flask_requestimport-alias support. - Advanced analysis: gated sinks (
Popen,subprocess.run/callwith activation-arg awareness), most SSA-equivalence and symbolic-execution fixtures target Python. - Fixtures: 125 under
tests/fixtures/plus 42 benchmark cases. - Blind spots: f-string interpolation is not explicitly modeled as a distinct taint-producing construct; string-formatting flows are caught by the general concatenation path.
JavaScript: 100% P / 100% R / 100% F1 (42-case corpus)
- Rule depth: 3 source families, 10 sanitizer families, 24 sink matchers spanning HTML, URL, JSON, Shell, SQL, Code, SSRF, and File I/O.
- Advanced analysis: gated sinks (
setAttribute,parseFromString), two-level SSA solve for top-level + per-function scopes (analyse_ssa_js_two_level), prefix-locked SSRF suppression via StringFact, abstract-interpretation interval tracking. - Framework context: Express, Koa, Fastify (via in-file import scan when
package.jsonis absent). - Fixtures: 238 under
tests/fixtures/; the largest fixture set of any language. - Blind spots: template literals are lowered through concatenation rather
than modeled as a first-class taint operator; dynamic property access
(
obj[user]) is conservatively treated.
TypeScript: 100% P / 100% R / 100% F1 (47-case corpus)
- Rule depth: Shares the JS ruleset (3 sources, 10 sanitizers, 24 sinks) plus TS-specific grammar handling.
- Advanced analysis: TSX and JSX grammars wired; discriminated-union narrowing, generic erasure, decorator flow, and interface dispatch are all validated against adversarial type-system stressors.
- Framework context: Fastify detection via
detect_in_file_frameworks(import-driven, nopackage.jsonrequired). - Fixtures: 39 test fixtures plus 42 benchmark cases.
- Blind spots:
as anycasts andany-typed flows are handled conservatively (treated as tainted).
Beta tier
Go: 96.2% P / 100.0% R / 98.0% F1 (53-case corpus, 1 FP, 0 FNs)
- Rule depth: 4 source families, 4 sanitizer families, 9 sink matchers covering HTML, URL, Shell, SQL, SSRF, Crypto, and File I/O.
- Framework context: Gin, Echo source matchers.
- Open weak spots: one safe Go fixture (
go-safe-009) draws a spurious CMDi finding. - Known gaps: no gated sinks, no deserialization class.
fmt.Sprintfis deliberately not a sink. Cap coverage is narrower than the Stable tier and argument-role-aware sink modeling is not yet implemented for Go, so production CI gates may surface additional FPs the corpus does not exercise.
Java: 100% P / 100% R / 100% F1 (35-case corpus)
- Rule depth: 3 source families, 8 sanitizer families, 10 sink matchers covering HTML, URL, Shell, SQL, Code, SSRF, and Deserialization.
- Framework context: Spring, JPA, Hibernate ORM rules; JNDI injection sinks.
- Known gaps: no gated sinks. Variable-receiver method calls
(
client.send(...)vsHttpClient.send(...)) rely on type-qualified resolution from receiver-type inference; flows where the receiver type cannot be inferred are conservatively over-tainted on unusual builder chains.
PHP: 100% P / 100% R / 100% F1 (37-case corpus)
- Rule depth: 3 source families (
$_GET,$_POST,$_REQUESTsuperglobals), 7 sanitizer families, 10 sink matchers covering HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization. - Known gaps: no gated sinks. Limited framework context (Laravel raw
methods only).
echolanguage-construct detection is wired but its inner-argument propagation is narrower than function-call sinks.
Ruby: 100% P / 100% R / 100% F1 (39-case corpus)
- Rule depth: 3 source families, 7 sanitizer families, 15 sink matchers covering HTML, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
- Framework context: Rails helpers (
sanitize_sql,permit,require). - Known gaps: string interpolation inside shell and SQL strings is
recognized structurally but not modeled as a distinct operator.
begin/rescue/ensureexception-edge wiring is documented as deferred (structurally incompatible withbuild_try()). The previous openrb-interproc-001FN closed in the 2026-04-28 baseline after the RubyKernel#openCMDI sink and exact-match sigil work landed.
Rust: 100% P / 100% R / 100% F1 (70-case adversarial corpus)
Rust holds the largest per-language adversarial corpus and was promoted
from Experimental to Beta in the 2026-04-25 measurement after the PathFact
landings closed every previously-open rs-safe-* regression.
- Rule depth: 6 source families, 2 sanitizer families (prefix and type-coercion), 11 sink matchers covering HTML, Shell, SQL, SSRF, Deserialization, and File I/O. Extensive framework source coverage (Axum, Actix, Rocket); the most of any language on the source side. The narrow sanitizer count is the primary reason Rust is not in the Stable tier. Engine-side path/typed sanitizer recognition (PathFact) compensates, but the ruleset itself is shallow.
- Recent additions: SQL class (
rusqlite,sqlx,diesel,postgres), Deserialization class (serde_yaml,bincode,rmp_serde,ciborium,ron,toml), expanded file I/O (fs::remove_file/dir/rename/copy),reqwestSSRF builder chain. - Closed by recent PathFact landings
(
src/abstract_interp/path_domain.rs+ per-return-path PathFact entries onSsaFuncSummary):rs-safe-007(.replace("..","")sanitiser),rs-safe-008(negative-validation return),rs-safe-009(match-arm guards via condition lifting),rs-safe-010(static-map lookup),rs-safe-012(.contains("..")+.starts_with('/')rejection),rs-safe-014(Option-returning user sanitiser),rs-safe-015(Path::new(p).is_absolute()typed rejection),rs-safe-016(cross-function.contains("..")rejection), and CVE patchesCVE-2018-20997,CVE-2022-36113,CVE-2024-24576. - Not yet covered: unsafe FFI /
std::mem::transmute(no rules), Tokioprocess::Commandasync variants (not distinguished from sync),hyper/surf/ureqSSRF clients (reqwest family only).
Preview tier
C and C++ remain Preview despite reporting 100% rule-level F1 on the
synthetic corpus. A run of additions in late April taught the engine to
follow taint through several constructs that used to be hard cutoffs (STL
containers, builder chains, inline member functions, the wider std::sto*
family), so the gap between “passes the synthetic corpus” and “would catch
the same flow on a real codebase” is narrower than it used to be. It is not
zero. The biggest remaining gaps are deep pointer aliasing and function
pointers, both of which are pervasive in real C/C++ code. Treat a clean
report as a starting point, not an audit. Pair Nyx with clang-tidy, the
Clang Static Analyzer, or Infer for production use.
What now works (added in late April):
- STL container flow.
vec.push_back(tainted)followed byvec.front().c_str()carries taint into a downstreamsystem()sink.std::map::insert_or_assign,find,count,at, anddataall participate in the container store/load model. - Inline class member functions.
class C { void run(...) { ... } };bodies are now extracted as their own functions, so an intra-file call likeinner.run(input)resolves to the body summary. Same fix coversstruct_specifier,union_specifier,enum_specifier,template_declaration, andextern "C"blocks. - Lambda passthrough.
auto echo = [](const char* s) { return s; };carries argument taint into the result via the engine’s default call-argument propagation. - Builder chains.
Socket::builder().host(user).port(8080).connect()resolves the chained returns and fires on.connect()whenuseris tainted; the safe variant with a hardcoded host stays quiet. - Wider numeric sanitizer family. The full
std::sto*set (includingstoll,stoull,stold) and the C-stdlib forms (atoi,atof,strtol, etc.) clear all caps when they’re called. - More header / source extensions.
.cc,.cxx,.hpp,.hxx,.hh, and.h++are recognized as C++ on top of.cppand.c++..his intentionally still routed to C since it’s ambiguous without a build system.
Still not modeled (common to both C and C++):
- Deep pointer aliasing. Taint through
*p,p->field, and arbitrary pointer arithmetic is not tracked through arbitrary aliased writes. Field-sensitive points-to (see Advanced analysis) handles the “lock on a sub-field” case but is not a general escape analysis. - Function pointers and callback dispatch. An indirect call through
void (*fn)(char *)resolves to no callee, so cross-pointer flows are invisible. - Array-element taint by index. Writes to
buf[i]do not always propagate taint tobufas a whole; the recent subscript-handling work helps the general case but doesn’t makebufan alias for every element. - Nested classes beyond one level (C++ only).
C: 100% P / 100% R / 100% F1 (30-case corpus)
- Rule depth: 3 source families, 2 sanitizer families (the
sanitize_*prefix and numeric-parse functions), 5 sink matchers spanning Shell, File, SSRF, and Format-String. - Known gaps: no framework rules, no gated sinks. The structural limitations listed above are the dominant concern; rule additions alone will not lift this language out of the Preview tier.
C++: 100% P / 100% R / 100% F1 (33-case corpus, plus 6 new fixtures for STL / builder / inline-method flows)
- Rule depth: Builds on the C ruleset with
std::cin/std::getlinesources and a wider numeric-sanitizer set covering the fullstd::sto*family (3 sources, 3 sanitizer families, 5 sinks). - Known gaps: still no framework rules and no gated sinks. The structural blind spots are now narrower than they were a release ago (see “What now works” above), but function pointers and the harder pointer-aliasing patterns still produce false negatives.
How the tiers were assigned
Because rule-level F1 has saturated for nine of ten languages, the tier boundaries are drawn primarily on rule depth and engine coverage of real-world idioms rather than on benchmark scores alone.
A language lands in Stable when all three hold:
- Rule set covers ≥ 8 vulnerability classes with both source and sink
matchers, and at least one class has argument-role-aware gated-sink
modeling (e.g.
setAttribute("href", url)only flags href-like attrs). - Benchmark F1 ≥ 95% on a corpus of ≥ 25 cases.
- Advanced analysis (SSA lowering, context-sensitivity, symbolic execution, abstract interpretation) is exercised by fixtures for the language.
A language lands in Beta when benchmark F1 is in the mid-90s or higher on a meaningful corpus but at least one Stable criterion fails. Typical gaps: absence of gated sinks, or sanitizer rule depth narrow enough that the engine compensates structurally rather than via the ruleset.
A language lands in Preview when the engine has documented structural blind spots for constructs that are pervasive in typical codebases for that language. For C and C++ that means deep pointer aliasing, function pointers, and array-element taint; STL container flow and builder chains have moved out of the blind-spot list. Synthetic-corpus F1 is not a reliable signal for Preview-tier languages: a clean report can coexist with structural gaps.
(The previous Experimental tier was retired in the 2026-04-25 measurement when Rust’s adversarial corpus reached 100% F1; no language currently sits in that tier.)
What this means for you
- CI gates: safe to set strict
--fail-on HIGHgates on Stable-tier languages. On Beta-tier, expect occasional FP triage on production code (the synthetic corpus does not cover every framework idiom); the weak-spot lists above tell you what to skim for. On Preview-tier, treat Nyx findings as a starting point for manual review rather than authoritative. STL container flow and builder chains are tracked now, but deep pointer aliasing and function pointers are not, so a clean report does not tell you what the engine could not see. - Rule contributions: the shortest path to raising a language’s tier is
contributing sink matchers and gated-sink registrations. Label files live
at
src/labels/<lang>.rs; benchmark cases live attests/benchmark/corpus/<lang>/. - Scope planning: if your primary stack is C or C++, Nyx will surface
real findings on obvious unsafe-API uses, but budget for review time and
combine Nyx with
clang-tidyor the Clang Static Analyzer. Rust is now Beta-tier and suitable as a CI gate; pair withcargo-auditfor dependency CVEs.
The benchmark thresholds in tests/benchmark_test.rs are deliberately set
~5 pp below current baselines so any drop in a language’s F1 fails CI. Tier
promotions require sustained benchmark performance, not just rule additions.
Rule reference
Every finding Nyx emits has a rule ID. This page enumerates the IDs that ship with scanner 0.5.0, grouped by family.
This page is written by hand and drifts against the code. Authoritative sources:
src/patterns/<lang>.rsfor AST patterns,src/labels/<lang>.rsfor taint matchers, andsrc/auth_analysis/config.rsfor auth rules. If a rule fires that isn’t listed here, the source file is right and this page is wrong.
If you’d rather browse rules interactively, nyx serve ships a Rules page that lists every loaded matcher with its language, kind, and capability:

ID format
| Prefix | Detector | Example |
|---|---|---|
taint-* | Taint analysis | taint-unsanitised-flow (source 5:11) |
cfg-* | CFG structural | cfg-unguarded-sink, cfg-auth-gap |
state-* | State model | state-use-after-close, state-resource-leak |
<lang>.auth.* | Auth analysis | rs.auth.missing_ownership_check |
<lang>.<category>.<name> | AST patterns | rs.memory.transmute, js.code_exec.eval |
Language prefixes: rs, c, cpp, go, java, js, ts, py, php, rb.
Cross-language rules
Taint
One rule covers every source-to-sink flow. The parenthetical identifies the source location.
| Rule ID | Severity |
|---|---|
taint-unsanitised-flow (source L:C) | Varies by source kind and sink capability |
The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in src/labels/<lang>.rs. Language maturity gives per-language counts and what’s covered.
CFG structural
| Rule ID | Severity |
|---|---|
cfg-unguarded-sink | High/Medium |
cfg-auth-gap | High |
cfg-unreachable-sink | Medium |
cfg-unreachable-sanitizer | Low |
cfg-unreachable-source | Low |
cfg-error-fallthrough | High/Medium |
cfg-resource-leak | Medium |
cfg-lock-not-released | Medium |
State model
| Rule ID | Severity |
|---|---|
state-use-after-close | High |
state-double-close | Medium |
state-resource-leak | Medium |
state-resource-leak-possible | Low |
state-unauthed-access | High |
Auth analysis (Rust only, today)
| Rule ID | Severity |
|---|---|
rs.auth.missing_ownership_check | High |
rs.auth.missing_ownership_check.taint | High (gated by scanner.enable_auth_as_taint) |
See auth.md for scope, the five sink-classes, and tuning.
AST patterns by language
Each language ships a tree-sitter pattern registry. Structural match on the pattern, no dataflow. Some patterns also have a Tier B heuristic guard (e.g. SQL execute must receive a concatenation, not a literal) noted in the registry.
The tables below are generated from src/patterns/<lang>.rs by tools/docgen. Run cargo run --features docgen --bin nyx-docgen after changing the registry to refresh them.
C: 8 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
c.cmdi.system | High | A | High |
c.memory.gets | High | A | High |
c.memory.printf_no_fmt | High | B | Medium |
c.memory.scanf_percent_s | High | A | High |
c.memory.sprintf | High | A | High |
c.memory.strcat | High | A | High |
c.memory.strcpy | High | A | High |
c.cmdi.popen | Medium | A | High |
C++: 9 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
cpp.cmdi.popen | High | A | High |
cpp.cmdi.system | High | A | High |
cpp.memory.gets | High | A | High |
cpp.memory.printf_no_fmt | High | B | Medium |
cpp.memory.sprintf | High | A | High |
cpp.memory.strcat | High | A | High |
cpp.memory.strcpy | High | A | High |
cpp.memory.const_cast | Medium | A | High |
cpp.memory.reinterpret_cast | Medium | A | High |
Go: 8 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
go.cmdi.exec_command | High | A | High |
go.transport.insecure_skip_verify | High | A | High |
go.deser.gob_decode | Medium | A | High |
go.memory.unsafe_pointer | Medium | A | High |
go.secrets.hardcoded_key | Medium | A | High |
go.sqli.query_concat | Medium | B | Medium |
go.crypto.md5 | Low | A | Medium |
go.crypto.sha1 | Low | A | Medium |
Java: 8 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
java.cmdi.runtime_exec | High | A | High |
java.deser.readobject | High | A | High |
java.reflection.class_forname | Medium | A | High |
java.reflection.method_invoke | Medium | A | High |
java.sqli.execute_concat | Medium | B | Medium |
java.xss.getwriter_print | Medium | A | High |
java.crypto.insecure_random | Low | A | Medium |
java.crypto.weak_digest | Low | A | Medium |
JavaScript: 22 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
js.code_exec.eval | High | A | High |
js.code_exec.new_function | High | A | High |
js.config.cors_dynamic_origin | High | A | Medium |
js.code_exec.settimeout_string | Medium | A | High |
js.config.insecure_session_httponly | Medium | A | High |
js.config.reject_unauthorized | Medium | A | High |
js.config.verbose_error_response | Medium | A | Medium |
js.crypto.weak_hash_import | Medium | A | Medium |
js.prototype.extend_object | Medium | A | High |
js.prototype.proto_assignment | Medium | A | High |
js.secrets.fallback_secret | Medium | A | Medium |
js.xss.cookie_write | Medium | A | High |
js.xss.document_write | Medium | A | High |
js.xss.insert_adjacent_html | Medium | A | High |
js.xss.location_assign | Medium | A | High |
js.xss.outer_html | Medium | A | High |
js.config.insecure_session_samesite | Low | A | High |
js.config.insecure_session_secure | Low | A | Medium |
js.crypto.math_random | Low | A | Medium |
js.crypto.weak_hash | Low | A | Medium |
js.secrets.hardcoded_secret | Low | A | Medium |
js.transport.fetch_http | Low | A | Medium |
PHP: 11 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
php.cmdi.system | High | A | High |
php.code_exec.assert_string | High | A | High |
php.code_exec.create_function | High | A | High |
php.code_exec.eval | High | A | High |
php.code_exec.preg_replace_e | High | A | High |
php.deser.unserialize | High | A | High |
php.path.include_variable | High | B | Medium |
php.sqli.query_concat | Medium | B | Medium |
php.crypto.md5 | Low | A | Medium |
php.crypto.rand | Low | A | Medium |
php.crypto.sha1 | Low | A | Medium |
Python: 13 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
py.cmdi.os_popen | High | A | High |
py.cmdi.os_system | High | A | High |
py.cmdi.subprocess_shell | High | B | Medium |
py.code_exec.eval | High | A | High |
py.code_exec.exec | High | A | High |
py.deser.pickle_loads | High | A | High |
py.deser.yaml_load | High | A | High |
py.code_exec.compile | Medium | A | High |
py.deser.shelve_open | Medium | A | High |
py.sqli.execute_format | Medium | B | Medium |
py.xss.jinja_from_string | Medium | A | High |
py.crypto.md5 | Low | A | Medium |
py.crypto.sha1 | Low | A | Medium |
Ruby: 11 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
rb.cmdi.backtick | High | A | High |
rb.cmdi.system_interp | High | A | High |
rb.code_exec.class_eval | High | A | High |
rb.code_exec.eval | High | A | High |
rb.code_exec.instance_eval | High | A | High |
rb.deser.marshal_load | High | A | High |
rb.deser.yaml_load | High | A | High |
rb.reflection.constantize | Medium | A | High |
rb.reflection.send_dynamic | Medium | B | Medium |
rb.ssrf.open_uri | Medium | A | High |
rb.crypto.md5 | Low | A | Medium |
Rust: 13 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
rs.memory.copy_nonoverlapping | High | A | High |
rs.memory.get_unchecked | High | A | High |
rs.memory.mem_zeroed | High | A | High |
rs.memory.ptr_read | High | A | High |
rs.memory.transmute | High | A | High |
rs.quality.unsafe_block | Medium | A | High |
rs.quality.unsafe_fn | Medium | A | High |
rs.memory.mem_forget | Low | A | High |
rs.memory.narrow_cast | Low | A | Medium |
rs.quality.expect | Low | A | High |
rs.quality.panic_macro | Low | A | High |
rs.quality.todo | Low | A | High |
rs.quality.unwrap | Low | A | High |
TypeScript: 22 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
ts.code_exec.eval | High | A | High |
ts.code_exec.new_function | High | A | High |
ts.config.cors_dynamic_origin | High | A | Medium |
ts.code_exec.settimeout_string | Medium | A | High |
ts.config.insecure_session_httponly | Medium | A | High |
ts.config.reject_unauthorized | Medium | A | High |
ts.config.verbose_error_response | Medium | A | Medium |
ts.crypto.weak_hash_import | Medium | A | Medium |
ts.prototype.proto_assignment | Medium | A | High |
ts.secrets.fallback_secret | Medium | A | Medium |
ts.xss.document_write | Medium | A | High |
ts.xss.insert_adjacent_html | Medium | A | High |
ts.xss.location_assign | Medium | A | High |
ts.xss.outer_html | Medium | A | High |
ts.config.insecure_session_samesite | Low | A | High |
ts.config.insecure_session_secure | Low | A | Medium |
ts.crypto.math_random | Low | A | Medium |
ts.crypto.weak_hash | Low | A | Medium |
ts.quality.any_annotation | Low | A | Medium |
ts.quality.as_any | Low | A | Medium |
ts.secrets.hardcoded_secret | Low | A | Medium |
ts.xss.cookie_write | Low | A | Medium |
Capability list for custom rules
nyx config add-rule --cap <name> and [analysis.languages.*.rules] in config accept:
env_var, html_escape, shell_escape, url_encode, json_parse, file_io, fmt_string, sql_query, deserialize, ssrf, code_exec, crypto, unauthorized_id, all
Source for both the enum and the to_cap mapping: src/labels/mod.rs (Cap) and src/utils/config.rs (CapName).
Auth analysis
Rust today. Other languages have rule scaffolding in src/auth_analysis/config.rs (Python, Ruby, Go, Java, JavaScript, TypeScript), but only Rust has benchmark corpus coverage and the precision work to back it. Treat findings on other languages as preview; the rule prefix (py.auth.*, js.auth.*, rb.auth.*, go.auth.*, java.auth.*) is reserved but the matchers haven’t been validated against real codebases yet.
What it catches
The Rust rule is rs.auth.missing_ownership_check. It fires when a request handler reaches a privileged operation that takes a scoped identifier (*_id, row reference, scoped resource) without a preceding ownership or membership check.
Concretely, it looks for five patterns of authorization in the function body and flags the call when none are present:
- A call to a recognised authorization helper. Defaults:
check_ownership,has_ownership,require_ownership,ensure_ownership,is_owner,authorize,verify_access,has_permission,can_access,can_manage, plus*_membershipandrequire_{group,org,workspace,tenant,team}_membervariants. Extend in[analysis.languages.rust]. - An ownership-equality check on a row reference:
if owner_id != user.id { return 403 }or anyfield_id != self_actorshape. The check writesAuthCheckevidence back to the row-fetch arguments viaAnalysisUnit.row_field_vars. - A self-actor reference:
let user = require_auth(...).await?followed by use ofuser.id,user.user_id,user.uid. The actor is recognised from typed extractor params (Extension<Session>,CurrentUser, etc.) and from typed helper bindings. - A SQL query that joins through an ACL table or filters by
user_idpredicate. Detected without a SQL parser viasql_semantics.rs; the authorized result variable propagates throughlet row = ...prepare(LIT)...,for row in result,let id = row.get(...). - A helper-summary lift: handler calls
validate_target(db, widget_id, user.id)whose body contains arequire_*_membercall. Cross-function summaries are merged at fixed-point (capped at 4 iterations).
Sink classification
The same call name can be safe on a local collection and dangerous on a database. The detector categorises each candidate sink before deciding whether to flag:
| Class | Examples | Default treatment |
|---|---|---|
InMemoryLocal | map.insert, set.insert, vec.push on tracked local | Never a sink |
RealtimePublish | realtime.publish_to_group, pubsub.send | Sink unless ownership is established for the channel scope |
OutboundNetwork | http.post, reqwest::Client::post | Sink unless a sanitiser is on the path |
CacheCrossTenant | redis.set, memcached.set with scoped keys | Sink unless tenant is checked |
DbMutation | db.insert, repo.save with scoped IDs | Sink unless ownership is established |
DbCrossTenantRead | db.query returning rows from a tenant scope | Sink unless ACL-join or tenant predicate is present |
Receiver type drives the classification when SSA type facts are available, so client.send(...) correctly resolves through the receiver’s inferred type.
What it can’t catch
- Non-Rust frameworks, in practice. Scaffolding exists; coverage doesn’t.
- Type-system authorization. A typestate pattern that makes unauthenticated handlers fail to compile (
fn endpoint(user: AuthenticatedUser<Admin>)) is invisible. This is mostly fine because the type system already enforced the check, but the rule won’t credit it. - Authorization performed only via macros that the AST doesn’t expose as a recognisable call.
- Cross-async-boundary actor binding. If the handler awaits
let user = require_auth(...).await?and then spawns a task that usesuser.idafter atokio::spawn, the spawn body is treated as a separate scope.
The taint-based variant
A second rule, rs.auth.missing_ownership_check.taint, folds the same logic into the SSA/taint engine using the Cap::UNAUTHORIZED_ID capability (bit 12). Request-bound handler parameters seed UNAUTHORIZED_ID into taint state; ownership checks act as sanitizers that strip the cap; sinks that take scoped IDs require it absent.
This path is off by default while the standalone analyser carries the stable signal. Enable both:
[scanner]
enable_auth_as_taint = true
Run them together; if both fire for the same site, treat it as the same finding (the taint variant carries fuller flow evidence).
Tuning
Add a project-specific authorization helper
[[analysis.languages.rust.rules]]
matchers = ["require_subscription", "ensure_paid_seat"]
kind = "sanitizer"
cap = "unauthorized_id"
The same rule recognised in the standalone analyser also strips Cap::UNAUTHORIZED_ID for the taint-based variant.
Recognised actor names
Recognised by default: user.id, user.user_id, user.uid, session.user_id, current_user.id, plus typed extractor parameters with CurrentUser, SessionUser, AuthUser, Extension<...> shapes. To add a custom binding pattern, file an issue or add a fixture; the heuristic is in src/auth_analysis/checks.rs under extract_validation_target and friends.
Suppress
Inline:
#![allow(unused)]
fn main() {
db.insert(widget_id, value)?; // nyx:ignore rs.auth.missing_ownership_check
}
Or filter by severity / confidence in CI:
nyx scan . --severity ">=MEDIUM" --min-confidence medium
In the UI
Auth findings render alongside taint findings in the browser UI. The flow visualiser shows the sink call, the actor reference (when one was found), and any helper-summary path the engine traversed; the How to fix panel mirrors the rule’s recommendation.

Where the work was done
The remediation work is documented release-by-release in tests/benchmark/RESULTS.md under the Rust auth row. Phases A1 through B5 (precision and structural improvements) and Phase C (taint-based variant) all landed on the 0.5.0 release branch. The benchmark corpus at tests/benchmark/corpus/rust/auth/ is 10 fixtures covering the five FP patterns plus a true-positive control.
How Nyx works
If you’re going to act on a finding, it helps to know how the scanner got there. This page is the short version. Source paths are linked where the answer to “exactly what does it do” lives in the code.
The pipeline
A scan runs in two passes over the file tree, with an optional SQLite index that lets the second scan skip files whose content hash hasn’t changed.
Pass 1, per file. Tree-sitter parses the file. Nyx builds an intra-procedural control-flow graph, lowers it to SSA, and extracts a summary per function describing what that function does at the boundary: which arguments flow to sinks, which sources it reads from, which sinks it calls, what taint it strips, what it returns. Summaries are persisted to SQLite (src/summary/, src/database.rs).
Summary merge. All per-file summaries get unioned into a global map keyed by qualified function name.
Pass 2, per file. Each file is reanalysed with the global summaries available. The taint engine runs a forward dataflow worklist over the SSA representation. When it hits a call, it consults summaries to decide whether the call propagates taint, sanitizes it, or terminates the flow. Findings are produced when tainted data reaches a sink whose required capability is still set on the value.
Two extra layers tune precision around calls. Context-sensitive inlining (k=1) re-runs intra-file callees with the actual argument taint at the call site, so a helper called once with tainted input and once with sanitized input produces the right result for each call. SCC fixed-point: when a group of mutually-recursive functions forms a strongly-connected component in the call graph, the engine iterates summaries to a joint fixed-point (capped at 64 iterations). SCCs that span files are also handled.
When a method call has a receiver typed as a super-class, trait, or interface, hierarchy fan-out widens the resolved callee set to every concrete implementer the engine has seen. A class diagram extracted in pass 1 (Java extends/implements, Rust impl-for, TS/JS extends, Python bases, Ruby includes, PHP extends/implements, C++ inheritance) feeds an index that the call resolver consults during pass 2. The fan-out is capped at 8 implementers per call site; over-fanning is a precision tax, not a soundness issue.
A separate field-sensitive points-to pass tracks abstract locations down to the field level, so c.mu.Lock() is a lock on Field(c, mu) rather than on c as a whole. That distinction is what lets the resource-lifecycle and taint passes tell obj.field = tainted; sink(obj.other_field) apart from the conservative whole-variable approximation. Subscript reads and writes (arr[i], map[k] = v) lower to synthetic __index_get__ / __index_set__ calls so the same container model handles them. Set NYX_POINTER_ANALYSIS=0 to fall back to the pre-pointer-pass behaviour for one release if you need to compare baselines.
Optional analyses on top
These run on top of the forward taint pass. They’re independently switchable via [analysis.engine] config or matching CLI flags. See advanced-analysis.md for the full description and tradeoffs.
| Pass | Purpose | Default |
|---|---|---|
| Abstract interpretation | Carries interval and string prefix/suffix bounds alongside taint. Suppresses findings on proven-bounded integers and locked-prefix URLs | on |
| Context sensitivity | k=1 inlining for intra-file callees | on |
| Field-sensitive points-to | Distinguishes obj.field from obj itself, so a tainted write to one field does not poison reads from another. Also gives the resource-lifecycle pass per-field locks | on |
| Hierarchy fan-out | When a method call’s receiver is typed as a super-class, trait, or interface, widens callee resolution to every concrete implementer the engine has seen | on |
| Constraint solving | Drops paths whose accumulated branch predicates are unsatisfiable. Optional Z3 backend with --features smt | on |
| Symbolic execution | Builds an expression tree per tainted value. Produces a witness string at the sink. Detects sanitization patterns the taint engine alone would miss | on |
| Backwards analysis | After the forward pass, walks backwards from each sink to confirm or invalidate the flow. Annotates findings as backwards-confirmed, backwards-infeasible, or backwards-budget-exhausted | off |
--engine-profile fast | balanced | deep flips groups of these at once. balanced is the default and the configuration the benchmark numbers in language-maturity.md are measured against.
Where bounds live
Static analysis at scale means choosing where to stop. Nyx exposes its bounds rather than hiding them:
- Inline depth is k=1. Callees larger than the inline body-size cap fall back to summary-based resolution.
- SCC fixed-point is capped at 64 iterations. If a recursive cluster doesn’t converge, the engine emits the best summary it has and records an
engine_noteon affected findings. - Lattice width is bounded. Taint origin sets cap at 32 entries per SSA value (
--max-origins); points-to sets cap at 32 heap objects (--max-pointsto). Truncation is recorded asOriginsTruncated/PointsToTruncatedso you can see when precision was lost. - Symbolic expressions cap at depth 32. Deeper expressions degrade to
Unknownrather than growing without bound.
Findings whose engine notes indicate a bound was hit can be filtered with --require-converged for strict CI gates. The flag drops over-reports and bails; under-reports (where the emitted finding is still real but the result set is a lower bound) are kept.
What you get out
Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling.
For the JSON shape and SARIF mapping, see output.md.
Advanced Analysis
Nyx layers several analysis passes on top of the core SSA taint engine.
Most are switchable via config ([analysis.engine] in nyx.conf /
nyx.local), a matching CLI flag pair, or, as a last-resort override for
library users with no CLI entry point, a NYX_* environment variable. The
five precision-tuning passes (abstract interpretation, context sensitivity,
symbolic execution, constraint solving, field-sensitive points-to) are
on by default because the benchmark numbers in
language-maturity.md are measured with them on.
The demand-driven backwards walk and hierarchy fan-out sit alongside but
are not user-toggleable in the same way.
See Configuration for the full config
surface and CLI flag table. This page explains what each pass does, why it
helps, how to disable it, and what it does not cover.
Abstract interpretation
What it does. Propagates interval and string abstract domains through the
SSA worklist alongside taint. Integer values carry [lo, hi] bounds;
string values carry a prefix and suffix (plus a bit domain for known-zero /
known-one bits). Values are joined at merge points and widened at loop
heads so the worklist always terminates.
Why it helps. Lets Nyx suppress some findings that are obviously safe given the abstract value; a proven-bounded integer does not flow into a SQL sink as an injection risk; an SSRF sink whose URL prefix is locked to a trusted host stays quiet. This turns a large class of FPs on numeric and locked-prefix paths into true negatives.
How to turn it off.
| Surface | Value |
|---|---|
| Config | abstract_interpretation = false under [analysis.engine] |
| CLI flag | --no-abstract-interp |
| Env var (legacy) | NYX_ABSTRACT_INTERP=0 |
Limitations. The interval domain is 64-bit signed; very wide or
overflow-producing arithmetic degrades to ⊤ (unbounded). String prefix /
suffix tracking is concat-only; it does not model reordering, reversal, or
character-level regex constraints. Loop widening deliberately drops
changing bounds rather than chasing fixpoints.
Source: src/abstract_interp/.
Context-sensitive analysis
What it does. Adds k=1 call-site-sensitive taint propagation for
intra-file callees. When a function is invoked, Nyx reanalyzes the callee
body with the actual per-argument taint signature of the call site,
producing call-site-specific return taint. Results are cached by
(function_name, ArgTaintSig) so repeated calls with the same signature
are free.
Why it helps. A helper called once with a tainted argument and once with a sanitized argument produces two different findings; without k=1 sensitivity, the conservative union of both call sites would be applied to the sanitized call, producing a spurious finding there.
How to turn it off.
| Surface | Value |
|---|---|
| Config | context_sensitive = false under [analysis.engine] |
| CLI flag | --no-context-sensitive |
| Env var (legacy) | NYX_CONTEXT_SENSITIVE=0 |
Limitations. Intra-file only. Cross-file callees are resolved via
summaries (see src/summary/) rather than re-inlined. Depth is capped at
k=1 to prevent cache blow-up and re-entrancy; higher k would require a
different cache key design. Callee bodies larger than the internal
MAX_INLINE_BLOCKS threshold fall back to the summary path. Cache keys
hash per-argument Cap bits but not source-origin identity, so two
callers with identical caps but different origins share cached
origin-attribution.
Source: src/taint/ssa_transfer.rs
(ArgTaintSig, InlineCache, inline_analyse_callee).
Field-sensitive points-to
What it does. Runs a Steensgaard-style alias analysis that interns field
accesses as their own abstract locations. c.mu becomes Field(c, mu),
distinct from c itself; a write to obj.cache and a read from
obj.cache in different methods both land on the same abstract location;
subscript reads and writes (arr[i], map[k] = v) lower to synthetic
__index_get__ / __index_set__ calls so the engine can model them
through the same container store/load primitives used for STL containers,
Python lists, JS arrays, and similar.
Why it helps. It splits a class of false positives that the
whole-variable taint model produced. Before this pass, obj.field = tainted; sink(obj.other_field) would taint obj as a whole and fire on
the safe field; the receiver-type / sub-field distinction is also what
lets the resource-lifecycle pass attribute a c.mu.Lock() to the lock
field rather than to its container. Cross-method field flow (writer in
one method, reader in another) shows up only when fields have stable
identity independent of the parent value.
How to turn it off.
| Surface | Value |
|---|---|
| Env var | NYX_POINTER_ANALYSIS=0 |
The pass is on by default as of 2026-04-26. The env-var override is kept for one release so you can compare against the pre-pointer baseline, then will be removed.
Limitations. This is not a general escape analysis. Function pointers
and arbitrary indirect calls still resolve to no callee, and deep alias
chains through *p / p->field in C/C++ are not tracked beyond the
direct field case. The points-to set per value is capped at
--max-pointsto (default 32); when truncation happens, an engine note
records the precision loss.
Source: src/pointer/.
Hierarchy fan-out for virtual dispatch
What it does. Builds a per-language type-hierarchy index in pass 1 (extends, implements, impl-for, includes; the exact construct depends on the language) and uses it in pass 2 to widen method-call resolution. When a call’s receiver is statically typed as a super-class, trait, or interface, the resolver returns every concrete implementer it has seen in the codebase rather than just the first match.
Why it helps. Without it, a call like repository.findById(id) where
repository is typed as the interface gets resolved against whatever the
single-result resolver finds first; if the matching implementer is in
another file the call effectively goes opaque. With the hierarchy, the
taint engine sees the union of every implementer’s transform and the
flow shows up regardless of which file holds the concrete class.
Limitations. Fan-out is capped at 8 implementers per call site; over that, the tail is silently dropped (a debug log records the cap hit) and the call is treated as a non-deterministic union of the kept implementers. Languages that use structural / implicit interface satisfaction (Go) are deliberately skipped because per-file extraction is intractable; those calls fall back to the single-result resolver. The extractor covers Java, Rust, TS/JS/TSX, Python, Ruby, PHP, and C++.
Source: src/cfg/hierarchy.rs
and src/summary/mod.rs
(TypeHierarchyIndex, resolve_callee_widened).
Symbolic execution
What it does. Builds a symbolic expression tree per tainted SSA value,
generates a witness string for each taint finding (the concrete-looking
shape of the dangerous value at the sink), and detects sanitization
patterns that the taint engine alone would miss. Supports string
operations (trim, replace, toLower, substring, strlen, …),
arithmetic, concatenation, phi nodes, and opaque calls.
Why it helps. Raises finding quality. A taint finding with a rendered
witness like "SELECT * FROM t WHERE id=" + userInput is substantially
easier to triage than one without. Also powers some confidence-gating for
downstream display.
How to turn it off.
| Surface | Value |
|---|---|
| Config | symex.enabled = false under [analysis.engine] |
| CLI flag | --no-symex |
| Env var (legacy) | NYX_SYMEX=0 |
Two nested switches refine the scope without disabling symex entirely:
| Setting | CLI | Env | Default | Effect |
|---|---|---|---|---|
symex.cross_file | --no-cross-file-symex | NYX_CROSS_FILE_SYMEX=0 | on | Consult cross-file SSA bodies so symex can reason about callees defined in other files |
symex.interprocedural | --no-symex-interproc | NYX_SYMEX_INTERPROC=0 | on | Intra-file interprocedural symex (k ≥ 2 via frame stack) |
Limitations. Expression trees are bounded at MAX_EXPR_DEPTH=32;
deeper expressions degrade to Unknown rather than growing unboundedly.
Sanitizer detection is informational: string-replace sanitizer patterns
are reported as witness metadata, not used to clear taint.
Source: src/symex/.
Demand-driven analysis
What it does. After the forward pass-2 taint analysis finishes, runs a
backwards walk from each sink’s tainted SSA operands. The walk follows
reverse SSA-edge transfer (phi fan-out, Assign operand-fanout, Call
body-expansion or arg-fanout) until it reaches a taint source, proves
the flow infeasible via an accumulated path predicate, or exhausts its
budget. Each forward finding is then annotated with the aggregate verdict:
backwards-confirmed; a matching source was reached. Finding picks up a small confidence boost and the note appears inevidence.symbolic.cutoff_notes.backwards-infeasible; every walk proved the flow unreachable. Finding is capped to Low confidence and a user-readable limiter is attached.backwards-budget-exhausted; the walk hitBACKWARDS_VALUE_BUDGETwithout a verdict. Recorded as a limiter so operators can see when the pass could not keep up.- Inconclusive outcomes are a no-op: the forward finding is untouched.
Because the backwards walk can consult GlobalSummaries.bodies_by_key
(populated by the cross-file callee body persistence layer) it closes
across file boundaries; when a callee body is not loadable the walk
falls back to fanning out over the call’s arguments so local reach-back
is still possible.
Why it helps. Inverts the analysis direction so budget follows questions the scanner actually cares about; “does any source reach this sink?”; instead of proving every potential source-to-sink path. Corroborated findings are a stronger signal than forward-only ones, and proven-infeasible flows provide a principled way to lower confidence on forward false positives without silently dropping them.
How to turn it on. Defaults off so the benchmark floor is preserved while the pass stabilises.
| Surface | Value |
|---|---|
| Config | backwards_analysis = true under [analysis.engine] |
| CLI flag | --backwards-analysis / --no-backwards-analysis |
| Env var (legacy) | NYX_BACKWARDS=1 |
Limitations (first cut). Reverse call-graph expansion past a
ReachedParam is deferred; the walk terminates at function parameters
rather than crossing back into callers. Path-constraint pruning is
conservative: only the accumulated PredicateSummary bits are consulted,
not the full symbolic predicate stack. Depth-bounded at k=2 for
cross-function body expansion. See DEFAULT_BACKWARDS_DEPTH,
BACKWARDS_VALUE_BUDGET, and MAX_BACKWARDS_CALLEE_BLOCKS in
src/taint/backwards.rs for the exact bounds.
Source: src/taint/backwards.rs.
Constraint solving
What it does. Collects path constraints at each branch in SSA and
propagates them alongside taint. Prunes paths whose accumulated constraint
set is unsatisfiable; a taint flow guarded by if x < 0 && x > 10 is
dropped rather than surfaced. Optionally delegates the satisfiability
check to Z3 when Nyx is built with the smt Cargo feature.
Why it helps. Removes a class of FPs rooted in clearly-infeasible control-flow combinations. Without path constraints, a taint flow that only occurs when mutually-exclusive branches are simultaneously taken can still produce a finding.
How to turn it off.
| Surface | Value |
|---|---|
| Config | constraint_solving = false under [analysis.engine] |
| CLI flag | --no-constraint-solving |
| Env var (legacy) | NYX_CONSTRAINT=0 |
The SMT backend is a separate switch:
| Setting | CLI | Env | Default | Effect |
|---|---|---|---|---|
symex.smt | --no-smt | NYX_SMT=0 | on when built with smt feature | Delegate satisfiability checks to Z3; ignored if Nyx was built without smt |
Limitations. The default path-constraint domain is syntactic;
trivially-inconsistent pairs are caught without an SMT solver, but richer
algebraic unsatisfiability requires the smt feature (Z3). Without smt,
Nyx ships a lightweight satisfiability check that catches literal
contradictions but not deeper reasoning.
Source: src/constraint/.
Combining the switches
The defaults (all on) are the configuration Nyx is benchmarked against.
Turning any switch off trades precision for speed and may move findings
relative to the published baseline; CI regression gates assume defaults.
If you need a minimal-overhead scan (for very large repositories or a
pre-commit fast path), the AST-only scan mode (--mode ast) skips CFG,
taint, and all four advanced passes entirely and is the right tool.
Detectors
Nyx ships four independent detector families. They run together in --mode full, the default. Findings are merged, deduplicated, ranked, and printed in one result set.
| Family | Rule prefix | Looks at | What it finds |
|---|---|---|---|
| Taint analysis | taint-* | Cross-file dataflow | Unsanitized data flowing source to sink |
| CFG structural | cfg-* | Per-function control flow | Auth gaps, unguarded sinks, error fallthrough, resource release on all paths |
| State model | state-* | Per-function state lattice | Use-after-close, double-close, leaks, unauthenticated access |
| AST patterns | <lang>.<cat>.<name> | Tree-sitter structural match | Banned APIs, weak crypto, dangerous constructs |
For Rust auth-specific rules (rs.auth.*), see auth.md.
How they combine
In --mode full:
- Taint and AST can both fire on one line. If
eval(userInput)triggers bothjs.code_exec.eval(AST) andtaint-unsanitised-flow(taint), both are kept with distinct rule IDs. The taint finding ranks higher because of the analysis-kind bonus. - State supersedes CFG on resource leaks. When
state-resource-leakandcfg-resource-leakfire at the same location, the CFG one is dropped. - Exact duplicates are removed. Same line, column, rule ID, severity → one finding.
Modes
| Mode | Active detectors |
|---|---|
full (default) | All four |
ast | AST patterns only |
cfg | Taint + CFG + State (no AST patterns) |
taint | Taint + State |
Attack-surface ranking
Every finding gets a deterministic score. Findings are sorted by descending score by default. Disable with --no-rank or output.attack_surface_ranking = false.
score = severity_base + analysis_kind + evidence_strength + state_bonus - validation_penalty
| Component | Values |
|---|---|
| Severity base | High=60, Medium=30, Low=10 |
| Analysis kind | taint=+10, state=+8, cfg with evidence=+5, cfg without evidence=+3, ast=+0 |
| Evidence strength | +1 per evidence item up to 4; +2 to +6 for source kind |
| State bonus | use-after-close / unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1 |
| Validation penalty | -5 if path-validated |
Source-kind contributions (taint only):
| Source | Bonus |
|---|---|
User input (req.body, argv, stdin, form, query, params) | +6 |
Environment (env::var, getenv, process.env) | +5 |
| Unknown | +4 |
| File system | +3 |
| Database | +2 |
Approximate score ranges:
| Finding type | Score |
|---|---|
| High taint with user input | 76 to 81 |
| High state (use-after-close) | ~74 |
| High CFG structural | 63 to 68 |
| Medium taint with env source | 45 to 50 |
| Medium state (resource leak) | ~40 |
| Low AST-only pattern | ~10 |
For the engine’s runtime model (passes, summaries, SCC fixed-point), see how-it-works.md.
AST patterns
AST patterns are tree-sitter queries that match dangerous structural shapes in source. No dataflow, no CFG. A match means the construct is present; it’s not proof the construct is exploitable.
Patterns run in every analysis mode. In --mode ast they’re the only active detector.
Rule IDs
<lang>.<category>.<name>
Examples: js.code_exec.eval, py.deser.pickle_loads, c.memory.gets, java.sqli.execute_concat.
Full list: rules.md.
Tiers
| Tier | Meaning |
|---|---|
| A | Structural presence alone is high-signal. gets, eval, pickle.loads, mem::transmute |
| B | Pattern includes a tree-sitter heuristic guard. Example: java.sqli.execute_concat only fires when executeQuery receives a binary_expression (string concatenation), not a literal or a parameterized statement |
Categories
| Category | Examples |
|---|---|
| CommandExec | system, os.system, Runtime.exec, backticks |
| CodeExec | eval, Function, PHP assert("string"), class_eval, instance_eval |
| Deserialization | pickle.loads, yaml.load, Marshal.load, readObject, unserialize |
| SqlInjection | executeQuery/Query/execute with concatenated argument (Tier B) |
| PathTraversal | PHP include $var |
| Xss | document.write, outerHTML, insertAdjacentHTML, getWriter().print |
| Crypto | md5, sha1, Math.random, java.util.Random for security use |
| Secrets | hardcoded API keys (Go, JS, TS) |
| InsecureTransport | InsecureSkipVerify, fetch("http://...") |
| Reflection | Class.forName, Method.invoke, send, constantize |
| MemorySafety | transmute, unsafe, gets, strcpy, sprintf |
| Prototype | __proto__ assignment, Object.prototype.* |
| Config | CORS dynamic origin, rejectUnauthorized: false, insecure session settings |
| CodeQuality | unwrap, panic!, as any |
What patterns can’t tell you
- Dataflow.
eval("1+1")(safe) andeval(userInput)(dangerous) both matchjs.code_exec.eval. The taint detector is the one that distinguishes them. - Reachability. A pattern in dead code matches identically.
- Semantics.
strcpy(dst, src)always matches, regardless of buffer sizes. - Indirect calls.
let e = eval; e(input)doesn’t matcheval. - Aliased imports.
from os import system as s; s(cmd)won’t matchsystem. - Macro expansions. Tree-sitter parses the macro call site, not the expansion.
Common false positives
| Scenario | Why | Mitigation |
|---|---|---|
eval("hardcoded literal") | Pattern matches structure | Run --mode cfg to drop AST patterns and rely on taint |
unsafe block with sound justification | Every unsafe matches rs.quality.unsafe_block | Filter >=MEDIUM (it’s Medium) or accept the noise |
.unwrap() in tests | Acceptable in test code | Default non-prod severity downgrade reduces it |
md5 for non-cryptographic checksums | Pattern can’t see intent | Suppress with --severity ">=MEDIUM" or per-line nyx:ignore |
| SQL concat with trusted data (Tier B) | Heuristic can’t verify the source | Taint is more precise; or convert to a parameterized query |
Confidence levels
Every AST pattern carries an explicit confidence:
| Confidence | Use |
|---|---|
| High | Inherently dangerous construct with no safe usage. gets, pickle.loads, eval with no guard |
| Medium | Likely issue, context may change the call. SQL concatenation (Tier B), unsafe blocks, exec |
| Low | Heuristic. Often appears in safe code. Weak crypto for checksums, unwrap outside tests, Math.random |
--min-confidence medium (or output.min_confidence = "medium") drops Low-confidence matches.
Tuning
nyx scan . --severity ">=MEDIUM" # drop Low-tier patterns
nyx scan . --severity HIGH # banned APIs and code-exec only
nyx scan . --mode cfg # drop AST patterns; keep taint + state + cfg
[scanner]
excluded_directories = ["node_modules", "vendor", "generated"]
Examples
Tier A, structural presence:
char buf[64];
gets(buf); // c.memory.gets
import pickle
data = pickle.loads(user_input) // py.deser.pickle_loads
Tier B, heuristic guard:
// Fires: concatenated argument
stmt.executeQuery("SELECT * FROM users WHERE id=" + userId); // java.sqli.execute_concat
// Does not fire: parameterized
stmt.executeQuery(preparedSql);
printf(user_input); // c.memory.printf_no_fmt: fires (variable as fmt)
printf("%s", user_input); // does not fire (literal fmt)
CFG structural analysis
Nyx builds an intra-procedural control-flow graph per function and checks structural properties: whether sinks are guarded by sanitizers or validators, whether web handlers check authentication, whether resources are released on all exit paths, and whether error paths terminate before reaching dangerous code.
These detectors use dominator analysis. A guard dominates a sink when the guard must execute before the sink on every path from entry.
Rule IDs
| Rule ID | Severity |
|---|---|
cfg-unguarded-sink | High/Medium |
cfg-auth-gap | High |
cfg-unreachable-sink | Medium |
cfg-unreachable-sanitizer | Low |
cfg-unreachable-source | Low |
cfg-error-fallthrough | High/Medium |
cfg-resource-leak | Medium |
cfg-lock-not-released | Medium |
What it detects
cfg-unguarded-sink: A sink call (system, eval, Command::new, db.execute, etc.) is reachable from function entry without passing through any guard or sanitizer that matches the sink’s capability.
cfg-auth-gap: A function identified as a web handler (by parameter naming conventions like req, res, ctx, request, language-dependent) reaches a privileged sink (shell execution, file I/O) without a preceding authentication call.
cfg-unreachable-*: Sinks, sanitizers, or sources in dead code. Usually signals a refactoring error that silently disabled security-relevant logic.
cfg-error-fallthrough: An error-handling branch (null check, error-return check) does not terminate. Execution falls through to a dangerous operation on the error path.
cfg-resource-leak, cfg-lock-not-released: A resource acquisition (File::open, fopen, socket, Lock) is not matched by a release on every exit path from the function.
What it can’t detect
- Inter-procedural guards. Middleware-level auth, helper functions that internally call auth, and cleanup performed in a caller are invisible.
- Dynamic dispatch. Virtual calls, function pointers, closures resolve to no specific callee.
- Correctness of guards. The detector checks a guard dominates the sink. It cannot check the guard is correct. A no-op
if true {}would suppress the finding. - Custom validation logic. Only recognised guard names are checked.
if password == expectedis not a recognised guard. - Cross-function resource flows. If a file handle opens in one function and closes in another, the opener gets flagged as a leak. This is the largest source of FPs on factory-pattern code.
Common false positives
| Scenario | Why | Mitigation |
|---|---|---|
| Framework middleware auth | Handler doesn’t call auth directly | Expected; suppress with severity filter or exclude handlers |
| RAII / defer cleanup | Implicit release not visible to CFG (partially handled for Rust Drop and Go defer) | Known limitation |
| Custom guard name | Function not in the recognised guard list | Add it as a sanitizer rule in config |
| Test handlers | Intentional lack of auth | Default non-prod downgrade reduces severity; or exclude test dirs |
Common false negatives
| Scenario | Why |
|---|---|
| Auth in a called helper | Cross-function guards not tracked |
| Type-system guards | Rust AuthenticatedUser<T> wrappers, typestate patterns not analysed |
Cleanup in finally/ensure/defer in callers | Cross-function cleanup not tracked |
Tuning
Recognised guard names
Nyx accepts these patterns as dominating guards:
| Pattern | Applies to |
|---|---|
validate*, sanitize* | All sinks |
check_*, verify_*, assert_* | All sinks |
shell_escape | Shell sinks |
html_escape | HTML/XSS sinks |
url_encode | URL sinks |
which | Shell execution (binary lookup) |
Recognised auth names
| Pattern | Language |
|---|---|
is_authenticated, require_auth, check_permission, authorize, authenticate, require_login, check_auth, verify_token, validate_token | Cross-language |
middleware.auth, auth.required | Go |
isAuthenticated, checkPermission, hasAuthority, hasRole | Java |
For Rust auth checks (require_*, ownership equality, row-level checks), see auth.md.
Custom guards
[[analysis.languages.python.rules]]
matchers = ["validate_request", "check_csrf"]
kind = "sanitizer"
cap = "all"
Custom auth functions
[[analysis.languages.javascript.rules]]
matchers = ["ensureLoggedIn", "requirePermission"]
kind = "sanitizer"
cap = "all"
Examples
Unguarded sink:
func handler(w http.ResponseWriter, r *http.Request) {
cmd := r.URL.Query().Get("cmd")
exec.Command("sh", "-c", cmd).Run() // cfg-unguarded-sink
}
Auth gap:
app.get('/admin/delete', (req, res) => {
// No auth call
db.execute("DELETE FROM users WHERE id = " + req.params.id); // cfg-auth-gap
});
Resource leak:
void process() {
FILE *f = fopen("data.txt", "r");
if (error) {
return; // cfg-resource-leak: f not closed on this path
}
fclose(f);
}
State model analysis
Tracks resource lifecycle and authentication state through a function. Detects use-after-close, double-close, leaks, and unauthenticated access to privileged operations.
State analysis is on by default. Disable with scanner.enable_state_analysis = false. It runs in --mode full and --mode taint; AST-only mode skips it.
Rule IDs
| Rule ID | Severity |
|---|---|
state-use-after-close | High |
state-double-close | Medium |
state-resource-leak | Medium |
state-resource-leak-possible | Low |
state-unauthed-access | High |
What it detects
state-use-after-close: Resource transitions to CLOSED (via close, fclose, disconnect, …), then a use operation happens on it.
FILE *f = fopen("data.txt", "r");
fclose(f);
fread(buf, 1, 100, f); // state-use-after-close
state-double-close: Resource closed twice. Crashes or undefined behaviour on most runtimes.
state-resource-leak: Resource opened but never closed on any path through the function. Definite leak.
state-resource-leak-possible: Resource closed on some paths but not others. Lower confidence; often an early-return error path.
state-unauthed-access: A function recognised as a web handler reaches a privileged sink without an auth call on the path.
A function counts as a web handler if its name starts with handle_, route_, or api_ (sufficient on its own), or starts with serve_/process_ and the file uses web-shaped parameter names (request, req, ctx, res, response, w, writer, language-dependent). main is excluded.
Managed-resource suppression
Several language-specific cleanup patterns suppress leak findings:
| Pattern | Languages | Effect |
|---|---|---|
| RAII / Drop | Rust | All leak findings suppressed except alloc/dealloc |
| Smart pointers | C++ | make_unique/make_shared treated as managed; raw new/malloc still tracked |
defer | Go | defer f.Close() suppresses leak at exit |
with context manager | Python | with open(f) as f: suppresses leak for the bound name |
| try-with-resources | Java | TWR-bound resources suppressed |
What it can’t detect
- Cross-function resource ownership. Open in one function, close in another, leak gets reported in the opener. The most common FP source for leak detection.
- Factory / builder functions that return a resource for the caller to manage.
- Variable shadowing across scopes. Same name in inner and outer scope shares one symbol; an inner close masks an outer leak.
- Resources stored in collections. Handles in arrays / maps / channels and cleaned up via iteration are not tracked.
- Dynamic dispatch. Close called via trait object or interface may not be recognised.
- Type-state authentication.
AuthenticatedRequest<T>and similar Rust patterns are not recognised as auth.
Common false positives
| Scenario | Why | Mitigation |
|---|---|---|
| Factory returns a resource | Caller owns it | Known limitation |
| Framework-managed handles | Connection pool, request scope | Exclude framework code or downgrade |
| Variable name shadowing | Same name reused | Known limitation |
Per-language detection
| Language | Leak | Double-close | Use-after-close | Notes |
|---|---|---|---|---|
| C | yes | yes | yes | fopen/fclose, malloc/free, pthread_mutex_* |
| C++ | yes | yes | yes | C pairs plus new/delete; smart pointers suppressed |
| Python | yes | yes | yes | with suppressed; open, socket, connect |
| Go | yes | yes | yes | defer suppressed; os.Open / .Close |
| Rust | unsafe only | n/a | n/a | RAII suppresses everything except alloc/dealloc |
| JavaScript | yes | yes | partial | fs.openSync/closeSync |
| TypeScript | yes | yes | partial | Same as JS |
| PHP | yes | yes | partial | fopen/fclose, curl_init/curl_close, mysqli_* |
| Ruby | partial | partial | partial | File.open/close, TCPSocket |
| Java | limited | limited | limited | Constructor-callee matching is incomplete |
Tuning
nyx scan . --severity ">=MEDIUM" # Skip "possible" leaks (Low)
[scanner]
enable_state_analysis = true # default
excluded_directories = ["tests", "test", "spec"]
Recognised pairs
The state engine ships these acquire/release pairs. Custom pairs are not yet configurable; file an issue if you need one.
C / C++
| Acquire | Release |
|---|---|
fopen | fclose |
open | close |
socket | close |
malloc, calloc, realloc | free |
pthread_mutex_lock | pthread_mutex_unlock |
new, new[] (C++) | delete, delete[] |
Rust
| Acquire | Release |
|---|---|
File::open, File::create | drop, close |
TcpStream::connect | shutdown |
lock, read, write (Mutex/RwLock) | drop |
Java
| Acquire | Release |
|---|---|
new FileInputStream (and friends) | close |
getConnection | close |
new Socket | close |
Go, Python, JavaScript, Ruby, PHP follow language-idiomatic equivalents.
Use-after-close triggers
These operations on a closed resource fire state-use-after-close:
read, write, send, recv, fread, fwrite, fgets, fputs, fprintf, fscanf,
fflush, fseek, ftell, rewind, feof, ferror, fgetc, fputc, getc, putc,
ungetc, query, execute, fetch, sendto, recvfrom, ioctl, fcntl,
strcpy, strncpy, strcat, strncat, memcpy, memmove, memset, memcmp,
strcmp, strncmp, strlen, sprintf, snprintf
Taint analysis
Nyx tracks untrusted data from sources (where it enters the program) through assignments and function calls to sinks (where it’s used dangerously). If the flow reaches a sink without passing a matching sanitizer, a finding fires.
The engine is a monotone forward dataflow over a finite lattice with guaranteed termination. It’s flow-sensitive inside a function, and interprocedural across files via persisted per-function summaries.
Rule ID
taint-unsanitised-flow (source <line>:<col>)
One rule ID, parameterized by the source location. Suppressions can target either the base ID or the full string.
What it detects
- User input flowing to shell execution:
req.body.cmd→child_process.exec - User input flowing to code evaluation:
req.query.code→eval - User input flowing to SQL:
request.args.get('id')→cursor.execute(f"... {id}") - Environment variables flowing to shell:
env::var("CMD")→Command::new("sh").arg("-c") - Request parameters flowing to HTML:
req.query.name→innerHTML - File contents flowing to privileged sinks:
fs::read_to_string→db.execute - Any other source-to-sink flow where the sink’s required capability is not stripped along the way
What it can’t detect
- Library calls without summaries. If a callee has no summary (no source, binary-only dependency), Nyx treats it as neither propagating nor sanitizing. This is conservative for sanitization but lossy for propagation.
- Deep pointer aliasing.
let y = &x; sink(*y)works through one level, but arbitrary chains of pointer arithmetic and aliased writes (*p,p->fieldin C/C++) are not tracked end-to-end. Function pointers and indirect calls resolve to no callee. - Implicit flows. Taint follows explicit data, not branching signal.
if (secret) x = 1 else x = 0does not taintx. - Globals and statics across functions. Not tracked across function boundaries.
Common false positives
| Scenario | Why | Mitigation |
|---|---|---|
| Custom sanitizer not recognised | Only built-in + configured sanitizers match | Add a custom sanitizer rule in config |
| Container holds mixed-typed items the engine cannot tell apart | A vector<int> of port numbers and a vector<string> of user input share the same store/load model | Sanitize the values on the way in (numeric parse / explicit validator) so the values themselves carry no cap, not just the container |
| Dead branches | Path-insensitive within a function | Constraint solving catches trivially infeasible combos; path-validated findings are scored lower |
| Library wrapper re-introduces taint | Wrapper opaque, or summary marks it as propagating | Summarize the wrapper explicitly or add it as a sanitizer |
Common false negatives
| Scenario | Why |
|---|---|
| Third-party library on the path | No summary available, callee treated opaquely |
| Globals / statics across function boundaries | Not tracked |
| Some closure captures | Closure analysis is limited. JS/TS/Ruby/Go anonymous functions passed as callbacks are analyzed as separate scopes |
| Very deep cross-file chains | Summary approximation loses precision at depth |
Confidence signals
Higher confidence:
- Source + Sink both present in evidence with specific call locations.
source_kind: user_input(direct attacker control).path_validated: false.- No dominating guard on the path.
- Symex produced a witness string (rendered sink value visible in JSON/SARIF
evidence.symbolic.witness).
Lower confidence:
- Path-validated taint (
path_validated: true). - Source is a database read or internal file (pre-validated at insertion is common).
- Engine note
ForwardBailed/PathWidened. Use--require-convergedto drop these in strict gates.
Tuning
Custom sanitizer
# nyx.local
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind = "sanitizer"
cap = "html_escape"
Or: nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape.
Filter by severity or confidence
nyx scan . --severity HIGH
nyx scan . --min-confidence medium
Skip dataflow entirely
nyx scan . --mode ast
AST-only mode gives you structural pattern matches without taint.
In the browser UI, taint findings render as a numbered flow walk so you can see each hop the engine took:

Example
Rust:
use std::env;
use std::process::Command;
fn main() {
let cmd = env::var("USER_CMD").unwrap(); // source
Command::new("sh").arg("-c").arg(&cmd).output(); // sink
}
Finding:
[HIGH] taint-unsanitised-flow (source 5:15) src/main.rs:6:5
Unsanitised user input flows from env::var → Command::new
Source: env::var (5:15)
Sink: Command::new
Safe rewrite: drop the shell and pass the value as argv directly (Command::new(&cmd).output()), or validate against an allowlist before passing to the shell.
Capabilities
Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer only clears taint for the cap it declares. A sink only fires when the remaining taint still carries its required cap.
| Capability | Typical source | Typical sanitizer | Typical sink |
|---|---|---|---|
env_var | env::var, getenv, process.env | ||
html_escape | html.escape, DOMPurify.sanitize | innerHTML, document.write | |
shell_escape | shlex.quote, shell_escape::escape | system, Command::new, eval | |
url_encode | encodeURIComponent | location.href, HTTP client URL arg | |
json_parse | JSON.parse | ||
file_io | os.path.realpath, filepath.Clean | open, fs::read_to_string, send_file | |
fmt_string | printf(var) | ||
sql_query | parameterized query binders | cursor.execute, db.query with concatenation | |
deserialize | pickle.loads, yaml.load, Marshal.load | ||
ssrf | URL-prefix locks | requests.get, fetch, HttpClient.send | |
code_exec | eval, exec, Function | ||
crypto | weak-algorithm constructors | ||
unauthorized_id | request-bound scoped IDs (Rust auth analysis) | ownership check | row-level write |
all | Sources typically use all so they match any sink |
Sources typically use cap = "all" so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name.