Quick start

After cargo install nyx-scanner (or dropping a release binary on your PATH), point Nyx at a directory:

nyx scan ./my-project

First run builds a SQLite index under .nyx/; later runs skip files whose content hash hasn’t changed.

What a finding looks like

nyx scan output: HIGH taint flows from req.params.user, req.query.url, and req.query.path into exec/fetch/fs.readFileSync, framed by the brand purple gradient

The same scan in console form:

/tmp/demo/cmdi_direct.py
  6:5  ✖ [HIGH] taint-unsanitised-flow (source 5:11)  (Score: 81, Confidence: High)
      Unsanitised user input flows from request.args.get → os.system

      Source: request.args.get (5:11)
      Sink:   os.system

  6:5  ✖ [HIGH] py.cmdi.os_system  (Score: 64, Confidence: High)
      os.system() runs a shell command

/tmp/demo/xss_document_write.js
  5:5  ✖ [HIGH] taint-unsanitised-flow (source 3:18)  (Score: 81, Confidence: High)
      Unsanitised user input flows from req.query.content → document.write

      Source: req.query.content (3:18)
      Sink:   document.write

  5:5  ⚠ [MEDIUM] js.xss.document_write  (Score: 34, Confidence: High)
      document.write() is an XSS sink

warning 'demo' generated 10 issues.
Finished in 0.054s.

Each finding is one line of header plus evidence. Fields that matter:

Field	Meaning
`[HIGH]` / `[MEDIUM]` / `[LOW]`	Severity after the non-prod downgrade
Rule ID	Either a taint rule (`taint-unsanitised-flow`), a structural rule (`cfg-`, `state-`), or an AST pattern (`<lang>.<category>.<name>`)
Score	Attack-surface ranking (severity + analysis kind + source kind + evidence). Higher is more exploitable
Confidence	`High`, `Medium`, `Low`. Drops for AST-only matches, capped widened flows, and lowered-to-Low backwards-infeasible findings
Source / Sink	Where tainted data entered and where the dangerous call happened

Two rules firing on the same line (the taint finding plus the AST pattern) is normal. The pattern matches the structural presence of document.write; the taint rule adds the evidence that req.query.content actually reached it. Both carry distinct rule IDs so suppressions can target one without the other.

Fail a CI job on High findings

nyx scan . --fail-on HIGH --quiet

Exit 1 if any HIGH finding remains. --quiet drops the “Using default configuration” banner so CI logs stay tidy.

Emit SARIF for GitHub Code Scanning

nyx scan . --format sarif > results.sarif

Full SARIF schema and GitHub Actions wiring: cli.md and output.md.

Tighten the gate

# Only HIGH findings
nyx scan . --severity HIGH

# HIGH + MEDIUM
nyx scan . --severity ">=MEDIUM"

# Drop anything below Medium confidence (useful for CI)
nyx scan . --min-confidence medium

# Also drop findings the engine could not fully resolve (widened / bailed)
nyx scan . --require-converged

--require-converged keeps under-report findings (the emitted flow is still real) but drops over-reports and widenings. Intended for strict gates where a noisy finding is worse than nothing.

Skip dataflow for a fast first pass

nyx scan . --mode ast

AST-only mode runs tree-sitter patterns without building a CFG or running taint. It’s fast and still catches banned-API uses, weak crypto, and obvious XSS sinks, but it can’t tell eval("1+1") apart from eval(userInput). Use it as a pre-commit filter, not as a CI gate replacement.

CLI reference for every flag and subcommand.
Configuration for the nyx.conf / nyx.local schema, profiles, and custom rules.
nyx serve for the browser UI, triage workflow, and scan history.
Language maturity for per-language tier and known FP/FN patterns.

Installation

For the happy path (cargo install nyx-scanner, release binary on PATH), see the README. This page covers platform-specific notes and upgrade paths.

Supported platforms

Release binaries are published for:

Platform	Archive
Linux x86_64	`nyx-x86_64-unknown-linux-gnu.zip`
macOS Intel	`nyx-x86_64-apple-darwin.zip`
macOS Apple Silicon	`nyx-aarch64-apple-darwin.zip`
Windows x86_64	`nyx-x86_64-pc-windows-msvc.zip`

Build from source works on any stable Rust 1.88+ target (edition 2024).

Verify the download

Each release attaches a SHA256SUMS file. When the maintainer signs the release, a detached SHA256SUMS.asc is published alongside it.

# Verify the checksum file's signature (skip if .asc isn't present)
gpg --verify SHA256SUMS.asc SHA256SUMS

# Then check your archive against it
sha256sum -c SHA256SUMS --ignore-missing

If sha256sum is missing on macOS, shasum -a 256 -c SHA256SUMS --ignore-missing is equivalent.

Windows

Expand-Archive -Path nyx-x86_64-pc-windows-msvc.zip -DestinationPath .
Move-Item -Path .\nyx.exe -Destination "C:\Program Files\Nyx\"
# Add C:\Program Files\Nyx to PATH in System Properties → Environment Variables
nyx --version

Build from source

git clone https://github.com/elicpeter/nyx.git
cd nyx
cargo build --release
# Binary at target/release/nyx

The frontend is built and embedded into the binary during cargo build, so there’s no separate step for nyx serve. Node is only required if you’re working on the frontend itself; see CONTRIBUTING.md.

Optional features:

Flag	Adds
`--features smt`	Bundles Z3 for stronger path-constraint solving. MIT-licensed; distributors should include Z3’s license in their attribution
`--features smt-system-z3`	Links against a system-installed Z3 instead of bundling

Upgrading

Nyx stores its scanner version in the project’s index database. When the binary’s version differs from the stored version, the index is wiped on the next scan and rebuilt against the new engine. You’ll see one info-level log line:

engine version changed (0.4.0 → 0.5.0), rebuilding index

No flag needed. If you see this on every scan, the metadata row isn’t being persisted; file an issue.

Corrupt database recovery

If the SQLite file itself is damaged (killed scan, full disk), delete it and let the next scan rebuild from scratch:

rm "$(nyx config path)"/<project>.sqlite*

Only the named project’s rows are affected.

CLI Reference

Global

nyx [COMMAND]
nyx --version
nyx --help

`nyx scan`

Run a security scan on a directory.

nyx scan [PATH] [OPTIONS]

PATH defaults to . (current directory).

Analysis Mode

Flag	Default	Description
`--mode <MODE>`	`full`	Analysis mode: `full`, `ast`, `cfg`, or `taint`

Mode	What runs
`full`	AST patterns + CFG structural analysis + taint analysis
`ast`	AST patterns only (fastest, no CFG or taint)
`cfg` / `taint`	CFG + taint analysis only (no AST patterns)

Deprecated aliases: --ast-only (use --mode ast), --cfg-only (use --mode cfg), --all-targets (use --mode full).

Index Control

Flag	Default	Description
`--index <MODE>`	`auto`	Index behavior: `auto`, `off`, or `rebuild`

Index Mode	Behavior
`auto`	Use existing index if available; build if missing
`off`	Skip indexing, scan filesystem directly
`rebuild`	Force rebuild index before scanning

Deprecated aliases: --no-index (use --index off), --rebuild-index (use --index rebuild).

Output

Flag	Default	Description
`-f, --format <FMT>`	`console`	Output format: `console`, `json`, or `sarif`
`--quiet`	off	Suppress status messages (stderr), including the Preview-tier banner for C/C++ scans
`--no-rank`	off	Disable attack-surface ranking
`--no-state`	off	Disable state-model analysis (resource lifecycle + auth state). Overrides `scanner.enable_state_analysis`

Profiles

Flag	Default	Description
`--profile <NAME>`	(none)	Apply a named scan profile. Built-ins: `quick`, `full`, `ci`, `taint_only`, `conservative_large_repo`. User-defined profiles override built-ins with the same name. CLI flags still take precedence over profile values

Filtering

Flag	Default	Description
`--severity <EXPR>`	(none)	Filter findings by severity
`--min-score <N>`	(none)	Drop findings with rank score below N
`--min-confidence <LEVEL>`	(none)	Drop findings below this confidence level (`low`, `medium`, `high`)
`--require-converged`	off	Drop findings whose engine provenance notes indicate widening (over-report) or analysis bail. Keeps `under-report` findings (emitted flow is still real). Intended for strict CI gates.
`--fail-on <SEV>`	(none)	Exit code 1 if any finding >= this severity
`--show-suppressed`	off	Show inline-suppressed findings (dimmed, tagged `[SUPPRESSED]`)
`--keep-nonprod-severity`	off	Don’t downgrade severity for test/vendor paths
`--all`	off	Disable category filtering, rollups, and LOW budgets – show everything
`--include-quality`	off	Include Quality-category findings (hidden by default)
`--max-low <N>`	`20`	Maximum total LOW findings to show
`--max-low-per-file <N>`	`1`	Maximum LOW findings per file
`--max-low-per-rule <N>`	`10`	Maximum LOW findings per rule
`--rollup-examples <N>`	`5`	Number of example locations in rollup findings
`--show-instances <RULE>`	(none)	Expand all instances of a specific rule (bypass rollup)

Severity expression formats:

--severity HIGH              # Only high
--severity "HIGH,MEDIUM"     # High or medium
--severity ">=MEDIUM"        # Medium and above (high + medium)
--severity ">= low"         # All severities (case-insensitive)

Deprecated aliases: --high-only (use --severity HIGH), --include-nonprod (use --keep-nonprod-severity).

--fail-on returns a non-zero exit code when the threshold trips, so CI jobs fail without further wiring:

nyx scan with --fail-on HIGH against a small fixture: three HIGH taint findings printed, followed by exit=1 from the shell

Quality-category and rollup-prone Low findings are filtered down by default. The footer tells you exactly what got dropped and which knob to turn:

nyx scan tail: warning '*' generated 57 issues; Suppressed 92 LOW/Quality findings; Active filters max_low=20, max_low_per_file=1, max_low_per_rule=10; Use --include-quality, --max-low, or --all to adjust

Analysis Engine Toggles

Override the corresponding [analysis.engine] values in nyx.conf for a single run. All default on; pass the --no-* variant to disable.

Pair	Config field	Effect when disabled
`--constraint-solving` / `--no-constraint-solving`	`constraint_solving`	Skip path-constraint solving; infeasible paths no longer pruned
`--abstract-interp` / `--no-abstract-interp`	`abstract_interpretation`	Skip interval / string / bit abstract domains
`--context-sensitive` / `--no-context-sensitive`	`context_sensitive`	Treat intra-file callees insensitively (summary-only)
`--symex` / `--no-symex`	`symex.enabled`	Skip the symex pipeline; no symbolic verdicts or witnesses
`--cross-file-symex` / `--no-cross-file-symex`	`symex.cross_file`	Skip extracting / consulting cross-file SSA bodies
`--symex-interproc` / `--no-symex-interproc`	`symex.interprocedural`	Cap symex frame stack at the entry function
`--smt` / `--no-smt`	`symex.smt`	Skip the SMT backend (still a no-op without the `smt` feature)
`--backwards-analysis` / `--no-backwards-analysis`	`backwards_analysis`	Demand-driven backwards taint walk from sinks (default off)
`--parse-timeout-ms <N>`	`parse_timeout_ms`	Per-file tree-sitter parse timeout (ms); `0` disables the cap

Lattice-width Caps

Two caps bound the width of taint origin sets and points-to sets per SSA value. When a set would exceed the cap, entries are truncated deterministically and an engine note (OriginsTruncated / PointsToTruncated) is recorded on affected findings so you can see when precision was lost.

Flag	Default	Description
`--max-origins <N>`	`32`	Max taint origins retained per lattice value. Raise on very wide codebases where truncation is observed; lower only when lattice width is a measured bottleneck. Also set via `NYX_MAX_ORIGINS`
`--max-pointsto <N>`	`32`	Max abstract heap objects retained per points-to set. Raise on factory-heavy codebases where truncation is observed. Also set via `NYX_MAX_POINTSTO`

See configuration.md for the full schema.

Engine-Depth Profile

Individual engine toggles are fine-grained but hard to remember in combination. The --engine-profile shortcut sets the whole stack in one shot, and individual flags are layered on top after the profile is applied.

Profile	Backwards	Symex	Abstract-interp	Context-sensitive
`fast`	off	off	off	off
`balanced` (default)	off	off	on	on
`deep`	on	on (cross-file + interprocedural)	on	on

All three profiles build the AST, CFG, and SSA lattice and run forward taint; the columns above show which additional analyses each profile enables. SMT (symex.smt) is always off unless Nyx was built with --features smt.

Individual flags override the profile. For example, --engine-profile fast --backwards-analysis runs the fast stack but with backwards analysis on.

Explain Effective Engine

--explain-engine prints the resolved engine configuration (profile + config + CLI overrides + env-var fallbacks) to stdout and exits without scanning. Useful for sanity-checking a CI invocation.

nyx scan --engine-profile deep --no-smt --explain-engine

nyx scan --engine-profile deep --explain-engine output: resolved config showing every analysis pass, its current state, and the CLI flag/env var that controls it

Examples

# Basic scan
nyx scan

# Scan specific path, JSON output
nyx scan ./server --format json

# CI gate: fail on medium+, SARIF output
nyx scan . --format sarif --fail-on medium > results.sarif

# Fast AST-only scan, no index
nyx scan . --mode ast --index off

# High-severity only, quiet mode
nyx scan . --severity HIGH --quiet

# Only findings scoring 50 or above
nyx scan . --min-score 50

# Only medium+ confidence findings
nyx scan . --min-confidence medium

# Show everything (no filtering, no rollups)
nyx scan . --all

# Include quality findings but keep rollups and budgets
nyx scan . --include-quality

# See all unwrap findings expanded
nyx scan . --include-quality --show-instances rs.quality.unwrap

# Allow more LOW findings
nyx scan . --max-low 50 --max-low-per-file 5

`nyx index`

Manage the SQLite file index.

`nyx index build`

nyx index build [PATH] [--force]

Build or update the index for the given path (default: .).

Flag	Description
`-f, --force`	Force full rebuild, ignoring cached file hashes

`nyx index status`

nyx index status [PATH]

Display index statistics (file count, size, last modified) for the given path.

nyx index status output: project name, index path under the platform config dir, exists/size/modified fields

`nyx list`

nyx list [-v]

List all indexed projects.

Flag	Description
`-v, --verbose`	Show detailed information per project

`nyx clean`

nyx clean [PROJECT] [--all]

Remove index data.

Argument/Flag	Description
`PROJECT`	Project name or path to clean
`--all`	Clean all indexed projects

`nyx config`

Manage configuration.

`nyx config show`

Print the effective merged configuration as TOML. Useful for sanity-checking what the scanner is actually using after nyx.conf and nyx.local merge:

nyx config show output: TOML dump of the merged scanner config showing [scanner] mode/min_severity/excluded_extensions/excluded_directories, [database] settings, and resolved engine toggles

`nyx config path`

Print the configuration directory path.

`nyx config add-rule`

nyx config add-rule --lang <LANG> --matcher <MATCHER> --kind <KIND> --cap <CAP>

Add a custom taint rule. Written to nyx.local.

Flag	Values
`--lang`	`rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby`
`--matcher`	Function or property name to match
`--kind`	`source`, `sanitizer`, `sink`
`--cap`	`env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `all`

`nyx config add-terminator`

nyx config add-terminator --lang <LANG> --name <NAME>

Add a terminator function (e.g. process.exit). Written to nyx.local.

Exit codes

See output.md. Summary: 0 on success (including findings without --fail-on), 1 when --fail-on trips, non-zero on scan errors.

Environment variables

Runtime behaviour:

Variable	Description
`RUST_LOG`	Set tracing verbosity (e.g. `RUST_LOG=debug nyx scan .`)
`NO_COLOR`	Disable ANSI color output

Engine toggles (legacy, still honored; prefer CLI flags or [analysis.engine] config):

Variable	Matches
`NYX_CONSTRAINT`	`--constraint-solving`
`NYX_ABSTRACT_INTERP`	`--abstract-interp`
`NYX_CONTEXT_SENSITIVE`	`--context-sensitive`
`NYX_SYMEX`, `NYX_CROSS_FILE_SYMEX`, `NYX_SYMEX_INTERPROC`	`--symex` and friends
`NYX_SMT`	`--smt` (no-op without the `smt` feature)
`NYX_BACKWARDS`	`--backwards-analysis`
`NYX_PARSE_TIMEOUT_MS`	`--parse-timeout-ms`
`NYX_MAX_ORIGINS`, `NYX_MAX_POINTSTO`	`--max-origins`, `--max-pointsto`

`nyx serve`: the browser UI

The CLI is fine for CI. For triage, you want context: the source snippet, the dataflow path, the history of how a finding has moved across scans, and a place to record decisions that survive the next run. nyx serve boots a local React UI bound to loopback.

nyx serve                         # opens http://localhost:9700 in your default browser
nyx serve ./my-project            # serve a specific project root
nyx serve --port 9750             # override port
nyx serve --no-browser            # don't auto-open

Persistent settings live under [server] in nyx.conf / nyx.local.

Nyx UI overview: total findings, severity breakdown, language and category distribution, top affected files

What it serves, and what it doesn’t

The frontend is built and embedded into the nyx binary at compile time. There’s no separate install step, and the binary serves the entire UI from memory; nothing is fetched from a CDN. The UI talks to the local Nyx process over a small JSON API.

There is no account, no telemetry, no remote logging, no auto-update ping. The data the UI shows is the data on your disk: the SQLite project index plus .nyx/triage.json.

Security model

nyx serve enforces three things at the HTTP layer (src/server/security.rs):

Loopback bind only. --host and [server].host are clamped to 127.0.0.1, localhost, or ::1. Any other value is refused at startup with Nyx serve only binds to loopback addresses; refused host '<value>'.
Host-header check. Every request must carry a Host header that matches the bound address and port. Missing or mismatched headers get a 400 invalid Host header. Defends against DNS rebinding.
CSRF on mutations. POST / PUT / PATCH / DELETE requests must carry a per-process CSRF token in the x-nyx-csrf header. The token is generated once when the server starts and exposed at GET /api/health so the embedded SPA can read it. Cross-origin mutations are rejected before the CSRF check via the Origin header.

If you forward the port over SSH or expose it through a reverse proxy, the host-header check will reject the request because the Host won’t match localhost:9700. That’s the intended behaviour. Don’t do this without a deliberate reason; the loopback bind is part of the security model.

The pages

Path	Page
`/`	Overview
`/findings`	Findings list
`/findings/:id`	Finding detail
`/triage`	Triage
`/explorer`	Explorer
`/scans`	Scans
`/scans/:id`	Scan detail and compare
`/rules`	Rules
`/rules/:id`	Rule detail
`/config`	Config

The numeric :id for finding URLs is the position index in the current scan, not a stable fingerprint. Bookmarks across scans aren’t reliable; rely on file path + line.

Overview and Health Score

The overview is the landing page after a scan. Severity counts, top affected files, OWASP coverage, and a 0 to 100 Health Score with a letter grade.

How the Health Score is calculated

Two things drive the score. The density of risk in the codebase, and hard guardrails that decide what the grade can mean.

Each finding contributes weight = severity_base × confidence_factor × verdict_factor × context_factor:

Severity base: HIGH 10, MEDIUM 3, LOW (security) 0.5
Confidence: High 1.0, Medium 0.6, Low 0.3
Symex verdict: Confirmed 1.2, NotAttempted 1.0, Inconclusive 0.7, Infeasible 0.1
Context: cross-file taint flow 1.15, intra-file flow 1.0, AST-only or no flow 0.75, test path 0.3

Quality lints (rule IDs containing .quality.) skip the per-finding weight and instead apply a saturating drag, capped at 15 points (so 1000 unwrap lints don’t grade worse than 300 do). Total weight gets divided by sqrt(files / 100), clamped between 1 and roughly 22, so a 100-file repo and a 50000-file repo see different denominators but a monorepo can’t dilute its way out of a real HIGH.

The result feeds a log curve into a 0 to 100 base, minus the quality drag. Then HIGH guardrails apply, keyed on the credibility-adjusted HIGH count rather than the raw count:

effective HIGH	ceiling
0	100
1	85
2	78
3 to 5	68
6 to 10	58
11+	45

A repo with zero effective HIGHs never grades below C 70. That floor is the structural promise that the score isn’t an automated F-machine for projects that have lots of LOW noise but no critical issues.

Modifiers in the ±5 range nudge the result for trend (only after the second scan), triage coverage (only when total findings ≥ 20), reintroduced findings, and stale HIGHs more than 30 days old.

What the score doesn’t measure

It’s a Nyx-finding-pressure metric, not a security audit. Score 100 means Nyx didn’t find anything under its current rules and language coverage; it doesn’t certify the absence of vulnerabilities. The score doesn’t see runtime config, IAM, secret stores, dependency CVEs, or anything outside the source tree being scanned. A repo of mostly Kotlin (where Nyx coverage is thin) will score artificially well because most of the code never gets evaluated.

The current ceilings are calibrated for v0.5 scanner false-positive rates. As symex coverage and rule precision improve, the ceilings tighten. Calibration data and the rationale behind each tunable lives in health-score-audit.md.

Findings and Finding detail

The findings list is filterable by severity, confidence, category, language, rule ID, and triage state.

Nyx findings list: 13 findings filtered by severity/confidence/rule, with status badges, file paths, and language tags

Clicking through opens the flow visualiser: a numbered walk from source to sink with the snippet at each step, cross-file markers when the path leaves the current file, the rule’s “How to fix” guidance, and the engine’s evidence object inline.

Nyx finding detail: HIGH taint-unsanitised-flow showing source → call → sink steps, How to fix guidance, and evidence panel

Engine notes call out when precision was bounded for that finding (OriginsTruncated, PointsToTruncated, PathWidened, ForwardBailed, etc.). Anything tagged under-report means the emitted flow is real and the result set is a lower bound; over-report means widening or bail. --require-converged in the CLI drops the over-report ones for strict gates.

Triage

Each finding carries a triage state: open, investigating, false_positive, accepted_risk, suppressed, or fixed. The triage page bulk-updates them and shows the audit trail.

Nyx triage page: 13 findings need attention, severity breakdown, Findings/Suppression rules/Audit log tabs, rule chips, Investigate buttons

State writes are persisted to SQLite immediately, and (when [server].triage_sync = true, default on) mirrored to .nyx/triage.json in the project root. Commit that file:

git add .nyx/triage.json

It carries decisions across machines so a teammate’s local scan reflects yours. The format is documented in src/server/triage_sync.rs; the schema is stable and round-trip-safe with nyx serve re-imports.

Explorer

A file tree with per-file finding counts, syntax-highlighted source, and a right rail with the file’s symbols and findings. Useful for “what’s wrong with this module” rather than “what’s wrong with this finding”.

Nyx explorer: file tree with per-file finding counts, syntax-highlighted Python source with red sink marker on the os.system line, file-summary right rail with findings

The path query string preselects a file: /explorer?file=src/handler.rs.

Scans and compare

Past runs are persisted when [runs].persist = true (off by default to avoid disk growth on heavy users). When persistence is on, /scans lists historical runs.

Nyx scans list: completed scan run with root, duration, finding count, languages, and started timestamp

Each run drills into a detail page with files scanned, findings count, duration, languages, and a per-pass timing breakdown.

Nyx scan detail: Summary tab with files scanned, findings, duration, languages; Details panel with Scan ID, Root, Engine version, started/finished timestamps; Timing breakdown bar showing Walk/Pass 1/Call Graph/Pass 2/Post

Pick two scans to diff and see what got introduced, fixed, or rediscovered between runs. The retention cap is [runs].max_runs (default 100). Each run can also optionally save its log and stdout (save_logs, save_stdout); both are off by default. Code snippets are saved (save_code_snippets = true); turn off if storage is tight.

Rules

Every rule the engine knows about, built-in plus user-added. Each row shows the matchers, kind (source / sanitiser / sink), capability, language, and how many findings it produced in the latest scan. Filter by language, by kind, or by free text.

Nyx rules page: 218 rules with language/kind dropdowns and a matcher search; rows showing rule title, language, kind (SOURCE/SANITIZER/SINK), cap, and finding count

User-added rules can be deleted from this page; built-ins are immutable. Built-ins live in src/labels/<lang>.rs and src/patterns/<lang>.rs; user-added entries write to nyx.local.

Config

A live config editor. Reads the merged config (nyx.conf + nyx.local), lets you flip switches and add custom source / sanitizer / sink rules, and writes back to nyx.local. Changes apply to the next scan; the running server uses its initial config snapshot.

Nyx config page: General settings (analysis mode, max file size, excluded extensions, attack-surface ranking), Triage Sync toggle, Sources section with language/matcher/capability dropdowns and a per-language matcher table

The custom-rule form picks a language, a matcher (function or property name), and a capability. The capability list matches the Cap bitflags the taint engine uses; see rules.md for what each one means.

API surface

For tooling, the JSON endpoints under /api/ are stable enough to script against. The full route map lives in src/server/routes/mod.rs. Mutating endpoints require the x-nyx-csrf header (read it from GET /api/health).

Disabling

If you don’t want the UI for a project, set:

[server]
enabled = false

nyx serve will refuse to start. The CLI continues to work.

Configuration

Nyx uses TOML configuration files. A default config is auto-generated on first run. If you’d rather edit settings and rules from the browser, the Config page in nyx serve is a live editor that writes back to nyx.local:

Nyx config page: General settings, Triage Sync toggle, Sources panel with language/matcher/capability dropdowns and a per-language matcher table

File Locations

Platform	Directory
Linux	`~/.config/nyx/`
macOS	`~/Library/Application Support/nyx/`
Windows	`%APPDATA%\elicpeter\nyx\config\`

Run nyx config path to see the exact directory on your system.

File Precedence

nyx.conf – Default config (auto-created from built-in template on first run)
nyx.local – User overrides (loaded on top of defaults)

Both files are optional. CLI flags take precedence over both.

Merge Strategy

Type	Behavior
Scalars (`mode`, `min_severity`, booleans)	User value wins
Arrays (`excluded_extensions`, `excluded_directories`, `excluded_files`)	Union + deduplicate
Analysis rules	Per-language union with deduplication
Profiles	User profile with same name fully replaces built-in
Server / Runs	User value wins (full section override)

Example:

# nyx.conf (default):
excluded_extensions = ["jpg", "png", "exe"]

# nyx.local (user):
excluded_extensions = ["foo", "jpg"]

# Effective result:
# ["exe", "foo", "jpg", "png"]  -- sorted, deduped union

Full Schema

`[scanner]`

Field	Type	Default	Description
`mode`	`"full"` \| `"ast"` \| `"cfg"` \| `"taint"`	`"full"`	Analysis mode
`min_severity`	`"Low"` \| `"Medium"` \| `"High"`	`"Low"`	Minimum severity to report
`max_file_size_mb`	int \| null	16	Max file size in MiB; null = unlimited. Default is a safe ceiling for untrusted repos; lift explicitly when scanning trusted codebases with large generated files
`excluded_extensions`	[string]	`["jpg", "png", "gif", "mp4", ...]`	File extensions to skip
`excluded_directories`	[string]	`["node_modules", ".git", "target", ...]`	Directories to skip
`excluded_files`	[string]	`[]`	Specific files to skip
`read_global_ignore`	bool	`false`	Honor global ignore file (RESERVED)
`read_vcsignore`	bool	`true`	Honor `.gitignore` / `.hgignore`
`require_git_to_read_vcsignore`	bool	`true`	Require `.git` dir to apply gitignore
`one_file_system`	bool	`false`	Don’t cross filesystem boundaries
`follow_symlinks`	bool	`false`	Follow symbolic links
`scan_hidden_files`	bool	`false`	Scan dot-files
`include_nonprod`	bool	`false`	Keep original severity for test/vendor paths
`enable_state_analysis`	bool	`true`	Enable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires `mode = "full"` or `mode = "taint"`.

`[database]`

Field	Type	Default	Description
`path`	string	`""`	Custom SQLite DB path; empty = platform default (RESERVED)
`auto_cleanup_days`	int	`30`	Days to keep DB files (RESERVED)
`max_db_size_mb`	int	`1024`	Maximum DB size in MiB (RESERVED)
`vacuum_on_startup`	bool	`false`	Run VACUUM before indexed scans

`[output]`

Field	Type	Default	Description
`default_format`	`"console"` \| `"json"` \| `"sarif"`	`"console"`	Default output format (used when `--format` is not specified)
`quiet`	bool	`false`	Suppress status messages
`max_results`	int \| null	null	Cap number of findings; null = unlimited
`attack_surface_ranking`	bool	`true`	Enable attack-surface ranking
`min_score`	int \| null	null	Minimum rank score to include; null = no minimum
`min_confidence`	string \| null	null	Minimum confidence level (`"low"`, `"medium"`, `"high"`); null = no minimum
`include_quality`	bool	`false`	Include Quality-category findings (hidden by default)
`show_all`	bool	`false`	Disable category filtering, rollups, and LOW budgets
`max_low`	int	`20`	Maximum total LOW findings to show (rollups count as 1)
`max_low_per_file`	int	`1`	Maximum LOW findings per file (rollups count as 1)
`max_low_per_rule`	int	`10`	Maximum LOW findings per rule (rollups count as 1)
`rollup_examples`	int	`5`	Number of example locations stored in rollup findings

`[performance]`

Field	Type	Default	Description
`max_depth`	int \| null	null	Max filesystem traversal depth; null = unlimited
`min_depth`	int \| null	null	Min depth for reported entries (RESERVED)
`prune`	bool	`false`	Stop traversing into matching directories (RESERVED)
`worker_threads`	int \| null	null	Worker thread count; null/0 = auto-detect
`batch_size`	int	`100`	Files per index batch
`channel_multiplier`	int	`4`	Channel capacity = threads x multiplier
`rayon_thread_stack_size`	int	`8388608`	Rayon thread stack size in bytes (8 MiB)
`scan_timeout_secs`	int \| null	null	Per-file timeout in seconds (RESERVED)
`memory_limit_mb`	int	`512`	Max memory in MiB (RESERVED)

`[server]`

Configuration for the local web UI (nyx serve).

Field	Type	Default	Description
`enabled`	bool	`true`	Whether the serve command is enabled
`host`	string	`"127.0.0.1"`	Host to bind to (localhost by default)
`port`	int	`9700`	Port for the web UI
`open_browser`	bool	`true`	Open browser automatically on serve
`auto_reload`	bool	`true`	Auto-reload UI when scan results change
`persist_runs`	bool	`true`	Persist scan runs for history view
`max_saved_runs`	int	`50`	Maximum number of saved runs

`[runs]`

Configuration for scan run persistence and history.

Field	Type	Default	Description
`persist`	bool	`false`	Persist scan run history to disk
`max_runs`	int	`100`	Maximum number of runs to keep
`save_logs`	bool	`false`	Save scan logs with each run
`save_stdout`	bool	`false`	Save stdout capture with each run
`save_code_snippets`	bool	`true`	Save code snippets in findings

`[profiles.<name>]`

Named scan presets that override scan-related config. Activate with --profile <name>.

All fields are optional; omitted fields inherit from the base config.

Field	Type	Description
`mode`	string	Analysis mode
`min_severity`	string	Minimum severity
`max_file_size_mb`	int	Max file size in MiB
`include_nonprod`	bool	Keep original severity for test/vendor
`enable_state_analysis`	bool	Enable state analysis
`default_format`	string	Output format
`quiet`	bool	Suppress status output
`attack_surface_ranking`	bool	Enable ranking
`max_results`	int	Max findings
`min_score`	int	Min rank score
`show_all`	bool	Show all findings
`include_quality`	bool	Include quality findings
`worker_threads`	int	Worker thread count
`max_depth`	int	Max traversal depth

Built-in profiles:

Name	Description
`quick`	AST-only, medium+ severity
`full`	Full analysis with state analysis enabled
`ci`	Full analysis, medium+ severity, quiet, SARIF output
`taint_only`	Taint analysis only
`conservative_large_repo`	AST-only, high severity, 5 MiB file limit, depth 10

User-defined profiles with the same name as a built-in will override it.

`[analysis.engine]`

Release-grade switches for the optional analysis passes. Each toggle has a matching CLI flag (pair of --foo / --no-foo) that overrides the config value for a single run. These used to be NYX_* environment variables (NYX_CONSTRAINT, NYX_ABSTRACT_INTERP, NYX_SYMEX, NYX_CROSS_FILE_SYMEX, NYX_SYMEX_INTERPROC, NYX_CONTEXT_SENSITIVE, NYX_PARSE_TIMEOUT_MS, NYX_SMT); those env vars are still honored as a last-resort override when nyx is used as a library (no CLI entry point), but the config/CLI surface is the stable path.

Field	Type	Default	Description
`constraint_solving`	bool	`true`	Path-constraint solving (prunes infeasible paths in taint)
`abstract_interpretation`	bool	`true`	Interval / string / bit abstract domains carried through the SSA worklist
`context_sensitive`	bool	`true`	k=1 context-sensitive callee inlining for intra-file calls
`backwards_analysis`	bool	`false`	Demand-driven backwards taint walk from sinks (adds scan time; default off)
`parse_timeout_ms`	int	`10000`	Per-file tree-sitter parse timeout; `0` disables the cap

[analysis.engine.symex] sub-section:

Field	Type	Default	Description
`enabled`	bool	`true`	Run the symex pipeline after taint; adds witness strings and symbolic verdicts
`cross_file`	bool	`true`	Persist / consult cross-file SSA bodies so symex can reason about callees defined in other files
`interprocedural`	bool	`true`	Intra-file interprocedural symex (k ≥ 2 via frame stack)
`smt`	bool	`true`	Use the SMT backend when nyx is built with the `smt` feature; ignored otherwise

CLI flag map (each pair is --enable / --no-enable):

Config field	CLI flags
`constraint_solving`	`--constraint-solving` / `--no-constraint-solving`
`abstract_interpretation`	`--abstract-interp` / `--no-abstract-interp`
`context_sensitive`	`--context-sensitive` / `--no-context-sensitive`
`backwards_analysis`	`--backwards-analysis` / `--no-backwards-analysis`
`parse_timeout_ms`	`--parse-timeout-ms <N>`
`symex.enabled`	`--symex` / `--no-symex`
`symex.cross_file`	`--cross-file-symex` / `--no-cross-file-symex`
`symex.interprocedural`	`--symex-interproc` / `--no-symex-interproc`
`symex.smt`	`--smt` / `--no-smt`

Engine-depth profile shortcut: instead of flipping individual toggles, pass --engine-profile {fast,balanced,deep} to set the whole stack at once. Individual flags override the profile, so --engine-profile fast --backwards-analysis runs the fast stack with backwards analysis on. See docs/cli.md for the exact toggle matrix.

Explain effective engine: pass --explain-engine to print the resolved engine configuration (profile + config + CLI overrides) and exit without scanning.

`[analysis.languages.<slug>]`

Per-language custom rules. <slug> is one of: rust, javascript, typescript, python, go, java, c, cpp, php, ruby.

Field	Type	Description
`rules`	array of rule objects	Custom label rules
`terminators`	[string]	Functions that terminate execution
`event_handlers`	[string]	Event handler function names

Rule object:

[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer"        # "source" | "sanitizer" | "sink"
cap = "html_escape"       # "env_var" | "html_escape" | "shell_escape" |
                          # "url_encode" | "json_parse" | "file_io" |
                          # "fmt_string" | "sql_query" | "deserialize" |
                          # "ssrf" | "code_exec" | "crypto" | "all"

Example Configurations

Minimal override (`nyx.local`)

[scanner]
min_severity = "Medium"

[output]
default_format = "json"
max_results = 100

CI-optimized

[scanner]
mode = "full"
min_severity = "Medium"
excluded_directories = ["node_modules", ".git", "target", "vendor", "dist"]

[output]
quiet = true
default_format = "sarif"

[performance]
worker_threads = 4

Using a scan profile

# Use a built-in profile
nyx scan --profile ci

# CLI flags still override profile values
nyx scan --profile ci --format json

Custom profile

[profiles.security_audit]
mode = "full"
min_severity = "Low"
enable_state_analysis = true
show_all = true

Custom rules for a Node.js project

[analysis.languages.javascript]
terminators = ["process.exit", "abort"]
event_handlers = ["addEventListener"]

[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind = "sanitizer"
cap = "html_escape"

[[analysis.languages.javascript.rules]]
matchers = ["dangerouslySetInnerHTML"]
kind = "sink"
cap = "html_escape"

[[analysis.languages.javascript.rules]]
matchers = ["getRequestBody", "readUserInput"]
kind = "source"
cap = "all"

Adding rules via CLI

# Add a sanitizer
nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape

# Add a terminator
nyx config add-terminator --lang javascript --name process.exit

# Verify
nyx config show

Config Validation

Config is validated after loading and merging. Validation checks include:

Server port must be 1–65535
Server host must not be empty
max_saved_runs must be > 0 when persist_runs is true
max_runs must be > 0 when persist is true
batch_size and channel_multiplier must be > 0
rollup_examples must be > 0
Profile names must be alphanumeric with underscores only

Invalid config produces structured error messages identifying the section, field, and issue.

State Analysis

State analysis detects resource lifecycle violations (use-after-close, double-close, resource leaks) and unauthenticated access patterns. It is enabled by default.

To disable:

[scanner]
enable_state_analysis = false

State analysis requires mode = "full" or mode = "taint". It has no effect in mode = "ast".

Tradeoffs:

Additional per-function state-machine pass adds some scan time
May produce findings that require domain knowledge to evaluate (e.g., whether a resource handle is intentionally left open)
Most useful for C, C++, Rust, Go, and Java where acquire/release patterns are common

Upgrading

Engine-version mismatch is handled automatically

Nyx stores the scanner’s CARGO_PKG_VERSION in the project index database. When the version recorded in the DB differs from the running binary; or the row is missing entirely; every cached summary, SSA body, and file-hash row is wiped on the next open so the next scan rebuilds the index against the new engine. No flag is needed; CI pipelines keep working across upgrades.

The rebuild is logged at info level:

engine version changed (0.4.0 → 0.5.0), rebuilding index

If you see this once per upgrade it is working as intended. If you see it on every scan, the metadata row is not being persisted; file an issue.

Forcing a reindex

Use --index rebuild to throw away the current project’s cached summaries and re-run pass 1 against the current rules. Useful after editing nyx.local rules, after an upgrade that changed label definitions without changing the engine version, or when you want a known-clean baseline:

nyx scan --index rebuild .

This clears the current project’s rows in files, function_summaries, ssa_function_summaries, and ssa_function_bodies; other projects sharing the same DB directory are untouched.

Recovering from a corrupt database

If the .sqlite file itself is damaged (e.g. from a killed scan or full disk) and nyx scan fails to open it, delete the file and let the next scan recreate it:

rm "$(nyx config path)"/<project>.sqlite*

On the next scan Nyx builds a fresh index from scratch.

Reserved Fields

Some config fields are defined but not yet implemented. They are marked (RESERVED) in the default config and accept values without effect. This allows forward-compatible config files; settings will activate when the feature is implemented without requiring config changes.

Output Formats

Nyx supports three output formats, selected with --format or output.default_format in config.

Console (default)

Human-readable, color-coded output to stdout. Status messages go to stderr.

[HIGH]   taint-unsanitised-flow (source 5:11)  src/handler.rs:12:5 (Score: 76, Confidence: High)
         Source: env::var("CMD") → Command::new("sh").arg("-c")

[MEDIUM] cfg-unguarded-sink                    src/handler.rs:12:5 (Score: 35, Confidence: Medium)

[LOW]    rs.quality.unwrap                     src/lib.rs:88:5 (Score: 10, Confidence: High)

Severity indicators

Tag	Color	Meaning
`[HIGH]`	Red, bold	Critical – likely exploitable
`[MEDIUM]`	Orange, bold	Important – may be exploitable
`[LOW]`	Muted blue-gray	Informational – code quality or weak signal

Evidence fields

Taint and state findings include structured evidence:

Label	Meaning
Source	Where tainted data originated (function name + location)
Sink	Where the dangerous operation happens
Path guard	Type of validation predicate protecting the path

Score

When attack-surface ranking is enabled (default), each finding shows a Score value. Higher scores indicate greater exploitability. See Detector Overview for the scoring formula.

Rollup findings

High-frequency LOW Quality findings (e.g. rs.quality.unwrap) are grouped into rollup findings by (file, rule):

  21:10  ● [LOW]   rs.quality.unwrap
      rs.quality.unwrap (38 occurrences)
      Examples: 21:10, 50:10, 79:10, 105:10, 134:10
      Run: nyx scan --show-instances rs.quality.unwrap

Rollups count as one finding for LOW budget enforcement. Use --show-instances <RULE> to expand a specific rule or --all to disable rollups entirely.

When findings are suppressed by the prioritization pipeline, a footer is shown:

Suppressed 195 LOW/Quality findings.
Active filters:
  include_quality = false
  max_low = 20
  max_low_per_file = 1
  max_low_per_rule = 10

Use --include-quality, --max-low, or --all to adjust.

JSON

Machine-readable JSON array. Each finding is an object:

[
  {
    "path": "src/handler.rs",
    "line": 12,
    "col": 5,
    "severity": "High",
    "id": "taint-unsanitised-flow (source 5:11)",
    "path_validated": false,
    "labels": [
      ["Source", "env::var(\"CMD\") at 5:11"],
      ["Sink", "Command::new(\"sh\").arg(\"-c\")"]
    ],
    "confidence": "High",
    "evidence": {
      "source": {
        "path": "src/handler.rs",
        "line": 5,
        "col": 11,
        "kind": "source",
        "snippet": "env::var(\"CMD\")"
      },
      "sink": {
        "path": "src/handler.rs",
        "line": 12,
        "col": 5,
        "kind": "sink",
        "snippet": "Command::new(\"sh\")"
      },
      "notes": ["source_kind:EnvironmentConfig"]
    },
    "rank_score": 76.0,
    "rank_reason": [
      ["severity_base", "60"],
      ["analysis_kind", "10"],
      ["source_kind", "5"],
      ["evidence_count", "1"]
    ]
  }
]

Field descriptions

Field	Type	Always present	Description
`path`	string	yes	File path relative to scan root
`line`	int	yes	1-indexed line number
`col`	int	yes	1-indexed column number
`severity`	string	yes	`"High"`, `"Medium"`, or `"Low"`
`id`	string	yes	Rule ID
`category`	string	yes	Finding category: `"Security"`, `"Reliability"`, or `"Quality"`
`path_validated`	bool	no	True if guarded by validation predicate
`guard_kind`	string	no	Predicate type (e.g. `"NullCheck"`, `"ValidationCall"`)
`message`	string	no	Human-readable context (state analysis findings)
`labels`	array	no	Array of `[label, value]` pairs for console display
`confidence`	string	no	Confidence level: `"Low"`, `"Medium"`, or `"High"`
`evidence`	object	no	Structured evidence (source/sink spans, state, notes)
`rank_score`	float	no	Attack-surface score (omitted when ranking disabled)
`rank_reason`	array	no	Score breakdown (omitted when ranking disabled)
`rollup`	object	no	Rollup data when findings are grouped (see below)

Fields marked “no” are omitted when empty/null/false to keep output compact.

Confidence levels

Level	Meaning
`High`	Strong signal – taint-confirmed flow, definite state violation
`Medium`	Moderate signal – resource leak, path-validated taint, CFG structural
`Low`	Weak signal – AST pattern match, possible resource leak, degraded analysis

Evidence object

The evidence field provides structured provenance data:

Field	Type	Description
`source`	object	Source span (path, line, col, kind, snippet)
`sink`	object	Sink span (path, line, col, kind, snippet)
`guards`	array	Validation guard spans
`sanitizers`	array	Sanitizer spans
`state`	object	State-machine evidence (machine, subject, from_state, to_state)
`notes`	array	Free-form notes (e.g. `"source_kind:UserInput"`, `"path_validated"`)

All fields are omitted when empty/null.

Rollup object

When a finding is a rollup (grouped from multiple occurrences), the rollup field is present:

{
  "rollup": {
    "count": 38,
    "occurrences": [
      { "line": 21, "col": 10 },
      { "line": 50, "col": 10 },
      { "line": 79, "col": 10 }
    ]
  }
}

Field	Type	Description
`count`	int	Total number of occurrences
`occurrences`	array	First N example locations (controlled by `rollup_examples`)

SARIF (Static Analysis Results Interchange Format)

SARIF 2.1.0 JSON, suitable for GitHub Code Scanning and other SARIF-compatible tools.

nyx scan . --format sarif > results.sarif

The SARIF output includes:

Tool metadata – Nyx name and version
Rules – Rule ID, description, severity mapping
Results – One result per finding with location, message, and properties
Properties – Each result includes category and optionally confidence and rollup.count
Related locations – Rollup findings include example locations in relatedLocations
Artifacts – File paths referenced by findings

GitHub Code Scanning integration

- name: Run Nyx
  run: nyx scan . --format sarif > results.sarif

- name: Upload SARIF
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: results.sarif

Exit Codes

Code	Meaning
`0`	Scan completed successfully; no findings matched `--fail-on` threshold
`1`	`--fail-on` threshold breached (at least one finding meets or exceeds the specified severity)
Non-zero	Error (I/O, config, database, parse error)

Without --fail-on, Nyx always exits 0 on a successful scan regardless of findings count.

Severity Levels

Level	Description	Typical rules
High	Critical vulnerabilities – likely exploitable	Command injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources
Medium	Important issues – may be exploitable with additional context	SQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks
Low	Informational – code quality or weak signals	Weak crypto algorithms, insecure randomness, `unwrap()`/`panic!()`, type-safety escapes

Non-production severity downgrade

By default, findings in paths matching common non-production patterns (tests/, test/, vendor/, build/, examples/, benchmarks/) are downgraded by one tier:

High → Medium
Medium → Low
Low → Low (unchanged)

Use --keep-nonprod-severity to disable this behavior.

Inline Suppressions

Suppress specific findings directly in source code using nyx:ignore comments. Suppressed findings are excluded from output, severity counts, and --fail-on checks by default.

Comment syntax

Language	Comment styles
Rust, C, C++, Java, Go, JS, TS	`// nyx:ignore ...` or `/* nyx:ignore ... */`
Python, Ruby	`# nyx:ignore ...`
PHP	`// nyx:ignore ...`, `# nyx:ignore ...`, or `/* nyx:ignore ... */`

Directive forms

x = dangerous()  # nyx:ignore taint-unsanitised-flow     ← suppresses this line
# nyx:ignore-next-line taint-unsanitised-flow
x = dangerous()                                           ← suppresses this line

nyx:ignore <RULE_ID> – suppresses findings on the same line as the comment.
nyx:ignore-next-line <RULE_ID> – suppresses findings on the next line.
For taint findings, the primary line is the sink line (the line field in output).

Rule ID matching

Case-sensitive, exact match after canonicalization.
Comma-separated: nyx:ignore rule-a, rule-b
Wildcard suffix: nyx:ignore rs.quality.* matches any ID starting with rs.quality.
Taint IDs are canonicalized: nyx:ignore taint-unsanitised-flow matches taint-unsanitised-flow (source 5:1) (parenthetical suffix stripped).

Console behavior

Default: suppressed findings are hidden entirely.
--show-suppressed: suppressed findings appear dimmed with [SUPPRESSED] tag. Summary shows "N issues (M suppressed)".

JSON / SARIF behavior

Default: suppressed findings are excluded from JSON/SARIF output.
--show-suppressed: suppressed findings are included with additional fields:

{
  "suppressed": true,
  "suppression": {
    "kind": "SameLine",
    "matched_pattern": "taint-unsanitised-flow",
    "directive_line": 42
  }
}

Exit code

Suppressed findings do not trigger --fail-on. A scan with only suppressed findings exits 0.

Rule ID Format

Prefix	Detector	Example
`taint-*`	Taint analysis	`taint-unsanitised-flow (source 5:11)`
`cfg-*`	CFG structural	`cfg-unguarded-sink`, `cfg-auth-gap`
`state-*`	State model	`state-use-after-close`, `state-resource-leak`
`<lang>..`	AST patterns	`rs.memory.transmute`, `js.code_exec.eval`

See the Rule Reference for a complete listing.

Language Maturity Matrix

Nyx supports ten languages, but support depth is not uniform. This page gives an honest per-language picture so you can calibrate expectations before depending on Nyx for a given stack.

The classifications here are grounded in three concrete signals:

Rule depth: how many distinct source / sanitizer / sink matchers exist for the language in src/labels/<lang>.rs, and how many vulnerability classes (Cap bits) those matchers cover.
Benchmark results: rule-level precision / recall / F1 on the 433-case corpus in tests/benchmark/RESULTS.md, last measured 2026-04-29 with scanner version 0.5.0.
Known weak spots: FPs and FNs the maintainers have deliberately left in the benchmark rather than suppressed, plus structural engine limitations the corpus does not stress, documented release-by-release in RESULTS.md.

As of 2026-04-29 the synthetic corpus has effectively saturated: every real-CVE fixture fires and rule-level recall is 100%. Nine of ten languages report rule-level F1 = 100.0%; Go reports 98.0% on the back of a single safe-fixture FP. Aggregate rule-level P=0.995, R=1.000, F1=0.998. That means F1 alone no longer differentiates tiers, so the differentiators are rule depth, gated-sink coverage, and structural idioms the corpus does not fully stress (deep pointer aliasing in C/C++, framework-specific context). All parser integrations use tree-sitter and are stable; parsing is not a differentiator.

Tier Summary

Tier	Languages	F1	What to expect
Stable	Python, JavaScript, TypeScript	100%	Deep rule sets, gated sinks (argument-role-aware), framework detection, extensive fixtures, and the bulk of advanced-analysis (SSA two-level solve, context-sensitivity, symbolic execution, abstract interpretation) coverage. Safe to depend on in CI gates.
Beta	Go, Java, PHP, Ruby, Rust	98.0% to 100%	Solid mid-depth rule sets with narrower cap coverage and no gated sinks. Cross-file flows work; some idioms (variable-typed method receivers, framework context, string interpolation, match-arm guards) are partially modeled. Usable in CI; review FP/FN lists before tightening gates.
Preview	C, C++	100% on synthetic corpus	Recent work taught the engine to follow taint through `std::vector` / `std::string` / map containers (including `c_str()`), through fluent builder chains like `Socket::builder().host(h).connect()`, and through inline class member functions. Function pointers and deeper pointer aliasing through `*p` / `p->field` are still not tracked. Rule-level scores against a corpus of obvious unsafe-API uses look perfect, but that is not the same as a clean audit on a real codebase. Pair with clang-tidy, Clang Static Analyzer, or Infer.

Per-Language Detail

Stable tier

Python: 100% P / 100% R / 100% F1 (46-case corpus)

Rule depth: 5 source families, 7 sanitizer families, 21 sink matchers spanning HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
Framework context: Flask, Django, argparse source matchers; flask_request import-alias support.
Advanced analysis: gated sinks (Popen, subprocess.run/call with activation-arg awareness), most SSA-equivalence and symbolic-execution fixtures target Python.
Fixtures: 125 under tests/fixtures/ plus 42 benchmark cases.
Blind spots: f-string interpolation is not explicitly modeled as a distinct taint-producing construct; string-formatting flows are caught by the general concatenation path.

JavaScript: 100% P / 100% R / 100% F1 (42-case corpus)

Rule depth: 3 source families, 10 sanitizer families, 24 sink matchers spanning HTML, URL, JSON, Shell, SQL, Code, SSRF, and File I/O.
Advanced analysis: gated sinks (setAttribute, parseFromString), two-level SSA solve for top-level + per-function scopes (analyse_ssa_js_two_level), prefix-locked SSRF suppression via StringFact, abstract-interpretation interval tracking.
Framework context: Express, Koa, Fastify (via in-file import scan when package.json is absent).
Fixtures: 238 under tests/fixtures/; the largest fixture set of any language.
Blind spots: template literals are lowered through concatenation rather than modeled as a first-class taint operator; dynamic property access (obj[user]) is conservatively treated.

TypeScript: 100% P / 100% R / 100% F1 (47-case corpus)

Rule depth: Shares the JS ruleset (3 sources, 10 sanitizers, 24 sinks) plus TS-specific grammar handling.
Advanced analysis: TSX and JSX grammars wired; discriminated-union narrowing, generic erasure, decorator flow, and interface dispatch are all validated against adversarial type-system stressors.
Framework context: Fastify detection via detect_in_file_frameworks (import-driven, no package.json required).
Fixtures: 39 test fixtures plus 42 benchmark cases.
Blind spots: as any casts and any-typed flows are handled conservatively (treated as tainted).

Beta tier

Go: 96.2% P / 100.0% R / 98.0% F1 (53-case corpus, 1 FP, 0 FNs)

Rule depth: 4 source families, 4 sanitizer families, 9 sink matchers covering HTML, URL, Shell, SQL, SSRF, Crypto, and File I/O.
Framework context: Gin, Echo source matchers.
Open weak spots: one safe Go fixture (go-safe-009) draws a spurious CMDi finding.
Known gaps: no gated sinks, no deserialization class. fmt.Sprintf is deliberately not a sink. Cap coverage is narrower than the Stable tier and argument-role-aware sink modeling is not yet implemented for Go, so production CI gates may surface additional FPs the corpus does not exercise.

Java: 100% P / 100% R / 100% F1 (35-case corpus)

Rule depth: 3 source families, 8 sanitizer families, 10 sink matchers covering HTML, URL, Shell, SQL, Code, SSRF, and Deserialization.
Framework context: Spring, JPA, Hibernate ORM rules; JNDI injection sinks.
Known gaps: no gated sinks. Variable-receiver method calls (client.send(...) vs HttpClient.send(...)) rely on type-qualified resolution from receiver-type inference; flows where the receiver type cannot be inferred are conservatively over-tainted on unusual builder chains.

PHP: 100% P / 100% R / 100% F1 (37-case corpus)

Rule depth: 3 source families ($_GET, $_POST, $_REQUEST superglobals), 7 sanitizer families, 10 sink matchers covering HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
Known gaps: no gated sinks. Limited framework context (Laravel raw methods only). echo language-construct detection is wired but its inner-argument propagation is narrower than function-call sinks.

Ruby: 100% P / 100% R / 100% F1 (39-case corpus)

Rule depth: 3 source families, 7 sanitizer families, 15 sink matchers covering HTML, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
Framework context: Rails helpers (sanitize_sql, permit, require).
Known gaps: string interpolation inside shell and SQL strings is recognized structurally but not modeled as a distinct operator. begin/rescue/ensure exception-edge wiring is documented as deferred (structurally incompatible with build_try()). The previous open rb-interproc-001 FN closed in the 2026-04-28 baseline after the Ruby Kernel#open CMDI sink and exact-match sigil work landed.

Rust: 100% P / 100% R / 100% F1 (70-case adversarial corpus)

Rust holds the largest per-language adversarial corpus and was promoted from Experimental to Beta in the 2026-04-25 measurement after the PathFact landings closed every previously-open rs-safe-* regression.

Rule depth: 6 source families, 2 sanitizer families (prefix and type-coercion), 11 sink matchers covering HTML, Shell, SQL, SSRF, Deserialization, and File I/O. Extensive framework source coverage (Axum, Actix, Rocket); the most of any language on the source side. The narrow sanitizer count is the primary reason Rust is not in the Stable tier. Engine-side path/typed sanitizer recognition (PathFact) compensates, but the ruleset itself is shallow.
Recent additions: SQL class (rusqlite, sqlx, diesel, postgres), Deserialization class (serde_yaml, bincode, rmp_serde, ciborium, ron, toml), expanded file I/O (fs::remove_file/dir/rename/copy), reqwest SSRF builder chain.
Closed by recent PathFact landings (src/abstract_interp/path_domain.rs + per-return-path PathFact entries on SsaFuncSummary): rs-safe-007 (.replace("..","") sanitiser), rs-safe-008 (negative-validation return), rs-safe-009 (match-arm guards via condition lifting), rs-safe-010 (static-map lookup), rs-safe-012 (.contains("..") + .starts_with('/') rejection), rs-safe-014 (Option-returning user sanitiser), rs-safe-015 (Path::new(p).is_absolute() typed rejection), rs-safe-016 (cross-function .contains("..") rejection), and CVE patches CVE-2018-20997, CVE-2022-36113, CVE-2024-24576.
Not yet covered: unsafe FFI / std::mem::transmute (no rules), Tokio process::Command async variants (not distinguished from sync), hyper / surf / ureq SSRF clients (reqwest family only).

Preview tier

C and C++ remain Preview despite reporting 100% rule-level F1 on the synthetic corpus. A run of additions in late April taught the engine to follow taint through several constructs that used to be hard cutoffs (STL containers, builder chains, inline member functions, the wider std::sto* family), so the gap between “passes the synthetic corpus” and “would catch the same flow on a real codebase” is narrower than it used to be. It is not zero. The biggest remaining gaps are deep pointer aliasing and function pointers, both of which are pervasive in real C/C++ code. Treat a clean report as a starting point, not an audit. Pair Nyx with clang-tidy, the Clang Static Analyzer, or Infer for production use.

What now works (added in late April):

STL container flow. vec.push_back(tainted) followed by vec.front().c_str() carries taint into a downstream system() sink. std::map::insert_or_assign, find, count, at, and data all participate in the container store/load model.
Inline class member functions. class C { void run(...) { ... } }; bodies are now extracted as their own functions, so an intra-file call like inner.run(input) resolves to the body summary. Same fix covers struct_specifier, union_specifier, enum_specifier, template_declaration, and extern "C" blocks.
Lambda passthrough. auto echo = [](const char* s) { return s; }; carries argument taint into the result via the engine’s default call-argument propagation.
Builder chains. Socket::builder().host(user).port(8080).connect() resolves the chained returns and fires on .connect() when user is tainted; the safe variant with a hardcoded host stays quiet.
Wider numeric sanitizer family. The full std::sto* set (including stoll, stoull, stold) and the C-stdlib forms (atoi, atof, strtol, etc.) clear all caps when they’re called.
More header / source extensions. .cc, .cxx, .hpp, .hxx, .hh, and .h++ are recognized as C++ on top of .cpp and .c++. .h is intentionally still routed to C since it’s ambiguous without a build system.

Still not modeled (common to both C and C++):

Deep pointer aliasing. Taint through *p, p->field, and arbitrary pointer arithmetic is not tracked through arbitrary aliased writes. Field-sensitive points-to (see Advanced analysis) handles the “lock on a sub-field” case but is not a general escape analysis.
Function pointers and callback dispatch. An indirect call through void (*fn)(char *) resolves to no callee, so cross-pointer flows are invisible.
Array-element taint by index. Writes to buf[i] do not always propagate taint to buf as a whole; the recent subscript-handling work helps the general case but doesn’t make buf an alias for every element.
Nested classes beyond one level (C++ only).

C: 100% P / 100% R / 100% F1 (30-case corpus)

Rule depth: 3 source families, 2 sanitizer families (the sanitize_* prefix and numeric-parse functions), 5 sink matchers spanning Shell, File, SSRF, and Format-String.
Known gaps: no framework rules, no gated sinks. The structural limitations listed above are the dominant concern; rule additions alone will not lift this language out of the Preview tier.

C++: 100% P / 100% R / 100% F1 (33-case corpus, plus 6 new fixtures for STL / builder / inline-method flows)

Rule depth: Builds on the C ruleset with std::cin / std::getline sources and a wider numeric-sanitizer set covering the full std::sto* family (3 sources, 3 sanitizer families, 5 sinks).
Known gaps: still no framework rules and no gated sinks. The structural blind spots are now narrower than they were a release ago (see “What now works” above), but function pointers and the harder pointer-aliasing patterns still produce false negatives.

How the tiers were assigned

Because rule-level F1 has saturated for nine of ten languages, the tier boundaries are drawn primarily on rule depth and engine coverage of real-world idioms rather than on benchmark scores alone.

A language lands in Stable when all three hold:

Rule set covers ≥ 8 vulnerability classes with both source and sink matchers, and at least one class has argument-role-aware gated-sink modeling (e.g. setAttribute("href", url) only flags href-like attrs).
Benchmark F1 ≥ 95% on a corpus of ≥ 25 cases.
Advanced analysis (SSA lowering, context-sensitivity, symbolic execution, abstract interpretation) is exercised by fixtures for the language.

A language lands in Beta when benchmark F1 is in the mid-90s or higher on a meaningful corpus but at least one Stable criterion fails. Typical gaps: absence of gated sinks, or sanitizer rule depth narrow enough that the engine compensates structurally rather than via the ruleset.

A language lands in Preview when the engine has documented structural blind spots for constructs that are pervasive in typical codebases for that language. For C and C++ that means deep pointer aliasing, function pointers, and array-element taint; STL container flow and builder chains have moved out of the blind-spot list. Synthetic-corpus F1 is not a reliable signal for Preview-tier languages: a clean report can coexist with structural gaps.

(The previous Experimental tier was retired in the 2026-04-25 measurement when Rust’s adversarial corpus reached 100% F1; no language currently sits in that tier.)

What this means for you

CI gates: safe to set strict --fail-on HIGH gates on Stable-tier languages. On Beta-tier, expect occasional FP triage on production code (the synthetic corpus does not cover every framework idiom); the weak-spot lists above tell you what to skim for. On Preview-tier, treat Nyx findings as a starting point for manual review rather than authoritative. STL container flow and builder chains are tracked now, but deep pointer aliasing and function pointers are not, so a clean report does not tell you what the engine could not see.
Rule contributions: the shortest path to raising a language’s tier is contributing sink matchers and gated-sink registrations. Label files live at src/labels/<lang>.rs; benchmark cases live at tests/benchmark/corpus/<lang>/.
Scope planning: if your primary stack is C or C++, Nyx will surface real findings on obvious unsafe-API uses, but budget for review time and combine Nyx with clang-tidy or the Clang Static Analyzer. Rust is now Beta-tier and suitable as a CI gate; pair with cargo-audit for dependency CVEs.

The benchmark thresholds in tests/benchmark_test.rs are deliberately set ~5 pp below current baselines so any drop in a language’s F1 fails CI. Tier promotions require sustained benchmark performance, not just rule additions.

Rule reference

Every finding Nyx emits has a rule ID. This page enumerates the IDs that ship with scanner 0.5.0, grouped by family.

This page is written by hand and drifts against the code. Authoritative sources: src/patterns/<lang>.rs for AST patterns, src/labels/<lang>.rs for taint matchers, and src/auth_analysis/config.rs for auth rules. If a rule fires that isn’t listed here, the source file is right and this page is wrong.

If you’d rather browse rules interactively, nyx serve ships a Rules page that lists every loaded matcher with its language, kind, and capability:

Nyx Rules page: filterable list of 218 rules with language, kind (SOURCE/SANITIZER/SINK), capability, and finding count columns

ID format

Prefix	Detector	Example
`taint-*`	Taint analysis	`taint-unsanitised-flow (source 5:11)`
`cfg-*`	CFG structural	`cfg-unguarded-sink`, `cfg-auth-gap`
`state-*`	State model	`state-use-after-close`, `state-resource-leak`
`<lang>.auth.*`	Auth analysis	`rs.auth.missing_ownership_check`
`<lang>.<category>.<name>`	AST patterns	`rs.memory.transmute`, `js.code_exec.eval`

Language prefixes: rs, c, cpp, go, java, js, ts, py, php, rb.

Cross-language rules

Taint

One rule covers every source-to-sink flow. The parenthetical identifies the source location.

Rule ID	Severity
`taint-unsanitised-flow (source L:C)`	Varies by source kind and sink capability

The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in src/labels/<lang>.rs. Language maturity gives per-language counts and what’s covered.

CFG structural

Rule ID	Severity
`cfg-unguarded-sink`	High/Medium
`cfg-auth-gap`	High
`cfg-unreachable-sink`	Medium
`cfg-unreachable-sanitizer`	Low
`cfg-unreachable-source`	Low
`cfg-error-fallthrough`	High/Medium
`cfg-resource-leak`	Medium
`cfg-lock-not-released`	Medium

State model

Rule ID	Severity
`state-use-after-close`	High
`state-double-close`	Medium
`state-resource-leak`	Medium
`state-resource-leak-possible`	Low
`state-unauthed-access`	High

Auth analysis (Rust only, today)

Rule ID	Severity
`rs.auth.missing_ownership_check`	High
`rs.auth.missing_ownership_check.taint`	High (gated by `scanner.enable_auth_as_taint`)

See auth.md for scope, the five sink-classes, and tuning.

AST patterns by language

Each language ships a tree-sitter pattern registry. Structural match on the pattern, no dataflow. Some patterns also have a Tier B heuristic guard (e.g. SQL execute must receive a concatenation, not a literal) noted in the registry.

The tables below are generated from src/patterns/<lang>.rs by tools/docgen. Run cargo run --features docgen --bin nyx-docgen after changing the registry to refresh them.

C: 8 patterns

Rule ID	Severity	Tier	Confidence
`c.cmdi.system`	High	A	High
`c.memory.gets`	High	A	High
`c.memory.printf_no_fmt`	High	B	Medium
`c.memory.scanf_percent_s`	High	A	High
`c.memory.sprintf`	High	A	High
`c.memory.strcat`	High	A	High
`c.memory.strcpy`	High	A	High
`c.cmdi.popen`	Medium	A	High

C++: 9 patterns

Rule ID	Severity	Tier	Confidence
`cpp.cmdi.popen`	High	A	High
`cpp.cmdi.system`	High	A	High
`cpp.memory.gets`	High	A	High
`cpp.memory.printf_no_fmt`	High	B	Medium
`cpp.memory.sprintf`	High	A	High
`cpp.memory.strcat`	High	A	High
`cpp.memory.strcpy`	High	A	High
`cpp.memory.const_cast`	Medium	A	High
`cpp.memory.reinterpret_cast`	Medium	A	High

Go: 8 patterns

Rule ID	Severity	Tier	Confidence
`go.cmdi.exec_command`	High	A	High
`go.transport.insecure_skip_verify`	High	A	High
`go.deser.gob_decode`	Medium	A	High
`go.memory.unsafe_pointer`	Medium	A	High
`go.secrets.hardcoded_key`	Medium	A	High
`go.sqli.query_concat`	Medium	B	Medium
`go.crypto.md5`	Low	A	Medium
`go.crypto.sha1`	Low	A	Medium

Java: 8 patterns

Rule ID	Severity	Tier	Confidence
`java.cmdi.runtime_exec`	High	A	High
`java.deser.readobject`	High	A	High
`java.reflection.class_forname`	Medium	A	High
`java.reflection.method_invoke`	Medium	A	High
`java.sqli.execute_concat`	Medium	B	Medium
`java.xss.getwriter_print`	Medium	A	High
`java.crypto.insecure_random`	Low	A	Medium
`java.crypto.weak_digest`	Low	A	Medium

JavaScript: 22 patterns

Rule ID	Severity	Tier	Confidence
`js.code_exec.eval`	High	A	High
`js.code_exec.new_function`	High	A	High
`js.config.cors_dynamic_origin`	High	A	Medium
`js.code_exec.settimeout_string`	Medium	A	High
`js.config.insecure_session_httponly`	Medium	A	High
`js.config.reject_unauthorized`	Medium	A	High
`js.config.verbose_error_response`	Medium	A	Medium
`js.crypto.weak_hash_import`	Medium	A	Medium
`js.prototype.extend_object`	Medium	A	High
`js.prototype.proto_assignment`	Medium	A	High
`js.secrets.fallback_secret`	Medium	A	Medium
`js.xss.cookie_write`	Medium	A	High
`js.xss.document_write`	Medium	A	High
`js.xss.insert_adjacent_html`	Medium	A	High
`js.xss.location_assign`	Medium	A	High
`js.xss.outer_html`	Medium	A	High
`js.config.insecure_session_samesite`	Low	A	High
`js.config.insecure_session_secure`	Low	A	Medium
`js.crypto.math_random`	Low	A	Medium
`js.crypto.weak_hash`	Low	A	Medium
`js.secrets.hardcoded_secret`	Low	A	Medium
`js.transport.fetch_http`	Low	A	Medium

PHP: 11 patterns

Rule ID	Severity	Tier	Confidence
`php.cmdi.system`	High	A	High
`php.code_exec.assert_string`	High	A	High
`php.code_exec.create_function`	High	A	High
`php.code_exec.eval`	High	A	High
`php.code_exec.preg_replace_e`	High	A	High
`php.deser.unserialize`	High	A	High
`php.path.include_variable`	High	B	Medium
`php.sqli.query_concat`	Medium	B	Medium
`php.crypto.md5`	Low	A	Medium
`php.crypto.rand`	Low	A	Medium
`php.crypto.sha1`	Low	A	Medium

Python: 13 patterns

Rule ID	Severity	Tier	Confidence
`py.cmdi.os_popen`	High	A	High
`py.cmdi.os_system`	High	A	High
`py.cmdi.subprocess_shell`	High	B	Medium
`py.code_exec.eval`	High	A	High
`py.code_exec.exec`	High	A	High
`py.deser.pickle_loads`	High	A	High
`py.deser.yaml_load`	High	A	High
`py.code_exec.compile`	Medium	A	High
`py.deser.shelve_open`	Medium	A	High
`py.sqli.execute_format`	Medium	B	Medium
`py.xss.jinja_from_string`	Medium	A	High
`py.crypto.md5`	Low	A	Medium
`py.crypto.sha1`	Low	A	Medium

Ruby: 11 patterns

Rule ID	Severity	Tier	Confidence
`rb.cmdi.backtick`	High	A	High
`rb.cmdi.system_interp`	High	A	High
`rb.code_exec.class_eval`	High	A	High
`rb.code_exec.eval`	High	A	High
`rb.code_exec.instance_eval`	High	A	High
`rb.deser.marshal_load`	High	A	High
`rb.deser.yaml_load`	High	A	High
`rb.reflection.constantize`	Medium	A	High
`rb.reflection.send_dynamic`	Medium	B	Medium
`rb.ssrf.open_uri`	Medium	A	High
`rb.crypto.md5`	Low	A	Medium

Rust: 13 patterns

Rule ID	Severity	Tier	Confidence
`rs.memory.copy_nonoverlapping`	High	A	High
`rs.memory.get_unchecked`	High	A	High
`rs.memory.mem_zeroed`	High	A	High
`rs.memory.ptr_read`	High	A	High
`rs.memory.transmute`	High	A	High
`rs.quality.unsafe_block`	Medium	A	High
`rs.quality.unsafe_fn`	Medium	A	High
`rs.memory.mem_forget`	Low	A	High
`rs.memory.narrow_cast`	Low	A	Medium
`rs.quality.expect`	Low	A	High
`rs.quality.panic_macro`	Low	A	High
`rs.quality.todo`	Low	A	High
`rs.quality.unwrap`	Low	A	High

TypeScript: 22 patterns

Rule ID	Severity	Tier	Confidence
`ts.code_exec.eval`	High	A	High
`ts.code_exec.new_function`	High	A	High
`ts.config.cors_dynamic_origin`	High	A	Medium
`ts.code_exec.settimeout_string`	Medium	A	High
`ts.config.insecure_session_httponly`	Medium	A	High
`ts.config.reject_unauthorized`	Medium	A	High
`ts.config.verbose_error_response`	Medium	A	Medium
`ts.crypto.weak_hash_import`	Medium	A	Medium
`ts.prototype.proto_assignment`	Medium	A	High
`ts.secrets.fallback_secret`	Medium	A	Medium
`ts.xss.document_write`	Medium	A	High
`ts.xss.insert_adjacent_html`	Medium	A	High
`ts.xss.location_assign`	Medium	A	High
`ts.xss.outer_html`	Medium	A	High
`ts.config.insecure_session_samesite`	Low	A	High
`ts.config.insecure_session_secure`	Low	A	Medium
`ts.crypto.math_random`	Low	A	Medium
`ts.crypto.weak_hash`	Low	A	Medium
`ts.quality.any_annotation`	Low	A	Medium
`ts.quality.as_any`	Low	A	Medium
`ts.secrets.hardcoded_secret`	Low	A	Medium
`ts.xss.cookie_write`	Low	A	Medium

Capability list for custom rules

nyx config add-rule --cap <name> and [analysis.languages.*.rules] in config accept:

env_var, html_escape, shell_escape, url_encode, json_parse, file_io, fmt_string, sql_query, deserialize, ssrf, code_exec, crypto, unauthorized_id, all

Source for both the enum and the to_cap mapping: src/labels/mod.rs (Cap) and src/utils/config.rs (CapName).

Auth analysis

Rust today. Other languages have rule scaffolding in src/auth_analysis/config.rs (Python, Ruby, Go, Java, JavaScript, TypeScript), but only Rust has benchmark corpus coverage and the precision work to back it. Treat findings on other languages as preview; the rule prefix (py.auth.*, js.auth.*, rb.auth.*, go.auth.*, java.auth.*) is reserved but the matchers haven’t been validated against real codebases yet.

What it catches

The Rust rule is rs.auth.missing_ownership_check. It fires when a request handler reaches a privileged operation that takes a scoped identifier (*_id, row reference, scoped resource) without a preceding ownership or membership check.

Concretely, it looks for five patterns of authorization in the function body and flags the call when none are present:

A call to a recognised authorization helper. Defaults: check_ownership, has_ownership, require_ownership, ensure_ownership, is_owner, authorize, verify_access, has_permission, can_access, can_manage, plus *_membership and require_{group,org,workspace,tenant,team}_member variants. Extend in [analysis.languages.rust].
An ownership-equality check on a row reference: if owner_id != user.id { return 403 } or any field_id != self_actor shape. The check writes AuthCheck evidence back to the row-fetch arguments via AnalysisUnit.row_field_vars.
A self-actor reference: let user = require_auth(...).await? followed by use of user.id, user.user_id, user.uid. The actor is recognised from typed extractor params (Extension<Session>, CurrentUser, etc.) and from typed helper bindings.
A SQL query that joins through an ACL table or filters by user_id predicate. Detected without a SQL parser via sql_semantics.rs; the authorized result variable propagates through let row = ...prepare(LIT)..., for row in result, let id = row.get(...).
A helper-summary lift: handler calls validate_target(db, widget_id, user.id) whose body contains a require_*_member call. Cross-function summaries are merged at fixed-point (capped at 4 iterations).

Sink classification

The same call name can be safe on a local collection and dangerous on a database. The detector categorises each candidate sink before deciding whether to flag:

Class	Examples	Default treatment
`InMemoryLocal`	`map.insert`, `set.insert`, `vec.push` on tracked local	Never a sink
`RealtimePublish`	`realtime.publish_to_group`, `pubsub.send`	Sink unless ownership is established for the channel scope
`OutboundNetwork`	`http.post`, `reqwest::Client::post`	Sink unless a sanitiser is on the path
`CacheCrossTenant`	`redis.set`, `memcached.set` with scoped keys	Sink unless tenant is checked
`DbMutation`	`db.insert`, `repo.save` with scoped IDs	Sink unless ownership is established
`DbCrossTenantRead`	`db.query` returning rows from a tenant scope	Sink unless ACL-join or tenant predicate is present

Receiver type drives the classification when SSA type facts are available, so client.send(...) correctly resolves through the receiver’s inferred type.

What it can’t catch

Non-Rust frameworks, in practice. Scaffolding exists; coverage doesn’t.
Type-system authorization. A typestate pattern that makes unauthenticated handlers fail to compile (fn endpoint(user: AuthenticatedUser<Admin>)) is invisible. This is mostly fine because the type system already enforced the check, but the rule won’t credit it.
Authorization performed only via macros that the AST doesn’t expose as a recognisable call.
Cross-async-boundary actor binding. If the handler awaits let user = require_auth(...).await? and then spawns a task that uses user.id after a tokio::spawn, the spawn body is treated as a separate scope.

The taint-based variant

A second rule, rs.auth.missing_ownership_check.taint, folds the same logic into the SSA/taint engine using the Cap::UNAUTHORIZED_ID capability (bit 12). Request-bound handler parameters seed UNAUTHORIZED_ID into taint state; ownership checks act as sanitizers that strip the cap; sinks that take scoped IDs require it absent.

This path is off by default while the standalone analyser carries the stable signal. Enable both:

[scanner]
enable_auth_as_taint = true

Run them together; if both fire for the same site, treat it as the same finding (the taint variant carries fuller flow evidence).

Tuning

Add a project-specific authorization helper

[[analysis.languages.rust.rules]]
matchers = ["require_subscription", "ensure_paid_seat"]
kind     = "sanitizer"
cap      = "unauthorized_id"

The same rule recognised in the standalone analyser also strips Cap::UNAUTHORIZED_ID for the taint-based variant.

Recognised actor names

Recognised by default: user.id, user.user_id, user.uid, session.user_id, current_user.id, plus typed extractor parameters with CurrentUser, SessionUser, AuthUser, Extension<...> shapes. To add a custom binding pattern, file an issue or add a fixture; the heuristic is in src/auth_analysis/checks.rs under extract_validation_target and friends.

Suppress

Inline:

#![allow(unused)]
fn main() {
db.insert(widget_id, value)?;  // nyx:ignore rs.auth.missing_ownership_check
}

Or filter by severity / confidence in CI:

nyx scan . --severity ">=MEDIUM" --min-confidence medium

In the UI

Auth findings render alongside taint findings in the browser UI. The flow visualiser shows the sink call, the actor reference (when one was found), and any helper-summary path the engine traversed; the How to fix panel mirrors the rule’s recommendation.

Nyx finding detail: numbered source → call → sink walk with a How to fix panel and an inline evidence object

Where the work was done

The remediation work is documented release-by-release in tests/benchmark/RESULTS.md under the Rust auth row. Phases A1 through B5 (precision and structural improvements) and Phase C (taint-based variant) all landed on the 0.5.0 release branch. The benchmark corpus at tests/benchmark/corpus/rust/auth/ is 10 fixtures covering the five FP patterns plus a true-positive control.

How Nyx works

If you’re going to act on a finding, it helps to know how the scanner got there. This page is the short version. Source paths are linked where the answer to “exactly what does it do” lives in the code.

The pipeline

A scan runs in two passes over the file tree, with an optional SQLite index that lets the second scan skip files whose content hash hasn’t changed.

Pass 1, per file. Tree-sitter parses the file. Nyx builds an intra-procedural control-flow graph, lowers it to SSA, and extracts a summary per function describing what that function does at the boundary: which arguments flow to sinks, which sources it reads from, which sinks it calls, what taint it strips, what it returns. Summaries are persisted to SQLite (src/summary/, src/database.rs).

Summary merge. All per-file summaries get unioned into a global map keyed by qualified function name.

Pass 2, per file. Each file is reanalysed with the global summaries available. The taint engine runs a forward dataflow worklist over the SSA representation. When it hits a call, it consults summaries to decide whether the call propagates taint, sanitizes it, or terminates the flow. Findings are produced when tainted data reaches a sink whose required capability is still set on the value.

Two extra layers tune precision around calls. Context-sensitive inlining (k=1) re-runs intra-file callees with the actual argument taint at the call site, so a helper called once with tainted input and once with sanitized input produces the right result for each call. SCC fixed-point: when a group of mutually-recursive functions forms a strongly-connected component in the call graph, the engine iterates summaries to a joint fixed-point (capped at 64 iterations). SCCs that span files are also handled.

When a method call has a receiver typed as a super-class, trait, or interface, hierarchy fan-out widens the resolved callee set to every concrete implementer the engine has seen. A class diagram extracted in pass 1 (Java extends/implements, Rust impl-for, TS/JS extends, Python bases, Ruby includes, PHP extends/implements, C++ inheritance) feeds an index that the call resolver consults during pass 2. The fan-out is capped at 8 implementers per call site; over-fanning is a precision tax, not a soundness issue.

A separate field-sensitive points-to pass tracks abstract locations down to the field level, so c.mu.Lock() is a lock on Field(c, mu) rather than on c as a whole. That distinction is what lets the resource-lifecycle and taint passes tell obj.field = tainted; sink(obj.other_field) apart from the conservative whole-variable approximation. Subscript reads and writes (arr[i], map[k] = v) lower to synthetic __index_get__ / __index_set__ calls so the same container model handles them. Set NYX_POINTER_ANALYSIS=0 to fall back to the pre-pointer-pass behaviour for one release if you need to compare baselines.

Optional analyses on top

These run on top of the forward taint pass. They’re independently switchable via [analysis.engine] config or matching CLI flags. See advanced-analysis.md for the full description and tradeoffs.

Pass	Purpose	Default
Abstract interpretation	Carries interval and string prefix/suffix bounds alongside taint. Suppresses findings on proven-bounded integers and locked-prefix URLs	on
Context sensitivity	k=1 inlining for intra-file callees	on
Field-sensitive points-to	Distinguishes `obj.field` from `obj` itself, so a tainted write to one field does not poison reads from another. Also gives the resource-lifecycle pass per-field locks	on
Hierarchy fan-out	When a method call’s receiver is typed as a super-class, trait, or interface, widens callee resolution to every concrete implementer the engine has seen	on
Constraint solving	Drops paths whose accumulated branch predicates are unsatisfiable. Optional Z3 backend with `--features smt`	on
Symbolic execution	Builds an expression tree per tainted value. Produces a witness string at the sink. Detects sanitization patterns the taint engine alone would miss	on
Backwards analysis	After the forward pass, walks backwards from each sink to confirm or invalidate the flow. Annotates findings as `backwards-confirmed`, `backwards-infeasible`, or `backwards-budget-exhausted`	off

--engine-profile fast | balanced | deep flips groups of these at once. balanced is the default and the configuration the benchmark numbers in language-maturity.md are measured against.

Where bounds live

Static analysis at scale means choosing where to stop. Nyx exposes its bounds rather than hiding them:

Inline depth is k=1. Callees larger than the inline body-size cap fall back to summary-based resolution.
SCC fixed-point is capped at 64 iterations. If a recursive cluster doesn’t converge, the engine emits the best summary it has and records an engine_note on affected findings.
Lattice width is bounded. Taint origin sets cap at 32 entries per SSA value (--max-origins); points-to sets cap at 32 heap objects (--max-pointsto). Truncation is recorded as OriginsTruncated / PointsToTruncated so you can see when precision was lost.
Symbolic expressions cap at depth 32. Deeper expressions degrade to Unknown rather than growing without bound.

Findings whose engine notes indicate a bound was hit can be filtered with --require-converged for strict CI gates. The flag drops over-reports and bails; under-reports (where the emitted finding is still real but the result set is a lower bound) are kept.

What you get out

Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling.

For the JSON shape and SARIF mapping, see output.md.

Advanced Analysis

Nyx layers several analysis passes on top of the core SSA taint engine. Most are switchable via config ([analysis.engine] in nyx.conf / nyx.local), a matching CLI flag pair, or, as a last-resort override for library users with no CLI entry point, a NYX_* environment variable. The five precision-tuning passes (abstract interpretation, context sensitivity, symbolic execution, constraint solving, field-sensitive points-to) are on by default because the benchmark numbers in language-maturity.md are measured with them on. The demand-driven backwards walk and hierarchy fan-out sit alongside but are not user-toggleable in the same way.

See Configuration for the full config surface and CLI flag table. This page explains what each pass does, why it helps, how to disable it, and what it does not cover.

Abstract interpretation

What it does. Propagates interval and string abstract domains through the SSA worklist alongside taint. Integer values carry [lo, hi] bounds; string values carry a prefix and suffix (plus a bit domain for known-zero / known-one bits). Values are joined at merge points and widened at loop heads so the worklist always terminates.

Why it helps. Lets Nyx suppress some findings that are obviously safe given the abstract value; a proven-bounded integer does not flow into a SQL sink as an injection risk; an SSRF sink whose URL prefix is locked to a trusted host stays quiet. This turns a large class of FPs on numeric and locked-prefix paths into true negatives.

How to turn it off.

Surface	Value
Config	`abstract_interpretation = false` under `[analysis.engine]`
CLI flag	`--no-abstract-interp`
Env var (legacy)	`NYX_ABSTRACT_INTERP=0`

Limitations. The interval domain is 64-bit signed; very wide or overflow-producing arithmetic degrades to ⊤ (unbounded). String prefix / suffix tracking is concat-only; it does not model reordering, reversal, or character-level regex constraints. Loop widening deliberately drops changing bounds rather than chasing fixpoints.

Source: src/abstract_interp/.

Context-sensitive analysis

What it does. Adds k=1 call-site-sensitive taint propagation for intra-file callees. When a function is invoked, Nyx reanalyzes the callee body with the actual per-argument taint signature of the call site, producing call-site-specific return taint. Results are cached by (function_name, ArgTaintSig) so repeated calls with the same signature are free.

Why it helps. A helper called once with a tainted argument and once with a sanitized argument produces two different findings; without k=1 sensitivity, the conservative union of both call sites would be applied to the sanitized call, producing a spurious finding there.

How to turn it off.

Surface	Value
Config	`context_sensitive = false` under `[analysis.engine]`
CLI flag	`--no-context-sensitive`
Env var (legacy)	`NYX_CONTEXT_SENSITIVE=0`

Limitations. Intra-file only. Cross-file callees are resolved via summaries (see src/summary/) rather than re-inlined. Depth is capped at k=1 to prevent cache blow-up and re-entrancy; higher k would require a different cache key design. Callee bodies larger than the internal MAX_INLINE_BLOCKS threshold fall back to the summary path. Cache keys hash per-argument Cap bits but not source-origin identity, so two callers with identical caps but different origins share cached origin-attribution.

Source: src/taint/ssa_transfer.rs (ArgTaintSig, InlineCache, inline_analyse_callee).

Field-sensitive points-to

What it does. Runs a Steensgaard-style alias analysis that interns field accesses as their own abstract locations. c.mu becomes Field(c, mu), distinct from c itself; a write to obj.cache and a read from obj.cache in different methods both land on the same abstract location; subscript reads and writes (arr[i], map[k] = v) lower to synthetic __index_get__ / __index_set__ calls so the engine can model them through the same container store/load primitives used for STL containers, Python lists, JS arrays, and similar.

Why it helps. It splits a class of false positives that the whole-variable taint model produced. Before this pass, obj.field = tainted; sink(obj.other_field) would taint obj as a whole and fire on the safe field; the receiver-type / sub-field distinction is also what lets the resource-lifecycle pass attribute a c.mu.Lock() to the lock field rather than to its container. Cross-method field flow (writer in one method, reader in another) shows up only when fields have stable identity independent of the parent value.

How to turn it off.

Surface	Value
Env var	`NYX_POINTER_ANALYSIS=0`

The pass is on by default as of 2026-04-26. The env-var override is kept for one release so you can compare against the pre-pointer baseline, then will be removed.

Limitations. This is not a general escape analysis. Function pointers and arbitrary indirect calls still resolve to no callee, and deep alias chains through *p / p->field in C/C++ are not tracked beyond the direct field case. The points-to set per value is capped at --max-pointsto (default 32); when truncation happens, an engine note records the precision loss.

Source: src/pointer/.

Hierarchy fan-out for virtual dispatch

What it does. Builds a per-language type-hierarchy index in pass 1 (extends, implements, impl-for, includes; the exact construct depends on the language) and uses it in pass 2 to widen method-call resolution. When a call’s receiver is statically typed as a super-class, trait, or interface, the resolver returns every concrete implementer it has seen in the codebase rather than just the first match.

Why it helps. Without it, a call like repository.findById(id) where repository is typed as the interface gets resolved against whatever the single-result resolver finds first; if the matching implementer is in another file the call effectively goes opaque. With the hierarchy, the taint engine sees the union of every implementer’s transform and the flow shows up regardless of which file holds the concrete class.

Limitations. Fan-out is capped at 8 implementers per call site; over that, the tail is silently dropped (a debug log records the cap hit) and the call is treated as a non-deterministic union of the kept implementers. Languages that use structural / implicit interface satisfaction (Go) are deliberately skipped because per-file extraction is intractable; those calls fall back to the single-result resolver. The extractor covers Java, Rust, TS/JS/TSX, Python, Ruby, PHP, and C++.

Source: src/cfg/hierarchy.rs and src/summary/mod.rs (TypeHierarchyIndex, resolve_callee_widened).

Symbolic execution

What it does. Builds a symbolic expression tree per tainted SSA value, generates a witness string for each taint finding (the concrete-looking shape of the dangerous value at the sink), and detects sanitization patterns that the taint engine alone would miss. Supports string operations (trim, replace, toLower, substring, strlen, …), arithmetic, concatenation, phi nodes, and opaque calls.

Why it helps. Raises finding quality. A taint finding with a rendered witness like "SELECT * FROM t WHERE id=" + userInput is substantially easier to triage than one without. Also powers some confidence-gating for downstream display.

How to turn it off.

Surface	Value
Config	`symex.enabled = false` under `[analysis.engine]`
CLI flag	`--no-symex`
Env var (legacy)	`NYX_SYMEX=0`

Two nested switches refine the scope without disabling symex entirely:

Setting	CLI	Env	Default	Effect
`symex.cross_file`	`--no-cross-file-symex`	`NYX_CROSS_FILE_SYMEX=0`	on	Consult cross-file SSA bodies so symex can reason about callees defined in other files
`symex.interprocedural`	`--no-symex-interproc`	`NYX_SYMEX_INTERPROC=0`	on	Intra-file interprocedural symex (k ≥ 2 via frame stack)

Limitations. Expression trees are bounded at MAX_EXPR_DEPTH=32; deeper expressions degrade to Unknown rather than growing unboundedly. Sanitizer detection is informational: string-replace sanitizer patterns are reported as witness metadata, not used to clear taint.

Source: src/symex/.

Demand-driven analysis

What it does. After the forward pass-2 taint analysis finishes, runs a backwards walk from each sink’s tainted SSA operands. The walk follows reverse SSA-edge transfer (phi fan-out, Assign operand-fanout, Call body-expansion or arg-fanout) until it reaches a taint source, proves the flow infeasible via an accumulated path predicate, or exhausts its budget. Each forward finding is then annotated with the aggregate verdict:

backwards-confirmed; a matching source was reached. Finding picks up a small confidence boost and the note appears in evidence.symbolic.cutoff_notes.
backwards-infeasible; every walk proved the flow unreachable. Finding is capped to Low confidence and a user-readable limiter is attached.
backwards-budget-exhausted; the walk hit BACKWARDS_VALUE_BUDGET without a verdict. Recorded as a limiter so operators can see when the pass could not keep up.
Inconclusive outcomes are a no-op: the forward finding is untouched.

Because the backwards walk can consult GlobalSummaries.bodies_by_key (populated by the cross-file callee body persistence layer) it closes across file boundaries; when a callee body is not loadable the walk falls back to fanning out over the call’s arguments so local reach-back is still possible.

Why it helps. Inverts the analysis direction so budget follows questions the scanner actually cares about; “does any source reach this sink?”; instead of proving every potential source-to-sink path. Corroborated findings are a stronger signal than forward-only ones, and proven-infeasible flows provide a principled way to lower confidence on forward false positives without silently dropping them.

How to turn it on. Defaults off so the benchmark floor is preserved while the pass stabilises.

Surface	Value
Config	`backwards_analysis = true` under `[analysis.engine]`
CLI flag	`--backwards-analysis` / `--no-backwards-analysis`
Env var (legacy)	`NYX_BACKWARDS=1`

Limitations (first cut). Reverse call-graph expansion past a ReachedParam is deferred; the walk terminates at function parameters rather than crossing back into callers. Path-constraint pruning is conservative: only the accumulated PredicateSummary bits are consulted, not the full symbolic predicate stack. Depth-bounded at k=2 for cross-function body expansion. See DEFAULT_BACKWARDS_DEPTH, BACKWARDS_VALUE_BUDGET, and MAX_BACKWARDS_CALLEE_BLOCKS in src/taint/backwards.rs for the exact bounds.

Source: src/taint/backwards.rs.

Constraint solving

What it does. Collects path constraints at each branch in SSA and propagates them alongside taint. Prunes paths whose accumulated constraint set is unsatisfiable; a taint flow guarded by if x < 0 && x > 10 is dropped rather than surfaced. Optionally delegates the satisfiability check to Z3 when Nyx is built with the smt Cargo feature.

Why it helps. Removes a class of FPs rooted in clearly-infeasible control-flow combinations. Without path constraints, a taint flow that only occurs when mutually-exclusive branches are simultaneously taken can still produce a finding.

How to turn it off.

Surface	Value
Config	`constraint_solving = false` under `[analysis.engine]`
CLI flag	`--no-constraint-solving`
Env var (legacy)	`NYX_CONSTRAINT=0`

The SMT backend is a separate switch:

Setting	CLI	Env	Default	Effect
`symex.smt`	`--no-smt`	`NYX_SMT=0`	on when built with `smt` feature	Delegate satisfiability checks to Z3; ignored if Nyx was built without `smt`

Limitations. The default path-constraint domain is syntactic; trivially-inconsistent pairs are caught without an SMT solver, but richer algebraic unsatisfiability requires the smt feature (Z3). Without smt, Nyx ships a lightweight satisfiability check that catches literal contradictions but not deeper reasoning.

Source: src/constraint/.

Combining the switches

The defaults (all on) are the configuration Nyx is benchmarked against. Turning any switch off trades precision for speed and may move findings relative to the published baseline; CI regression gates assume defaults. If you need a minimal-overhead scan (for very large repositories or a pre-commit fast path), the AST-only scan mode (--mode ast) skips CFG, taint, and all four advanced passes entirely and is the right tool.

Detectors

Nyx ships four independent detector families. They run together in --mode full, the default. Findings are merged, deduplicated, ranked, and printed in one result set.

Family	Rule prefix	Looks at	What it finds
Taint analysis	`taint-*`	Cross-file dataflow	Unsanitized data flowing source to sink
CFG structural	`cfg-*`	Per-function control flow	Auth gaps, unguarded sinks, error fallthrough, resource release on all paths
State model	`state-*`	Per-function state lattice	Use-after-close, double-close, leaks, unauthenticated access
AST patterns	`<lang>.<cat>.<name>`	Tree-sitter structural match	Banned APIs, weak crypto, dangerous constructs

For Rust auth-specific rules (rs.auth.*), see auth.md.

How they combine

In --mode full:

Taint and AST can both fire on one line. If eval(userInput) triggers both js.code_exec.eval (AST) and taint-unsanitised-flow (taint), both are kept with distinct rule IDs. The taint finding ranks higher because of the analysis-kind bonus.
State supersedes CFG on resource leaks. When state-resource-leak and cfg-resource-leak fire at the same location, the CFG one is dropped.
Exact duplicates are removed. Same line, column, rule ID, severity → one finding.

Modes

Mode	Active detectors
`full` (default)	All four
`ast`	AST patterns only
`cfg`	Taint + CFG + State (no AST patterns)
`taint`	Taint + State

Attack-surface ranking

Every finding gets a deterministic score. Findings are sorted by descending score by default. Disable with --no-rank or output.attack_surface_ranking = false.

score = severity_base + analysis_kind + evidence_strength + state_bonus - validation_penalty

Component	Values
Severity base	High=60, Medium=30, Low=10
Analysis kind	taint=+10, state=+8, cfg with evidence=+5, cfg without evidence=+3, ast=+0
Evidence strength	+1 per evidence item up to 4; +2 to +6 for source kind
State bonus	use-after-close / unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1
Validation penalty	-5 if path-validated

Source-kind contributions (taint only):

Source	Bonus
User input (`req.body`, `argv`, `stdin`, `form`, `query`, `params`)	+6
Environment (`env::var`, `getenv`, `process.env`)	+5
Unknown	+4
File system	+3
Database	+2

Approximate score ranges:

Finding type	Score
High taint with user input	76 to 81
High state (use-after-close)	~74
High CFG structural	63 to 68
Medium taint with env source	45 to 50
Medium state (resource leak)	~40
Low AST-only pattern	~10

For the engine’s runtime model (passes, summaries, SCC fixed-point), see how-it-works.md.

AST patterns

AST patterns are tree-sitter queries that match dangerous structural shapes in source. No dataflow, no CFG. A match means the construct is present; it’s not proof the construct is exploitable.

Patterns run in every analysis mode. In --mode ast they’re the only active detector.

Rule IDs

<lang>.<category>.<name>

Examples: js.code_exec.eval, py.deser.pickle_loads, c.memory.gets, java.sqli.execute_concat.

Full list: rules.md.

Tiers

Tier	Meaning
A	Structural presence alone is high-signal. `gets`, `eval`, `pickle.loads`, `mem::transmute`
B	Pattern includes a tree-sitter heuristic guard. Example: `java.sqli.execute_concat` only fires when `executeQuery` receives a `binary_expression` (string concatenation), not a literal or a parameterized statement

What patterns can’t tell you

Dataflow. eval("1+1") (safe) and eval(userInput) (dangerous) both match js.code_exec.eval. The taint detector is the one that distinguishes them.
Reachability. A pattern in dead code matches identically.
Semantics. strcpy(dst, src) always matches, regardless of buffer sizes.
Indirect calls. let e = eval; e(input) doesn’t match eval.
Aliased imports. from os import system as s; s(cmd) won’t match system.
Macro expansions. Tree-sitter parses the macro call site, not the expansion.

Common false positives

Scenario	Why	Mitigation
`eval("hardcoded literal")`	Pattern matches structure	Run `--mode cfg` to drop AST patterns and rely on taint
`unsafe` block with sound justification	Every `unsafe` matches `rs.quality.unsafe_block`	Filter `>=MEDIUM` (it’s Medium) or accept the noise
`.unwrap()` in tests	Acceptable in test code	Default non-prod severity downgrade reduces it
`md5` for non-cryptographic checksums	Pattern can’t see intent	Suppress with `--severity ">=MEDIUM"` or per-line `nyx:ignore`
SQL concat with trusted data (Tier B)	Heuristic can’t verify the source	Taint is more precise; or convert to a parameterized query

Confidence levels

Every AST pattern carries an explicit confidence:

Confidence	Use
High	Inherently dangerous construct with no safe usage. `gets`, `pickle.loads`, `eval` with no guard
Medium	Likely issue, context may change the call. SQL concatenation (Tier B), `unsafe` blocks, `exec`
Low	Heuristic. Often appears in safe code. Weak crypto for checksums, `unwrap` outside tests, `Math.random`

--min-confidence medium (or output.min_confidence = "medium") drops Low-confidence matches.

Tuning

nyx scan . --severity ">=MEDIUM"        # drop Low-tier patterns
nyx scan . --severity HIGH              # banned APIs and code-exec only
nyx scan . --mode cfg                   # drop AST patterns; keep taint + state + cfg

[scanner]
excluded_directories = ["node_modules", "vendor", "generated"]

Examples

Tier A, structural presence:

char buf[64];
gets(buf);                              // c.memory.gets

import pickle
data = pickle.loads(user_input)         // py.deser.pickle_loads

Tier B, heuristic guard:

// Fires: concatenated argument
stmt.executeQuery("SELECT * FROM users WHERE id=" + userId);  // java.sqli.execute_concat

// Does not fire: parameterized
stmt.executeQuery(preparedSql);

printf(user_input);                     // c.memory.printf_no_fmt: fires (variable as fmt)
printf("%s", user_input);               // does not fire (literal fmt)

CFG structural analysis

Nyx builds an intra-procedural control-flow graph per function and checks structural properties: whether sinks are guarded by sanitizers or validators, whether web handlers check authentication, whether resources are released on all exit paths, and whether error paths terminate before reaching dangerous code.

These detectors use dominator analysis. A guard dominates a sink when the guard must execute before the sink on every path from entry.

Rule IDs

Rule ID	Severity
`cfg-unguarded-sink`	High/Medium
`cfg-auth-gap`	High
`cfg-unreachable-sink`	Medium
`cfg-unreachable-sanitizer`	Low
`cfg-unreachable-source`	Low
`cfg-error-fallthrough`	High/Medium
`cfg-resource-leak`	Medium
`cfg-lock-not-released`	Medium

What it detects

cfg-unguarded-sink: A sink call (system, eval, Command::new, db.execute, etc.) is reachable from function entry without passing through any guard or sanitizer that matches the sink’s capability.

cfg-auth-gap: A function identified as a web handler (by parameter naming conventions like req, res, ctx, request, language-dependent) reaches a privileged sink (shell execution, file I/O) without a preceding authentication call.

cfg-unreachable-*: Sinks, sanitizers, or sources in dead code. Usually signals a refactoring error that silently disabled security-relevant logic.

cfg-error-fallthrough: An error-handling branch (null check, error-return check) does not terminate. Execution falls through to a dangerous operation on the error path.

cfg-resource-leak, cfg-lock-not-released: A resource acquisition (File::open, fopen, socket, Lock) is not matched by a release on every exit path from the function.

What it can’t detect

Inter-procedural guards. Middleware-level auth, helper functions that internally call auth, and cleanup performed in a caller are invisible.
Dynamic dispatch. Virtual calls, function pointers, closures resolve to no specific callee.
Correctness of guards. The detector checks a guard dominates the sink. It cannot check the guard is correct. A no-op if true {} would suppress the finding.
Custom validation logic. Only recognised guard names are checked. if password == expected is not a recognised guard.
Cross-function resource flows. If a file handle opens in one function and closes in another, the opener gets flagged as a leak. This is the largest source of FPs on factory-pattern code.

Common false positives

Scenario	Why	Mitigation
Framework middleware auth	Handler doesn’t call auth directly	Expected; suppress with severity filter or exclude handlers
RAII / defer cleanup	Implicit release not visible to CFG (partially handled for Rust Drop and Go defer)	Known limitation
Custom guard name	Function not in the recognised guard list	Add it as a sanitizer rule in config
Test handlers	Intentional lack of auth	Default non-prod downgrade reduces severity; or exclude test dirs

Common false negatives

Scenario	Why
Auth in a called helper	Cross-function guards not tracked
Type-system guards	Rust `AuthenticatedUser<T>` wrappers, typestate patterns not analysed
Cleanup in `finally`/`ensure`/`defer` in callers	Cross-function cleanup not tracked

Tuning

Recognised guard names

Nyx accepts these patterns as dominating guards:

Pattern	Applies to
`validate`, `sanitize`	All sinks
`check_`, `verify_`, `assert_*`	All sinks
`shell_escape`	Shell sinks
`html_escape`	HTML/XSS sinks
`url_encode`	URL sinks
`which`	Shell execution (binary lookup)

Recognised auth names

Pattern	Language
`is_authenticated`, `require_auth`, `check_permission`, `authorize`, `authenticate`, `require_login`, `check_auth`, `verify_token`, `validate_token`	Cross-language
`middleware.auth`, `auth.required`	Go
`isAuthenticated`, `checkPermission`, `hasAuthority`, `hasRole`	Java

For Rust auth checks (require_*, ownership equality, row-level checks), see auth.md.

Custom guards

[[analysis.languages.python.rules]]
matchers = ["validate_request", "check_csrf"]
kind = "sanitizer"
cap  = "all"

Custom auth functions

[[analysis.languages.javascript.rules]]
matchers = ["ensureLoggedIn", "requirePermission"]
kind = "sanitizer"
cap  = "all"

Examples

Unguarded sink:

func handler(w http.ResponseWriter, r *http.Request) {
    cmd := r.URL.Query().Get("cmd")
    exec.Command("sh", "-c", cmd).Run()  // cfg-unguarded-sink
}

Auth gap:

app.get('/admin/delete', (req, res) => {
    // No auth call
    db.execute("DELETE FROM users WHERE id = " + req.params.id);  // cfg-auth-gap
});

Resource leak:

void process() {
    FILE *f = fopen("data.txt", "r");
    if (error) {
        return;           // cfg-resource-leak: f not closed on this path
    }
    fclose(f);
}

State model analysis

Tracks resource lifecycle and authentication state through a function. Detects use-after-close, double-close, leaks, and unauthenticated access to privileged operations.

State analysis is on by default. Disable with scanner.enable_state_analysis = false. It runs in --mode full and --mode taint; AST-only mode skips it.

Rule IDs

Rule ID	Severity
`state-use-after-close`	High
`state-double-close`	Medium
`state-resource-leak`	Medium
`state-resource-leak-possible`	Low
`state-unauthed-access`	High

What it detects

state-use-after-close: Resource transitions to CLOSED (via close, fclose, disconnect, …), then a use operation happens on it.

FILE *f = fopen("data.txt", "r");
fclose(f);
fread(buf, 1, 100, f);  // state-use-after-close

state-double-close: Resource closed twice. Crashes or undefined behaviour on most runtimes.

state-resource-leak: Resource opened but never closed on any path through the function. Definite leak.

state-resource-leak-possible: Resource closed on some paths but not others. Lower confidence; often an early-return error path.

state-unauthed-access: A function recognised as a web handler reaches a privileged sink without an auth call on the path.

A function counts as a web handler if its name starts with handle_, route_, or api_ (sufficient on its own), or starts with serve_/process_ and the file uses web-shaped parameter names (request, req, ctx, res, response, w, writer, language-dependent). main is excluded.

Managed-resource suppression

Several language-specific cleanup patterns suppress leak findings:

Pattern	Languages	Effect
RAII / Drop	Rust	All leak findings suppressed except `alloc`/`dealloc`
Smart pointers	C++	`make_unique`/`make_shared` treated as managed; raw `new`/`malloc` still tracked
`defer`	Go	`defer f.Close()` suppresses leak at exit
`with` context manager	Python	`with open(f) as f:` suppresses leak for the bound name
try-with-resources	Java	TWR-bound resources suppressed

What it can’t detect

Cross-function resource ownership. Open in one function, close in another, leak gets reported in the opener. The most common FP source for leak detection.
Factory / builder functions that return a resource for the caller to manage.
Variable shadowing across scopes. Same name in inner and outer scope shares one symbol; an inner close masks an outer leak.
Resources stored in collections. Handles in arrays / maps / channels and cleaned up via iteration are not tracked.
Dynamic dispatch. Close called via trait object or interface may not be recognised.
Type-state authentication. AuthenticatedRequest<T> and similar Rust patterns are not recognised as auth.

Common false positives

Scenario	Why	Mitigation
Factory returns a resource	Caller owns it	Known limitation
Framework-managed handles	Connection pool, request scope	Exclude framework code or downgrade
Variable name shadowing	Same name reused	Known limitation

Per-language detection

Language	Leak	Double-close	Use-after-close	Notes
C	yes	yes	yes	`fopen`/`fclose`, `malloc`/`free`, `pthread_mutex_*`
C++	yes	yes	yes	C pairs plus `new`/`delete`; smart pointers suppressed
Python	yes	yes	yes	`with` suppressed; `open`, `socket`, `connect`
Go	yes	yes	yes	`defer` suppressed; `os.Open` / `.Close`
Rust	unsafe only	n/a	n/a	RAII suppresses everything except `alloc`/`dealloc`
JavaScript	yes	yes	partial	`fs.openSync`/`closeSync`
TypeScript	yes	yes	partial	Same as JS
PHP	yes	yes	partial	`fopen`/`fclose`, `curl_init`/`curl_close`, `mysqli_*`
Ruby	partial	partial	partial	`File.open`/`close`, `TCPSocket`
Java	limited	limited	limited	Constructor-callee matching is incomplete

Tuning

nyx scan . --severity ">=MEDIUM"   # Skip "possible" leaks (Low)

[scanner]
enable_state_analysis = true        # default
excluded_directories  = ["tests", "test", "spec"]

Recognised pairs

The state engine ships these acquire/release pairs. Custom pairs are not yet configurable; file an issue if you need one.

C / C++

Acquire	Release
`fopen`	`fclose`
`open`	`close`
`socket`	`close`
`malloc`, `calloc`, `realloc`	`free`
`pthread_mutex_lock`	`pthread_mutex_unlock`
`new`, `new[]` (C++)	`delete`, `delete[]`

Rust

Acquire	Release
`File::open`, `File::create`	`drop`, `close`
`TcpStream::connect`	`shutdown`
`lock`, `read`, `write` (Mutex/RwLock)	`drop`

Java

Acquire	Release
`new FileInputStream` (and friends)	`close`
`getConnection`	`close`
`new Socket`	`close`

Go, Python, JavaScript, Ruby, PHP follow language-idiomatic equivalents.

Use-after-close triggers

These operations on a closed resource fire state-use-after-close:

read, write, send, recv, fread, fwrite, fgets, fputs, fprintf, fscanf,
fflush, fseek, ftell, rewind, feof, ferror, fgetc, fputc, getc, putc,
ungetc, query, execute, fetch, sendto, recvfrom, ioctl, fcntl,
strcpy, strncpy, strcat, strncat, memcpy, memmove, memset, memcmp,
strcmp, strncmp, strlen, sprintf, snprintf

Taint analysis

Nyx tracks untrusted data from sources (where it enters the program) through assignments and function calls to sinks (where it’s used dangerously). If the flow reaches a sink without passing a matching sanitizer, a finding fires.

The engine is a monotone forward dataflow over a finite lattice with guaranteed termination. It’s flow-sensitive inside a function, and interprocedural across files via persisted per-function summaries.

Rule ID

taint-unsanitised-flow (source <line>:<col>)

One rule ID, parameterized by the source location. Suppressions can target either the base ID or the full string.

What it detects

User input flowing to shell execution: req.body.cmd → child_process.exec
User input flowing to code evaluation: req.query.code → eval
User input flowing to SQL: request.args.get('id') → cursor.execute(f"... {id}")
Environment variables flowing to shell: env::var("CMD") → Command::new("sh").arg("-c")
Request parameters flowing to HTML: req.query.name → innerHTML
File contents flowing to privileged sinks: fs::read_to_string → db.execute
Any other source-to-sink flow where the sink’s required capability is not stripped along the way

What it can’t detect

Library calls without summaries. If a callee has no summary (no source, binary-only dependency), Nyx treats it as neither propagating nor sanitizing. This is conservative for sanitization but lossy for propagation.
Deep pointer aliasing. let y = &x; sink(*y) works through one level, but arbitrary chains of pointer arithmetic and aliased writes (*p, p->field in C/C++) are not tracked end-to-end. Function pointers and indirect calls resolve to no callee.
Implicit flows. Taint follows explicit data, not branching signal. if (secret) x = 1 else x = 0 does not taint x.
Globals and statics across functions. Not tracked across function boundaries.

Common false positives

Scenario	Why	Mitigation
Custom sanitizer not recognised	Only built-in + configured sanitizers match	Add a custom sanitizer rule in config
Container holds mixed-typed items the engine cannot tell apart	A `vector<int>` of port numbers and a `vector<string>` of user input share the same store/load model	Sanitize the values on the way in (numeric parse / explicit validator) so the values themselves carry no cap, not just the container
Dead branches	Path-insensitive within a function	Constraint solving catches trivially infeasible combos; path-validated findings are scored lower
Library wrapper re-introduces taint	Wrapper opaque, or summary marks it as propagating	Summarize the wrapper explicitly or add it as a sanitizer

Common false negatives

Scenario	Why
Third-party library on the path	No summary available, callee treated opaquely
Globals / statics across function boundaries	Not tracked
Some closure captures	Closure analysis is limited. JS/TS/Ruby/Go anonymous functions passed as callbacks are analyzed as separate scopes
Very deep cross-file chains	Summary approximation loses precision at depth

Confidence signals

Higher confidence:

Source + Sink both present in evidence with specific call locations.
source_kind: user_input (direct attacker control).
path_validated: false.
No dominating guard on the path.
Symex produced a witness string (rendered sink value visible in JSON/SARIF evidence.symbolic.witness).

Lower confidence:

Path-validated taint (path_validated: true).
Source is a database read or internal file (pre-validated at insertion is common).
Engine note ForwardBailed / PathWidened. Use --require-converged to drop these in strict gates.

Tuning

Custom sanitizer

# nyx.local
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind     = "sanitizer"
cap      = "html_escape"

Or: nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape.

Filter by severity or confidence

nyx scan . --severity HIGH
nyx scan . --min-confidence medium

Skip dataflow entirely

nyx scan . --mode ast

AST-only mode gives you structural pattern matches without taint.

In the browser UI, taint findings render as a numbered flow walk so you can see each hop the engine took:

Nyx finding detail: HIGH taint-unsanitised-flow with numbered source → call → sink steps and How to fix guidance

Example

Rust:

use std::env;
use std::process::Command;

fn main() {
    let cmd = env::var("USER_CMD").unwrap();           // source
    Command::new("sh").arg("-c").arg(&cmd).output();   // sink
}

Finding:

[HIGH] taint-unsanitised-flow (source 5:15)  src/main.rs:6:5
       Unsanitised user input flows from env::var → Command::new
       Source: env::var (5:15)
       Sink:   Command::new

Safe rewrite: drop the shell and pass the value as argv directly (Command::new(&cmd).output()), or validate against an allowlist before passing to the shell.

Capabilities

Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer only clears taint for the cap it declares. A sink only fires when the remaining taint still carries its required cap.

Capability	Typical source	Typical sanitizer	Typical sink
`env_var`	`env::var`, `getenv`, `process.env`
`html_escape`		`html.escape`, `DOMPurify.sanitize`	`innerHTML`, `document.write`
`shell_escape`		`shlex.quote`, `shell_escape::escape`	`system`, `Command::new`, `eval`
`url_encode`		`encodeURIComponent`	`location.href`, HTTP client URL arg
`json_parse`		`JSON.parse`
`file_io`		`os.path.realpath`, `filepath.Clean`	`open`, `fs::read_to_string`, `send_file`
`fmt_string`			`printf(var)`
`sql_query`		parameterized query binders	`cursor.execute`, `db.query` with concatenation
`deserialize`			`pickle.loads`, `yaml.load`, `Marshal.load`
`ssrf`		URL-prefix locks	`requests.get`, `fetch`, `HttpClient.send`
`code_exec`			`eval`, `exec`, `Function`
`crypto`			weak-algorithm constructors
`unauthorized_id`	request-bound scoped IDs (Rust auth analysis)	ownership check	row-level write
`all`	Sources typically use `all` so they match any sink

Sources typically use cap = "all" so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name.

Keyboard shortcuts

Nyx