Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Quick start

After cargo install nyx-scanner (or dropping a release binary on your PATH), point Nyx at a directory:

nyx scan ./my-project

First run builds a SQLite index under .nyx/; later runs skip files whose content hash hasn’t changed.

What a finding looks like

nyx scan output: HIGH taint flows from req.params.user, req.query.url, and req.query.path into exec/fetch/fs.readFileSync, framed by the brand purple gradient

The same scan in console form:

/tmp/demo/cmdi_direct.py
  6:5  ✖ [HIGH] taint-unsanitised-flow (source 5:11)  (Score: 81, Confidence: High)
      Unsanitised user input flows from request.args.get → os.system

      Source: request.args.get (5:11)
      Sink:   os.system

  6:5  ✖ [HIGH] py.cmdi.os_system  (Score: 64, Confidence: High)
      os.system() runs a shell command

/tmp/demo/xss_document_write.js
  5:5  ✖ [HIGH] taint-unsanitised-flow (source 3:18)  (Score: 81, Confidence: High)
      Unsanitised user input flows from req.query.content → document.write

      Source: req.query.content (3:18)
      Sink:   document.write

  5:5  ⚠ [MEDIUM] js.xss.document_write  (Score: 34, Confidence: High)
      document.write() is an XSS sink

warning 'demo' generated 10 issues.
Finished in 0.054s.

Each finding is one line of header plus evidence. Fields that matter:

FieldMeaning
[HIGH] / [MEDIUM] / [LOW]Severity after the non-prod downgrade
Rule IDEither a taint rule (taint-unsanitised-flow), a structural rule (cfg-*, state-*), or an AST pattern (<lang>.<category>.<name>)
ScoreAttack-surface ranking (severity + analysis kind + source kind + evidence). Higher is more exploitable
ConfidenceHigh, Medium, Low. Drops for AST-only matches, capped widened flows, and lowered-to-Low backwards-infeasible findings
Source / SinkWhere tainted data entered and where the dangerous call happened

Two rules firing on the same line (the taint finding plus the AST pattern) is normal. The pattern matches the structural presence of document.write; the taint rule adds the evidence that req.query.content actually reached it. Both carry distinct rule IDs so suppressions can target one without the other.

Fail a CI job on High findings

nyx scan . --fail-on HIGH --quiet

Exit 1 if any HIGH finding remains. --quiet drops the “Using default configuration” banner so CI logs stay tidy.

Emit SARIF for GitHub Code Scanning

nyx scan . --format sarif > results.sarif

Full SARIF schema and GitHub Actions wiring: cli.md and output.md.

Tighten the gate

# Only HIGH findings
nyx scan . --severity HIGH

# HIGH + MEDIUM
nyx scan . --severity ">=MEDIUM"

# Drop anything below Medium confidence (useful for CI)
nyx scan . --min-confidence medium

# Also drop findings the engine could not fully resolve (widened / bailed)
nyx scan . --require-converged

--require-converged keeps under-report findings (the emitted flow is still real) but drops over-reports and widenings. Intended for strict gates where a noisy finding is worse than nothing.

Skip dataflow for a fast first pass

nyx scan . --mode ast

AST-only mode runs tree-sitter patterns without building a CFG or running taint. It’s fast and still catches banned-API uses, weak crypto, and obvious XSS sinks, but it can’t tell eval("1+1") apart from eval(userInput). Use it as a pre-commit filter, not as a CI gate replacement.

Next

Installation

For the happy path (cargo install nyx-scanner, release binary on PATH), see the README. This page covers platform-specific notes and upgrade paths.

Supported platforms

Release binaries are published for:

PlatformArchive
Linux x86_64nyx-x86_64-unknown-linux-gnu.zip
macOS Intelnyx-x86_64-apple-darwin.zip
macOS Apple Siliconnyx-aarch64-apple-darwin.zip
Windows x86_64nyx-x86_64-pc-windows-msvc.zip

Build from source works on any stable Rust 1.88+ target (edition 2024).

Verify the download

Each release attaches a SHA256SUMS file. When the maintainer signs the release, a detached SHA256SUMS.asc is published alongside it.

# Verify the checksum file's signature (skip if .asc isn't present)
gpg --verify SHA256SUMS.asc SHA256SUMS

# Then check your archive against it
sha256sum -c SHA256SUMS --ignore-missing

If sha256sum is missing on macOS, shasum -a 256 -c SHA256SUMS --ignore-missing is equivalent.

Windows

Expand-Archive -Path nyx-x86_64-pc-windows-msvc.zip -DestinationPath .
Move-Item -Path .\nyx.exe -Destination "C:\Program Files\Nyx\"
# Add C:\Program Files\Nyx to PATH in System Properties → Environment Variables
nyx --version

Build from source

git clone https://github.com/elicpeter/nyx.git
cd nyx
cargo build --release
# Binary at target/release/nyx

The frontend is built and embedded into the binary during cargo build, so there’s no separate step for nyx serve. Node is only required if you’re working on the frontend itself; see CONTRIBUTING.md.

Optional features:

FlagAdds
--features smtBundles Z3 for stronger path-constraint solving. MIT-licensed; distributors should include Z3’s license in their attribution
--features smt-system-z3Links against a system-installed Z3 instead of bundling

Upgrading

Nyx stores its scanner version in the project’s index database. When the binary’s version differs from the stored version, the index is wiped on the next scan and rebuilt against the new engine. You’ll see one info-level log line:

engine version changed (0.4.0 → 0.5.0), rebuilding index

No flag needed. If you see this on every scan, the metadata row isn’t being persisted; file an issue.

Corrupt database recovery

If the SQLite file itself is damaged (killed scan, full disk), delete it and let the next scan rebuild from scratch:

rm "$(nyx config path)"/<project>.sqlite*

Only the named project’s rows are affected.

CLI Reference

Global

nyx [COMMAND]
nyx --version
nyx --help

nyx scan

Run a security scan on a directory.

nyx scan [PATH] [OPTIONS]

PATH defaults to . (current directory).

Analysis Mode

FlagDefaultDescription
--mode <MODE>fullAnalysis mode: full, ast, cfg, or taint
ModeWhat runs
fullAST patterns + CFG structural analysis + taint analysis
astAST patterns only (fastest, no CFG or taint)
cfg / taintCFG + taint analysis only (no AST patterns)

Deprecated aliases: --ast-only (use --mode ast), --cfg-only (use --mode cfg), --all-targets (use --mode full).

Index Control

FlagDefaultDescription
--index <MODE>autoIndex behavior: auto, off, or rebuild
Index ModeBehavior
autoUse existing index if available; build if missing
offSkip indexing, scan filesystem directly
rebuildForce rebuild index before scanning

Deprecated aliases: --no-index (use --index off), --rebuild-index (use --index rebuild).

Output

FlagDefaultDescription
-f, --format <FMT>consoleOutput format: console, json, or sarif
--quietoffSuppress status messages (stderr), including the Preview-tier banner for C/C++ scans
--no-rankoffDisable attack-surface ranking
--no-stateoffDisable state-model analysis (resource lifecycle + auth state). Overrides scanner.enable_state_analysis

Profiles

FlagDefaultDescription
--profile <NAME>(none)Apply a named scan profile. Built-ins: quick, full, ci, taint_only, conservative_large_repo. User-defined profiles override built-ins with the same name. CLI flags still take precedence over profile values

Filtering

FlagDefaultDescription
--severity <EXPR>(none)Filter findings by severity
--min-score <N>(none)Drop findings with rank score below N
--min-confidence <LEVEL>(none)Drop findings below this confidence level (low, medium, high)
--require-convergedoffDrop findings whose engine provenance notes indicate widening (over-report) or analysis bail. Keeps under-report findings (emitted flow is still real). Intended for strict CI gates.
--fail-on <SEV>(none)Exit code 1 if any finding >= this severity
--show-suppressedoffShow inline-suppressed findings (dimmed, tagged [SUPPRESSED])
--keep-nonprod-severityoffDon’t downgrade severity for test/vendor paths
--alloffDisable category filtering, rollups, and LOW budgets – show everything
--include-qualityoffInclude Quality-category findings (hidden by default)
--max-low <N>20Maximum total LOW findings to show
--max-low-per-file <N>1Maximum LOW findings per file
--max-low-per-rule <N>10Maximum LOW findings per rule
--rollup-examples <N>5Number of example locations in rollup findings
--show-instances <RULE>(none)Expand all instances of a specific rule (bypass rollup)

Severity expression formats:

--severity HIGH              # Only high
--severity "HIGH,MEDIUM"     # High or medium
--severity ">=MEDIUM"        # Medium and above (high + medium)
--severity ">= low"         # All severities (case-insensitive)

Deprecated aliases: --high-only (use --severity HIGH), --include-nonprod (use --keep-nonprod-severity).

--fail-on returns a non-zero exit code when the threshold trips, so CI jobs fail without further wiring:

nyx scan with --fail-on HIGH against a small fixture: three HIGH taint findings printed, followed by exit=1 from the shell

Quality-category and rollup-prone Low findings are filtered down by default. The footer tells you exactly what got dropped and which knob to turn:

nyx scan tail: warning '*' generated 57 issues; Suppressed 92 LOW/Quality findings; Active filters max_low=20, max_low_per_file=1, max_low_per_rule=10; Use --include-quality, --max-low, or --all to adjust

Analysis Engine Toggles

Override the corresponding [analysis.engine] values in nyx.conf for a single run. All default on; pass the --no-* variant to disable.

PairConfig fieldEffect when disabled
--constraint-solving / --no-constraint-solvingconstraint_solvingSkip path-constraint solving; infeasible paths no longer pruned
--abstract-interp / --no-abstract-interpabstract_interpretationSkip interval / string / bit abstract domains
--context-sensitive / --no-context-sensitivecontext_sensitiveTreat intra-file callees insensitively (summary-only)
--symex / --no-symexsymex.enabledSkip the symex pipeline; no symbolic verdicts or witnesses
--cross-file-symex / --no-cross-file-symexsymex.cross_fileSkip extracting / consulting cross-file SSA bodies
--symex-interproc / --no-symex-interprocsymex.interproceduralCap symex frame stack at the entry function
--smt / --no-smtsymex.smtSkip the SMT backend (still a no-op without the smt feature)
--backwards-analysis / --no-backwards-analysisbackwards_analysisDemand-driven backwards taint walk from sinks (default off)
--parse-timeout-ms <N>parse_timeout_msPer-file tree-sitter parse timeout (ms); 0 disables the cap

Lattice-width Caps

Two caps bound the width of taint origin sets and points-to sets per SSA value. When a set would exceed the cap, entries are truncated deterministically and an engine note (OriginsTruncated / PointsToTruncated) is recorded on affected findings so you can see when precision was lost.

FlagDefaultDescription
--max-origins <N>32Max taint origins retained per lattice value. Raise on very wide codebases where truncation is observed; lower only when lattice width is a measured bottleneck. Also set via NYX_MAX_ORIGINS
--max-pointsto <N>32Max abstract heap objects retained per points-to set. Raise on factory-heavy codebases where truncation is observed. Also set via NYX_MAX_POINTSTO

See configuration.md for the full schema.

Engine-Depth Profile

Individual engine toggles are fine-grained but hard to remember in combination. The --engine-profile shortcut sets the whole stack in one shot, and individual flags are layered on top after the profile is applied.

ProfileBackwardsSymexAbstract-interpContext-sensitive
fastoffoffoffoff
balanced (default)offoffonon
deeponon (cross-file + interprocedural)onon

All three profiles build the AST, CFG, and SSA lattice and run forward taint; the columns above show which additional analyses each profile enables. SMT (symex.smt) is always off unless Nyx was built with --features smt.

Individual flags override the profile. For example, --engine-profile fast --backwards-analysis runs the fast stack but with backwards analysis on.

Explain Effective Engine

--explain-engine prints the resolved engine configuration (profile + config + CLI overrides + env-var fallbacks) to stdout and exits without scanning. Useful for sanity-checking a CI invocation.

nyx scan --engine-profile deep --no-smt --explain-engine

nyx scan --engine-profile deep --explain-engine output: resolved config showing every analysis pass, its current state, and the CLI flag/env var that controls it

Examples

# Basic scan
nyx scan

# Scan specific path, JSON output
nyx scan ./server --format json

# CI gate: fail on medium+, SARIF output
nyx scan . --format sarif --fail-on medium > results.sarif

# Fast AST-only scan, no index
nyx scan . --mode ast --index off

# High-severity only, quiet mode
nyx scan . --severity HIGH --quiet

# Only findings scoring 50 or above
nyx scan . --min-score 50

# Only medium+ confidence findings
nyx scan . --min-confidence medium

# Show everything (no filtering, no rollups)
nyx scan . --all

# Include quality findings but keep rollups and budgets
nyx scan . --include-quality

# See all unwrap findings expanded
nyx scan . --include-quality --show-instances rs.quality.unwrap

# Allow more LOW findings
nyx scan . --max-low 50 --max-low-per-file 5

nyx index

Manage the SQLite file index.

nyx index build

nyx index build [PATH] [--force]

Build or update the index for the given path (default: .).

FlagDescription
-f, --forceForce full rebuild, ignoring cached file hashes

nyx index status

nyx index status [PATH]

Display index statistics (file count, size, last modified) for the given path.

nyx index status output: project name, index path under the platform config dir, exists/size/modified fields


nyx list

nyx list [-v]

List all indexed projects.

FlagDescription
-v, --verboseShow detailed information per project

nyx clean

nyx clean [PROJECT] [--all]

Remove index data.

Argument/FlagDescription
PROJECTProject name or path to clean
--allClean all indexed projects

nyx config

Manage configuration.

nyx config show

Print the effective merged configuration as TOML. Useful for sanity-checking what the scanner is actually using after nyx.conf and nyx.local merge:

nyx config show output: TOML dump of the merged scanner config showing [scanner] mode/min_severity/excluded_extensions/excluded_directories, [database] settings, and resolved engine toggles

nyx config path

Print the configuration directory path.

nyx config add-rule

nyx config add-rule --lang <LANG> --matcher <MATCHER> --kind <KIND> --cap <CAP>

Add a custom taint rule. Written to nyx.local.

FlagValues
--langrust, javascript, typescript, python, go, java, c, cpp, php, ruby
--matcherFunction or property name to match
--kindsource, sanitizer, sink
--capenv_var, html_escape, shell_escape, url_encode, json_parse, file_io, fmt_string, sql_query, deserialize, ssrf, code_exec, crypto, unauthorized_id, all

nyx config add-terminator

nyx config add-terminator --lang <LANG> --name <NAME>

Add a terminator function (e.g. process.exit). Written to nyx.local.


Exit codes

See output.md. Summary: 0 on success (including findings without --fail-on), 1 when --fail-on trips, non-zero on scan errors.


Environment variables

Runtime behaviour:

VariableDescription
RUST_LOGSet tracing verbosity (e.g. RUST_LOG=debug nyx scan .)
NO_COLORDisable ANSI color output

Engine toggles (legacy, still honored; prefer CLI flags or [analysis.engine] config):

VariableMatches
NYX_CONSTRAINT--constraint-solving
NYX_ABSTRACT_INTERP--abstract-interp
NYX_CONTEXT_SENSITIVE--context-sensitive
NYX_SYMEX, NYX_CROSS_FILE_SYMEX, NYX_SYMEX_INTERPROC--symex and friends
NYX_SMT--smt (no-op without the smt feature)
NYX_BACKWARDS--backwards-analysis
NYX_PARSE_TIMEOUT_MS--parse-timeout-ms
NYX_MAX_ORIGINS, NYX_MAX_POINTSTO--max-origins, --max-pointsto

nyx serve: the browser UI

The CLI is fine for CI. For triage, you want context: the source snippet, the dataflow path, the history of how a finding has moved across scans, and a place to record decisions that survive the next run. nyx serve boots a local React UI bound to loopback.

nyx serve                         # opens http://localhost:9700 in your default browser
nyx serve ./my-project            # serve a specific project root
nyx serve --port 9750             # override port
nyx serve --no-browser            # don't auto-open

Persistent settings live under [server] in nyx.conf / nyx.local.

Nyx UI overview: total findings, severity breakdown, language and category distribution, top affected files

What it serves, and what it doesn’t

The frontend is built and embedded into the nyx binary at compile time. There’s no separate install step, and the binary serves the entire UI from memory; nothing is fetched from a CDN. The UI talks to the local Nyx process over a small JSON API.

There is no account, no telemetry, no remote logging, no auto-update ping. The data the UI shows is the data on your disk: the SQLite project index plus .nyx/triage.json.

Security model

nyx serve enforces three things at the HTTP layer (src/server/security.rs):

  1. Loopback bind only. --host and [server].host are clamped to 127.0.0.1, localhost, or ::1. Any other value is refused at startup with Nyx serve only binds to loopback addresses; refused host '<value>'.
  2. Host-header check. Every request must carry a Host header that matches the bound address and port. Missing or mismatched headers get a 400 invalid Host header. Defends against DNS rebinding.
  3. CSRF on mutations. POST / PUT / PATCH / DELETE requests must carry a per-process CSRF token in the x-nyx-csrf header. The token is generated once when the server starts and exposed at GET /api/health so the embedded SPA can read it. Cross-origin mutations are rejected before the CSRF check via the Origin header.

If you forward the port over SSH or expose it through a reverse proxy, the host-header check will reject the request because the Host won’t match localhost:9700. That’s the intended behaviour. Don’t do this without a deliberate reason; the loopback bind is part of the security model.

The pages

PathPage
/Overview
/findingsFindings list
/findings/:idFinding detail
/triageTriage
/explorerExplorer
/scansScans
/scans/:idScan detail and compare
/rulesRules
/rules/:idRule detail
/configConfig

The numeric :id for finding URLs is the position index in the current scan, not a stable fingerprint. Bookmarks across scans aren’t reliable; rely on file path + line.

Overview and Health Score

The overview is the landing page after a scan. Severity counts, top affected files, OWASP coverage, and a 0 to 100 Health Score with a letter grade.

How the Health Score is calculated

Two things drive the score. The density of risk in the codebase, and hard guardrails that decide what the grade can mean.

Each finding contributes weight = severity_base × confidence_factor × verdict_factor × context_factor:

  • Severity base: HIGH 10, MEDIUM 3, LOW (security) 0.5
  • Confidence: High 1.0, Medium 0.6, Low 0.3
  • Symex verdict: Confirmed 1.2, NotAttempted 1.0, Inconclusive 0.7, Infeasible 0.1
  • Context: cross-file taint flow 1.15, intra-file flow 1.0, AST-only or no flow 0.75, test path 0.3

Quality lints (rule IDs containing .quality.) skip the per-finding weight and instead apply a saturating drag, capped at 15 points (so 1000 unwrap lints don’t grade worse than 300 do). Total weight gets divided by sqrt(files / 100), clamped between 1 and roughly 22, so a 100-file repo and a 50000-file repo see different denominators but a monorepo can’t dilute its way out of a real HIGH.

The result feeds a log curve into a 0 to 100 base, minus the quality drag. Then HIGH guardrails apply, keyed on the credibility-adjusted HIGH count rather than the raw count:

effective HIGHceiling
0100
185
278
3 to 568
6 to 1058
11+45

A repo with zero effective HIGHs never grades below C 70. That floor is the structural promise that the score isn’t an automated F-machine for projects that have lots of LOW noise but no critical issues.

Modifiers in the ±5 range nudge the result for trend (only after the second scan), triage coverage (only when total findings ≥ 20), reintroduced findings, and stale HIGHs more than 30 days old.

What the score doesn’t measure

It’s a Nyx-finding-pressure metric, not a security audit. Score 100 means Nyx didn’t find anything under its current rules and language coverage; it doesn’t certify the absence of vulnerabilities. The score doesn’t see runtime config, IAM, secret stores, dependency CVEs, or anything outside the source tree being scanned. A repo of mostly Kotlin (where Nyx coverage is thin) will score artificially well because most of the code never gets evaluated.

The current ceilings are calibrated for v0.5 scanner false-positive rates. As symex coverage and rule precision improve, the ceilings tighten. Calibration data and the rationale behind each tunable lives in health-score-audit.md.

Findings and Finding detail

The findings list is filterable by severity, confidence, category, language, rule ID, and triage state.

Nyx findings list: 13 findings filtered by severity/confidence/rule, with status badges, file paths, and language tags

Clicking through opens the flow visualiser: a numbered walk from source to sink with the snippet at each step, cross-file markers when the path leaves the current file, the rule’s “How to fix” guidance, and the engine’s evidence object inline.

Nyx finding detail: HIGH taint-unsanitised-flow showing source → call → sink steps, How to fix guidance, and evidence panel

Engine notes call out when precision was bounded for that finding (OriginsTruncated, PointsToTruncated, PathWidened, ForwardBailed, etc.). Anything tagged under-report means the emitted flow is real and the result set is a lower bound; over-report means widening or bail. --require-converged in the CLI drops the over-report ones for strict gates.

Triage

Each finding carries a triage state: open, investigating, false_positive, accepted_risk, suppressed, or fixed. The triage page bulk-updates them and shows the audit trail.

Nyx triage page: 13 findings need attention, severity breakdown, Findings/Suppression rules/Audit log tabs, rule chips, Investigate buttons

State writes are persisted to SQLite immediately, and (when [server].triage_sync = true, default on) mirrored to .nyx/triage.json in the project root. Commit that file:

git add .nyx/triage.json

It carries decisions across machines so a teammate’s local scan reflects yours. The format is documented in src/server/triage_sync.rs; the schema is stable and round-trip-safe with nyx serve re-imports.

Explorer

A file tree with per-file finding counts, syntax-highlighted source, and a right rail with the file’s symbols and findings. Useful for “what’s wrong with this module” rather than “what’s wrong with this finding”.

Nyx explorer: file tree with per-file finding counts, syntax-highlighted Python source with red sink marker on the os.system line, file-summary right rail with findings

The path query string preselects a file: /explorer?file=src/handler.rs.

Scans and compare

Past runs are persisted when [runs].persist = true (off by default to avoid disk growth on heavy users). When persistence is on, /scans lists historical runs.

Nyx scans list: completed scan run with root, duration, finding count, languages, and started timestamp

Each run drills into a detail page with files scanned, findings count, duration, languages, and a per-pass timing breakdown.

Nyx scan detail: Summary tab with files scanned, findings, duration, languages; Details panel with Scan ID, Root, Engine version, started/finished timestamps; Timing breakdown bar showing Walk/Pass 1/Call Graph/Pass 2/Post

Pick two scans to diff and see what got introduced, fixed, or rediscovered between runs. The retention cap is [runs].max_runs (default 100). Each run can also optionally save its log and stdout (save_logs, save_stdout); both are off by default. Code snippets are saved (save_code_snippets = true); turn off if storage is tight.

Rules

Every rule the engine knows about, built-in plus user-added. Each row shows the matchers, kind (source / sanitiser / sink), capability, language, and how many findings it produced in the latest scan. Filter by language, by kind, or by free text.

Nyx rules page: 218 rules with language/kind dropdowns and a matcher search; rows showing rule title, language, kind (SOURCE/SANITIZER/SINK), cap, and finding count

User-added rules can be deleted from this page; built-ins are immutable. Built-ins live in src/labels/<lang>.rs and src/patterns/<lang>.rs; user-added entries write to nyx.local.

Config

A live config editor. Reads the merged config (nyx.conf + nyx.local), lets you flip switches and add custom source / sanitizer / sink rules, and writes back to nyx.local. Changes apply to the next scan; the running server uses its initial config snapshot.

Nyx config page: General settings (analysis mode, max file size, excluded extensions, attack-surface ranking), Triage Sync toggle, Sources section with language/matcher/capability dropdowns and a per-language matcher table

The custom-rule form picks a language, a matcher (function or property name), and a capability. The capability list matches the Cap bitflags the taint engine uses; see rules.md for what each one means.

API surface

For tooling, the JSON endpoints under /api/ are stable enough to script against. The full route map lives in src/server/routes/mod.rs. Mutating endpoints require the x-nyx-csrf header (read it from GET /api/health).

Disabling

If you don’t want the UI for a project, set:

[server]
enabled = false

nyx serve will refuse to start. The CLI continues to work.

Configuration

Nyx uses TOML configuration files. A default config is auto-generated on first run. If you’d rather edit settings and rules from the browser, the Config page in nyx serve is a live editor that writes back to nyx.local:

Nyx config page: General settings, Triage Sync toggle, Sources panel with language/matcher/capability dropdowns and a per-language matcher table

File Locations

PlatformDirectory
Linux~/.config/nyx/
macOS~/Library/Application Support/nyx/
Windows%APPDATA%\elicpeter\nyx\config\

Run nyx config path to see the exact directory on your system.

File Precedence

  1. nyx.conf – Default config (auto-created from built-in template on first run)
  2. nyx.local – User overrides (loaded on top of defaults)

Both files are optional. CLI flags take precedence over both.

Merge Strategy

TypeBehavior
Scalars (mode, min_severity, booleans)User value wins
Arrays (excluded_extensions, excluded_directories, excluded_files)Union + deduplicate
Analysis rulesPer-language union with deduplication
ProfilesUser profile with same name fully replaces built-in
Server / RunsUser value wins (full section override)

Example:

# nyx.conf (default):
excluded_extensions = ["jpg", "png", "exe"]

# nyx.local (user):
excluded_extensions = ["foo", "jpg"]

# Effective result:
# ["exe", "foo", "jpg", "png"]  -- sorted, deduped union

Full Schema

[scanner]

FieldTypeDefaultDescription
mode"full" | "ast" | "cfg" | "taint""full"Analysis mode
min_severity"Low" | "Medium" | "High""Low"Minimum severity to report
max_file_size_mbint | null16Max file size in MiB; null = unlimited. Default is a safe ceiling for untrusted repos; lift explicitly when scanning trusted codebases with large generated files
excluded_extensions[string]["jpg", "png", "gif", "mp4", ...]File extensions to skip
excluded_directories[string]["node_modules", ".git", "target", ...]Directories to skip
excluded_files[string][]Specific files to skip
read_global_ignoreboolfalseHonor global ignore file (RESERVED)
read_vcsignorebooltrueHonor .gitignore / .hgignore
require_git_to_read_vcsignorebooltrueRequire .git dir to apply gitignore
one_file_systemboolfalseDon’t cross filesystem boundaries
follow_symlinksboolfalseFollow symbolic links
scan_hidden_filesboolfalseScan dot-files
include_nonprodboolfalseKeep original severity for test/vendor paths
enable_state_analysisbooltrueEnable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires mode = "full" or mode = "taint".

[database]

FieldTypeDefaultDescription
pathstring""Custom SQLite DB path; empty = platform default (RESERVED)
auto_cleanup_daysint30Days to keep DB files (RESERVED)
max_db_size_mbint1024Maximum DB size in MiB (RESERVED)
vacuum_on_startupboolfalseRun VACUUM before indexed scans

[output]

FieldTypeDefaultDescription
default_format"console" | "json" | "sarif""console"Default output format (used when --format is not specified)
quietboolfalseSuppress status messages
max_resultsint | nullnullCap number of findings; null = unlimited
attack_surface_rankingbooltrueEnable attack-surface ranking
min_scoreint | nullnullMinimum rank score to include; null = no minimum
min_confidencestring | nullnullMinimum confidence level ("low", "medium", "high"); null = no minimum
include_qualityboolfalseInclude Quality-category findings (hidden by default)
show_allboolfalseDisable category filtering, rollups, and LOW budgets
max_lowint20Maximum total LOW findings to show (rollups count as 1)
max_low_per_fileint1Maximum LOW findings per file (rollups count as 1)
max_low_per_ruleint10Maximum LOW findings per rule (rollups count as 1)
rollup_examplesint5Number of example locations stored in rollup findings

[performance]

FieldTypeDefaultDescription
max_depthint | nullnullMax filesystem traversal depth; null = unlimited
min_depthint | nullnullMin depth for reported entries (RESERVED)
pruneboolfalseStop traversing into matching directories (RESERVED)
worker_threadsint | nullnullWorker thread count; null/0 = auto-detect
batch_sizeint100Files per index batch
channel_multiplierint4Channel capacity = threads x multiplier
rayon_thread_stack_sizeint8388608Rayon thread stack size in bytes (8 MiB)
scan_timeout_secsint | nullnullPer-file timeout in seconds (RESERVED)
memory_limit_mbint512Max memory in MiB (RESERVED)

[server]

Configuration for the local web UI (nyx serve).

FieldTypeDefaultDescription
enabledbooltrueWhether the serve command is enabled
hoststring"127.0.0.1"Host to bind to (localhost by default)
portint9700Port for the web UI
open_browserbooltrueOpen browser automatically on serve
auto_reloadbooltrueAuto-reload UI when scan results change
persist_runsbooltruePersist scan runs for history view
max_saved_runsint50Maximum number of saved runs

[runs]

Configuration for scan run persistence and history.

FieldTypeDefaultDescription
persistboolfalsePersist scan run history to disk
max_runsint100Maximum number of runs to keep
save_logsboolfalseSave scan logs with each run
save_stdoutboolfalseSave stdout capture with each run
save_code_snippetsbooltrueSave code snippets in findings

[profiles.<name>]

Named scan presets that override scan-related config. Activate with --profile <name>.

All fields are optional; omitted fields inherit from the base config.

FieldTypeDescription
modestringAnalysis mode
min_severitystringMinimum severity
max_file_size_mbintMax file size in MiB
include_nonprodboolKeep original severity for test/vendor
enable_state_analysisboolEnable state analysis
default_formatstringOutput format
quietboolSuppress status output
attack_surface_rankingboolEnable ranking
max_resultsintMax findings
min_scoreintMin rank score
show_allboolShow all findings
include_qualityboolInclude quality findings
worker_threadsintWorker thread count
max_depthintMax traversal depth

Built-in profiles:

NameDescription
quickAST-only, medium+ severity
fullFull analysis with state analysis enabled
ciFull analysis, medium+ severity, quiet, SARIF output
taint_onlyTaint analysis only
conservative_large_repoAST-only, high severity, 5 MiB file limit, depth 10

User-defined profiles with the same name as a built-in will override it.

[analysis.engine]

Release-grade switches for the optional analysis passes. Each toggle has a matching CLI flag (pair of --foo / --no-foo) that overrides the config value for a single run. These used to be NYX_* environment variables (NYX_CONSTRAINT, NYX_ABSTRACT_INTERP, NYX_SYMEX, NYX_CROSS_FILE_SYMEX, NYX_SYMEX_INTERPROC, NYX_CONTEXT_SENSITIVE, NYX_PARSE_TIMEOUT_MS, NYX_SMT); those env vars are still honored as a last-resort override when nyx is used as a library (no CLI entry point), but the config/CLI surface is the stable path.

FieldTypeDefaultDescription
constraint_solvingbooltruePath-constraint solving (prunes infeasible paths in taint)
abstract_interpretationbooltrueInterval / string / bit abstract domains carried through the SSA worklist
context_sensitivebooltruek=1 context-sensitive callee inlining for intra-file calls
backwards_analysisboolfalseDemand-driven backwards taint walk from sinks (adds scan time; default off)
parse_timeout_msint10000Per-file tree-sitter parse timeout; 0 disables the cap

[analysis.engine.symex] sub-section:

FieldTypeDefaultDescription
enabledbooltrueRun the symex pipeline after taint; adds witness strings and symbolic verdicts
cross_filebooltruePersist / consult cross-file SSA bodies so symex can reason about callees defined in other files
interproceduralbooltrueIntra-file interprocedural symex (k ≥ 2 via frame stack)
smtbooltrueUse the SMT backend when nyx is built with the smt feature; ignored otherwise

CLI flag map (each pair is --enable / --no-enable):

Config fieldCLI flags
constraint_solving--constraint-solving / --no-constraint-solving
abstract_interpretation--abstract-interp / --no-abstract-interp
context_sensitive--context-sensitive / --no-context-sensitive
backwards_analysis--backwards-analysis / --no-backwards-analysis
parse_timeout_ms--parse-timeout-ms <N>
symex.enabled--symex / --no-symex
symex.cross_file--cross-file-symex / --no-cross-file-symex
symex.interprocedural--symex-interproc / --no-symex-interproc
symex.smt--smt / --no-smt

Engine-depth profile shortcut: instead of flipping individual toggles, pass --engine-profile {fast,balanced,deep} to set the whole stack at once. Individual flags override the profile, so --engine-profile fast --backwards-analysis runs the fast stack with backwards analysis on. See docs/cli.md for the exact toggle matrix.

Explain effective engine: pass --explain-engine to print the resolved engine configuration (profile + config + CLI overrides) and exit without scanning.

[analysis.languages.<slug>]

Per-language custom rules. <slug> is one of: rust, javascript, typescript, python, go, java, c, cpp, php, ruby.

FieldTypeDescription
rulesarray of rule objectsCustom label rules
terminators[string]Functions that terminate execution
event_handlers[string]Event handler function names

Rule object:

[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer"        # "source" | "sanitizer" | "sink"
cap = "html_escape"       # "env_var" | "html_escape" | "shell_escape" |
                          # "url_encode" | "json_parse" | "file_io" |
                          # "fmt_string" | "sql_query" | "deserialize" |
                          # "ssrf" | "code_exec" | "crypto" | "all"

Example Configurations

Minimal override (nyx.local)

[scanner]
min_severity = "Medium"

[output]
default_format = "json"
max_results = 100

CI-optimized

[scanner]
mode = "full"
min_severity = "Medium"
excluded_directories = ["node_modules", ".git", "target", "vendor", "dist"]

[output]
quiet = true
default_format = "sarif"

[performance]
worker_threads = 4

Using a scan profile

# Use a built-in profile
nyx scan --profile ci

# CLI flags still override profile values
nyx scan --profile ci --format json

Custom profile

[profiles.security_audit]
mode = "full"
min_severity = "Low"
enable_state_analysis = true
show_all = true

Custom rules for a Node.js project

[analysis.languages.javascript]
terminators = ["process.exit", "abort"]
event_handlers = ["addEventListener"]

[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind = "sanitizer"
cap = "html_escape"

[[analysis.languages.javascript.rules]]
matchers = ["dangerouslySetInnerHTML"]
kind = "sink"
cap = "html_escape"

[[analysis.languages.javascript.rules]]
matchers = ["getRequestBody", "readUserInput"]
kind = "source"
cap = "all"

Adding rules via CLI

# Add a sanitizer
nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape

# Add a terminator
nyx config add-terminator --lang javascript --name process.exit

# Verify
nyx config show

Config Validation

Config is validated after loading and merging. Validation checks include:

  • Server port must be 1–65535
  • Server host must not be empty
  • max_saved_runs must be > 0 when persist_runs is true
  • max_runs must be > 0 when persist is true
  • batch_size and channel_multiplier must be > 0
  • rollup_examples must be > 0
  • Profile names must be alphanumeric with underscores only

Invalid config produces structured error messages identifying the section, field, and issue.


State Analysis

State analysis detects resource lifecycle violations (use-after-close, double-close, resource leaks) and unauthenticated access patterns. It is enabled by default.

To disable:

[scanner]
enable_state_analysis = false

State analysis requires mode = "full" or mode = "taint". It has no effect in mode = "ast".

Tradeoffs:

  • Additional per-function state-machine pass adds some scan time
  • May produce findings that require domain knowledge to evaluate (e.g., whether a resource handle is intentionally left open)
  • Most useful for C, C++, Rust, Go, and Java where acquire/release patterns are common

Upgrading

Engine-version mismatch is handled automatically

Nyx stores the scanner’s CARGO_PKG_VERSION in the project index database. When the version recorded in the DB differs from the running binary; or the row is missing entirely; every cached summary, SSA body, and file-hash row is wiped on the next open so the next scan rebuilds the index against the new engine. No flag is needed; CI pipelines keep working across upgrades.

The rebuild is logged at info level:

engine version changed (0.4.0 → 0.5.0), rebuilding index

If you see this once per upgrade it is working as intended. If you see it on every scan, the metadata row is not being persisted; file an issue.

Forcing a reindex

Use --index rebuild to throw away the current project’s cached summaries and re-run pass 1 against the current rules. Useful after editing nyx.local rules, after an upgrade that changed label definitions without changing the engine version, or when you want a known-clean baseline:

nyx scan --index rebuild .

This clears the current project’s rows in files, function_summaries, ssa_function_summaries, and ssa_function_bodies; other projects sharing the same DB directory are untouched.

Recovering from a corrupt database

If the .sqlite file itself is damaged (e.g. from a killed scan or full disk) and nyx scan fails to open it, delete the file and let the next scan recreate it:

rm "$(nyx config path)"/<project>.sqlite*

On the next scan Nyx builds a fresh index from scratch.


Reserved Fields

Some config fields are defined but not yet implemented. They are marked (RESERVED) in the default config and accept values without effect. This allows forward-compatible config files; settings will activate when the feature is implemented without requiring config changes.

Output Formats

Nyx supports three output formats, selected with --format or output.default_format in config.

Console (default)

Human-readable, color-coded output to stdout. Status messages go to stderr.

[HIGH]   taint-unsanitised-flow (source 5:11)  src/handler.rs:12:5 (Score: 76, Confidence: High)
         Source: env::var("CMD") → Command::new("sh").arg("-c")

[MEDIUM] cfg-unguarded-sink                    src/handler.rs:12:5 (Score: 35, Confidence: Medium)

[LOW]    rs.quality.unwrap                     src/lib.rs:88:5 (Score: 10, Confidence: High)

Severity indicators

TagColorMeaning
[HIGH]Red, boldCritical – likely exploitable
[MEDIUM]Orange, boldImportant – may be exploitable
[LOW]Muted blue-grayInformational – code quality or weak signal

Evidence fields

Taint and state findings include structured evidence:

LabelMeaning
SourceWhere tainted data originated (function name + location)
SinkWhere the dangerous operation happens
Path guardType of validation predicate protecting the path

Score

When attack-surface ranking is enabled (default), each finding shows a Score value. Higher scores indicate greater exploitability. See Detector Overview for the scoring formula.

Rollup findings

High-frequency LOW Quality findings (e.g. rs.quality.unwrap) are grouped into rollup findings by (file, rule):

  21:10  ● [LOW]   rs.quality.unwrap
      rs.quality.unwrap (38 occurrences)
      Examples: 21:10, 50:10, 79:10, 105:10, 134:10
      Run: nyx scan --show-instances rs.quality.unwrap

Rollups count as one finding for LOW budget enforcement. Use --show-instances <RULE> to expand a specific rule or --all to disable rollups entirely.

When findings are suppressed by the prioritization pipeline, a footer is shown:

Suppressed 195 LOW/Quality findings.
Active filters:
  include_quality = false
  max_low = 20
  max_low_per_file = 1
  max_low_per_rule = 10

Use --include-quality, --max-low, or --all to adjust.

JSON

Machine-readable JSON array. Each finding is an object:

[
  {
    "path": "src/handler.rs",
    "line": 12,
    "col": 5,
    "severity": "High",
    "id": "taint-unsanitised-flow (source 5:11)",
    "path_validated": false,
    "labels": [
      ["Source", "env::var(\"CMD\") at 5:11"],
      ["Sink", "Command::new(\"sh\").arg(\"-c\")"]
    ],
    "confidence": "High",
    "evidence": {
      "source": {
        "path": "src/handler.rs",
        "line": 5,
        "col": 11,
        "kind": "source",
        "snippet": "env::var(\"CMD\")"
      },
      "sink": {
        "path": "src/handler.rs",
        "line": 12,
        "col": 5,
        "kind": "sink",
        "snippet": "Command::new(\"sh\")"
      },
      "notes": ["source_kind:EnvironmentConfig"]
    },
    "rank_score": 76.0,
    "rank_reason": [
      ["severity_base", "60"],
      ["analysis_kind", "10"],
      ["source_kind", "5"],
      ["evidence_count", "1"]
    ]
  }
]

Field descriptions

FieldTypeAlways presentDescription
pathstringyesFile path relative to scan root
lineintyes1-indexed line number
colintyes1-indexed column number
severitystringyes"High", "Medium", or "Low"
idstringyesRule ID
categorystringyesFinding category: "Security", "Reliability", or "Quality"
path_validatedboolnoTrue if guarded by validation predicate
guard_kindstringnoPredicate type (e.g. "NullCheck", "ValidationCall")
messagestringnoHuman-readable context (state analysis findings)
labelsarraynoArray of [label, value] pairs for console display
confidencestringnoConfidence level: "Low", "Medium", or "High"
evidenceobjectnoStructured evidence (source/sink spans, state, notes)
rank_scorefloatnoAttack-surface score (omitted when ranking disabled)
rank_reasonarraynoScore breakdown (omitted when ranking disabled)
rollupobjectnoRollup data when findings are grouped (see below)

Fields marked “no” are omitted when empty/null/false to keep output compact.

Confidence levels

LevelMeaning
HighStrong signal – taint-confirmed flow, definite state violation
MediumModerate signal – resource leak, path-validated taint, CFG structural
LowWeak signal – AST pattern match, possible resource leak, degraded analysis

Evidence object

The evidence field provides structured provenance data:

FieldTypeDescription
sourceobjectSource span (path, line, col, kind, snippet)
sinkobjectSink span (path, line, col, kind, snippet)
guardsarrayValidation guard spans
sanitizersarraySanitizer spans
stateobjectState-machine evidence (machine, subject, from_state, to_state)
notesarrayFree-form notes (e.g. "source_kind:UserInput", "path_validated")

All fields are omitted when empty/null.

Rollup object

When a finding is a rollup (grouped from multiple occurrences), the rollup field is present:

{
  "rollup": {
    "count": 38,
    "occurrences": [
      { "line": 21, "col": 10 },
      { "line": 50, "col": 10 },
      { "line": 79, "col": 10 }
    ]
  }
}
FieldTypeDescription
countintTotal number of occurrences
occurrencesarrayFirst N example locations (controlled by rollup_examples)

SARIF (Static Analysis Results Interchange Format)

SARIF 2.1.0 JSON, suitable for GitHub Code Scanning and other SARIF-compatible tools.

nyx scan . --format sarif > results.sarif

The SARIF output includes:

  • Tool metadata – Nyx name and version
  • Rules – Rule ID, description, severity mapping
  • Results – One result per finding with location, message, and properties
  • Properties – Each result includes category and optionally confidence and rollup.count
  • Related locations – Rollup findings include example locations in relatedLocations
  • Artifacts – File paths referenced by findings

GitHub Code Scanning integration

- name: Run Nyx
  run: nyx scan . --format sarif > results.sarif

- name: Upload SARIF
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: results.sarif

Exit Codes

CodeMeaning
0Scan completed successfully; no findings matched --fail-on threshold
1--fail-on threshold breached (at least one finding meets or exceeds the specified severity)
Non-zeroError (I/O, config, database, parse error)

Without --fail-on, Nyx always exits 0 on a successful scan regardless of findings count.


Severity Levels

LevelDescriptionTypical rules
HighCritical vulnerabilities – likely exploitableCommand injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources
MediumImportant issues – may be exploitable with additional contextSQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks
LowInformational – code quality or weak signalsWeak crypto algorithms, insecure randomness, unwrap()/panic!(), type-safety escapes

Non-production severity downgrade

By default, findings in paths matching common non-production patterns (tests/, test/, vendor/, build/, examples/, benchmarks/) are downgraded by one tier:

  • High → Medium
  • Medium → Low
  • Low → Low (unchanged)

Use --keep-nonprod-severity to disable this behavior.


Inline Suppressions

Suppress specific findings directly in source code using nyx:ignore comments. Suppressed findings are excluded from output, severity counts, and --fail-on checks by default.

Comment syntax

LanguageComment styles
Rust, C, C++, Java, Go, JS, TS// nyx:ignore ... or /* nyx:ignore ... */
Python, Ruby# nyx:ignore ...
PHP// nyx:ignore ..., # nyx:ignore ..., or /* nyx:ignore ... */

Directive forms

x = dangerous()  # nyx:ignore taint-unsanitised-flow     ← suppresses this line
# nyx:ignore-next-line taint-unsanitised-flow
x = dangerous()                                           ← suppresses this line
  • nyx:ignore <RULE_ID> – suppresses findings on the same line as the comment.
  • nyx:ignore-next-line <RULE_ID> – suppresses findings on the next line.
  • For taint findings, the primary line is the sink line (the line field in output).

Rule ID matching

  • Case-sensitive, exact match after canonicalization.
  • Comma-separated: nyx:ignore rule-a, rule-b
  • Wildcard suffix: nyx:ignore rs.quality.* matches any ID starting with rs.quality.
  • Taint IDs are canonicalized: nyx:ignore taint-unsanitised-flow matches taint-unsanitised-flow (source 5:1) (parenthetical suffix stripped).

Console behavior

  • Default: suppressed findings are hidden entirely.
  • --show-suppressed: suppressed findings appear dimmed with [SUPPRESSED] tag. Summary shows "N issues (M suppressed)".

JSON / SARIF behavior

  • Default: suppressed findings are excluded from JSON/SARIF output.
  • --show-suppressed: suppressed findings are included with additional fields:
{
  "suppressed": true,
  "suppression": {
    "kind": "SameLine",
    "matched_pattern": "taint-unsanitised-flow",
    "directive_line": 42
  }
}

Exit code

Suppressed findings do not trigger --fail-on. A scan with only suppressed findings exits 0.


Rule ID Format

PrefixDetectorExample
taint-*Taint analysistaint-unsanitised-flow (source 5:11)
cfg-*CFG structuralcfg-unguarded-sink, cfg-auth-gap
state-*State modelstate-use-after-close, state-resource-leak
<lang>.*.*AST patternsrs.memory.transmute, js.code_exec.eval

See the Rule Reference for a complete listing.

Language Maturity Matrix

Nyx supports ten languages, but support depth is not uniform. This page gives an honest per-language picture so you can calibrate expectations before depending on Nyx for a given stack.

The classifications here are grounded in three concrete signals:

  1. Rule depth: how many distinct source / sanitizer / sink matchers exist for the language in src/labels/<lang>.rs, and how many vulnerability classes (Cap bits) those matchers cover.
  2. Benchmark results: rule-level precision / recall / F1 on the 433-case corpus in tests/benchmark/RESULTS.md, last measured 2026-04-29 with scanner version 0.5.0.
  3. Known weak spots: FPs and FNs the maintainers have deliberately left in the benchmark rather than suppressed, plus structural engine limitations the corpus does not stress, documented release-by-release in RESULTS.md.

As of 2026-04-29 the synthetic corpus has effectively saturated: every real-CVE fixture fires and rule-level recall is 100%. Nine of ten languages report rule-level F1 = 100.0%; Go reports 98.0% on the back of a single safe-fixture FP. Aggregate rule-level P=0.995, R=1.000, F1=0.998. That means F1 alone no longer differentiates tiers, so the differentiators are rule depth, gated-sink coverage, and structural idioms the corpus does not fully stress (deep pointer aliasing in C/C++, framework-specific context). All parser integrations use tree-sitter and are stable; parsing is not a differentiator.


Tier Summary

TierLanguagesF1What to expect
StablePython, JavaScript, TypeScript100%Deep rule sets, gated sinks (argument-role-aware), framework detection, extensive fixtures, and the bulk of advanced-analysis (SSA two-level solve, context-sensitivity, symbolic execution, abstract interpretation) coverage. Safe to depend on in CI gates.
BetaGo, Java, PHP, Ruby, Rust98.0% to 100%Solid mid-depth rule sets with narrower cap coverage and no gated sinks. Cross-file flows work; some idioms (variable-typed method receivers, framework context, string interpolation, match-arm guards) are partially modeled. Usable in CI; review FP/FN lists before tightening gates.
PreviewC, C++100% on synthetic corpusRecent work taught the engine to follow taint through std::vector / std::string / map containers (including c_str()), through fluent builder chains like Socket::builder().host(h).connect(), and through inline class member functions. Function pointers and deeper pointer aliasing through *p / p->field are still not tracked. Rule-level scores against a corpus of obvious unsafe-API uses look perfect, but that is not the same as a clean audit on a real codebase. Pair with clang-tidy, Clang Static Analyzer, or Infer.

Per-Language Detail

Stable tier

Python: 100% P / 100% R / 100% F1 (46-case corpus)

  • Rule depth: 5 source families, 7 sanitizer families, 21 sink matchers spanning HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
  • Framework context: Flask, Django, argparse source matchers; flask_request import-alias support.
  • Advanced analysis: gated sinks (Popen, subprocess.run/call with activation-arg awareness), most SSA-equivalence and symbolic-execution fixtures target Python.
  • Fixtures: 125 under tests/fixtures/ plus 42 benchmark cases.
  • Blind spots: f-string interpolation is not explicitly modeled as a distinct taint-producing construct; string-formatting flows are caught by the general concatenation path.

JavaScript: 100% P / 100% R / 100% F1 (42-case corpus)

  • Rule depth: 3 source families, 10 sanitizer families, 24 sink matchers spanning HTML, URL, JSON, Shell, SQL, Code, SSRF, and File I/O.
  • Advanced analysis: gated sinks (setAttribute, parseFromString), two-level SSA solve for top-level + per-function scopes (analyse_ssa_js_two_level), prefix-locked SSRF suppression via StringFact, abstract-interpretation interval tracking.
  • Framework context: Express, Koa, Fastify (via in-file import scan when package.json is absent).
  • Fixtures: 238 under tests/fixtures/; the largest fixture set of any language.
  • Blind spots: template literals are lowered through concatenation rather than modeled as a first-class taint operator; dynamic property access (obj[user]) is conservatively treated.

TypeScript: 100% P / 100% R / 100% F1 (47-case corpus)

  • Rule depth: Shares the JS ruleset (3 sources, 10 sanitizers, 24 sinks) plus TS-specific grammar handling.
  • Advanced analysis: TSX and JSX grammars wired; discriminated-union narrowing, generic erasure, decorator flow, and interface dispatch are all validated against adversarial type-system stressors.
  • Framework context: Fastify detection via detect_in_file_frameworks (import-driven, no package.json required).
  • Fixtures: 39 test fixtures plus 42 benchmark cases.
  • Blind spots: as any casts and any-typed flows are handled conservatively (treated as tainted).

Beta tier

Go: 96.2% P / 100.0% R / 98.0% F1 (53-case corpus, 1 FP, 0 FNs)

  • Rule depth: 4 source families, 4 sanitizer families, 9 sink matchers covering HTML, URL, Shell, SQL, SSRF, Crypto, and File I/O.
  • Framework context: Gin, Echo source matchers.
  • Open weak spots: one safe Go fixture (go-safe-009) draws a spurious CMDi finding.
  • Known gaps: no gated sinks, no deserialization class. fmt.Sprintf is deliberately not a sink. Cap coverage is narrower than the Stable tier and argument-role-aware sink modeling is not yet implemented for Go, so production CI gates may surface additional FPs the corpus does not exercise.

Java: 100% P / 100% R / 100% F1 (35-case corpus)

  • Rule depth: 3 source families, 8 sanitizer families, 10 sink matchers covering HTML, URL, Shell, SQL, Code, SSRF, and Deserialization.
  • Framework context: Spring, JPA, Hibernate ORM rules; JNDI injection sinks.
  • Known gaps: no gated sinks. Variable-receiver method calls (client.send(...) vs HttpClient.send(...)) rely on type-qualified resolution from receiver-type inference; flows where the receiver type cannot be inferred are conservatively over-tainted on unusual builder chains.

PHP: 100% P / 100% R / 100% F1 (37-case corpus)

  • Rule depth: 3 source families ($_GET, $_POST, $_REQUEST superglobals), 7 sanitizer families, 10 sink matchers covering HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
  • Known gaps: no gated sinks. Limited framework context (Laravel raw methods only). echo language-construct detection is wired but its inner-argument propagation is narrower than function-call sinks.

Ruby: 100% P / 100% R / 100% F1 (39-case corpus)

  • Rule depth: 3 source families, 7 sanitizer families, 15 sink matchers covering HTML, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
  • Framework context: Rails helpers (sanitize_sql, permit, require).
  • Known gaps: string interpolation inside shell and SQL strings is recognized structurally but not modeled as a distinct operator. begin/rescue/ensure exception-edge wiring is documented as deferred (structurally incompatible with build_try()). The previous open rb-interproc-001 FN closed in the 2026-04-28 baseline after the Ruby Kernel#open CMDI sink and exact-match sigil work landed.

Rust: 100% P / 100% R / 100% F1 (70-case adversarial corpus)

Rust holds the largest per-language adversarial corpus and was promoted from Experimental to Beta in the 2026-04-25 measurement after the PathFact landings closed every previously-open rs-safe-* regression.

  • Rule depth: 6 source families, 2 sanitizer families (prefix and type-coercion), 11 sink matchers covering HTML, Shell, SQL, SSRF, Deserialization, and File I/O. Extensive framework source coverage (Axum, Actix, Rocket); the most of any language on the source side. The narrow sanitizer count is the primary reason Rust is not in the Stable tier. Engine-side path/typed sanitizer recognition (PathFact) compensates, but the ruleset itself is shallow.
  • Recent additions: SQL class (rusqlite, sqlx, diesel, postgres), Deserialization class (serde_yaml, bincode, rmp_serde, ciborium, ron, toml), expanded file I/O (fs::remove_file/dir/rename/copy), reqwest SSRF builder chain.
  • Closed by recent PathFact landings (src/abstract_interp/path_domain.rs + per-return-path PathFact entries on SsaFuncSummary): rs-safe-007 (.replace("..","") sanitiser), rs-safe-008 (negative-validation return), rs-safe-009 (match-arm guards via condition lifting), rs-safe-010 (static-map lookup), rs-safe-012 (.contains("..") + .starts_with('/') rejection), rs-safe-014 (Option-returning user sanitiser), rs-safe-015 (Path::new(p).is_absolute() typed rejection), rs-safe-016 (cross-function .contains("..") rejection), and CVE patches CVE-2018-20997, CVE-2022-36113, CVE-2024-24576.
  • Not yet covered: unsafe FFI / std::mem::transmute (no rules), Tokio process::Command async variants (not distinguished from sync), hyper / surf / ureq SSRF clients (reqwest family only).

Preview tier

C and C++ remain Preview despite reporting 100% rule-level F1 on the synthetic corpus. A run of additions in late April taught the engine to follow taint through several constructs that used to be hard cutoffs (STL containers, builder chains, inline member functions, the wider std::sto* family), so the gap between “passes the synthetic corpus” and “would catch the same flow on a real codebase” is narrower than it used to be. It is not zero. The biggest remaining gaps are deep pointer aliasing and function pointers, both of which are pervasive in real C/C++ code. Treat a clean report as a starting point, not an audit. Pair Nyx with clang-tidy, the Clang Static Analyzer, or Infer for production use.

What now works (added in late April):

  • STL container flow. vec.push_back(tainted) followed by vec.front().c_str() carries taint into a downstream system() sink. std::map::insert_or_assign, find, count, at, and data all participate in the container store/load model.
  • Inline class member functions. class C { void run(...) { ... } }; bodies are now extracted as their own functions, so an intra-file call like inner.run(input) resolves to the body summary. Same fix covers struct_specifier, union_specifier, enum_specifier, template_declaration, and extern "C" blocks.
  • Lambda passthrough. auto echo = [](const char* s) { return s; }; carries argument taint into the result via the engine’s default call-argument propagation.
  • Builder chains. Socket::builder().host(user).port(8080).connect() resolves the chained returns and fires on .connect() when user is tainted; the safe variant with a hardcoded host stays quiet.
  • Wider numeric sanitizer family. The full std::sto* set (including stoll, stoull, stold) and the C-stdlib forms (atoi, atof, strtol, etc.) clear all caps when they’re called.
  • More header / source extensions. .cc, .cxx, .hpp, .hxx, .hh, and .h++ are recognized as C++ on top of .cpp and .c++. .h is intentionally still routed to C since it’s ambiguous without a build system.

Still not modeled (common to both C and C++):

  • Deep pointer aliasing. Taint through *p, p->field, and arbitrary pointer arithmetic is not tracked through arbitrary aliased writes. Field-sensitive points-to (see Advanced analysis) handles the “lock on a sub-field” case but is not a general escape analysis.
  • Function pointers and callback dispatch. An indirect call through void (*fn)(char *) resolves to no callee, so cross-pointer flows are invisible.
  • Array-element taint by index. Writes to buf[i] do not always propagate taint to buf as a whole; the recent subscript-handling work helps the general case but doesn’t make buf an alias for every element.
  • Nested classes beyond one level (C++ only).

C: 100% P / 100% R / 100% F1 (30-case corpus)

  • Rule depth: 3 source families, 2 sanitizer families (the sanitize_* prefix and numeric-parse functions), 5 sink matchers spanning Shell, File, SSRF, and Format-String.
  • Known gaps: no framework rules, no gated sinks. The structural limitations listed above are the dominant concern; rule additions alone will not lift this language out of the Preview tier.

C++: 100% P / 100% R / 100% F1 (33-case corpus, plus 6 new fixtures for STL / builder / inline-method flows)

  • Rule depth: Builds on the C ruleset with std::cin / std::getline sources and a wider numeric-sanitizer set covering the full std::sto* family (3 sources, 3 sanitizer families, 5 sinks).
  • Known gaps: still no framework rules and no gated sinks. The structural blind spots are now narrower than they were a release ago (see “What now works” above), but function pointers and the harder pointer-aliasing patterns still produce false negatives.

How the tiers were assigned

Because rule-level F1 has saturated for nine of ten languages, the tier boundaries are drawn primarily on rule depth and engine coverage of real-world idioms rather than on benchmark scores alone.

A language lands in Stable when all three hold:

  • Rule set covers ≥ 8 vulnerability classes with both source and sink matchers, and at least one class has argument-role-aware gated-sink modeling (e.g. setAttribute("href", url) only flags href-like attrs).
  • Benchmark F1 ≥ 95% on a corpus of ≥ 25 cases.
  • Advanced analysis (SSA lowering, context-sensitivity, symbolic execution, abstract interpretation) is exercised by fixtures for the language.

A language lands in Beta when benchmark F1 is in the mid-90s or higher on a meaningful corpus but at least one Stable criterion fails. Typical gaps: absence of gated sinks, or sanitizer rule depth narrow enough that the engine compensates structurally rather than via the ruleset.

A language lands in Preview when the engine has documented structural blind spots for constructs that are pervasive in typical codebases for that language. For C and C++ that means deep pointer aliasing, function pointers, and array-element taint; STL container flow and builder chains have moved out of the blind-spot list. Synthetic-corpus F1 is not a reliable signal for Preview-tier languages: a clean report can coexist with structural gaps.

(The previous Experimental tier was retired in the 2026-04-25 measurement when Rust’s adversarial corpus reached 100% F1; no language currently sits in that tier.)


What this means for you

  • CI gates: safe to set strict --fail-on HIGH gates on Stable-tier languages. On Beta-tier, expect occasional FP triage on production code (the synthetic corpus does not cover every framework idiom); the weak-spot lists above tell you what to skim for. On Preview-tier, treat Nyx findings as a starting point for manual review rather than authoritative. STL container flow and builder chains are tracked now, but deep pointer aliasing and function pointers are not, so a clean report does not tell you what the engine could not see.
  • Rule contributions: the shortest path to raising a language’s tier is contributing sink matchers and gated-sink registrations. Label files live at src/labels/<lang>.rs; benchmark cases live at tests/benchmark/corpus/<lang>/.
  • Scope planning: if your primary stack is C or C++, Nyx will surface real findings on obvious unsafe-API uses, but budget for review time and combine Nyx with clang-tidy or the Clang Static Analyzer. Rust is now Beta-tier and suitable as a CI gate; pair with cargo-audit for dependency CVEs.

The benchmark thresholds in tests/benchmark_test.rs are deliberately set ~5 pp below current baselines so any drop in a language’s F1 fails CI. Tier promotions require sustained benchmark performance, not just rule additions.

Rule reference

Every finding Nyx emits has a rule ID. This page enumerates the IDs that ship with scanner 0.5.0, grouped by family.

This page is written by hand and drifts against the code. Authoritative sources: src/patterns/<lang>.rs for AST patterns, src/labels/<lang>.rs for taint matchers, and src/auth_analysis/config.rs for auth rules. If a rule fires that isn’t listed here, the source file is right and this page is wrong.

If you’d rather browse rules interactively, nyx serve ships a Rules page that lists every loaded matcher with its language, kind, and capability:

Nyx Rules page: filterable list of 218 rules with language, kind (SOURCE/SANITIZER/SINK), capability, and finding count columns

ID format

PrefixDetectorExample
taint-*Taint analysistaint-unsanitised-flow (source 5:11)
cfg-*CFG structuralcfg-unguarded-sink, cfg-auth-gap
state-*State modelstate-use-after-close, state-resource-leak
<lang>.auth.*Auth analysisrs.auth.missing_ownership_check
<lang>.<category>.<name>AST patternsrs.memory.transmute, js.code_exec.eval

Language prefixes: rs, c, cpp, go, java, js, ts, py, php, rb.

Cross-language rules

Taint

One rule covers every source-to-sink flow. The parenthetical identifies the source location.

Rule IDSeverity
taint-unsanitised-flow (source L:C)Varies by source kind and sink capability

The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in src/labels/<lang>.rs. Language maturity gives per-language counts and what’s covered.

CFG structural

Rule IDSeverity
cfg-unguarded-sinkHigh/Medium
cfg-auth-gapHigh
cfg-unreachable-sinkMedium
cfg-unreachable-sanitizerLow
cfg-unreachable-sourceLow
cfg-error-fallthroughHigh/Medium
cfg-resource-leakMedium
cfg-lock-not-releasedMedium

State model

Rule IDSeverity
state-use-after-closeHigh
state-double-closeMedium
state-resource-leakMedium
state-resource-leak-possibleLow
state-unauthed-accessHigh

Auth analysis (Rust only, today)

Rule IDSeverity
rs.auth.missing_ownership_checkHigh
rs.auth.missing_ownership_check.taintHigh (gated by scanner.enable_auth_as_taint)

See auth.md for scope, the five sink-classes, and tuning.

AST patterns by language

Each language ships a tree-sitter pattern registry. Structural match on the pattern, no dataflow. Some patterns also have a Tier B heuristic guard (e.g. SQL execute must receive a concatenation, not a literal) noted in the registry.

The tables below are generated from src/patterns/<lang>.rs by tools/docgen. Run cargo run --features docgen --bin nyx-docgen after changing the registry to refresh them.

C: 8 patterns

Rule IDSeverityTierConfidence
c.cmdi.systemHighAHigh
c.memory.getsHighAHigh
c.memory.printf_no_fmtHighBMedium
c.memory.scanf_percent_sHighAHigh
c.memory.sprintfHighAHigh
c.memory.strcatHighAHigh
c.memory.strcpyHighAHigh
c.cmdi.popenMediumAHigh

C++: 9 patterns

Rule IDSeverityTierConfidence
cpp.cmdi.popenHighAHigh
cpp.cmdi.systemHighAHigh
cpp.memory.getsHighAHigh
cpp.memory.printf_no_fmtHighBMedium
cpp.memory.sprintfHighAHigh
cpp.memory.strcatHighAHigh
cpp.memory.strcpyHighAHigh
cpp.memory.const_castMediumAHigh
cpp.memory.reinterpret_castMediumAHigh

Go: 8 patterns

Rule IDSeverityTierConfidence
go.cmdi.exec_commandHighAHigh
go.transport.insecure_skip_verifyHighAHigh
go.deser.gob_decodeMediumAHigh
go.memory.unsafe_pointerMediumAHigh
go.secrets.hardcoded_keyMediumAHigh
go.sqli.query_concatMediumBMedium
go.crypto.md5LowAMedium
go.crypto.sha1LowAMedium

Java: 8 patterns

Rule IDSeverityTierConfidence
java.cmdi.runtime_execHighAHigh
java.deser.readobjectHighAHigh
java.reflection.class_fornameMediumAHigh
java.reflection.method_invokeMediumAHigh
java.sqli.execute_concatMediumBMedium
java.xss.getwriter_printMediumAHigh
java.crypto.insecure_randomLowAMedium
java.crypto.weak_digestLowAMedium

JavaScript: 22 patterns

Rule IDSeverityTierConfidence
js.code_exec.evalHighAHigh
js.code_exec.new_functionHighAHigh
js.config.cors_dynamic_originHighAMedium
js.code_exec.settimeout_stringMediumAHigh
js.config.insecure_session_httponlyMediumAHigh
js.config.reject_unauthorizedMediumAHigh
js.config.verbose_error_responseMediumAMedium
js.crypto.weak_hash_importMediumAMedium
js.prototype.extend_objectMediumAHigh
js.prototype.proto_assignmentMediumAHigh
js.secrets.fallback_secretMediumAMedium
js.xss.cookie_writeMediumAHigh
js.xss.document_writeMediumAHigh
js.xss.insert_adjacent_htmlMediumAHigh
js.xss.location_assignMediumAHigh
js.xss.outer_htmlMediumAHigh
js.config.insecure_session_samesiteLowAHigh
js.config.insecure_session_secureLowAMedium
js.crypto.math_randomLowAMedium
js.crypto.weak_hashLowAMedium
js.secrets.hardcoded_secretLowAMedium
js.transport.fetch_httpLowAMedium

PHP: 11 patterns

Rule IDSeverityTierConfidence
php.cmdi.systemHighAHigh
php.code_exec.assert_stringHighAHigh
php.code_exec.create_functionHighAHigh
php.code_exec.evalHighAHigh
php.code_exec.preg_replace_eHighAHigh
php.deser.unserializeHighAHigh
php.path.include_variableHighBMedium
php.sqli.query_concatMediumBMedium
php.crypto.md5LowAMedium
php.crypto.randLowAMedium
php.crypto.sha1LowAMedium

Python: 13 patterns

Rule IDSeverityTierConfidence
py.cmdi.os_popenHighAHigh
py.cmdi.os_systemHighAHigh
py.cmdi.subprocess_shellHighBMedium
py.code_exec.evalHighAHigh
py.code_exec.execHighAHigh
py.deser.pickle_loadsHighAHigh
py.deser.yaml_loadHighAHigh
py.code_exec.compileMediumAHigh
py.deser.shelve_openMediumAHigh
py.sqli.execute_formatMediumBMedium
py.xss.jinja_from_stringMediumAHigh
py.crypto.md5LowAMedium
py.crypto.sha1LowAMedium

Ruby: 11 patterns

Rule IDSeverityTierConfidence
rb.cmdi.backtickHighAHigh
rb.cmdi.system_interpHighAHigh
rb.code_exec.class_evalHighAHigh
rb.code_exec.evalHighAHigh
rb.code_exec.instance_evalHighAHigh
rb.deser.marshal_loadHighAHigh
rb.deser.yaml_loadHighAHigh
rb.reflection.constantizeMediumAHigh
rb.reflection.send_dynamicMediumBMedium
rb.ssrf.open_uriMediumAHigh
rb.crypto.md5LowAMedium

Rust: 13 patterns

Rule IDSeverityTierConfidence
rs.memory.copy_nonoverlappingHighAHigh
rs.memory.get_uncheckedHighAHigh
rs.memory.mem_zeroedHighAHigh
rs.memory.ptr_readHighAHigh
rs.memory.transmuteHighAHigh
rs.quality.unsafe_blockMediumAHigh
rs.quality.unsafe_fnMediumAHigh
rs.memory.mem_forgetLowAHigh
rs.memory.narrow_castLowAMedium
rs.quality.expectLowAHigh
rs.quality.panic_macroLowAHigh
rs.quality.todoLowAHigh
rs.quality.unwrapLowAHigh

TypeScript: 22 patterns

Rule IDSeverityTierConfidence
ts.code_exec.evalHighAHigh
ts.code_exec.new_functionHighAHigh
ts.config.cors_dynamic_originHighAMedium
ts.code_exec.settimeout_stringMediumAHigh
ts.config.insecure_session_httponlyMediumAHigh
ts.config.reject_unauthorizedMediumAHigh
ts.config.verbose_error_responseMediumAMedium
ts.crypto.weak_hash_importMediumAMedium
ts.prototype.proto_assignmentMediumAHigh
ts.secrets.fallback_secretMediumAMedium
ts.xss.document_writeMediumAHigh
ts.xss.insert_adjacent_htmlMediumAHigh
ts.xss.location_assignMediumAHigh
ts.xss.outer_htmlMediumAHigh
ts.config.insecure_session_samesiteLowAHigh
ts.config.insecure_session_secureLowAMedium
ts.crypto.math_randomLowAMedium
ts.crypto.weak_hashLowAMedium
ts.quality.any_annotationLowAMedium
ts.quality.as_anyLowAMedium
ts.secrets.hardcoded_secretLowAMedium
ts.xss.cookie_writeLowAMedium

Capability list for custom rules

nyx config add-rule --cap <name> and [analysis.languages.*.rules] in config accept:

env_var, html_escape, shell_escape, url_encode, json_parse, file_io, fmt_string, sql_query, deserialize, ssrf, code_exec, crypto, unauthorized_id, all

Source for both the enum and the to_cap mapping: src/labels/mod.rs (Cap) and src/utils/config.rs (CapName).

Auth analysis

Rust today. Other languages have rule scaffolding in src/auth_analysis/config.rs (Python, Ruby, Go, Java, JavaScript, TypeScript), but only Rust has benchmark corpus coverage and the precision work to back it. Treat findings on other languages as preview; the rule prefix (py.auth.*, js.auth.*, rb.auth.*, go.auth.*, java.auth.*) is reserved but the matchers haven’t been validated against real codebases yet.

What it catches

The Rust rule is rs.auth.missing_ownership_check. It fires when a request handler reaches a privileged operation that takes a scoped identifier (*_id, row reference, scoped resource) without a preceding ownership or membership check.

Concretely, it looks for five patterns of authorization in the function body and flags the call when none are present:

  • A call to a recognised authorization helper. Defaults: check_ownership, has_ownership, require_ownership, ensure_ownership, is_owner, authorize, verify_access, has_permission, can_access, can_manage, plus *_membership and require_{group,org,workspace,tenant,team}_member variants. Extend in [analysis.languages.rust].
  • An ownership-equality check on a row reference: if owner_id != user.id { return 403 } or any field_id != self_actor shape. The check writes AuthCheck evidence back to the row-fetch arguments via AnalysisUnit.row_field_vars.
  • A self-actor reference: let user = require_auth(...).await? followed by use of user.id, user.user_id, user.uid. The actor is recognised from typed extractor params (Extension<Session>, CurrentUser, etc.) and from typed helper bindings.
  • A SQL query that joins through an ACL table or filters by user_id predicate. Detected without a SQL parser via sql_semantics.rs; the authorized result variable propagates through let row = ...prepare(LIT)..., for row in result, let id = row.get(...).
  • A helper-summary lift: handler calls validate_target(db, widget_id, user.id) whose body contains a require_*_member call. Cross-function summaries are merged at fixed-point (capped at 4 iterations).

Sink classification

The same call name can be safe on a local collection and dangerous on a database. The detector categorises each candidate sink before deciding whether to flag:

ClassExamplesDefault treatment
InMemoryLocalmap.insert, set.insert, vec.push on tracked localNever a sink
RealtimePublishrealtime.publish_to_group, pubsub.sendSink unless ownership is established for the channel scope
OutboundNetworkhttp.post, reqwest::Client::postSink unless a sanitiser is on the path
CacheCrossTenantredis.set, memcached.set with scoped keysSink unless tenant is checked
DbMutationdb.insert, repo.save with scoped IDsSink unless ownership is established
DbCrossTenantReaddb.query returning rows from a tenant scopeSink unless ACL-join or tenant predicate is present

Receiver type drives the classification when SSA type facts are available, so client.send(...) correctly resolves through the receiver’s inferred type.

What it can’t catch

  • Non-Rust frameworks, in practice. Scaffolding exists; coverage doesn’t.
  • Type-system authorization. A typestate pattern that makes unauthenticated handlers fail to compile (fn endpoint(user: AuthenticatedUser<Admin>)) is invisible. This is mostly fine because the type system already enforced the check, but the rule won’t credit it.
  • Authorization performed only via macros that the AST doesn’t expose as a recognisable call.
  • Cross-async-boundary actor binding. If the handler awaits let user = require_auth(...).await? and then spawns a task that uses user.id after a tokio::spawn, the spawn body is treated as a separate scope.

The taint-based variant

A second rule, rs.auth.missing_ownership_check.taint, folds the same logic into the SSA/taint engine using the Cap::UNAUTHORIZED_ID capability (bit 12). Request-bound handler parameters seed UNAUTHORIZED_ID into taint state; ownership checks act as sanitizers that strip the cap; sinks that take scoped IDs require it absent.

This path is off by default while the standalone analyser carries the stable signal. Enable both:

[scanner]
enable_auth_as_taint = true

Run them together; if both fire for the same site, treat it as the same finding (the taint variant carries fuller flow evidence).

Tuning

Add a project-specific authorization helper

[[analysis.languages.rust.rules]]
matchers = ["require_subscription", "ensure_paid_seat"]
kind     = "sanitizer"
cap      = "unauthorized_id"

The same rule recognised in the standalone analyser also strips Cap::UNAUTHORIZED_ID for the taint-based variant.

Recognised actor names

Recognised by default: user.id, user.user_id, user.uid, session.user_id, current_user.id, plus typed extractor parameters with CurrentUser, SessionUser, AuthUser, Extension<...> shapes. To add a custom binding pattern, file an issue or add a fixture; the heuristic is in src/auth_analysis/checks.rs under extract_validation_target and friends.

Suppress

Inline:

#![allow(unused)]
fn main() {
db.insert(widget_id, value)?;  // nyx:ignore rs.auth.missing_ownership_check
}

Or filter by severity / confidence in CI:

nyx scan . --severity ">=MEDIUM" --min-confidence medium

In the UI

Auth findings render alongside taint findings in the browser UI. The flow visualiser shows the sink call, the actor reference (when one was found), and any helper-summary path the engine traversed; the How to fix panel mirrors the rule’s recommendation.

Nyx finding detail: numbered source → call → sink walk with a How to fix panel and an inline evidence object

Where the work was done

The remediation work is documented release-by-release in tests/benchmark/RESULTS.md under the Rust auth row. Phases A1 through B5 (precision and structural improvements) and Phase C (taint-based variant) all landed on the 0.5.0 release branch. The benchmark corpus at tests/benchmark/corpus/rust/auth/ is 10 fixtures covering the five FP patterns plus a true-positive control.

How Nyx works

If you’re going to act on a finding, it helps to know how the scanner got there. This page is the short version. Source paths are linked where the answer to “exactly what does it do” lives in the code.

The pipeline

A scan runs in two passes over the file tree, with an optional SQLite index that lets the second scan skip files whose content hash hasn’t changed.

Pass 1, per file. Tree-sitter parses the file. Nyx builds an intra-procedural control-flow graph, lowers it to SSA, and extracts a summary per function describing what that function does at the boundary: which arguments flow to sinks, which sources it reads from, which sinks it calls, what taint it strips, what it returns. Summaries are persisted to SQLite (src/summary/, src/database.rs).

Summary merge. All per-file summaries get unioned into a global map keyed by qualified function name.

Pass 2, per file. Each file is reanalysed with the global summaries available. The taint engine runs a forward dataflow worklist over the SSA representation. When it hits a call, it consults summaries to decide whether the call propagates taint, sanitizes it, or terminates the flow. Findings are produced when tainted data reaches a sink whose required capability is still set on the value.

Two extra layers tune precision around calls. Context-sensitive inlining (k=1) re-runs intra-file callees with the actual argument taint at the call site, so a helper called once with tainted input and once with sanitized input produces the right result for each call. SCC fixed-point: when a group of mutually-recursive functions forms a strongly-connected component in the call graph, the engine iterates summaries to a joint fixed-point (capped at 64 iterations). SCCs that span files are also handled.

When a method call has a receiver typed as a super-class, trait, or interface, hierarchy fan-out widens the resolved callee set to every concrete implementer the engine has seen. A class diagram extracted in pass 1 (Java extends/implements, Rust impl-for, TS/JS extends, Python bases, Ruby includes, PHP extends/implements, C++ inheritance) feeds an index that the call resolver consults during pass 2. The fan-out is capped at 8 implementers per call site; over-fanning is a precision tax, not a soundness issue.

A separate field-sensitive points-to pass tracks abstract locations down to the field level, so c.mu.Lock() is a lock on Field(c, mu) rather than on c as a whole. That distinction is what lets the resource-lifecycle and taint passes tell obj.field = tainted; sink(obj.other_field) apart from the conservative whole-variable approximation. Subscript reads and writes (arr[i], map[k] = v) lower to synthetic __index_get__ / __index_set__ calls so the same container model handles them. Set NYX_POINTER_ANALYSIS=0 to fall back to the pre-pointer-pass behaviour for one release if you need to compare baselines.

Optional analyses on top

These run on top of the forward taint pass. They’re independently switchable via [analysis.engine] config or matching CLI flags. See advanced-analysis.md for the full description and tradeoffs.

PassPurposeDefault
Abstract interpretationCarries interval and string prefix/suffix bounds alongside taint. Suppresses findings on proven-bounded integers and locked-prefix URLson
Context sensitivityk=1 inlining for intra-file calleeson
Field-sensitive points-toDistinguishes obj.field from obj itself, so a tainted write to one field does not poison reads from another. Also gives the resource-lifecycle pass per-field lockson
Hierarchy fan-outWhen a method call’s receiver is typed as a super-class, trait, or interface, widens callee resolution to every concrete implementer the engine has seenon
Constraint solvingDrops paths whose accumulated branch predicates are unsatisfiable. Optional Z3 backend with --features smton
Symbolic executionBuilds an expression tree per tainted value. Produces a witness string at the sink. Detects sanitization patterns the taint engine alone would misson
Backwards analysisAfter the forward pass, walks backwards from each sink to confirm or invalidate the flow. Annotates findings as backwards-confirmed, backwards-infeasible, or backwards-budget-exhaustedoff

--engine-profile fast | balanced | deep flips groups of these at once. balanced is the default and the configuration the benchmark numbers in language-maturity.md are measured against.

Where bounds live

Static analysis at scale means choosing where to stop. Nyx exposes its bounds rather than hiding them:

  • Inline depth is k=1. Callees larger than the inline body-size cap fall back to summary-based resolution.
  • SCC fixed-point is capped at 64 iterations. If a recursive cluster doesn’t converge, the engine emits the best summary it has and records an engine_note on affected findings.
  • Lattice width is bounded. Taint origin sets cap at 32 entries per SSA value (--max-origins); points-to sets cap at 32 heap objects (--max-pointsto). Truncation is recorded as OriginsTruncated / PointsToTruncated so you can see when precision was lost.
  • Symbolic expressions cap at depth 32. Deeper expressions degrade to Unknown rather than growing without bound.

Findings whose engine notes indicate a bound was hit can be filtered with --require-converged for strict CI gates. The flag drops over-reports and bails; under-reports (where the emitted finding is still real but the result set is a lower bound) are kept.

What you get out

Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling.

For the JSON shape and SARIF mapping, see output.md.

Advanced Analysis

Nyx layers several analysis passes on top of the core SSA taint engine. Most are switchable via config ([analysis.engine] in nyx.conf / nyx.local), a matching CLI flag pair, or, as a last-resort override for library users with no CLI entry point, a NYX_* environment variable. The five precision-tuning passes (abstract interpretation, context sensitivity, symbolic execution, constraint solving, field-sensitive points-to) are on by default because the benchmark numbers in language-maturity.md are measured with them on. The demand-driven backwards walk and hierarchy fan-out sit alongside but are not user-toggleable in the same way.

See Configuration for the full config surface and CLI flag table. This page explains what each pass does, why it helps, how to disable it, and what it does not cover.


Abstract interpretation

What it does. Propagates interval and string abstract domains through the SSA worklist alongside taint. Integer values carry [lo, hi] bounds; string values carry a prefix and suffix (plus a bit domain for known-zero / known-one bits). Values are joined at merge points and widened at loop heads so the worklist always terminates.

Why it helps. Lets Nyx suppress some findings that are obviously safe given the abstract value; a proven-bounded integer does not flow into a SQL sink as an injection risk; an SSRF sink whose URL prefix is locked to a trusted host stays quiet. This turns a large class of FPs on numeric and locked-prefix paths into true negatives.

How to turn it off.

SurfaceValue
Configabstract_interpretation = false under [analysis.engine]
CLI flag--no-abstract-interp
Env var (legacy)NYX_ABSTRACT_INTERP=0

Limitations. The interval domain is 64-bit signed; very wide or overflow-producing arithmetic degrades to (unbounded). String prefix / suffix tracking is concat-only; it does not model reordering, reversal, or character-level regex constraints. Loop widening deliberately drops changing bounds rather than chasing fixpoints.

Source: src/abstract_interp/.


Context-sensitive analysis

What it does. Adds k=1 call-site-sensitive taint propagation for intra-file callees. When a function is invoked, Nyx reanalyzes the callee body with the actual per-argument taint signature of the call site, producing call-site-specific return taint. Results are cached by (function_name, ArgTaintSig) so repeated calls with the same signature are free.

Why it helps. A helper called once with a tainted argument and once with a sanitized argument produces two different findings; without k=1 sensitivity, the conservative union of both call sites would be applied to the sanitized call, producing a spurious finding there.

How to turn it off.

SurfaceValue
Configcontext_sensitive = false under [analysis.engine]
CLI flag--no-context-sensitive
Env var (legacy)NYX_CONTEXT_SENSITIVE=0

Limitations. Intra-file only. Cross-file callees are resolved via summaries (see src/summary/) rather than re-inlined. Depth is capped at k=1 to prevent cache blow-up and re-entrancy; higher k would require a different cache key design. Callee bodies larger than the internal MAX_INLINE_BLOCKS threshold fall back to the summary path. Cache keys hash per-argument Cap bits but not source-origin identity, so two callers with identical caps but different origins share cached origin-attribution.

Source: src/taint/ssa_transfer.rs (ArgTaintSig, InlineCache, inline_analyse_callee).


Field-sensitive points-to

What it does. Runs a Steensgaard-style alias analysis that interns field accesses as their own abstract locations. c.mu becomes Field(c, mu), distinct from c itself; a write to obj.cache and a read from obj.cache in different methods both land on the same abstract location; subscript reads and writes (arr[i], map[k] = v) lower to synthetic __index_get__ / __index_set__ calls so the engine can model them through the same container store/load primitives used for STL containers, Python lists, JS arrays, and similar.

Why it helps. It splits a class of false positives that the whole-variable taint model produced. Before this pass, obj.field = tainted; sink(obj.other_field) would taint obj as a whole and fire on the safe field; the receiver-type / sub-field distinction is also what lets the resource-lifecycle pass attribute a c.mu.Lock() to the lock field rather than to its container. Cross-method field flow (writer in one method, reader in another) shows up only when fields have stable identity independent of the parent value.

How to turn it off.

SurfaceValue
Env varNYX_POINTER_ANALYSIS=0

The pass is on by default as of 2026-04-26. The env-var override is kept for one release so you can compare against the pre-pointer baseline, then will be removed.

Limitations. This is not a general escape analysis. Function pointers and arbitrary indirect calls still resolve to no callee, and deep alias chains through *p / p->field in C/C++ are not tracked beyond the direct field case. The points-to set per value is capped at --max-pointsto (default 32); when truncation happens, an engine note records the precision loss.

Source: src/pointer/.


Hierarchy fan-out for virtual dispatch

What it does. Builds a per-language type-hierarchy index in pass 1 (extends, implements, impl-for, includes; the exact construct depends on the language) and uses it in pass 2 to widen method-call resolution. When a call’s receiver is statically typed as a super-class, trait, or interface, the resolver returns every concrete implementer it has seen in the codebase rather than just the first match.

Why it helps. Without it, a call like repository.findById(id) where repository is typed as the interface gets resolved against whatever the single-result resolver finds first; if the matching implementer is in another file the call effectively goes opaque. With the hierarchy, the taint engine sees the union of every implementer’s transform and the flow shows up regardless of which file holds the concrete class.

Limitations. Fan-out is capped at 8 implementers per call site; over that, the tail is silently dropped (a debug log records the cap hit) and the call is treated as a non-deterministic union of the kept implementers. Languages that use structural / implicit interface satisfaction (Go) are deliberately skipped because per-file extraction is intractable; those calls fall back to the single-result resolver. The extractor covers Java, Rust, TS/JS/TSX, Python, Ruby, PHP, and C++.

Source: src/cfg/hierarchy.rs and src/summary/mod.rs (TypeHierarchyIndex, resolve_callee_widened).


Symbolic execution

What it does. Builds a symbolic expression tree per tainted SSA value, generates a witness string for each taint finding (the concrete-looking shape of the dangerous value at the sink), and detects sanitization patterns that the taint engine alone would miss. Supports string operations (trim, replace, toLower, substring, strlen, …), arithmetic, concatenation, phi nodes, and opaque calls.

Why it helps. Raises finding quality. A taint finding with a rendered witness like "SELECT * FROM t WHERE id=" + userInput is substantially easier to triage than one without. Also powers some confidence-gating for downstream display.

How to turn it off.

SurfaceValue
Configsymex.enabled = false under [analysis.engine]
CLI flag--no-symex
Env var (legacy)NYX_SYMEX=0

Two nested switches refine the scope without disabling symex entirely:

SettingCLIEnvDefaultEffect
symex.cross_file--no-cross-file-symexNYX_CROSS_FILE_SYMEX=0onConsult cross-file SSA bodies so symex can reason about callees defined in other files
symex.interprocedural--no-symex-interprocNYX_SYMEX_INTERPROC=0onIntra-file interprocedural symex (k ≥ 2 via frame stack)

Limitations. Expression trees are bounded at MAX_EXPR_DEPTH=32; deeper expressions degrade to Unknown rather than growing unboundedly. Sanitizer detection is informational: string-replace sanitizer patterns are reported as witness metadata, not used to clear taint.

Source: src/symex/.


Demand-driven analysis

What it does. After the forward pass-2 taint analysis finishes, runs a backwards walk from each sink’s tainted SSA operands. The walk follows reverse SSA-edge transfer (phi fan-out, Assign operand-fanout, Call body-expansion or arg-fanout) until it reaches a taint source, proves the flow infeasible via an accumulated path predicate, or exhausts its budget. Each forward finding is then annotated with the aggregate verdict:

  • backwards-confirmed; a matching source was reached. Finding picks up a small confidence boost and the note appears in evidence.symbolic.cutoff_notes.
  • backwards-infeasible; every walk proved the flow unreachable. Finding is capped to Low confidence and a user-readable limiter is attached.
  • backwards-budget-exhausted; the walk hit BACKWARDS_VALUE_BUDGET without a verdict. Recorded as a limiter so operators can see when the pass could not keep up.
  • Inconclusive outcomes are a no-op: the forward finding is untouched.

Because the backwards walk can consult GlobalSummaries.bodies_by_key (populated by the cross-file callee body persistence layer) it closes across file boundaries; when a callee body is not loadable the walk falls back to fanning out over the call’s arguments so local reach-back is still possible.

Why it helps. Inverts the analysis direction so budget follows questions the scanner actually cares about; “does any source reach this sink?”; instead of proving every potential source-to-sink path. Corroborated findings are a stronger signal than forward-only ones, and proven-infeasible flows provide a principled way to lower confidence on forward false positives without silently dropping them.

How to turn it on. Defaults off so the benchmark floor is preserved while the pass stabilises.

SurfaceValue
Configbackwards_analysis = true under [analysis.engine]
CLI flag--backwards-analysis / --no-backwards-analysis
Env var (legacy)NYX_BACKWARDS=1

Limitations (first cut). Reverse call-graph expansion past a ReachedParam is deferred; the walk terminates at function parameters rather than crossing back into callers. Path-constraint pruning is conservative: only the accumulated PredicateSummary bits are consulted, not the full symbolic predicate stack. Depth-bounded at k=2 for cross-function body expansion. See DEFAULT_BACKWARDS_DEPTH, BACKWARDS_VALUE_BUDGET, and MAX_BACKWARDS_CALLEE_BLOCKS in src/taint/backwards.rs for the exact bounds.

Source: src/taint/backwards.rs.


Constraint solving

What it does. Collects path constraints at each branch in SSA and propagates them alongside taint. Prunes paths whose accumulated constraint set is unsatisfiable; a taint flow guarded by if x < 0 && x > 10 is dropped rather than surfaced. Optionally delegates the satisfiability check to Z3 when Nyx is built with the smt Cargo feature.

Why it helps. Removes a class of FPs rooted in clearly-infeasible control-flow combinations. Without path constraints, a taint flow that only occurs when mutually-exclusive branches are simultaneously taken can still produce a finding.

How to turn it off.

SurfaceValue
Configconstraint_solving = false under [analysis.engine]
CLI flag--no-constraint-solving
Env var (legacy)NYX_CONSTRAINT=0

The SMT backend is a separate switch:

SettingCLIEnvDefaultEffect
symex.smt--no-smtNYX_SMT=0on when built with smt featureDelegate satisfiability checks to Z3; ignored if Nyx was built without smt

Limitations. The default path-constraint domain is syntactic; trivially-inconsistent pairs are caught without an SMT solver, but richer algebraic unsatisfiability requires the smt feature (Z3). Without smt, Nyx ships a lightweight satisfiability check that catches literal contradictions but not deeper reasoning.

Source: src/constraint/.


Combining the switches

The defaults (all on) are the configuration Nyx is benchmarked against. Turning any switch off trades precision for speed and may move findings relative to the published baseline; CI regression gates assume defaults. If you need a minimal-overhead scan (for very large repositories or a pre-commit fast path), the AST-only scan mode (--mode ast) skips CFG, taint, and all four advanced passes entirely and is the right tool.

Detectors

Nyx ships four independent detector families. They run together in --mode full, the default. Findings are merged, deduplicated, ranked, and printed in one result set.

FamilyRule prefixLooks atWhat it finds
Taint analysistaint-*Cross-file dataflowUnsanitized data flowing source to sink
CFG structuralcfg-*Per-function control flowAuth gaps, unguarded sinks, error fallthrough, resource release on all paths
State modelstate-*Per-function state latticeUse-after-close, double-close, leaks, unauthenticated access
AST patterns<lang>.<cat>.<name>Tree-sitter structural matchBanned APIs, weak crypto, dangerous constructs

For Rust auth-specific rules (rs.auth.*), see auth.md.

How they combine

In --mode full:

  1. Taint and AST can both fire on one line. If eval(userInput) triggers both js.code_exec.eval (AST) and taint-unsanitised-flow (taint), both are kept with distinct rule IDs. The taint finding ranks higher because of the analysis-kind bonus.
  2. State supersedes CFG on resource leaks. When state-resource-leak and cfg-resource-leak fire at the same location, the CFG one is dropped.
  3. Exact duplicates are removed. Same line, column, rule ID, severity → one finding.

Modes

ModeActive detectors
full (default)All four
astAST patterns only
cfgTaint + CFG + State (no AST patterns)
taintTaint + State

Attack-surface ranking

Every finding gets a deterministic score. Findings are sorted by descending score by default. Disable with --no-rank or output.attack_surface_ranking = false.

score = severity_base + analysis_kind + evidence_strength + state_bonus - validation_penalty
ComponentValues
Severity baseHigh=60, Medium=30, Low=10
Analysis kindtaint=+10, state=+8, cfg with evidence=+5, cfg without evidence=+3, ast=+0
Evidence strength+1 per evidence item up to 4; +2 to +6 for source kind
State bonususe-after-close / unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1
Validation penalty-5 if path-validated

Source-kind contributions (taint only):

SourceBonus
User input (req.body, argv, stdin, form, query, params)+6
Environment (env::var, getenv, process.env)+5
Unknown+4
File system+3
Database+2

Approximate score ranges:

Finding typeScore
High taint with user input76 to 81
High state (use-after-close)~74
High CFG structural63 to 68
Medium taint with env source45 to 50
Medium state (resource leak)~40
Low AST-only pattern~10

For the engine’s runtime model (passes, summaries, SCC fixed-point), see how-it-works.md.

AST patterns

AST patterns are tree-sitter queries that match dangerous structural shapes in source. No dataflow, no CFG. A match means the construct is present; it’s not proof the construct is exploitable.

Patterns run in every analysis mode. In --mode ast they’re the only active detector.

Rule IDs

<lang>.<category>.<name>

Examples: js.code_exec.eval, py.deser.pickle_loads, c.memory.gets, java.sqli.execute_concat.

Full list: rules.md.

Tiers

TierMeaning
AStructural presence alone is high-signal. gets, eval, pickle.loads, mem::transmute
BPattern includes a tree-sitter heuristic guard. Example: java.sqli.execute_concat only fires when executeQuery receives a binary_expression (string concatenation), not a literal or a parameterized statement

Categories

CategoryExamples
CommandExecsystem, os.system, Runtime.exec, backticks
CodeExeceval, Function, PHP assert("string"), class_eval, instance_eval
Deserializationpickle.loads, yaml.load, Marshal.load, readObject, unserialize
SqlInjectionexecuteQuery/Query/execute with concatenated argument (Tier B)
PathTraversalPHP include $var
Xssdocument.write, outerHTML, insertAdjacentHTML, getWriter().print
Cryptomd5, sha1, Math.random, java.util.Random for security use
Secretshardcoded API keys (Go, JS, TS)
InsecureTransportInsecureSkipVerify, fetch("http://...")
ReflectionClass.forName, Method.invoke, send, constantize
MemorySafetytransmute, unsafe, gets, strcpy, sprintf
Prototype__proto__ assignment, Object.prototype.*
ConfigCORS dynamic origin, rejectUnauthorized: false, insecure session settings
CodeQualityunwrap, panic!, as any

What patterns can’t tell you

  • Dataflow. eval("1+1") (safe) and eval(userInput) (dangerous) both match js.code_exec.eval. The taint detector is the one that distinguishes them.
  • Reachability. A pattern in dead code matches identically.
  • Semantics. strcpy(dst, src) always matches, regardless of buffer sizes.
  • Indirect calls. let e = eval; e(input) doesn’t match eval.
  • Aliased imports. from os import system as s; s(cmd) won’t match system.
  • Macro expansions. Tree-sitter parses the macro call site, not the expansion.

Common false positives

ScenarioWhyMitigation
eval("hardcoded literal")Pattern matches structureRun --mode cfg to drop AST patterns and rely on taint
unsafe block with sound justificationEvery unsafe matches rs.quality.unsafe_blockFilter >=MEDIUM (it’s Medium) or accept the noise
.unwrap() in testsAcceptable in test codeDefault non-prod severity downgrade reduces it
md5 for non-cryptographic checksumsPattern can’t see intentSuppress with --severity ">=MEDIUM" or per-line nyx:ignore
SQL concat with trusted data (Tier B)Heuristic can’t verify the sourceTaint is more precise; or convert to a parameterized query

Confidence levels

Every AST pattern carries an explicit confidence:

ConfidenceUse
HighInherently dangerous construct with no safe usage. gets, pickle.loads, eval with no guard
MediumLikely issue, context may change the call. SQL concatenation (Tier B), unsafe blocks, exec
LowHeuristic. Often appears in safe code. Weak crypto for checksums, unwrap outside tests, Math.random

--min-confidence medium (or output.min_confidence = "medium") drops Low-confidence matches.

Tuning

nyx scan . --severity ">=MEDIUM"        # drop Low-tier patterns
nyx scan . --severity HIGH              # banned APIs and code-exec only
nyx scan . --mode cfg                   # drop AST patterns; keep taint + state + cfg
[scanner]
excluded_directories = ["node_modules", "vendor", "generated"]

Examples

Tier A, structural presence:

char buf[64];
gets(buf);                              // c.memory.gets
import pickle
data = pickle.loads(user_input)         // py.deser.pickle_loads

Tier B, heuristic guard:

// Fires: concatenated argument
stmt.executeQuery("SELECT * FROM users WHERE id=" + userId);  // java.sqli.execute_concat

// Does not fire: parameterized
stmt.executeQuery(preparedSql);
printf(user_input);                     // c.memory.printf_no_fmt: fires (variable as fmt)
printf("%s", user_input);               // does not fire (literal fmt)

CFG structural analysis

Nyx builds an intra-procedural control-flow graph per function and checks structural properties: whether sinks are guarded by sanitizers or validators, whether web handlers check authentication, whether resources are released on all exit paths, and whether error paths terminate before reaching dangerous code.

These detectors use dominator analysis. A guard dominates a sink when the guard must execute before the sink on every path from entry.

Rule IDs

Rule IDSeverity
cfg-unguarded-sinkHigh/Medium
cfg-auth-gapHigh
cfg-unreachable-sinkMedium
cfg-unreachable-sanitizerLow
cfg-unreachable-sourceLow
cfg-error-fallthroughHigh/Medium
cfg-resource-leakMedium
cfg-lock-not-releasedMedium

What it detects

cfg-unguarded-sink: A sink call (system, eval, Command::new, db.execute, etc.) is reachable from function entry without passing through any guard or sanitizer that matches the sink’s capability.

cfg-auth-gap: A function identified as a web handler (by parameter naming conventions like req, res, ctx, request, language-dependent) reaches a privileged sink (shell execution, file I/O) without a preceding authentication call.

cfg-unreachable-*: Sinks, sanitizers, or sources in dead code. Usually signals a refactoring error that silently disabled security-relevant logic.

cfg-error-fallthrough: An error-handling branch (null check, error-return check) does not terminate. Execution falls through to a dangerous operation on the error path.

cfg-resource-leak, cfg-lock-not-released: A resource acquisition (File::open, fopen, socket, Lock) is not matched by a release on every exit path from the function.

What it can’t detect

  • Inter-procedural guards. Middleware-level auth, helper functions that internally call auth, and cleanup performed in a caller are invisible.
  • Dynamic dispatch. Virtual calls, function pointers, closures resolve to no specific callee.
  • Correctness of guards. The detector checks a guard dominates the sink. It cannot check the guard is correct. A no-op if true {} would suppress the finding.
  • Custom validation logic. Only recognised guard names are checked. if password == expected is not a recognised guard.
  • Cross-function resource flows. If a file handle opens in one function and closes in another, the opener gets flagged as a leak. This is the largest source of FPs on factory-pattern code.

Common false positives

ScenarioWhyMitigation
Framework middleware authHandler doesn’t call auth directlyExpected; suppress with severity filter or exclude handlers
RAII / defer cleanupImplicit release not visible to CFG (partially handled for Rust Drop and Go defer)Known limitation
Custom guard nameFunction not in the recognised guard listAdd it as a sanitizer rule in config
Test handlersIntentional lack of authDefault non-prod downgrade reduces severity; or exclude test dirs

Common false negatives

ScenarioWhy
Auth in a called helperCross-function guards not tracked
Type-system guardsRust AuthenticatedUser<T> wrappers, typestate patterns not analysed
Cleanup in finally/ensure/defer in callersCross-function cleanup not tracked

Tuning

Recognised guard names

Nyx accepts these patterns as dominating guards:

PatternApplies to
validate*, sanitize*All sinks
check_*, verify_*, assert_*All sinks
shell_escapeShell sinks
html_escapeHTML/XSS sinks
url_encodeURL sinks
whichShell execution (binary lookup)

Recognised auth names

PatternLanguage
is_authenticated, require_auth, check_permission, authorize, authenticate, require_login, check_auth, verify_token, validate_tokenCross-language
middleware.auth, auth.requiredGo
isAuthenticated, checkPermission, hasAuthority, hasRoleJava

For Rust auth checks (require_*, ownership equality, row-level checks), see auth.md.

Custom guards

[[analysis.languages.python.rules]]
matchers = ["validate_request", "check_csrf"]
kind = "sanitizer"
cap  = "all"

Custom auth functions

[[analysis.languages.javascript.rules]]
matchers = ["ensureLoggedIn", "requirePermission"]
kind = "sanitizer"
cap  = "all"

Examples

Unguarded sink:

func handler(w http.ResponseWriter, r *http.Request) {
    cmd := r.URL.Query().Get("cmd")
    exec.Command("sh", "-c", cmd).Run()  // cfg-unguarded-sink
}

Auth gap:

app.get('/admin/delete', (req, res) => {
    // No auth call
    db.execute("DELETE FROM users WHERE id = " + req.params.id);  // cfg-auth-gap
});

Resource leak:

void process() {
    FILE *f = fopen("data.txt", "r");
    if (error) {
        return;           // cfg-resource-leak: f not closed on this path
    }
    fclose(f);
}

State model analysis

Tracks resource lifecycle and authentication state through a function. Detects use-after-close, double-close, leaks, and unauthenticated access to privileged operations.

State analysis is on by default. Disable with scanner.enable_state_analysis = false. It runs in --mode full and --mode taint; AST-only mode skips it.

Rule IDs

Rule IDSeverity
state-use-after-closeHigh
state-double-closeMedium
state-resource-leakMedium
state-resource-leak-possibleLow
state-unauthed-accessHigh

What it detects

state-use-after-close: Resource transitions to CLOSED (via close, fclose, disconnect, …), then a use operation happens on it.

FILE *f = fopen("data.txt", "r");
fclose(f);
fread(buf, 1, 100, f);  // state-use-after-close

state-double-close: Resource closed twice. Crashes or undefined behaviour on most runtimes.

state-resource-leak: Resource opened but never closed on any path through the function. Definite leak.

state-resource-leak-possible: Resource closed on some paths but not others. Lower confidence; often an early-return error path.

state-unauthed-access: A function recognised as a web handler reaches a privileged sink without an auth call on the path.

A function counts as a web handler if its name starts with handle_, route_, or api_ (sufficient on its own), or starts with serve_/process_ and the file uses web-shaped parameter names (request, req, ctx, res, response, w, writer, language-dependent). main is excluded.

Managed-resource suppression

Several language-specific cleanup patterns suppress leak findings:

PatternLanguagesEffect
RAII / DropRustAll leak findings suppressed except alloc/dealloc
Smart pointersC++make_unique/make_shared treated as managed; raw new/malloc still tracked
deferGodefer f.Close() suppresses leak at exit
with context managerPythonwith open(f) as f: suppresses leak for the bound name
try-with-resourcesJavaTWR-bound resources suppressed

What it can’t detect

  • Cross-function resource ownership. Open in one function, close in another, leak gets reported in the opener. The most common FP source for leak detection.
  • Factory / builder functions that return a resource for the caller to manage.
  • Variable shadowing across scopes. Same name in inner and outer scope shares one symbol; an inner close masks an outer leak.
  • Resources stored in collections. Handles in arrays / maps / channels and cleaned up via iteration are not tracked.
  • Dynamic dispatch. Close called via trait object or interface may not be recognised.
  • Type-state authentication. AuthenticatedRequest<T> and similar Rust patterns are not recognised as auth.

Common false positives

ScenarioWhyMitigation
Factory returns a resourceCaller owns itKnown limitation
Framework-managed handlesConnection pool, request scopeExclude framework code or downgrade
Variable name shadowingSame name reusedKnown limitation

Per-language detection

LanguageLeakDouble-closeUse-after-closeNotes
Cyesyesyesfopen/fclose, malloc/free, pthread_mutex_*
C++yesyesyesC pairs plus new/delete; smart pointers suppressed
Pythonyesyesyeswith suppressed; open, socket, connect
Goyesyesyesdefer suppressed; os.Open / .Close
Rustunsafe onlyn/an/aRAII suppresses everything except alloc/dealloc
JavaScriptyesyespartialfs.openSync/closeSync
TypeScriptyesyespartialSame as JS
PHPyesyespartialfopen/fclose, curl_init/curl_close, mysqli_*
RubypartialpartialpartialFile.open/close, TCPSocket
JavalimitedlimitedlimitedConstructor-callee matching is incomplete

Tuning

nyx scan . --severity ">=MEDIUM"   # Skip "possible" leaks (Low)
[scanner]
enable_state_analysis = true        # default
excluded_directories  = ["tests", "test", "spec"]

Recognised pairs

The state engine ships these acquire/release pairs. Custom pairs are not yet configurable; file an issue if you need one.

C / C++

AcquireRelease
fopenfclose
openclose
socketclose
malloc, calloc, reallocfree
pthread_mutex_lockpthread_mutex_unlock
new, new[] (C++)delete, delete[]

Rust

AcquireRelease
File::open, File::createdrop, close
TcpStream::connectshutdown
lock, read, write (Mutex/RwLock)drop

Java

AcquireRelease
new FileInputStream (and friends)close
getConnectionclose
new Socketclose

Go, Python, JavaScript, Ruby, PHP follow language-idiomatic equivalents.

Use-after-close triggers

These operations on a closed resource fire state-use-after-close:

read, write, send, recv, fread, fwrite, fgets, fputs, fprintf, fscanf,
fflush, fseek, ftell, rewind, feof, ferror, fgetc, fputc, getc, putc,
ungetc, query, execute, fetch, sendto, recvfrom, ioctl, fcntl,
strcpy, strncpy, strcat, strncat, memcpy, memmove, memset, memcmp,
strcmp, strncmp, strlen, sprintf, snprintf

Taint analysis

Nyx tracks untrusted data from sources (where it enters the program) through assignments and function calls to sinks (where it’s used dangerously). If the flow reaches a sink without passing a matching sanitizer, a finding fires.

The engine is a monotone forward dataflow over a finite lattice with guaranteed termination. It’s flow-sensitive inside a function, and interprocedural across files via persisted per-function summaries.

Rule ID

taint-unsanitised-flow (source <line>:<col>)

One rule ID, parameterized by the source location. Suppressions can target either the base ID or the full string.

What it detects

  • User input flowing to shell execution: req.body.cmdchild_process.exec
  • User input flowing to code evaluation: req.query.codeeval
  • User input flowing to SQL: request.args.get('id')cursor.execute(f"... {id}")
  • Environment variables flowing to shell: env::var("CMD")Command::new("sh").arg("-c")
  • Request parameters flowing to HTML: req.query.nameinnerHTML
  • File contents flowing to privileged sinks: fs::read_to_stringdb.execute
  • Any other source-to-sink flow where the sink’s required capability is not stripped along the way

What it can’t detect

  • Library calls without summaries. If a callee has no summary (no source, binary-only dependency), Nyx treats it as neither propagating nor sanitizing. This is conservative for sanitization but lossy for propagation.
  • Deep pointer aliasing. let y = &x; sink(*y) works through one level, but arbitrary chains of pointer arithmetic and aliased writes (*p, p->field in C/C++) are not tracked end-to-end. Function pointers and indirect calls resolve to no callee.
  • Implicit flows. Taint follows explicit data, not branching signal. if (secret) x = 1 else x = 0 does not taint x.
  • Globals and statics across functions. Not tracked across function boundaries.

Common false positives

ScenarioWhyMitigation
Custom sanitizer not recognisedOnly built-in + configured sanitizers matchAdd a custom sanitizer rule in config
Container holds mixed-typed items the engine cannot tell apartA vector<int> of port numbers and a vector<string> of user input share the same store/load modelSanitize the values on the way in (numeric parse / explicit validator) so the values themselves carry no cap, not just the container
Dead branchesPath-insensitive within a functionConstraint solving catches trivially infeasible combos; path-validated findings are scored lower
Library wrapper re-introduces taintWrapper opaque, or summary marks it as propagatingSummarize the wrapper explicitly or add it as a sanitizer

Common false negatives

ScenarioWhy
Third-party library on the pathNo summary available, callee treated opaquely
Globals / statics across function boundariesNot tracked
Some closure capturesClosure analysis is limited. JS/TS/Ruby/Go anonymous functions passed as callbacks are analyzed as separate scopes
Very deep cross-file chainsSummary approximation loses precision at depth

Confidence signals

Higher confidence:

  • Source + Sink both present in evidence with specific call locations.
  • source_kind: user_input (direct attacker control).
  • path_validated: false.
  • No dominating guard on the path.
  • Symex produced a witness string (rendered sink value visible in JSON/SARIF evidence.symbolic.witness).

Lower confidence:

  • Path-validated taint (path_validated: true).
  • Source is a database read or internal file (pre-validated at insertion is common).
  • Engine note ForwardBailed / PathWidened. Use --require-converged to drop these in strict gates.

Tuning

Custom sanitizer

# nyx.local
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind     = "sanitizer"
cap      = "html_escape"

Or: nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape.

Filter by severity or confidence

nyx scan . --severity HIGH
nyx scan . --min-confidence medium

Skip dataflow entirely

nyx scan . --mode ast

AST-only mode gives you structural pattern matches without taint.

In the browser UI, taint findings render as a numbered flow walk so you can see each hop the engine took:

Nyx finding detail: HIGH taint-unsanitised-flow with numbered source → call → sink steps and How to fix guidance

Example

Rust:

use std::env;
use std::process::Command;

fn main() {
    let cmd = env::var("USER_CMD").unwrap();           // source
    Command::new("sh").arg("-c").arg(&cmd).output();   // sink
}

Finding:

[HIGH] taint-unsanitised-flow (source 5:15)  src/main.rs:6:5
       Unsanitised user input flows from env::var → Command::new
       Source: env::var (5:15)
       Sink:   Command::new

Safe rewrite: drop the shell and pass the value as argv directly (Command::new(&cmd).output()), or validate against an allowlist before passing to the shell.

Capabilities

Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer only clears taint for the cap it declares. A sink only fires when the remaining taint still carries its required cap.

CapabilityTypical sourceTypical sanitizerTypical sink
env_varenv::var, getenv, process.env
html_escapehtml.escape, DOMPurify.sanitizeinnerHTML, document.write
shell_escapeshlex.quote, shell_escape::escapesystem, Command::new, eval
url_encodeencodeURIComponentlocation.href, HTTP client URL arg
json_parseJSON.parse
file_ioos.path.realpath, filepath.Cleanopen, fs::read_to_string, send_file
fmt_stringprintf(var)
sql_queryparameterized query binderscursor.execute, db.query with concatenation
deserializepickle.loads, yaml.load, Marshal.load
ssrfURL-prefix locksrequests.get, fetch, HttpClient.send
code_execeval, exec, Function
cryptoweak-algorithm constructors
unauthorized_idrequest-bound scoped IDs (Rust auth analysis)ownership checkrow-level write
allSources typically use all so they match any sink

Sources typically use cap = "all" so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name.