Taint analysis

Nyx tracks untrusted data from sources (where it enters the program) through assignments and function calls to sinks (where it’s used dangerously). If the flow reaches a sink without passing a matching sanitizer, a finding fires.

The engine is a monotone forward dataflow over a finite lattice with guaranteed termination. It’s flow-sensitive inside a function, and interprocedural across files via persisted per-function summaries.

Rule ID

taint-unsanitised-flow (source <line>:<col>)

One rule ID, parameterized by the source location. Suppressions can target either the base ID or the full string.

What it detects

User input flowing to shell execution: req.body.cmd → child_process.exec
User input flowing to code evaluation: req.query.code → eval
User input flowing to SQL: request.args.get('id') → cursor.execute(f"... {id}")
Environment variables flowing to shell: env::var("CMD") → Command::new("sh").arg("-c")
Request parameters flowing to HTML: req.query.name → innerHTML
File contents flowing to privileged sinks: fs::read_to_string → db.execute
Any other source-to-sink flow where the sink’s required capability is not stripped along the way

What it can’t detect

Library calls without summaries. If a callee has no summary (no source, binary-only dependency), Nyx treats it as neither propagating nor sanitizing. This is conservative for sanitization but lossy for propagation.
Deep pointer aliasing. let y = &x; sink(*y) works through one level, but arbitrary chains of pointer arithmetic and aliased writes (*p, p->field in C/C++) are not tracked end-to-end. Function pointers and indirect calls resolve to no callee.
Implicit flows. Taint follows explicit data, not branching signal. if (secret) x = 1 else x = 0 does not taint x.
Globals and statics across functions. Not tracked across function boundaries.

Common false positives

Scenario	Why	Mitigation
Custom sanitizer not recognised	Only built-in + configured sanitizers match	Add a custom sanitizer rule in config
Container holds mixed-typed items the engine cannot tell apart	A `vector<int>` of port numbers and a `vector<string>` of user input share the same store/load model	Sanitize the values on the way in (numeric parse / explicit validator) so the values themselves carry no cap, not just the container
Dead branches	Path-insensitive within a function	Constraint solving catches trivially infeasible combos; path-validated findings are scored lower
Library wrapper re-introduces taint	Wrapper opaque, or summary marks it as propagating	Summarize the wrapper explicitly or add it as a sanitizer

Common false negatives

Scenario	Why
Third-party library on the path	No summary available, callee treated opaquely
Globals / statics across function boundaries	Not tracked
Some closure captures	Closure analysis is limited. JS/TS/Ruby/Go anonymous functions passed as callbacks are analyzed as separate scopes
Very deep cross-file chains	Summary approximation loses precision at depth

Confidence signals

Higher confidence:

Source + Sink both present in evidence with specific call locations.
source_kind: user_input (direct attacker control).
path_validated: false.
No dominating guard on the path.
Symex produced a witness string (rendered sink value visible in JSON/SARIF evidence.symbolic.witness).

Lower confidence:

Path-validated taint (path_validated: true).
Source is a database read or internal file (pre-validated at insertion is common).
Engine note ForwardBailed / PathWidened. Use --require-converged to drop these in strict gates.

Tuning

Custom sanitizer

# nyx.local
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml", "sanitizeInput"]
kind     = "sanitizer"
cap      = "html_escape"

Or: nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape.

Filter by severity or confidence

nyx scan . --severity HIGH
nyx scan . --min-confidence medium

Skip dataflow entirely

nyx scan . --mode ast

AST-only mode gives you structural pattern matches without taint.

In the browser UI, taint findings render as a numbered flow walk so you can see each hop the engine took:

Nyx finding detail: HIGH taint-unsanitised-flow with numbered source → call → sink steps and How to fix guidance

Example

Rust:

use std::env;
use std::process::Command;

fn main() {
    let cmd = env::var("USER_CMD").unwrap();           // source
    Command::new("sh").arg("-c").arg(&cmd).output();   // sink
}

Finding:

[HIGH] taint-unsanitised-flow (source 5:15)  src/main.rs:6:5
       Unsanitised user input flows from env::var → Command::new
       Source: env::var (5:15)
       Sink:   Command::new

Safe rewrite: drop the shell and pass the value as argv directly (Command::new(&cmd).output()), or validate against an allowlist before passing to the shell.

Capabilities

Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer only clears taint for the cap it declares. A sink only fires when the remaining taint still carries its required cap.

Capability	Typical source	Typical sanitizer	Typical sink
`env_var`	`env::var`, `getenv`, `process.env`
`html_escape`		`html.escape`, `DOMPurify.sanitize`	`innerHTML`, `document.write`
`shell_escape`		`shlex.quote`, `shell_escape::escape`	`system`, `Command::new`, `eval`
`url_encode`		`encodeURIComponent`	`location.href`, HTTP client URL arg
`json_parse`		`JSON.parse`
`file_io`		`os.path.realpath`, `filepath.Clean`	`open`, `fs::read_to_string`, `send_file`
`fmt_string`			`printf(var)`
`sql_query`		parameterized query binders	`cursor.execute`, `db.query` with concatenation
`deserialize`			`pickle.loads`, `yaml.load`, `Marshal.load`
`ssrf`		URL-prefix locks	`requests.get`, `fetch`, `HttpClient.send`
`code_exec`			`eval`, `exec`, `Function`
`crypto`			weak-algorithm constructors
`unauthorized_id`	request-bound scoped IDs (Rust auth analysis)	ownership check	row-level write
`all`	Sources typically use `all` so they match any sink

Sources typically use cap = "all" so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name.

Keyboard shortcuts

Nyx