The Scan Pipeline
Code orchestration from input to verdict — the four stages, workspace layout, crate boundaries, and the Probe and Analyzer traits.
parlov's detection pipeline separates observation from interpretation from corroboration. Each layer has a distinct responsibility and zero knowledge of the layers above it.
Layer 1 — Detection (Code-Blind)
Layer 1 has no knowledge of HTTP semantics. It compares two values and checks whether the difference is reproducible.
Input: Two sets of responses — identical requests differing only in the resource identifier. Baseline uses the known-valid identifier. Probe uses the candidate identifier.
Process:
- Send baseline request, record response.
- Send probe request, record response.
- Compare: does any observable property differ?
- If a difference is detected, re-send both requests N times.
- If the differential is stable across all retries, emit a stable differential.
Output: Either a stable differential (with the raw values from both sides and the sample count) or nothing.
The default sample count is 3 — sufficient to filter transient infrastructure noise (GC pauses, load balancer jitter, connection resets) without being expensive. Stability is strict: all N samples on each side must return the same status code. Any inconsistency means the signal is unstable and is not forwarded to Layer 2.
What Layer 1 does not do: no exclusion lists, no pattern matching, no severity assignment, no verdict. It answers exactly one question — "is this difference real and reproducible?" — and nothing else.
Adaptive short-circuit: If the first pair shows no differential (same status code), Layer 1 returns "not present" after a single sample without retrying. Retries only happen when a differential is detected and needs stability confirmation.
Layer 2 — Classification (Code-Aware)
Layer 2 receives a confirmed stable differential and applies protocol-informed semantics to label and score it.
The classification draws on a pattern table that maps differential pairs to known oracle types. Each entry carries:
- Label — a human-readable name for the oracle (e.g., "Authorization-based differential", "Conditional-request differential")
- Confidence level — how strongly the pattern indicates an oracle
- RFC basis — the specific specification section that grounds the behavior
Examples from the pattern table:
| Baseline | Probe | Label | Confidence | RFC Basis |
|---|---|---|---|---|
| 403 Forbidden | 404 Not Found | Authorization-based differential | High | RFC 9110 §15.5.4 |
| 304 Not Modified | 404 Not Found | Conditional-request differential | High | RFC 9110 §15.4.5 |
| 409 Conflict | 201 Created | Conflict-based creation differential | High | RFC 9110 §15.5.10 |
| 412 Precondition Failed | 404 Not Found | Precondition-failed differential | High | RFC 9110 §13.1.1 |
| 429 Too Many Requests | 404 Not Found | Rate-limit-based differential | Medium | RFC 6585 §4 |
| 422 Unprocessable | 404 Not Found | Validation-path differential | High | RFC 9110 §15.5.21 |
Verdict mapping:
- High confidence patterns produce a Confirmed verdict — the differential has a well-understood RFC basis and the behavior is unambiguous.
- Medium confidence patterns produce a Likely verdict — the differential is real but the status code semantics are broad or context-dependent.
- Unclassified stable differentials (pairs not in the pattern table) receive a base confidence of 40. Without additional signals pushing confidence above the Likely threshold (60), they produce a NotPresent verdict. Additional corroborating signals can elevate unclassified differentials into Likely or Confirmed territory.
Layer 3 — Corroboration (Multi-Signal) (planned, not yet implemented)
Layer 3 is designed to cross-check medium or low-confidence findings with additional signal classes to promote or demote confidence. It is not yet active in the current pipeline.
Planned corroboration signals:
- Body differential — do the response bodies diverge between baseline and probe? If a
400/201differential also shows distinct body content ("email already exists" vs. account created), confidence would be promoted. - Header differential — different header sets between baseline and probe?
WWW-Authenticatepresence,Allowheader values,Set-Cookiedifferences. - Timing differential — does one code path take measurably longer, suggesting deeper server-side execution?
- Cross-method consistency — does the same differential appear across GET, HEAD, and DELETE for the same resource? Consistency across methods would strengthen confidence.
Planned promotion and demotion:
- Likely + body corroboration → Confirmed
- Likely + no corroborating signals → remains Likely
- Likely + contradicting signals (e.g., bodies are identical despite status code diff) → demoted
When implemented, Layer 3 will operate at the orchestration level, composing evidence across signal types collected by independent probes.
RFC Compliance as the Source
Oracles exist because servers implement RFCs correctly. The server is not broken — it is leaking through correct behavior.
Auth Contexts
The three perspectives from which every oracle must be evaluated. The same endpoint can have an oracle in one context and none in another.