Architecture

How SilentWitness turns a DFIR investigation into cited findings, audit rows, and a verifiable report.

SilentWitness is a Custom MCP Server for incident response work. The agent can plan an investigation, form hypotheses, and ask for evidence records, but it cannot write unsupported findings. The MCP server owns the evidence boundary, the citation checks, the entity checks, and the audit trail.

The review path:

Register evidence.
Prepare and index the case.
Investigate through MCP tools.
Review staged findings.
Verify the audit chain.
Export the report.

The important idea is simple: the model may suggest a claim, but the server decides whether that claim is grounded enough to enter the report.

What Runs Where

Examiner and SIFT CLI. The examiner starts the case, registers evidence, builds the index, launches the investigator, and exports the report.

Read-only evidence boundary. Source artifacts stay mounted read-only. The agent never receives a raw evidence write path.

Offline ingest spine. dfVFS extraction and targeted feeders parse artifacts into citable records. Each useful record gets a stable ID, a source pointer, and an audit row.

MCP server. This is the product. It exposes a small tool surface over registered evidence, indexed records, findings, and audit output. It also runs the code-level gates before anything becomes a finding.

Pydantic AI investigator. The agent sequences the work: it lists detections, forms one hypothesis at a time, searches evidence records, records observations, pivots when evidence contradicts the hypothesis, and writes the final narrative through server tools.

The Hallucination Firewall

Unsupported text has to survive multiple checks before it can become a finding. The core checks are enforced in code, not by asking the model to behave.

The enforced path:

The agent proposes an observation through record_observation.
The server rejects claims without a cited evidence span.
The entity gate checks named entities against indexed evidence.
Contradiction detectors and the live critic can challenge weak findings.
The coverage gate blocks finalization until the report answers WHO, WHAT, WHEN, WHERE, and HOW with citations.
The audit chain records the tool call, the record reference, and the report mutation so verification can replay the trail.

Prompt instructions still matter for quality, but they are supplementary. If a prompt is removed, the investigation gets worse. It does not grant the model permission to write unsupported findings.

Investigation Loop

The investigator works like a senior analyst: one hypothesis at a time, tested against evidence, then confirmed, pivoted, or abandoned.

The loop is intentionally small:

Start from staged detections.
Form a concrete hypothesis.
Search evidence and fetch records.
Stage only cited observations.
Confirm, pivot, or abandon the hypothesis.
Let the critic and coverage gate challenge the result before export.

That keeps the demo understandable. It also keeps the implementation auditable: every transition leaves a record.

Claim Trace

A report finding is not just prose. It points back to the record and tool execution that produced it.

For judges, the quickest proof path is:

Three-Claim Trace: three findings traced from report text to cited record, audit row, and source artifact.
Accuracy Report: measured recall, misses, and the bugs fixed during evaluation.
Try It Out: dataset walkthroughs for the shipped demo cases.
Full architecture spec: the long internal spec with pins, ADRs, and implementation notes.

What Runs Where

The Hallucination Firewall

Investigation Loop

Claim Trace

On this page