From CVE Publication to Working Proof Capsule in Under Six Hours: The Nightly Chain Walkthrough

1. The window between publication and exploitation

The interval between a CVE being published and the first observed exploitation attempt in the wild has been compressing for years. Mandiant's M-Trends 2025 report measures the median time-to-exploitation for newly-published high-severity CVEs at under five days; for the top decile of impactful CVEs, the interval is under 24 hours. The vendors who can ship a detection or a patch inside that window protect their customers. The vendors who cannot ship inside it leave customers exposed during the highest-density attack period of the CVE's lifetime.

We took the position eighteen months ago that under-six-hours, every weekday night, was the right operating tempo for our research-to-detection pipeline. The pipeline that delivers it is a nine-stage Temporal-orchestrated chain that runs at 03:00 UTC daily. The output of the chain is a set of newly-minted Test Capsules and Proof Capsules covering every CVE the chain considers in-scope for our customer base. The remainder of this article walks the nine stages, their inputs and outputs, and the engineering choices that make them resilient.

2. Stage 1, Feed ingestion

The chain begins with a parallel ingest of four primary feeds: NVD (the National Vulnerability Database), the GitHub Advisory Database, vendor advisories (the long tail of vendor RSS feeds we maintain a registry for), and CISA KEV. Each feed runs in its own ingest activity inside Temporal, so a single feed's outage does not stall the rest.

The deduplication step is non-trivial. The same CVE often appears in three of the four feeds with slightly different metadata. We normalise on the CVE identifier as the join key, take the union of references, and prefer NVD's CVSS scoring as the authoritative severity field while keeping every other source's scoring as a comparison row. The output is a single canonical CVE record per identifier with provenance attached to every field.

Resilience design: each feed's HTTP fetch is wrapped in a retry policy with exponential backoff and a circuit breaker. A feed that has been down for more than 30 minutes gets bypassed for the night and an alert fires; the chain proceeds without it. We do not block on any single source.

3. Stage 2, EPSS enrichment

Stage 2 fetches the daily EPSS scores for every CVE that came through Stage 1, plus the EPSS scores for every CVE that was already in our active catalogue (so EPSS movers, CVEs whose probability shifted overnight, get re-scored too). EPSS is a daily-refreshed feed; the model retrains every 24 hours, and a CVE's score can shift meaningfully when public exploit code drops or when proof-of-life posts appear in monitored channels.

The output is an EPSS-enriched CVE record. Any CVE whose EPSS climbs by more than 0.2 in a single day is flagged as a "mover" and gets fast-tracked through the rest of the chain regardless of its absolute score.

4. Stage 3, Patch-diff acquisition

Stage 3 is the moat builder. For every CVE that links to a public commit, GitHub release, or vendor advisory page, the chain attempts to resolve the underlying patch and clone the diff into a structured store. The clone is bounded, we never pull more than 50 MB per commit, and the diff is parsed into a structured shape: changed files, changed functions, added/removed lines, security-relevant code-path hits.

Most CVEs do not carry a clean patch link. We accept that. The 30-40 percent that do produce a structured diff are the ones where Stages 5 and 6 can do their highest-leverage work. The remaining CVEs fall back to behaviour-based PoC synthesis from the advisory text.

Resilience design: GitHub rate limits are honoured per-token, and the chain rotates through a pool of authentication tokens scoped to public-read only. A clone that exceeds the size budget gets partial-diff-with-truncation rather than failing. A clone that fails on auth gets requeued with a different token.

5. Stage 4, Affected-component projection

Stage 4 takes each enriched CVE and projects it against our Version Vulnerability Projector catalogue. The question the projection answers is: which customer-installed components inherit this CVE? The output is a list of (cve, customer_asset_class, version_range) triples.

A CVE that projects against zero customer asset classes does not get a test built. It gets stored in the catalogue with coverage_status: out_of_scope_for_current_install_base, which we re-evaluate weekly as the install base shifts. This avoids spending engineering effort building tests that no customer will benefit from, and gives us a queryable record of which CVEs we deliberately deprioritised, with a reason.

6. Stage 5, PoC synthesis

Stage 5 is where the chain attempts to synthesise a working proof-of-concept from the patch diff and the advisory text. The synthesis pipeline has three branches:

Direct PoC extraction. When the advisory or commit message contains a complete PoC snippet (curl invocation, payload string, exploit script), we extract it directly and adapt it to the Proof Capsule format.
Patch-diff inversion. When the patch adds an input validator, a length check, or a security control, the chain attempts to construct an input that satisfies the unchanged code path and violates the new check. This is the highest-leverage branch, it produces working PoCs for the 30-40 percent of CVEs with structured diffs, often within hours of the patch being public.
Advisory-text synthesis. When no diff is available, the chain falls back to large-language-model synthesis from the advisory text. This branch produces drafts that are reviewed by a human researcher before promotion, the LLM is good at sketching the attack shape, less good at the exact payload bytes.

Every synthesised PoC ships with the same metadata: required preconditions, expected blast radius, benign-demonstration variant (the version that runs against a willing customer in production), and a deterministic decision rule that the Test Capsule records.

7. Stage 6, Validation in a controlled environment

A synthesised PoC is not trustworthy until it runs against the vulnerable component and produces the expected behaviour. Stage 6 stands up a controlled environment, usually a vendored copy of the affected software pinned to the vulnerable version range, and executes the PoC against it. The validation captures the full request/response trace and writes it into the candidate Proof Capsule.

This stage catches the majority of LLM-synthesised PoCs that look plausible but do not actually exploit the bug. A failed validation gets rerouted to the human-review queue with the original synthesis attempt and the failure trace; the reviewer typically converges on a working PoC within thirty minutes of opening the ticket.

Resilience design: the controlled environment runs on a separate, network-isolated worker pool. A PoC that misbehaves (consumes excessive resources, attempts unexpected egress) gets killed by the worker's bounded-budget controller and the attempt is logged as a synthesis failure. The blast radius of a misbehaving PoC is the worker, not the production scanning fleet.

8. Stage 7, Catalogue registration

A validated PoC gets registered into the test catalogue with a unique test ID (typically CVE-YYYY-NNNNN-PROBE for the discovery probe and CVE-YYYY-NNNNN-EXPLOIT for the benign-demonstration variant). The registration includes the affected-component projection, the EPSS and KEV data, the patch reference, and the Proof Capsule signature.

Registration is the gate that promotes a research artifact into a customer-visible test. Until registration, the PoC lives in the research staging area and is not dispatched to any customer scan. After registration, the next scheduled scan picks the test up automatically, there is no manual deployment step.

9. Stage 8, Dispatch to active scans

Stage 8 schedules the newly-registered tests for execution against the affected customer assets. Dispatch respects the customer's freshness window (the maximum age of a finding the customer wants re-validated) and the per-customer scan budget. Tests for CVEs that hit EPSS > 0.5 or KEV membership are dispatched immediately; the rest enter the regular scan rotation.

The dispatch is deterministic, each customer gets a derived workflow ID per the Temporal deterministic-workflow-id rule so duplicate runs of the chain do not produce duplicate dispatches. We learned this the hard way in mid-2026 when a random-suffix workflow ID produced duplicate scans against a customer's billing API and we had to apologise. The rule is now load-bearing.

10. Stage 9, Customer notification and report build

The final stage builds the per-customer impact report: which new tests run against your assets tomorrow morning, what they will look for, and what the expected blast radius of the probe is. The report is delivered to the customer's portal and the engagement-team contact list overnight, so the customer wakes up to a one-page brief on what changed in their threat model.

Reports are signed. The signature ties back to the chain run, the catalogue revision, and the per-stage outputs. A customer who disputes a finding can ask for the chain trace and we can reconstruct exactly which feed, which patch diff, which validation run, and which PoC produced the test. There is no black box.

11. The outcomes that matter

The pipeline has run nightly for six months without a missed cycle. The median CVE-to-Capsule latency, measured from NVD publication to validated Capsule, sits at four hours forty minutes. The longest interval in the last quarter was nineteen hours (a vendor advisory whose patch reference pointed at a private repository, which forced a manual escalation. The shortest was forty-seven minutes) a high-EPSS CVE with a clean public patch and an unambiguous diff inversion.

The customer-facing consequence is that the day after a major CVE drops, the scan that runs against their environment already includes the test for it. The customer does not file a ticket asking us to add it. The customer does not pay an emergency-onboarding fee for an expedited check. The test was there.

Verifiable security.