Recall @ window
Of the major events that actually happened in a window, the share Velocity surfaced as a trend in time. Ground truth is the human-curated Wikipedia Current Events portal — not our own output.
Most intelligence products ask you to trust the output. Velocity grades itself every night against a public, human-curated record of what actually happened — and shows you the numbers. For humans, and for the AI agents choosing where to ground.
The latest nightly run, graded against the Wikipedia Current Events golden set.
Six measures, all graded against ground truth — not against ourselves.
Of the major events that actually happened in a window, the share Velocity surfaced as a trend in time. Ground truth is the human-curated Wikipedia Current Events portal — not our own output.
Of the top trends we show in each category, the share that map to a real, verifiable event — not noise, a duplicate, or a mislabel. This is the number that says "what you see is real."
Median hours from an event's first public appearance to the moment Velocity creates a trend for it. The honest measure of "how fast."
How often a single real event fragments into more than one active trend. Structured domains (CVEs, bills, quakes, earnings) use natural keys to collapse duplicates deterministically.
How often a trend lands in the wrong category. Classification happens once at trend creation, so the label you see is the label we measured.
The share of surfaced trends that fail title-quality checks — fragments, boilerplate, or non-events. Kept low so the board stays readable.
We run the trend engine like a model in training: golden set, scorecard, replay, diff.
Every day we pull the Wikipedia Current Events portal — a human-curated, categorized record of what actually happened, available under CC BY-SA. That becomes the ground truth we grade ourselves against. We do not grade ourselves against our own output.
A scheduled job compares Velocity’s live boards to the golden set and persists recall, precision, latency, duplicate, mislabel, and junk rates — corpus-wide and per category. The numbers on this page are that scorecard, read live.
Raw feed payloads are retained, so any change to scoring or classification is re-run against a frozen week and diffed before it ships. The rule is simple: no scorer change merges without a replay scorecard that is an improvement or flat. Whack-a-mole becomes compounding progress.
We do not declare the engine "done" on a good day. The bar is every target met for 14 consecutive days. Until then, the cards above read "in progress" — honestly.
Coverage you can cite, with provenance built in.
Legislation (federal congress.gov/GovInfo and 50-state Open States), patents (USPTO PatentsView), and security advisories (NVD, GHSA, CISA KEV) are official-API, public-domain, and naturally keyed by bill id, publication number, or CVE. High accuracy, near-zero ambiguity, and coverage no general feed reader offers.
For press and general news, Velocity surfaces the headline, a link to the original publisher, and a Velocity-authored summary — never the publisher’s body text. Provenance is preserved and attribution always links back to the source.
Every trend carries the independent sources that confirmed it. A "breaking" signal triggers on a story’s rate against its own baseline plus independent-source confirmation — not on raw volume — so the score reflects real corroboration, not noise.
We consider the core news engine “nailed” only when, for 14 consecutive days, recall at major events clears 80% within the detection window, precision in the top trends clears 70%, junk stays under 5%, mislabel under 10%, and duplicates under 3%. These are public commitments — when a metric above is below its target, the card says so.
Velocity exposes its signals over an MCP server and REST API with the same provenance you see here. When your agent cites a Velocity trend, it can show the independent sources behind it — and point its users to a published precision number instead of an unverifiable claim.