diff --git a/docs/my-website/release_notes/v1.81.14.md b/docs/my-website/release_notes/v1.81.14.md index 7b57c9695c..36021a848c 100644 --- a/docs/my-website/release_notes/v1.81.14.md +++ b/docs/my-website/release_notes/v1.81.14.md @@ -1,5 +1,5 @@ --- -title: "v1.81.14 - Claude Sonnet 4.6, Guardrail Garden & Major Performance Improvements" +title: "[Preview] v1.81.14 - Claude Sonnet 4.6, Guardrail Garden & Major Performance Improvements" slug: "v1-81-14" date: 2026-02-21T00:00:00 authors: @@ -69,14 +69,15 @@ These guardrails are built for production and on our benchmarks had a 100% Recal #### Eval results -We benchmark every built-in guardrail against labeled datasets before shipping. Results for Denied Financial Advice (207 cases) and Denied Insults (299 cases): +We benchmarked our new built-in guardrails against labeled datasets before shipping. You can see the results for Denied Financial Advice (207 cases) and Denied Insults (299 cases): | Guardrail | Precision | Recall | F1 | Latency p50 | Cost/req | |-----------|-----------|--------|----|-------------|----------| | Denied Financial Advice | 100% | 100% | 100% | <0.1ms | $0 | | Denied Insults | 100% | 100% | 100% | <0.1ms | $0 | -For reference, ONNX embedding approaches on the same eval set hit 95–98% precision at 2–20ms latency and require additional dependencies. The built-in guardrails use no ML model — just structured YAML rules with layered matching — nothing to download, no API key, and latency is effectively zero. +100% precision means zero false positives — no legitimate messages were incorrectly blocked. 100% recall means zero false negatives — every message that should have been blocked was caught. + ### Compliance Playground