This commit is contained in:
Ishaan Jaffer
2026-02-21 17:51:45 -08:00
parent 522954fe0d
commit 356eb5a413
+4 -3
View File
@@ -1,5 +1,5 @@
---
title: "v1.81.14 - Claude Sonnet 4.6, Guardrail Garden & Major Performance Improvements"
title: "[Preview] v1.81.14 - Claude Sonnet 4.6, Guardrail Garden & Major Performance Improvements"
slug: "v1-81-14"
date: 2026-02-21T00:00:00
authors:
@@ -69,14 +69,15 @@ These guardrails are built for production and on our benchmarks had a 100% Recal
#### Eval results
We benchmark every built-in guardrail against labeled datasets before shipping. Results for Denied Financial Advice (207 cases) and Denied Insults (299 cases):
We benchmarked our new built-in guardrails against labeled datasets before shipping. You can see the results for Denied Financial Advice (207 cases) and Denied Insults (299 cases):
| Guardrail | Precision | Recall | F1 | Latency p50 | Cost/req |
|-----------|-----------|--------|----|-------------|----------|
| Denied Financial Advice | 100% | 100% | 100% | <0.1ms | $0 |
| Denied Insults | 100% | 100% | 100% | <0.1ms | $0 |
For reference, ONNX embedding approaches on the same eval set hit 9598% precision at 220ms latency and require additional dependencies. The built-in guardrails use no ML model — just structured YAML rules with layered matching — nothing to download, no API key, and latency is effectively zero.
100% precision means zero false positives — no legitimate messages were incorrectly blocked. 100% recall means zero false negatives — every message that should have been blocked was caught.
### Compliance Playground