Deterministic Sampling¶
The SDK uses a deterministic SHA-256 gate to decide which spans qualify for downstream vector embedding generation and prompt hashing. This minimizes CPU and memory consumption while ensuring statistical representation.
Sampling Gate Logic¶
Span ID (UUID / String / Bytes)
│
▼
[SHA-256 Hash]
│
▼
[Convert to Integer]
│
▼
[Modulo 100] ───► Result (0 - 99)
│
▼
( == 0 ? )
┌──────────┴──────────┐
▼ ▼
( Yes ) ( No )
│ │
▼ ▼
is_sampled=True is_sampled=False
[Full Metrics, [Full Metrics,
Embeddings, No Embeddings,
Prompt Hashes] No Prompt Hashes]
- Roughly 1% of spans are sampled.
- The outcome is deterministic — the same
span_idwill always yield the identical sampling state. - Unsampled spans are still reported to the dashboards; they simply skip vector/hash compute cycles.
Feature Matrix by Sampling State¶
| Operation / Field Captured | Sampled (True) | Unsampled (False) |
|---|---|---|
Latency Tracking (latency_ms_total) | ✅ Captured | ✅ Captured |
Token Tracking (prompt_tokens, etc.) | ✅ Captured | ✅ Captured |
| USD Cost Calculation | ✅ Calculated | ✅ Calculated |
| PII & Injection Scan | ✅ Scanned | ✅ Scanned |
Prompt SHA-256 Hash (prompt_hash) | ✅ Computed | ❌ None |
384-dim Embeddings (prompt_embedding) | ✅ Generated | ❌ Skipped |
Programmatic Usage¶
Check if a given ID would pass the sampling gate:
from instrumentation_sdk import should_sample
import uuid
# Check a newly generated UUID
span_id = uuid.uuid4()
sampled = should_sample(span_id)
print(f"Span {span_id} -> sampled: {sampled}")
should_sample accepts various formats:
| Format | Example |
|---|---|
uuid.UUID | uuid.uuid4() |
str (UUID format) | "550e8400-e29b-41d4-a716-446655440000" |
str (generic) | "user-session-1234" |
bytes | b"\x12\x34..." |
Behavior Inside Spans¶
Sampling is run automatically when initializing a span. You can read the state directly from the span properties:
async with llm_span(model="gpt-4o", provider="openai") as span:
print(span._data["is_sampled"]) # True or False
PII Scan Precedence
If PII is detected, the SDK immediately redacts the prompt text and clears the hashes/embeddings, even if the span was selected by the sampling gate.
Verification via REST API¶
Check the gate status for any ID using curl:
curl -X POST http://localhost:8002/v1/sampling/should-sample \
-H "Content-Type: application/json" \
-d '{"span_id": "00000000-0000-0000-0000-000000000000"}'
Response:
Next Steps¶
- MiniLM Embeddings - Learn how sampled prompts are vector-mapped.
- Prometheus Metrics & Grafana - Visualizing sampled metric rates.