REST Management API¶

Full reference for all HTTP endpoints exposed by the observability container. By default, the API is served at http://localhost:8002/v1.

Endpoint Feature Map¶

FastAPI REST API (localhost:8002)
│
├── /instrumentation
│   ├── POST /init         → Enable monkey-patching
│   ├── POST /uninstrument → Remove active patches
│   ├── POST /detect       → Discover provider/model
│   └── POST /test-call    → Verify trace output
│
├── /token-counting
│   └── POST /count        → Local token evaluation
│
├── /streaming
│   └── POST /test-stream-call → Server-Sent Events test
│
├── /pii-injection
│   └── POST /scan         → Aho-Corasick & regex match
│
├── /sampling
│   └── POST /should-sample → Evaluate modulo gate
│
├── /embeddings
│   └── POST /embed        → Vector conversion
│
└── /metrics
    ├── POST /init         → Start scrape endpoint
    ├── GET  /health       → Check metrics status
    ├── POST /record       → Log single span
    └── POST /record-batch → Log bulk spans

1. Instrumentation Management¶

`POST /instrumentation/init`¶

Enable auto-instrumentation globally in the application runtime. - Request Body: None - Response (application/json):

{"success": true, "message": "Auto-instrumentation initialized"}

`POST /instrumentation/uninstrument`¶

Remove all active auto-instrumentation monkey-patches. - Request Body: None - Response (application/json):

{"success": true, "message": "All instrumentation disabled"}

`POST /instrumentation/detect`¶

Parse a sample request body to discover the provider name and model. - Request Body:

{
  "url": "https://api.openai.com/v1/chat/completions",
  "body": "{\"model\": \"gpt-4o\"}"
}

- Response:

{"provider": "openai", "model": "gpt-4o"}

`POST /instrumentation/test-call`¶

Trigger an outbound call to verify metrics and tracing flow. - Request Body:

{
  "method": "httpx",
  "provider": "openai"
}

Allowed method values: httpx, requests, sdk. Allowed provider values: openai, anthropic. - Response:

{"success": true, "message": "Test call triggered via httpx for openai"}

2. Utility Engine¶

`POST /token-counting/count`¶

Count prompt tokens locally without contacting the LLM provider. - Request Body (Plain String):

{
  "prompt": "Hello, how are you?",
  "model": "gpt-4o"
}

- Request Body (Chat Messages):

{
  "prompt": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Summarize this document."}
  ],
  "model": "gpt-4o"
}

- Response:

{"tokens": 20, "method": "tiktoken"}

`POST /pii-injection/scan`¶

Scan inputs for PII and injection exploits. - Request Body:

{"prompt": "My email is test@example.com and DROP TABLE users;"}

- Response:

{
  "pii_detected": true,
  "injection_attempt": true
}

`POST /embeddings/embed`¶

Convert text into a 384-dimensional MiniLM-L6-v2 vector embedding. - Request Body:

{"text": "Explain transformers."}

- Response:

{
  "embedding": [0.021, -0.103, 0.044, "...381 more floats..."]
}

`POST /sampling/should-sample`¶

Check if a span ID passes the 1% deterministic modulo-100 gate. - Request Body:

{"span_id": "550e8400-e29b-41d4-a716-446655440000"}

- Response:

{"is_sampled": false}

3. Prometheus Metrics & Records¶

`POST /metrics/init`¶

Initialize the local Prometheus scraper endpoint. - Request Body:

{"port": 9464}

- Response:

{"initialized": true, "message": "Metrics pipeline initialized"}

`POST /metrics/record`¶

Record variables for a single span to generate metrics and compute USD pricing. - Request Body:

{
  "model": "gpt-4o",
  "provider": "openai",
  "service_name": "chat-api",
  "prompt_tokens": 120,
  "completion_tokens": 60,
  "latency_ms_total": 380,
  "latency_ms_ttft": 90,
  "finish_reason": "stop",
  "status": "success",
  "pii_detected": false,
  "injection_attempt": false,
  "retry_count": 0
}

- Response:

{
  "recorded": true,
  "cost_usd_micro": 1500,
  "price_version": "2025-01-15"
}

Next Steps¶

Docker & CLI Deployment - Run and expose the API container.
Config Files Reference - Learn how prices and regex patterns are configured.

REST Management API¶

Endpoint Feature Map¶

1. Instrumentation Management¶

POST /instrumentation/init¶

POST /instrumentation/uninstrument¶

POST /instrumentation/detect¶

POST /instrumentation/test-call¶

2. Utility Engine¶

POST /token-counting/count¶

POST /pii-injection/scan¶

POST /embeddings/embed¶

POST /sampling/should-sample¶

3. Prometheus Metrics & Records¶

POST /metrics/init¶

POST /metrics/record¶

Next Steps¶

`POST /instrumentation/init`¶

`POST /instrumentation/uninstrument`¶

`POST /instrumentation/detect`¶

`POST /instrumentation/test-call`¶

`POST /token-counting/count`¶

`POST /pii-injection/scan`¶

`POST /embeddings/embed`¶

`POST /sampling/should-sample`¶

`POST /metrics/init`¶

`POST /metrics/record`¶