Skip to main content
Version: 0.1.0

Router Access Log Fields Reference

This document provides a comprehensive reference for all fields available in Kthena Router access logs.

Overview

Kthena Router generates structured access logs for all AI inference requests. These logs provide detailed information about request processing, including timing breakdowns, routing decisions, token usage, and error information.

Log Format

Access logs are available in two formats:

  • JSON: Structured JSON format suitable for log aggregation and analysis
  • Text: Human-readable format for development and debugging

Text Format Structure

The text format follows this structure:

[timestamp] "METHOD /path PROTOCOL" status_code [error=type:message] model_name=name model_route=route model_server=server selected_pod=pod request_id=id tokens=input/output timings=total(req+upstream+resp)ms

Key features of the text format:

  • Error placement: Error information appears immediately after the status code when present
  • Timing format: Shows total time with breakdown in parentheses: timings=2350ms(45+2180+5)
  • Compact representation: All information on a single line for easy parsing

Field Reference

Standard HTTP Fields

These fields follow the Envoy access log format for compatibility with existing log processing tools.

FieldTypeDescriptionExample
timestampstring (RFC3339)ISO 8601 timestamp when the request was received2024-01-15T10:30:45.123Z
methodstringHTTP method used for the requestPOST, GET
pathstringRequest path including query parameters/v1/chat/completions
protocolstringHTTP protocol versionHTTP/1.1, HTTP/2
status_codeintegerHTTP response status code200, 400, 500

Error Information

Error information is included when a request fails and appears immediately after the status code.

FieldTypeDescriptionExample
error.typestringError category or typetimeout, rate_limit, model_not_found
error.messagestringDetailed error messageModel inference timeout after 30s

Common Error Types

Error TypeDescriptionTypical Status Code
timeoutRequest exceeded configured timeout504
rate_limitRequest was rate limited429
model_not_foundRequested model is not available404
authentication_failedAuthentication credentials invalid401
authorization_failedUser lacks required permissions403
upstream_errorError from model inference backend502, 503
invalid_requestMalformed request body or parameters400

AI-Specific Routing Information

These fields provide information about how the request was routed through the AI router.

FieldTypeDescriptionExample
model_namestringName of the AI model requestedllama2-7b, gpt-3.5-turbo
model_routestringName of the ModelRoute resource useddefault/llama2-route-v1
model_serverstringModelServer that handled the requestdefault/llama2-server
selected_podstringSpecific pod that processed the inferencellama2-deployment-5f7b8c9d-xk2p4
request_idstringUnique identifier for request tracing550e8400-e29b-41d4-a716-446655440000

Token Information

Token usage metrics for the inference request.

FieldTypeDescriptionExample
input_tokensintegerNumber of tokens in the request prompt150
output_tokensintegerNumber of tokens generated in response75

Timing Breakdown

All timing values are in milliseconds and provide detailed performance metrics.

FieldTypeDescriptionExample
duration_totalintegerTotal end-to-end request processing time (ms)2350
duration_request_processingintegerRouter request processing overhead (ms)45
duration_upstream_processingintegerModel inference time on backend pod (ms)2180
duration_response_processingintegerResponse processing and serialization time (ms)5

Timing Phases

  1. Request Processing: Time spent parsing the request, authentication, rate limiting, and routing decisions
  2. Upstream Processing: Time spent on actual model inference in the backend pod
  3. Response Processing: Time spent formatting and serializing the response

Example Access Logs

Successful Request (JSON Format)

{
"timestamp": "2024-01-15T10:30:45.123Z",
"method": "POST",
"path": "/v1/chat/completions",
"protocol": "HTTP/1.1",
"status_code": 200,
"model_name": "llama2-7b",
"model_route": "default/llama2-route-v1",
"model_server": "default/llama2-server",
"selected_pod": "llama2-deployment-5f7b8c9d-xk2p4",
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"input_tokens": 150,
"output_tokens": 75,
"duration_total": 2350,
"duration_request_processing": 45,
"duration_upstream_processing": 2180,
"duration_response_processing": 5
}

Failed Request with Error (JSON Format)

{
"timestamp": "2024-01-15T10:35:22.456Z",
"method": "POST",
"path": "/v1/chat/completions",
"protocol": "HTTP/1.1",
"status_code": 504,
"error": {
"type": "timeout",
"message": "Model inference timeout after 30s"
},
"model_name": "llama2-7b",
"model_route": "default/llama2-route-v1",
"model_server": "default/llama2-server",
"selected_pod": "llama2-deployment-5f7b8c9d-xk2p4",
"request_id": "660e8400-e29b-41d4-a716-446655440001",
"input_tokens": 200,
"output_tokens": 0,
"duration_total": 30050,
"duration_request_processing": 50,
"duration_upstream_processing": 30000,
"duration_response_processing": 0
}

Text Format Example

[2024-01-15T10:30:45.123Z] "POST /v1/chat/completions HTTP/1.1" 200 model_name=llama2-7b model_route=default/llama2-route-v1 model_server=default/llama2-server selected_pod=llama2-deployment-5f7b8c9d-xk2p4 request_id=550e8400-e29b-41d4-a716-446655440000 tokens=150/75 timings=2350ms(45+2180+5)

Text Format with Error

[2024-01-15T10:35:22.456Z] "POST /v1/chat/completions HTTP/1.1" 504 error=timeout:Model inference timeout after 30s model_name=llama2-7b model_route=default/llama2-route-v1 model_server=default/llama2-server selected_pod=llama2-deployment-5f7b8c9d-xk2p4 request_id=660e8400-e29b-41d4-a716-446655440001 tokens=200/0 timings=30050ms(50+30000+0)

Configuration

Access logging is configured through environment variables in the kthena router deployment:

Environment Variables

VariableDescriptionDefaultValid Values
ACCESS_LOG_ENABLEDEnable or disable access loggingtruetrue, false
ACCESS_LOG_FORMATLog output formattextjson, text
ACCESS_LOG_OUTPUTWhere to write logsstdoutstdout, stderr, or file path