synth_ai.core.tracing_v3.abstractions
Core data structures for tracing v3.
This module defines the fundamental data structures used throughout the tracing system.
All structures are implemented as frozen dataclasses for immutability and type safety.
The hierarchy is designed to support different types of events while maintaining
a consistent interface for storage and processing.
Event Type Hierarchy:
- BaseEvent: Common fields for all events
- RuntimeEvent: Events from the runtime system (e.g., actions taken)
- EnvironmentEvent: Events from the environment (e.g., rewards, termination)
- LMCAISEvent: Language model events with token/cost tracking
Session Structure:
- SessionTrace: Top-level container for a complete session
- SessionTimeStep: Logical steps within a session (e.g., conversation turns)
- Events: Individual events that occurred during the timestep
- Messages: Information passed between subsystems (user, agent, runtime, environments)
- SessionTimeStep: Logical steps within a session (e.g., conversation turns)
Concepts:
- Events capture something that happened inside a subsystem. They may or may not be externally visible. Examples include an LLM API call (LMCAISEvent), a tool selection (RuntimeEvent), or a tool execution outcome (EnvironmentEvent).
- Messages represent information transmitted between subsystems within the session. Messages are used to record communications like: a user sending input to the agent, the agent/runtime sending a tool invocation to an environment, the environment sending a tool result back, and the agent sending a reply to the user. Do not confuse these with provider-specific LLM API “messages” (prompt formatting) — those belong inside an LMCAISEvent as part of its input/output content, not as SessionEventMessages.
Classes
TimeRecord
Time information for events and messages.
This class captures timing information with microsecond precision for event
correlation and performance analysis.
Attributes:
event_time: Unix timestamp (float) when the event occurred. This is the primary timestamp used for ordering and correlation.message_time: Optional integer timestamp for message-specific timing. Can be used for external message IDs or sequence numbers.
SessionMessageContent
Normalized payload stored alongside session messages.
Methods:
as_text
has_json
SessionEventMarkovBlanketMessage
Message crossing Markov blanket boundaries between systems in a session.
IMPORTANT: This represents information transfer BETWEEN distinct systems/subsystems,
where each system is conceptualized as having a Markov blanket that separates its
internal states from the external environment. These messages cross those boundaries.
This is NOT for chat messages within an LLM conversation (those belong in LLMCallRecord).
Instead, this captures inter-system communication such as:
- Human -> Agent system (user providing instructions)
- Agent -> Runtime (agent deciding on an action)
- Runtime -> Environment (executing a tool/action)
- Environment -> Runtime (returning results)
- Runtime -> Agent (passing back results)
- Agent -> Human (final response)
content: The actual message content crossing the boundary (text, JSON, etc.)message_type: Type of boundary crossing (e.g., ‘observation’, ‘action’, ‘result’)time_record: Timing information for the boundary crossingmetadata: Boundary crossing metadata. Recommended keys:- ‘step_id’: Timestep identifier
- ‘from_system_instance_id’: UUID of the sending system
- ‘to_system_instance_id’: UUID of the receiving system
- ‘from_system_role’: Role of sender (e.g., ‘human’, ‘agent’, ‘runtime’, ‘environment’)
- ‘to_system_role’: Role of receiver
- ‘boundary_type’: Type of Markov blanket boundary being crossed
- ‘call_id’: Correlate request/response pairs across boundaries
- ‘causal_influence’: Direction of causal flow
BaseEvent
Base class for all event types.
This is the foundation for all events in the tracing system. Every event must
have a system identifier and timing information. Events are intra-system facts
(they occur within a subsystem) and are not necessarily direct communications.
Attributes:
system_instance_id: Identifier for the system/component that generated this event (e.g., ‘llm’, ‘environment’, ‘tool_executor’)time_record: Timing information for the eventmetadata: Flexible dictionary for event-specific data. Common keys include:- ‘step_id’: Associated timestep identifier
- ‘error’: Error information if event failed
- ‘duration_ms’: Event duration in milliseconds
event_metadata: Optional list for structured metadata that doesn’t fit in the dictionary format (e.g., tensor data, embeddings)
RuntimeEvent
Event from runtime system.
Captures events from the AI system’s runtime, typically representing decisions
or actions taken by the system (e.g., selecting a tool with arguments).
Use paired SessionEventMessages to record the communication of this choice to
the environment.
Attributes:
actions: List of action identifiers or indices. The interpretation depends on the system (e.g., discrete action indices for RL, tool selection IDs for agents, etc.)
EnvironmentEvent
Event from environment.
Captures feedback from the environment in response to system actions (e.g.,
command output, exit codes, observations). Use a paired SessionEventMessage
to record the environment-to-agent communication of the result.
Follows the Gymnasium/OpenAI Gym convention for compatibility.
Attributes:
reward: Scalar reward signal from the environmentterminated: Whether the episode ended due to reaching a terminal statetruncated: Whether the episode ended due to a time/step limitsystem_state_before: System state before the action (for debugging)system_state_after: System state after the action (observations)
LMCAISEvent
Extended CAIS event for language model interactions.
CAIS (Claude AI System) events capture detailed information about LLM calls,
including performance metrics, cost tracking, and distributed tracing support.
Treat provider-specific prompt/completion structures as part of this event’s
data. Do not emit them as SessionEventMessages.
Attributes:
model_name: The specific model used (e.g., ‘gpt-4’, ‘claude-3-opus’)provider: LLM provider (e.g., ‘openai’, ‘anthropic’, ‘local’)input_tokens: Number of tokens in the prompt/inputoutput_tokens: Number of tokens in the completion/outputtotal_tokens: Total tokens used (input + output)cost_usd: Estimated cost in US dollars for this calllatency_ms: End-to-end latency in millisecondsspan_id: OpenTelemetry compatible span identifiertrace_id: OpenTelemetry compatible trace identifiersystem_state_before: State snapshot before the LLM callsystem_state_after: State snapshot after the LLM callcall_records: List of normalized LLM call records capturing request/response details (messages, tool calls/results, usage, params, etc.).
SessionTimeStep
A logical timestep within a session.
Represents a discrete step in the session timeline. In conversational AI,
this often corresponds to a single turn of dialogue. In RL systems, it
might represent a single environment step.
Attributes:
step_id: Unique identifier for this step (e.g., ‘turn_1’, ‘step_42’)step_index: Sequential index of this step within the sessiontimestamp: When this timestep started (UTC)turn_number: Optional turn number for conversational contextsevents: All events that occurred during this timestepstep_messages: Messages exchanged during this timestepstep_metadata: Additional metadata specific to this step (e.g., ‘user_feedback’, ‘context_switches’, ‘tool_calls’)completed_at: When this timestep was completed (None if still active)
SessionTrace
Complete trace of a session.
The top-level container that holds all data for a single execution session.
This could represent a complete conversation, an RL episode, or any other
bounded interaction sequence.
Attributes:
session_id: Unique identifier for this sessioncreated_at: When the session started (UTC)session_time_steps: Ordered list of timesteps in this sessionevent_history: Complete chronological list of all eventsmessage_history: Complete chronological list of all messagesmetadata: Session-level metadata (e.g., ‘user_id’, ‘experiment_id’, ‘model_config’, ‘environment_name’)session_metadata: Optional list of structured metadata entries that don’t fit the dictionary format
to_dict
- A dictionary containing all session data, suitable for
- JSON serialization or database storage.