Debugging agents is frustrating. You need to find failing agent runs, identify the step(s) where the trajectory broke down, and isolate the cause. It slows you down when developing your agent app, and become untenable when you’re serving it in prod a thousand times a day.Fortunately, Synth already does this work for you under the hood, and we’re releasing our Errors product as the best way to interact with and action on this data.
You upload agent logs to Synth. Synth applies standard algorithms to score individual trajectories and identify common failure modes across all available data. Those become error clusters.
Each error cluster comes with high-level information and a list of individual trajectories we believe demonstrate the problem. This information populates the Errors dashboard.
Paid customers can configure agents to review and update these errors offline to reflect their priorities.
All customers can configure slack notifications to be alerted to errors as they’re detected.
To deep dive into an error, select the cluster into context. You’ll see its details, along with its instances in the left panel, and it will be added to context in the AI panel on the right.
For queries that can be quickly answered just with the high-level data in the error cluster / instances, throw queries into the chat panel.
For queries that require more context - many instances, comparing clusters, or reviewing un-clustered traces - query our search agent, Cuvier. It comes equipped with vector search, SQL access to the underlying data, and plenty of compute to throw at the problem.
Cuvier introduces intelligent error analysis to help teams debug AI agents at scale. As agents run thousands of times per day, manually reviewing logs becomes impractical. Cuvier automatically identifies error patterns and provides tools to investigate them efficiently.
Errors will alert you to issues sooner and help you fix them faster.