WHAT IS AN AGENT
Agents are a new form of applications. Thousand Birds is an ecosystem for building, deploying, and managing agentic software systems.
┌────────┐ ┌────────────────┐ ┌──────────────────────┐ | LLMs | ◀──────▶ │ Orchestrator │ ◀─────▶ │ External Environment │ └────────┘ └────────────────┘ └──────────────────────┘
Large Language Models (LLMs) have provided new capabilities for natural language processing and generative workflows. In many cases code can be authored by prompting of language models.
The Challenges of Agents
Building agents is hard. They are complex systems that require the integration of many different technologies. LLMs are expensive and require fine-tuning or prompt engineering to achieve desired results. Behavior can be difficult to debug and understand. Agents are difficult to audit and verify.
Our goal is to address as many of these challenges for you as we can:
- How are we going to understand the system of our agent during development?
- Evaluating LLMs during testing and development can add up, how do we manage inference costs?
- Debugging agents can mean digging through thousands of lines of natural language log output, how are we going to make that less arduous?
- Once we get the agent deployed, how will we monitor and management it?
- And more...
Debug your agents with time travel.
Time travel debugging allows you to pause, rewind, and inspect previous executions of your agents. Going beyond observability, Thousand Birds supports modification of both the definition or state of your agent in prior runs, enabling iterative development within long running executions.
Audit the execution history of your agents.
Our framework is built around a reactive database as a central component. This database is designed to capture the execution history of your agents. This provides a rich audit trail of your agent's behavior, and allows you to query and inspect the execution history of your agents.
Understand how often you're getting specific behavior.
Thousand Birds provides a rich set of tools for building evaluation metrics for your agents. These metrics can be used to monitor the behavior of your agents in production, or to evaluate the performance of your agents during development.
Combined with our structural definition features we provide a powerful toolset for understanding the probability distribution of your agent's behavior.