In our previous session, we introduced Agentic AI and how autonomous reasoning systems can cut through alert storms and accelerate root cause analysis in your AWS cloud environments. But how does this translate into the runbooks and playbooks customized for your teams to rely on every day?
In this Level-200 session, we move from theory to practical application by examining how to use Agentic AI to transform static runbooks into dynamic, real-time investigative playbooks. Traditional runbooks are written for known problems. Modern cloud incidents are rarely known problems.
In this session, we will use real-world design patterns, including modern AI SaaS stacks inspired by our own product, to demonstrate optimal Agentic AI setup and usage. This webinar demonstrates how Agentic AI systems smartly ingest telemetry from sources like Amazon CloudWatch, Kubernetes, Pagerduty, and Datadog — then generate investigation workflows on the fly that mirror how experienced SREs troubleshoot incidents.
Instead of asking engineers to follow static steps, these systems:
Integrate general context and scoping from initial product setup
Build live service topology from telemetry during incidents
Correlate symptoms to upstream and downstream causes automatically
Generate investigative playbooks in real time based on what the system is observing
Collapse thousands of alerts into a single operational narrative
Integrate directly into existing incident response processes and tooling
This is the evolution from bloated runbook-driven operations to agile playbook-driven autonomous investigation.
Attendees will learn:
Why traditional runbooks break down in dynamic cloud environments
The difference between dense static runbooks and AI-generated specialized playbooks
How Agentic AI creates playbooks dynamically during an incident
How telemetry reasoning informs each step of the investigation workflow
How to integrate autonomous playbooks into existing on-call and incident processes
Practical steps to reduce dependence on manual troubleshooting and war rooms
This session is ideal for SREs, DevOps engineers, platform teams, and engineering leaders who want to modernize their operational practices and evolve beyond static runbooks toward intelligent, autonomous incident investigation.