SREDAY

Site Reliability, DevOps and Cloud

October 11, 2025 Ford, Chennai, India

1
Days
25+
Speakers
1
Tracks
200
Attendees

Taming the Chaos: Avoiding Kubernetes Landmines with Observability & AIOps

Kaustubha Shravan

Kubernetes promises portability and scalability, but in reality, most production outages happen due to avoidable mistakes. Security gaps, misconfigured health checks, poor scaling strategies—all can derail even experienced teams.

In this session, we’ll uncover:

Security Disasters → The risk of running containers as root, overlooked image vulnerabilities, and RBAC pitfalls.

Configuration Catastrophes → Why “works on my machine” never works, and how resource mismanagement wrecks clusters.

Observability Blind Spots → Missing runtime security monitoring, misleading CPU/memory metrics, and logging anti-patterns.

Scaling Traps → HPA-induced thrashing, node scheduling inefficiencies, and bottlenecks hidden until too late.

But it’s not just about problems—we’ll explore solutions:

Actionable hardening checklists

Tools for continuous monitoring & runtime security

Automation strategies to prevent config drift

Proven observability practices for anomaly detection

Whether you’re new to Kubernetes or running enterprise-scale clusters, you’ll leave this session with practical, battle-tested strategies to keep your systems safe, stable, and observable.

Kaustubha Shravan is a Cloud Architect who designs and operates resilient, measurable, and cost-efficient platforms across Azure, AWS, and GCP. She blends reliability engineering with data-driven practices—SLOs, error budgets, and ML-assisted incident response—to make outages rare and recovery fast. With 46+ cloud certifications, she has led initiatives such as Benchmarking-as-a-Service and production-grade ML inference pipelines that improved performance while cutting spend. Kaustubha is a Women Techmakers Ambassador and frequent community mentor; her work has been showcased at NeurIPS workshops. She speaks about pragmatic reliability patterns, observability that drives action, and culture—how to turn postmortems into durable engineering improvements. When she’s not shipping guardrails, she’s helping teams adopt sustainable, privacy-aware AI practices and sharing playbooks that teams can put to work immediately.

Sponsors & Partners

Want to become a sponsor? Get in touch!