SREDAY

Site Reliability, DevOps and Cloud

Sep 19-20, 2024 London, UK

2
Days
50+
Speakers
6
Tracks
250+
Attendees

Event Starts In:

Tickets

Schedule (WIP - subject to change)

Day 1

09:30

Keynote: The History of DevOps in 30 minutes

Coralogix
Let’s wander down memory lane, and revisit all of the strange things we used to do to keep systems alive. Remember the horrible things we used to do to the tail command? File roll-over errors? Staring in horror at failed SSH connections? Well, I do and now, so will you.

Keynote: Supercharging Observability with Feature Flagging

Pulumi
Feature flags allow you to enable and disable code without changing or deploying any source code, as well as letting you selectively route traffic to certain users or a percentage of certain users, along with other great tricks. It’s powerful stuff … but when you combine it with observability...

10:30

Coffee break

Main lobby


Improve application resilience with chaos engineering

AWS
This session covers the basics of how to use isolation boundaries to build resilient applications, and dives deep into using chaos engineering and AWS Fault Injection Service (AWS FIS) to test those boundaries.

11:30

Navigating the Transition: SRE Challenges and Highlights in Shifting from Monolith to Microservices at adidas e-commerce

Adidas
Explore adidas e-commerce's shift from monolithic to microservices in this talk. Uncover the challenges and insights faced by SREs – from scalability to incident response. Join us for practical lessons, best practices, and success stories navigating SRE in the microservices era.

How to Destroy an SRE

Elastic
To understand how to retain, build and grow SREs, we need to know what not to do. In my time in tech I've been a contributor and manager and have good and bad experiences of managing and being managed. Join me as I share how not to manage SRE teams and the mistakes made in this ecosystem.

12:30

Beyond K8S - Maturing Your Platform Engineering Initiative

OpenCredo
Trying to mature your platform engineering team? Don't know where to start, or what pitfalls may be waiting for you. Then this talk is for you. Based on real world experience gained across a variety of clients, and using the CNCF platform maturity model as a guide, we explore & answer such questions

13:00

Lunch & networking

Main lobby


Declarative Linux with the Common Operating System Interface

Sidero Labs
We run declarative systems built on top of an imperative operating system. It’s time we simplified our interface for Linux with a declarative API.

14:30

The RunWhen Authors Community: Royalties For Runbooks

RunWhen
Inspired by work from hyperscale SRE teams, RunWhen is building the industry's largest open source library of troubleshooting automation. Contributing Authors receive royalties when the company's enterprise customers use their code to automate root cause analysis and remediation.

Cloud Native Technologies: The Building Blocks of Modern Database Software

Severalnines
Learn how cloud-native technologies like Kubernetes are revolutionizing database software, enabling scalability, resilience, and agility. Join Divine a Data on Kubernetes Ambassador to explore real-world examples and best practices for building the next generation of database solutions.

Mystery of the disappearing S3 Keys

GlobalLogic
Imagine a system designed to process millions of events per day, but as use grows, data mysteriously seems to disappear, only to reappear later. We’ll talk about the what, why and how we investigated the symptoms, overturned some assumptions and finally delivered a resilient, serverless solution

16:00

10:30

Coffee break

Main lobby


11:00

Developer productivity is waste

VMware / Pivotal
We're developer productivity obsessed, worse, developer productivity METRICS obsessed. Deployment frequency! MTTR! eNPS! It's become a delightful distraction. Besides, who really benefits from this focus? This talk makes the case that we should focus more on actual craft and less on metrics.

How Google SRE and developers work together

Google
SRE are Google's specialists for designing, building, and running complex services that are reliable, scalable, efficient, and maintainable. The SRE Engagement Model describes how the collaboration between developers and SREs works and how reliability engineering can be applied early in development.

From Hero to Zero: practical guide to destroy your best employees

Cognizant
All of us have attended lots of talks with the premise “from Zero to Hero…”, but what does it take for a real Rock Star, a real Ninja, a magical Unicorn, a superhero to become a complete Zero? How can we destroy those heroes in a way that every supervillain on every film has dreamt about every day?

Navigating the Path to Reliability: Crafting an SRE Roadmap for Enterprise Success

NTT DATA
If you need to define a road map to build more reliable products through SRE building ways of working adoption, you need to make an strategy to have a solid why and a road map to reach outcomes iteratively while you follow SRE road map adoption at your company, in this talk we share a real...

13:00

Lunch & networking

Main lobby


14:00

From free-kicks to git commits

Steven Wade Consulting
Join me as I transition from football to tech, focusing on Kubernetes. Learn how teamwork, discipline, and strategy from sports influenced my tech career. Discover my journey into building developer platforms and the parallels between leading a sports team and orchestrating cloud-native environments

Accelerating Developer Experience with Backstage

StatusNeo
The session would focus on the importance of Engineering Portals to accelerate Developer Experience by letting them "Create" Reusable Patterns, "Manage" their software catalog in a centralized inventory, and "Explore" the software ecosystem in a unified Single Pane of Glass.

15:00

Vertically Integrated in Platform Engineering: Secrets to Success When Operating the Company Within the Company

Fanatics
Why don't you get dedicated product support from the company? Why is it so hard to get a headcount for your platform team? Why is it so hard to drive adoption of the products you create for the dev org? The truth is that Platform Engineering is a company within a company. Run it that way.

15:30

10:30

Coffee break

Main lobby


Harness the power of Karpenter to scale, optimize & upgrade Kubernetes

AWS
Unlock the full potential of Kubernetes with Karpenter! Scale effortlessly, optimize efficiently, and upgrade seamlessly. Join my talk to revolutionize your cloud infrastructure journey in just minutes! Don't miss out on this game-changing solution!

Cloud Development Kit: Less Code, Better Infrastructure

AWS
Join our session led by AWS DevOps experts! Dive deep into Infrastructure as Code, focusing on AWS CDK. Write less code, ensure consistency, reuse components, and stay in your IDE. Perfect for developers simplifying infrastructure management

Mastering Metal: Our Journey to Stable, Reproducible Infrastructures with Talos Linux and Cozystack

Ænix
How to deploy a reproducible environment on bare metal with Talos Linux. Our experience in developing an open platform designed for on-premise. How we ensure component stability on any hardware.

A multi-environment deployment strategy for a Kubernetes-based microservice architecture

Agilelab
Imagine a product made of a number of microservices on Kubernetes and dozens of devs working on them. Now imagine multiple environments, where they must be deployed, tested and demoed at the speed of light. Let's put all of this together in a design that satisfies devs, customers and SRE's nights!

13:00

Lunch & networking

Main lobby


14:00

Virtual Kubernetes Clusters: A New Approach to Multi-tenancy

Loft Labs
Virtual Kubernetes cluster is a new approach to dealing with Kubernetes multi-tenancy pain. With virtual k8s clusters, each tenant has what feels like a full-blown, dedicated cluster inside a shared host cluster. Let's learn how you can use a virtual cluster to save costs and improve productivity.

14:30

When Infrastructure as Code Ends - Jump in and Create Some More

Riskified
Adding new functionality to Terraform can be daunting: it’s written only in Go (which you may not know) and you have to understand the architecture and work through less than welcoming documentation. I’ll provide a walkthrough from my experience with it, going from zero to publishing a provider.

Architecture of a Fintech Startup: Tackling Growing Complexity at Scale

Stenn International
A comprehensive overview of various aspects of building a highly scalable fintech startup with a product-led strategy. In this talk, I will share insights on different problems and solutions for tackling the inevitable growing complexity during the scaling of a fintech startup. He will also...

15:30

Caching the uncacheable in Varnish

Varnish Software
Learn how to accelerate web applications and APIs by caching the HTTP output in Varnish. Instead of focusing on basic use cases with static content that is easily cacheable, this presentation shows how to cache personalized content "on the edge", that is otherwise deemed uncacheable.

16:00

Day 2

09:00

Keynote: Developers are all the same

Viam
We know that developers are not all the same. But wait, they kind of are. Let’s explore the changing landscape that affects individuals and infrastructure alike, and a more recent approach in development where reproducibility, scalability, and resilience take precedence over handcrafted code.

Keynote: The state of SRE in 2024

SRE Author
Come and explore the landscape of SRE as it is in 2024, with the new trends, techniques and tools on the horizon.

10:00

TBD

TBD
TBD

10:30

Coffee break

Main lobby


Mastering OpenTelemetry Collector Configuration

Cisco
Are you struggling to make sense of your telemetry data or overwhelmed by your system's sheer volume and complexity? Learn how the OpenTelemetry Collector can help you streamline data collection, optimize performance, and gain unparalleled insights into your applications.

The Subtle Art of Lying with Statistics

NGINX
Lies, damned lies and statistics. But statistics allow you to lie to yourself. Statistics can trick us into believing things that are less than true, though not on purpose. Learn how data choice, event focus and scale change perspective. See how graphs mislead and correlation can cause confusion.

A deep dive into Perses, the GitOps Native dashboard visualization tool

Fullstaq
Perses is to Prometheus as what Grafana was for graphite. And even though Grafana has become so much more, Perses is an extremely interesting new entry into the dashboad/visualisation space. This talk will have an up to date demo and some of the unique advantages Perses offers over alternatives.

Unlocking key metrics and patterns using Grafana

Grafana Labs
This talk is about Data analysts who mostly work as Data Scientists who need to visualize data mostly available in Google Excel data sets aka data sheets on the cloud and using Grafana can help to visualize and monitor it by using the Excel sheet plugin.

13:00

Lunch & networking

Main lobby


Improving the efficiency of more than 800 databases with observability: 4 years later

OVHcloud
The presentation explores how observability can improve service quality down to the database level. The speaker will discuss how OVHcloud refined the efficiency of their information system databases. The talk shares a recipe for getting the most out of your DBMS and how to get rid of slow queries.

14:30

Alerts don't suck, YOUR alerts suck

Kentik
When people tell me "alerts suck", I tell them "No, YOUR alerts suck." Here's why: We suck at creating alerts that are useful, meaningful, and most of all actionable. But there's good news: We'll show you how your alerts can suck less, and even be more manageable, using a few easy techniques.

DevOps Is Not Dead. Not Not by a Long Shot

FanDuel / Blip.pt
DevOps is not a trend that has come and gone. It's a cultural shift that has fundamentally changed how software is developed and deployed. While emerging practices like Platform Engineering may seem to be taking over, they are not meant to replace DevOps; instead, they are complementary to it.

15:30

Beyond Spans and Traces: Advanced Python Observability with OpenTelemetry

Outshift by Cisco
Attendees will learn through practical examples how to integrate OpenTelemetry for logs and metrics, moving beyond basic tracing to achieve a more holistic view of their applications. The session will also share examples of using open source tools for visualization.

16:00

10:30

Coffee break

Main lobby


Practical AI with Machine Learning for Observability in Netdata

Netdata
In SreCON19, Todd Underwood from Google gave a presentation with the title “All of Our ML Ideas Are Bad (and We Should Feel Bad)”. Let’s see a few ML ideas, implemented in the open-source Netdata, that may not actually be that bad.

Beyond Reactive Security: Next-Gen Kubernetes Threat Hunting Powered by GenAI and eBPF

Accenture
Unlock the future of Kubernetes security with GenAI and eBPF. Learn to predict, detect, and neutralize threats in real-time. This session reveals how AI and monitoring transform security strategies for engineers and platforms in fast-paced environments.

ChatGPT, LLMs, and LangChains - A Beginner's Guide

TantusData
Would you like to do something more than just create prompts for ChatGPT? How about building an application? An application that utilises the power of generative AI. From scratch! All you need is Python knowledge, your laptop, and an open mind. We will cover the tricky bits and pitfalls as well!

12:30

New Approaches to Reduce Alert Noise AIOps

ilert
This talk will be particularly valuable for DevOps engineers looking to optimize their alert management systems and reduce the cognitive load caused by alert fatigue.

13:00

Lunch & networking

Main lobby


Generative AI Powered Omni-functional Decision Insights for SREs

Accenture
This talk will explore the integration of Generative Artificial Intelligence (AI) to enhance decision-making and diagnosis in Site Reliability Engineering (SRE). Leveraging the capabilities of Generative AI, we propose an Omni-functional approach to synthesizing insights from diverse sources,...

Adopting Zero-Trust Security Strategy in Serverless application Deployment

Globallogic
Zero Trust Security Framework deployed as guardrails to verify every request as if it originates from an open network and anticipates that threats can be both internal and external using Enterprise Solutions like Hashicorp Sentinel and Open-source tools like Kyverno and Open Policy Agent.

What does "high priority" mean? The secret to happy queues

Indeed
When delegating work to background jobs, developers often manage multiple queues, which can lead to unpredictable problems initially. In this talk, I propose a different approach based on latency tolerance to address common issues encountered with queues.

15:30

10:30

Coffee break

Main lobby


11:00

Kubernetes, AWS, ALBs, Terraform and no Helm!

Modus Create
You will see the advantages of using the k8s provider for terraform instead of helm to manage the k8s objects ("yamls"). It allows you to see and confirm exactly what changes will happen in your cluster. Later we'll see how to integrate that with the AWS ALB and forward traffic directly to pods.

Config files vs. flags: story of pain

Have you encountered a project where every single app reads a config file on startup? Have you struggled to find those files or change them? Did you feel that there is something off with this approach? Look no further! We will discuss when flags can solve all of these issues (and when they can't).

Klustered: Kubernetes Debugging Live

Rawkode Academy
You don't learn until your back is up against the wall and the clock is ticking down. Join us as we explore 3 broken Kubernetes clusters in an attempt to fix each one within less than 30 minutes.

12:30

On-Prem is the new Black

Alchemist Accelerator
The general trend in the industry is shifting towards cloud repatriation, this shift has caused what I call a knowledge gap. In this talk I aim to demystify on-prem environments and show engineers how easy and smooth it is to repatriate data from cloud to an on-prem air gap environment.

13:00

Lunch & networking

Main lobby


Oops, I deployed too hard

Omnistrate
Sometimes you delete production, you drop tables, you change a conf and everything breaks but what if you upgraded the wrong environment, to the wrong version, of the wrong customer (that also happens to be the bigger your company has)? And what if you did it right before going to a long lunch?...

Debezium Server - New CDC runtime on the horizon!

Red Hat
Discover Debezium: Real-time database change streaming made seamless. Say goodbye to vendor lock-in with the innovative Debezium Server. Learn about parallelization and Kubernetes deployment using the Debezium Operator and supercharge your application performance.

Using, and mis-using Kubernetes Dynamic Admission Control

Thought Machine
Kubernetes Dynamic Admission control can be used for advanced validation and error correction of workloads. If you've ever wanted to make sure workloads in your clusters are conforming to security best practices, this talk will show you how!

15:30

Chaos as an Art: Crafting Chaos, Creating Order

OpenPayd
Imagine a canvas where chaos reigns supreme, but in its depths a jaw-dropping order emerges. Witness chaos transformed into art and order into a masterpiece. In this talk, we will explore how to organise chaos in a controlled manner and create chaos scenarios with k6 fault injection.

16:00

Speakers

AJ Jester
Alchemist Accelerator
Read more →
Ajuna Kyaruzi
Datadog
Read more →
Alayshia Knighten
Pulumi
Read more →
Aleksei Popov
Stenn International
Read more →
Andrei Kvapil
Ænix
Read more →
Andreia Otto
Adidas
Read more →
Anthony Ekpechue
Globallogic
Read more →
Antonio Cobo Cuenca
Cognizant
Read more →
Birol Yildiz
ilert
Read more →
Carlo Ventrella
Agilelab
Read more →
Carly Richmond
Elastic
Read more →
Chakkree Tipsupa
AWS
Read more →
Chris Cooney
Coralogix
Read more →
Costa Tsaousis
Netdata
Read more →
Daniel Magliola
Indeed
Read more →
Dave McAllister
NGINX
Read more →
David Flanagan
Rawkode Academy
Read more →
Divine Odazie
Severalnines
Read more →
Erwin de Keijzer
Fullstaq
Read more →
George Lestaris
Google
Read more →
Gunnar Grosch
AWS
Read more →
Harel Safra
Riskified
Read more →
Hrittik Roy
Loft Labs
Read more →
Jorge Luis Castro Toribio
NTT DATA
Read more →
Joyce Lin
Viam
Read more →
Justin Garrison
Sidero Labs
Read more →
Kyle Forster
RunWhen
Read more →
Leon Adato
Kentik
Read more →
Mahesh Venkataraman
Accenture
Read more →
Marcin Szymaniuk
TantusData
Read more →
Marcos Diez
Modus Create
Read more →
Mateusz Zaremba & Krzysztof Wilczynski
AWS
Read more →
Matt Simons
Fanatics
Read more →
Matteo Bianchi
Omnistrate
Read more →
Michael Cote
VMware / Pivotal
Read more →
Michele Dodic & Francesco Sbaraglia
Accenture
Read more →
Miko Pawlikowski
SRE Author
Read more →
Nicki Watt
OpenCredo
Read more →
Nishkarsh Raj
StatusNeo
Read more →
Oleg Fatkhiev
Read more →
Ondrej Babec & Jiri Novotny
Red Hat
Read more →
Ricardo Castro
FanDuel / Blip.pt
Read more →
Richard Finlay Tweed
Thought Machine
Read more →
Simon Hanmer & Ross Walker
GlobalLogic
Read more →
Steve Flanders
Cisco
Read more →
Steve Wade
Steven Wade Consulting
Read more →
Syed Usman Ahmad
Grafana Labs
Read more →
TBD
TBD
Read more →
Thijs Feryn
Varnish Software
Read more →
Wilfried Roset
OVHcloud
Read more →
Yosef Arbiv
Outshift by Cisco
Read more →
Yusuf Tayman
OpenPayd
Read more →

Venue

Everyman Canary Wharf

Crossrail Place,
Canary Wharf,
E14 5AR, London, UK
Level -2

Tube access
Jubilee, Elizabeth and DLR lines: Canary Wharf station

Sponsors & Partners

Want to become a sponsor? Get in touch!