SREday 2023

14 - 15 Sep London, UK
Site Reliability Engineering Conference
2
Days
50+
Speakers
6
Tracks
150+
Attendees

Event finished

2022 throwback

Schedule

Day 1 - Sep 14

09:00

Keynote: Futuristic Luxury Incident Response

PagerDuty
Responding to incidents is work. It’s unplanned, sometimes chaotic, and often stressful. It should be getting better, but many organizations find improving difficult and often backslide into bad practices. Teams tackling too many incidents see more burnout and have less time to work on work that...

Keynote: Supercharging Observability with Feature Flagging

Freelance
Feature flags allow you to enable and disable code without changing or deploying any source code, as well as letting you selectively route traffic to certain users or a percentage of certain users, along with other great tricks. It’s powerful stuff … but when you combine it with observability...

Keynote: Fake It 'Til You Make It: Get the Most out of Incident Simulations

Rootly
The worst time to discover your processes are totally broken? During a live incident. Avoid incident management flops by running regular simulations that actually reflect a realistic incident environment. I'll teach you how.

10:30

Coffee break

Main lobby


11:00

Five Challenges in Real-time Stream Processing and Five Solutions

Hazelcast
Real-time stream processing is growing exponentially in recent years, businesses need to gather insights from real-time data as soon as it’s generated. To do this, developers and software architects use various pipelines and tools to capture and process data in motion. Real-time stream processing...

Lessons learned after more than 20 years working in IT

Contino
When you arrive at a point in life you have work colleagues whose dads are in the same age bracket than you, it’s time to reflect on your career and think all the learning you have cumulated and all the advice you would have loved to hear on each stage of your professional career

12:00

Cracking Cholera’s Code: Victorian Insights for Today’s Technologist

Curious Coffee Club
As Steve Jobs reminded us, technology can be a bicycle for the mind. It can be a force multiplier, helping us achieve what we could otherwise not. However, it is too easy for technology to constrain rather than enable us. Transformations too often fall short, and new technology rarely creates the...

Cloud Fitness Engineering - Maximizing Business Value from Cloud Implementations

Accenture
Distributed system design is hard. How to address the challenge of uncertainty, complexity, and dynamism in the cloud and mitigate 3 risks: value,cost and failure by applying evolutionary system design, continuous optimizations, iterative simulations and empirical learning through use of datascience

13:00

Lunch & networking

Main lobby


14:00

Lessons learned from 7 years of running developer platforms

VMware/Pivotal
How do you get developers to actually use your platform? Come hear what groups like Mercedes-Benz, JP Morgan Chase, and more have proven out: product managing the platform, attracting and retaining developers, seeding trust and skills, re-skilling existing ops staff.

Overcoming SRE Anti-Pattern Roadblocks: Rebranding the Operations Team

FanDuel/Blip.pt
This talk will detail the anti-pattern where traditional operations are rebranded as “SRE” but everything else stays nearly the same. The same tools, processes, and interactions persist. SRE is seen as a mandate without any real change. In practice, the only change is the team’s renaming.

Platform Engineering meets Service Mesh: an SRE love story

Solo.io
Learn why and how every SRE should know what service mesh and what can it do for them: control and management of application networking, automated zero trust security and near-perfect observability of cloud native applications

Navigating GraphQL Vulnerabilities: Proactive Discoveries and Resilience Building for SRE/DevSecOps

Escape.tech
Dive into the chaotic world of GraphQL vulnerabilities with Escape's co-founders. Their exhaustive research unveiled a daunting 46,000+ security issues across more than 1500 GraphQL endpoints. Walk away armed with a resilience strategy to fortify your production GraphQL applications.

16:00

10:30

Coffee break

Main lobby


On-premise data centers do not need to be legacy

Red Hat
We will explore the shift from on-premise to the public cloud. The reasons behind it, advantages, and disadvantages. We will discuss what learnings and design choices we can bring back to private clouds. Uncover how to stricture modern private cloud, and discover which technologies can help you.

My Journey to SRE

Depop
In this talk, I share my experience & perspective as a SE going into SRE. This is divided into 2 parts - the pre-journey like I'd like to call it focuses on the key skills that you pick up as a SE that preps you for the role and then the actual Journey continues...how you start to get fit into SRE.

Deploying applications to Kubernetes using Helm

Pure Storage
Helm is a package manager for Kubernetes which simplifies deployments to Kubernetes. In this session we'll go over exactly what Helm is, the benefits it brings, and how to deploy applications to Kubernetes.

12:30

Building a Viral Open Source AI Chatbot: A Journey from Concept to Reality

Aleios
This talk is about how we built Quivr- the open-source virtual brain. In just over 2 months we are at 14k stars and 6.5k active users. This is a story of the crazy highs, some broken production and how we created a community behind this project. I would love to be able to share the journey with you.

13:00

Lunch & networking

Main lobby


Smart Contract-driven AI: A New Era of Decentralization

Free Will AI
Until now, dapps have never been able to incorporate machine learning models. It is hard for a dapp such as DTube to compete against centralized apps such as Youtube that can offer better content recommendations to their users. We'll explain how anyone can easily deploy a ML model to the blockchain.

Slaying the SLAs: Mastering Effective Communication for Seamless Customer Experience!

Smarsh
In today's fast-paced development landscape, meeting SLAs is crucial for any SaaS organisation to maintain a competitive edge. Join us to learn how to build & manage SLOs to prevent customer SLAs from being breached. As a SRE, it's crucial to break down those barriers & communicate effectively.

From Docker'fail to Dockerfile.. a true story!

Spotify
In this session, we compiled 10 horror Dockerfiles don'ts, a.k.a “Dockerfails”. To end on an optimistic note, we will showcase the maximum good practices in some popular CI/CD tools (github/gitlab/jenkins) with a touch of GitOps.

Resilient Distributed Caching in Highly Available Real-Time Payment Systems

Discover Financial Services
Have you ever wondered what it takes to build a highly resilient distributed caching platform for critical real-time payment systems? Join us as we share our journey of building a highly available and fault-tolerant caching solution while leveraging automation to achieve a faster MTTR.

16:00

10:30

Coffee break

Main lobby


11:00

The efficient way of Autoscaling Your Workload in Kubernetes

Dynatrace
Struggling with limited observability data for HPA? Tired of provisioning resources that lead to unnecessary costs? The Keptn's Metric Server solves this by exposing new metrics in K8s. Join me to see how to scale workloads with efficient HPA rules.

11:30

DevOps 2.0 - Bigger, Badder with More Automation

LinearB
It’s hard to imagine ever going back on the process & quality improvements brought by CI/CD. But it only got us so far. It's time we enter the next phase of the DevOps evolution, to continue streamlining our engineering operations, we need to rethink our processes & eliminate the amounting friction.

Synthetic Monitoring and E2E Testing: 2 Sides of the Same Coin

Elastic
As a developer who loves SRE, I want to collaborate with SREs and support engineers to build amazing software. With traditional support teams' adoption of SRE, we automate similar workflows with different tools. Let's use Synthetic Monitoring as E2E tests to validate the user experience together!

Chaos Synergy: A Game Theory Approach to Chaos Engineering for Enhanced System Resilience

Samsung
Unlock the power of game theory principles in chaos engineering! Our AI model, Chaos Synergy, predicts system behavior under different failure conditions, providing insights for better system hardening. Join me at SREday 2023 and learn how to enhance your system's resiliency with this novel approach

13:00

Lunch & networking

Main lobby


Stop configuring infrastructure, start coding it!

AWS
Infrastructure as Code is a best practice, but we are writing configuration files, not real code. Is there another way? Yes! In this session, we’ll dive into the open-source Cloud Development Kit that lets us define cloud infrastructure in languages like Python and TypeScript.

Know your data: The stats behind your alerts

NGINX
Quick, what's the difference between mean, mode and median? Review how statistical behavior impacts alerting. Learn why a median is best for historical anomaly. Jump into distributions, data alignment challenges and the trouble with sampling. Walk out with a deeper understanding of your metrics.

15:00

Feature Flags with OpenFeature: Enabling Faster, More Collaborative Development

Dynatrace
Have you ever thought about using Feature flags to reduce the risk every time you release? Why not get started with OpenFeatuere, an opensource option to dive into progressive delivery

Reliability Engineering: What we learnt so far and what's next

Netquest
From Systems Engineering to Platforms Team, a journey through DevOps and SRE adoptions, progressing from losing one team member per year to “The happiest team in Netquest". A tale of building trust and reliability while adopting anti-fragility; with best practices, anti-patterns and teaming tips.

16:00

Day 2 - Sep 15

Keynote: The state of SRE in 2023

SRE Author
Come and explore the landscape of SRE as it is in 2023, with the new trends, techniques and tools on the horizon.

Keynote: The Big One

Datadog
On March 8th, Datadog had a massive global outage. It took more than 500 engineers split amongst many teams over two days to coordinate the incident response. In this talk, I will go over the trigger of the incident and why it took such large-scale efforts to resolve, and some of the technical...

10:00

Keynote: Failing to Autoscale?

StormForge
Resource inefficiency with K8S autoscaling often begins with improper vertical scaling. Then horizontal scaling compounds these issues, which manifest as high cloud costs as the cluster autoscaler adds instances. In this talk, I will inform people how they can use ML to enable effective autoscaling.

10:30

Coffee break

Main lobby


11:00

Statuscake vs Kibana alerting

Elastic
Moving from external alerting tools to in-house built solutions can be an exciting challenge. Learn how Observability at Elastic has put in place Synthetic monitoring to fire up Kibana alerting solution for a better SRE experience.

Data Radar Maps: A Snapshot of Your Organizational Data Framework

The Joy of Data
Meet Data Radar Maps : actionable data tools that help product and engineering teams grow with data science. It allows you to measure and track the maturity and progress of the data framework in your organization. All this and a bit of knowledge sharing!

Use continuous profiling to gain a deeper understanding of your incidents

Grafana Labs
During this talk I will show how continuous profiling can help aid the investigators during an active incident, to reduce the time to recovery. Continuous profiling data can also give you some more clues to finding the root cause in the aftermath. I will share our experiences with real word...

Centralise legacy auth at the ingress gateway

StackAdapt
Tired of "just use JWT!" tutorials? Learn how you could move your existing legacy authn/authz to a centralised service working together with your ingress gateway. Convert basic, bearer or other authentication mechanisms into a common format, even handling multiple auth types for all your endpoints.

13:00

Lunch & networking

Main lobby


Meet Kairos, an open-source project building the immutable Kubernetes edge

Spectro Cloud
The edge is the new cloud! Kairos is an open-source project that acts as a factory to produce immutable operating systems with any Linux distribution from container images and build Kubernetes clusters in small-footprint environments at the edge. All of this in a flash with a few manual actions!

How We Stopped Thanos from Snapping $100,000 from our Infra Budget

Zenduty
In a galaxy not so far away, where data is as vast as the cosmos, our team was troubled with observability data chaos. Seeking some clarity, we sought salvation with Thanos and Fluentbit – fabled titans against our metric storage and logging issues. Thanos empowered us with a Prometheus setup...

15:00

10:30

Coffee break

Main lobby


We need to talk about SRE burnout

Cloudsoft
Incident resolution is stressful, but competing demands, cultural misfires and lack of automation are tipping SRE stress over to SRE burnout, impacting the health of SREs, and SRE success. But, it can be addressed. Come to this talk for actionable insights into preventing burnout and reducing toil!

Data Reliability Engineering: The Secret Sauce Behind Our Success

Sahaj Software
Data products break the software rules. Schemas shift, data acts up, systems revolt. Yet demand skyrockets. We built solutions. You'll master our playbook—data contracts, testing, monitoring—forcing order upon data chaos. Tame the wild west; lead the data age.

Revolutionizing Developer Experience: Accelerating Software Development in the Modern Era

Parsectix
Revitalize your DevEx! Unleash innovation & efficiency in engineering teams by mastering Infrastructure as Code, CI/CD, & more. Streamline onboarding, create dynamic developer hubs, & cultivate a thriving team culture. Witness a real-life transformation and revolutionize software development today

Automation Best Practices for SRE and Security: Insights from Building a Workflow Automation Product

Datadog
When building a workflow automation product, I noted different types of teams and organizations faced different issues with the automation they envisioned. Hence In this talk, I aim to outline best practices that can be distilled from all the ways seen that different teams were building automation.

13:00

Lunch & networking

Main lobby


14:00

Culture Trumps YAML

Dell Technologies
Over my last few jobs, I have been brought on as a senior SRE. Instead of wrangling CI/CD or messaging queues, I found my time was better spent introducing, or reintroducing the concepts of observability, incident & change management and other core SRE topics. SRE is a positive lever for culture.

14:30

Mastering Incident Communications

PagerDuty
Clear comms are critical when moments matter. Translating engineering terms to customer-friendly comms is harder than you think. Come to this talk to learn how to work with your customer-facing teams to translate your systems to customer symptoms, allowing you to get back to focusing on a fix.

15:00

10:30

Coffee break

Main lobby


11:00

Kubernetes, Decisions

AWS
Top7 decisions a DevOps Architect will need to face when running kubernetes in high scale.

11:30

Unwiring High Cardinality

last9.io
Observability relies on metrics as a crucial aspect, providing a cost-effective and speedy way to address SDLC and Software health queries. From combating Noisy Neighbors to battling in the Streaming Wars and dealing with the pulse of High Cardinality, what are the best workflows to deal with it?

Observability Visualization in the Age of OpenTelemetry

Swisscom
OpenTelemetry isn't just a step forward; it's a leap into a new era of observability. Born from the complex needs of K8s, it has unlocked a world of possibilities that extend beyond the conventional realm. We can now visualize data like never before, illuminating insights that were once hidden.

12:30

Lunch & networking

Main lobby


13:30

Tickets

General Admission

£799
If your company is paying for you to attend, please pick this option!

Self-Funding

£149
If you're paying from your own pocket to gain new skills for a better job, pick this!

Student

£49
All students are welcome!

Speakers

Ajuna Kyaruzi
Datadog
Read more →
Akshay Karle & Carmen Mardiros
Sahaj Software
Read more →
Alayshia Knighten
Freelance
Read more →
Alessandro Vozza
Solo.io
Read more →
Aman Sardana & Bhargav Nachegari
Discover Financial Services
Read more →
Andrew Kirkpatrick
StackAdapt
Read more →
Andrew Pruski
Pure Storage
Read more →
Antoine Carossio & Tristan Kalos
Escape.tech
Read more →
Antonio Cobo Cuenca
Contino
Read more →
Ashley Sawatsky
Rootly
Read more →
Brian Murphy
Dell Technologies
Read more →
Carly Richmond
Elastic
Read more →
Christian Simon
Grafana Labs
Read more →
Dave McAllister
NGINX
Read more →
David Hirsch
Dynatrace
Read more →
Devrim Demiroz
Swisscom
Read more →
Diana Todea
Elastic
Read more →
Erwin Daria
StormForge
Read more →
Fabio Alessandro Locati
Red Hat
Read more →
Fawaz Ghali
Hazelcast
Read more →
Henrik Rexed
Dynatrace
Read more →
Joy Chatterjee
The Joy of Data
Read more →
Kat Gaines
PagerDuty
Read more →
Kobi Biton
AWS
Read more →
Krishna Pomar
Smarsh
Read more →
Lucas Roitman
Free Will AI
Read more →
Ludovic Farine
Cloudsoft
Read more →
Madhu Kumar Reddy
Samsung
Read more →
Mahesh Venkataraman
Accenture
Read more →
Mandi Walls
PagerDuty
Read more →
Martin Sakowski & Julian Lang
AWS
Read more →
Matt Carey
Aleios
Read more →
Michael Cote
VMware/Pivotal
Read more →
Miko Pawlikowski
SRE Author
Read more →
Mohammed Aboullaite & Djalal Elbaz
Spotify
Read more →
Nicolas Vermande
Spectro Cloud
Read more →
Pavlos Kleanthous
Parsectix
Read more →
Piyush Verma
last9.io
Read more →
Ricardo Castro
FanDuel/Blip.pt
Read more →
Ruggero Tonelli & Forlidar Macias
Netquest
Read more →
Sarjeel Yusuf
Datadog
Read more →
Shubham Srivastava & Deepak Kumar
Zenduty
Read more →
Simon Copsey
Curious Coffee Club
Read more →
Temitope Faro
Depop
Read more →
Yishai Beeri
LinearB
Read more →

Venue

Everyman Canary Wharf

Crossrail Place,
Canary Wharf,
E14 5AR, London, UK
Level -2

Tube access
Jubilee, Elizabeth and DLR lines: Canary Wharf station

Sponsors & Partners

Want to become a sponsor? Get in touch!