SREDAY

Site Reliability, DevOps and Cloud

June 22, 2024 San Francisco, USA

1
Day
10
Speakers
~100
Attendees

Event Starts In:

Tickets

Schedule

10:00

Keynote: Increasing the collaboration between SREs and Developers for the best outcomes on Incident Management

Harness
SREs have been at the forefront of Incident management when it comes to recovering the services during the outages. While SREs can follow the best practices such as setting up the SLOs and auto recovery systems as much as possible, the P0 and P1 outages require the developers to join the forces...

10:30

Keynote: Achieving DORA outcomes by building reliability capabilities into your platform

Google
Infrastructure breaks, but systems can persist! We really want systems that withstand unavoidable failures. Abstractly, we understand that. In this talk, Steve presents concepts like spanning failure domains and using generic mitigations, as well as introducing a lab environment where teams can...

11:00

Keynote: Redefining Observability: Cost-Free Cardinality

Cardinal
In Observability, Cardinality often gets the short end of the stick due to high costs. This forces engineers to minimize its use. However, SREs, who need to identify business impact quickly, suffer the most from this limitation because of misaligned incentives between developers & SREs....

11:30

Keynote: AIOps, are we there yet?

Google
Coming soon...

12:00

Coffee break

Main space


12:30

The RunWhen Authors Community: Royalties For Runbooks

RunWhen
Inspired by work from hyperscale SRE teams, RunWhen is building the industry's largest open source library of troubleshooting automation. Contributing Authors receive royalties when the company's enterprise customers use their code to automate root cause analysis and remediation.

13:00

Tools and Best Practices from Running Unikernels in Prod for a Few Years

NanoVMs
Unikernels have been running in production for years now, and that experience has produced a set of new tools, best practices, and workflows.In this talk, come hear about how and why your peers have been pushing unikernels to production and how new methods have been established. Whether it's...

TDD: Are you sure about that?

Codejet
Test-Driven Development (TDD) gives you the peace of mind to sleep well, knowing your code is robust and reliable. By writing tests before the implementation, you ensure that each change meets the requirements and works as intended, reducing the anxiety of unexpected bugs. TDD also enhances the...

Rock around the clock (synchronization): Improve performance for end users with high precision time!

Clockwork Systems
Is the app slow or the network lagging? When it comes to latency across distributed systems, it can be hard to pinpoint where exactly the issue lies. To add to this, business demands drive where we run our workloads today - whether on-premises, cloud or hybrid environments - to enhance agility,...

14:30

Lunch & networking

Main space


The IaC Tooling Face-off for Modern Cloud Native SRE Practices

Env0
With Infrastructure as Code (IaC) becoming the de facto way we manage our infrastructure today, a lot of excellent tools have become widely adopted that each have a different set of strengths. In this talk, we'd like to take a look at the evolution of the IaC landscape over the past decade and...

16:00

How Workload Rightsizing Increases Availability: Acquia’s journey to Kubernetes and achieving 99.99% uptime

StormForge
When migrating to Kubernetes from legacy architecture, organizations typically see application availability suffer. That wasn’t the case though for Acquia, whose website hosting platform supports tens of thousands of highly variable workloads. In this session, we’ll see how Acquia overcame the...

16:30

The Journey of Reliability: Why bother at all?

Stealth Startup
Modern software is all but delivering moments of delight to customers. Moments that are constantly competing for users attention , reliability is not just a luxury—it’s a necessity. But what does reliability truly mean for stakeholders, and how do they ensure it in our software systems? This talk...

Top 10 Security Vulnerabilities and Misconfigurations in Kubernetes Infrastructure

Teams often struggle to properly secure Kubernetes clusters due to the high workload. Meanwhile, gaining shell access to a container within the cluster is very possible. In this session, we will demonstrate potential attack vectors in the situation when a malicious actor could gain shell access...

SREday - it's a wrap!

SRE Author
Time to wrap SREday SF - where we go next, and how you can get involved

Speakers

Uma Mukkara
Harness
Read more →
Steve McGhee
Google
Read more →
Ruchir Jha
Cardinal
Read more →
Thais Melo
Google
Read more →
Kyle Forster
RunWhen
Read more →
Ian Eyberg
NanoVMs
Read more →
Sebastian Kurzynowski
Codejet
Read more →
Lerna Ekmekcioglu
Clockwork Systems
Read more →
Ohad Maislish
Env0
Read more →
Erwin Daria
StormForge
Read more →
Piyush Verma
Stealth Startup
Read more →
Vitaliy Shynkar & Bogdan Barchuk
Read more →
Miko Pawlikowski
SRE Author
Read more →

Venue

The offices of Harness.io

55 Stockton St, San Francisco,
CA 94108, United States

Sponsors & Partners

Want to become a sponsor? Get in touch!