Talks & Presentations

5 Years of Monitoring & Metrics

5 years ago, monitoring was just beginning to emerge from the dark ages.

Since then there's been a cambrian explosion of tools, a rough formalisation of how the tools should be strung together, the emergence of the monitoringsucks meme, the transformation of #monitoringsucks into #monitoringlove, and the rise of a sister community around Monitorama.

Alert fatigue has become a concept that's entered the devops consciousness, and more advanced shops along the monitoring continuum are analysing their alerting data to help humans and machines work better together.

But Nagios is still the dominant check executor. Plenty of sites still use RRDtool. And plenty of people are still chained to their pagers, with no relief in sight.

What's holding us back? What will the next 5 years look like? Will we still be using Nagios? Have we misjudged our audience? What are our biggest challenges?


The DevOps Field Guide to Understanding Cognitive Biases

As devops practitioners we focus on improving the culture of collaboration so that others play nicely with us & we play nicely with others - but what if the biggest thing holding us back from change is our own brains?

Cognitive biases can deeply affect our behaviours towards others by herding us towards mental shortcuts that are optimised for timeliness over accuracy, at the expense of rationalising irrational behaviour.

You are probably pushing these biases onto other people every day but don't even know it. Does that idea make you feel unconfortable? You are probably experiencing the Semmelweis reflex kicking your confirmation bias right now.

Knowing is half the battle. This talk will delve into some of the well-known and less well-known biases that may be affecting your ability to work with your peers, and your team's ability to work constructively with other teams.

Attendees will leave the talk with an overview of biases they run into every day, how to hack their brains to use these biases to their advantage, and some tips on how to mitigate the effects of the limitations baked into their wetware.

We have met the enemy and he is us.


Escalating complexity: DevOps learnings from Air France 447

On June 1, 2009 Air France 447 crashed into the Atlantic ocean killing all 228 passengers and crew. The 15 minutes leading up to the impact were a terrifying demonstration of the how thick the fog of war is in complex systems.

Mainstream reports of the incident put the blame on the pilots - a common motif in incident reports that conveniently ignore a simple fact: people were just actors within a complex system, doing their best based on the information at hand.

While the systems you build and operate likely don't control the fate of people's lives, they share many of the same complexity characteristics. Dev and Ops can learn an abundance from how the feedback loops between these aviation systems are designed and how these systems are operated.

In this we cover what happened on the flight, why the mainstream explanation doesn't add up, how design assumptions can impact people's ability to respond to rapidly developing situations, and how to improve your operational effectiveness when dealing with rapidly developing failure scenarios.