FutureStack14, New Relic’s unique technology and user conference in San Francisco this October, will attract a wide variety of innovators who are doing beautiful, important things with data and software. Etsy’s Daniel Schauenberg, who will give a talk on Data Driven Monitoring on Thursday, October 9th at 1:45 p.m., perfectly embodies those qualities.
Daniel (@mrtazz on Twitter) is a senior software engineer—or, as he likes to call himself, an “Infrastructure Toolsmith.” He’s part of the infrastructure and development tools team at Etsy, the online handmade-goods marketplace. Previously, he worked in systems and network administration, on connecting chemical plants to IT systems, and as an embedded systems networking engineer.
A lover of automation and monitoring, Daniel will discuss Etsy’s monitoring and on-call setup and how the company uses it to gather data about how system health and performance. Etsy feeds the data it collects back into the system to make informed decisions about what data is missing and how to design its alerting infrastructure to walk the fine line between being helpful and overwhelming on-call engineer with alerts.
We talked with Daniel about his background and his talk at FutureStack14:
When did you first realize you were a data nerd?
I think I have always been a dormant data nerd without knowing it. In previous jobs, I always had the feeling that something was missing when we approached problems, because there wasn’t such a strong culture of making decisions based on data. When I joined Etsy in 2011 and learned about the culture of graphing everything and making decisions based on data, I was immediately hooked and everything suddenly made sense.
At Etsy we have the motto “when in doubt, graph it.” But this approach also comes with the potential for engineers feeling buried under a pile of graphs and alerts. So figuring out what are actually the most important metrics and things to alert on to keep the monitoring system relevant is a hard but exciting challenge.
Why are graphs so important?
Graphs give a very simple and effective way to visualize things that otherwise would often be invisible in software. I like that they provide a quick glance into the system while simultaneously also often providing the foundation for finding previously unknown relationships between systems.
What kind of data do you monitor, in particular?
We monitor (or more accurately, graph) everything! We monitor all the standard things like page-response times, requests per second, and utilization of different resources like CPU, memory, and disk across all the systems. In addition, we also monitor how often we have to page the on-call engineer for particular monitoring checks (spoiler: disk space issues are the clear winner), how actionable different checks are, and whether or not they woke up the on-call engineer.
What attracted you about speaking at FutureStack14?
I had heard about the FutureStack13 conference last year when a friend of mine spoke. The topics sounded very interesting, so when I was asked to speak at this year’s conference, I was immediately excited to get the opportunity.
I’m especially looking forward to finding out how other companies use different kinds of data to make things better and to meet friends and colleagues from the industry.
What are your personal goals? What do you want to monitor that you aren’t monitoring yet?
I want to gain a better understanding of how to apply different forms of monitoring than just binary and simple threshold monitoring, and then to build tools to easily utilize those insights in existing monitoring systems. And to never get woken up by a disk-space alert again!