Scaling a cloud company to handle a billion data requests every hour is part of the challenge that drew Mikey Butler, New Relic’s senior vice president of engineering, to the company. We talk to Mikey about the challenges and opportunities of New Relic’s ever-increasing scale. (Part 1 of this series examined Mikey’s journey to New Relic and the importance of engineering management, while part 3 discusses New Relic’s Next Generation Engineering Process.)
New Relic: How quickly is New Relic scaling?
Mikey: When you look at it in terms of HTTP requests per day, we are one of the busiest websites in the world.
We’re handling over a billion requests an hour. With each one of those requests, there’s a little payload of data that’s coming from our customers that we have to ingest and understand so we can provide the value of that data back to them. And it’s growing by leaps and bounds.
New Relic: How does that kind of scale change things?
Mikey: At a smaller scale, companies can buy their way out of growth problems by buying very advanced hardware, very advanced software systems. Essentially just adding cookie-cutter capacity.
But at a certain point you have to look at a more refined approach, where data gets stored in different storage tiers with different costs at different performance capabilities.
If it’s a small amount of data or a moderate amount of data, the fact that you have very expensive storage doesn’t hurt you. But at the scale that we’re at now and where we hope to go, we have to be much more strategic about where we store data, how expensive it is, and think about if we need to trade off some of the speed of access for cost.
That’s something that all the big players have to do. Everyone is now dealing with the concept of different storage tiers, and the cost differences of those storage tiers. For example, why should you have data that you don’t look at very often in a very expensive solid-state system? Why can’t that be on rotating media?
The idea of storage cost optimization is very much at play here. The idea of finding elegant ways of distributing queries across those different storage tiers is a big priority for New Relic.
In the past, we had the simplistic approach of having one storage tier, and we could query across the whole thing. Now we’re going to have different storage tiers, with different characteristics, holding different types of data. But you’ve still got to look across all these different types of data and different types of storage backend systems, integrate an answer that spans all that, and present it to the user as an integrated, unified whole.
New Relic: What is the company doing to figure that out?
Mikey: Well, if we make mistakes, they should be what I call punching a hole above the waterline in the canoe as opposed to below the waterline in the canoe. That’s kind of the trick of it.
Part of it is that we learn from best-of-breed players that are already out there. We spend time talking to the Googles and the Twitters and the Facebooks to get their learnings from further down the road with respect to scaling. Our data is different, of course, so we have to actually do some interpretation and interpolation to make it relevant to our use case. But there are lessons to be learned.
New Relic: How are we translating those learnings into best practices?
Mikey: We tend to solve the storage-tier problem largely with software. We have our own software for the very high-performance data, in the NRDB. But we integrate off-the-shelf third-party storage offerings for some of the lower-performance tiers. ObjectSpace, for example, or Cassandra-based storage. Being able to do queries across all three of those is part of the game.
It’s off-the-shelf software and off-the-shelf hardware. Then we put value on top of that, and we do the integration that goes with it. But to the degree that we can get away with open source to meet our performance and uptime SLAs, we try to do that.
New Relic: What’s the next step in this process?
Mikey: This is an ongoing challenge. You’re not going to solve it one day and say, “What next?” because the scale keeps growing.