What are X-Ray Sessions?
X-Ray sessions provide deeper insight into your application’s key transactions. Under normal conditions, New Relic gathers the slowest transaction trace each minute. With X-Ray sessions, your key transactions will gather multiple transaction traces whether they’re slow or not. As an added bonus, because the Ruby agent supports thread profiling, the agent gathers profiles targetted at just the x-rayed transactions.
While we’re excited to bring this feature to Ruby, this post isn’t just to explain the feature–it’s thoroughly documented elsewhere. Instead I wanted to show how we used New Relic along with other tools to validate that x-ray sessions perform up to our high standards.
Why Do We Care?
In an abstract sense, what kind of performance problems could a feature like x-ray sessions cause? Two big cases come to mind.
First off, there’s memory consumption from transaction traces. The allure of x-ray sessions is getting lots of transaction traces you care about, but that data grows rapidly. If the agent didn’t properly manage its collection of transaction traces, an application could bloat up and consequently slow down.
The second area of caution is thread profiling. This feature hinges on a background thread polling for backtraces, generally ten times a second. (If you’re interested in more about the techniques that went into building the thread profiler, I gave a talk about it at MountainWest RubyConf). If the backtrace processing on each polling cycle is too slow, it could drag an application down too.
With these concerns in mind we benchmarked the feature using–what else!–New Relic.
Given this feature access the active thread list in a process, I performance tested using Puma (a multithreaded server) on both JRuby 1.7.4 and MRI 2.0. I gave the app a generous thread pool of 64 to work with.
I benchmarked a standard Rails 3.2 application running against MySql. The test focused on two actions–one heavy with repeated database access, the other quick and light-weight. Other points considered as well so the test setup reasonable approximated a real production environment:
- Run Rails set to the “production” environment. This keeps code reloading and other dev side-effects from muddying the numbers.
- Limit logging, especially for anything per-request. That meant turning down Puma, Rails, and New Relic’s logging.
- Enlarge the default database pool so we didn’t accidentally serialize our “concurrent” access behind a database bottleneck.
For driving load I used the excellent siege with a simulated user count of 100 to put a significant load on the Puma process. Along with providing great support for simulating multiple concurrent users, siege also allows a file of URL’s for each of the faked users to cycle through via the -f flag. The end result is more varied traffic than hitting the same URL repeatedly.
[The server is now under siege…]
On each benchmarking run, I followed these steps:
- Start up the application
- Commence the siege, sending load against the app
- Allow a settling-in period, at least 10 minutes
- Confirm in New Relic and siege output that performance is level
- Start x-ray sessions
- Run x-rays to completion
- Let the app run another 10 minutes
These steps gave a good comparison, not only of the normal steady state of the application, but also of the impact during and after an x-ray session. Watching after is especially important since it confirms you only pay the overhead for an x-ray session while it’s running–no leftover objects or threads remain to influence the application after the x-ray session is finished.
New Relic for Benchmarking
While New Relic is typically used for ongoing monitoring, it proved useful for this type of benchmarking too.
In particular, I watched the response time and throughput of application while x-raying transactions. Since that information is right at hand in the web transactions view, I ran my application as usual, confident New Relic was gathering the necessary data. In addition these numbers were cross-checked with the results coming out of siege to ensure their accuracy.
[03_WebTransaction_View] <caption: Steady state before x-rays>
But this did leave an issue finding the right time periods for tallying up my measurements. While New Relic’s graphs are great, there wasn’t any visual indication when an x-ray started. How could I easily locate the time ranges for comparison?
Deployments to the Rescue
Another feature of New Relic filled this gap neatly. Deployments let you mark arbitrary points in time in New Relic. These markers are layered onto all the pertinent graphs across the UI. By recording a deployment at each stage of testing, I could use those markers in the graphs to gather the performance data.
That’s how I gathered the data, but what were the results? Even under heavily loaded conditions, X-Ray Sessions only incurred a 2-3% overhead. In absolute terms, I saw a cost of roughly 2ms per request on my heaviest, 100+ ms action. Alongside the measurements in New Relic, we also independently confirmed the values with the results from siege.
For the wealth of information the x-ray session provides and the pointers it can give to improving your applications performance, that overhead could often be a worthwhile investment.
An interesting side-note was that while MRI and JRuby performed similarly overall, JRuby saw a larger time spent in the polling. It’s well known that backtracing in JRuby is expensive, which could account for that. Also since JRuby doesn’t have a GIL, our internal locks faced real contention, which could have been involved too.
So there’s how we used New Relic to x-ray our X-Ray Sessions. Now go out there and get your x-rays going!