Load testing is crucial. No one wants their website or app to crash—and load tests help us check whether our systems can cope with high traffic loads. But load testing alone isn’t enough. To get the full picture, you need to run load tests and solve the performance issues revealed by those tests.
That requires a comprehensive, cross-platform solution combining load testing with software analytics. New Relic Insights answers this need; it lets you collect metrics from any external source and visualize this data on its dashboards to pinpoint the root cause of your system’s performance issue.
Let’s look at how New Relic Insights works with BlazeMeter’s performance testing platform—and how together they can answer your specific queries, reveal patterns that are hidden in normal graphs, and provide drill-down analytics to show the underlying sources of performance issues.
Feeding Data Into New Relic’s Insights Platform
First, you need to set up a data feed into New Relic. Here, I have already set up a data feed on the request and transaction levels. The next step is to get all the data in one place, so I’m going to send all the results from a BlazeMeter load test directly into Insights. BlazeMeter is fully integrated with New Relic Insights so all it takes is a click of a button on BlazeMeter’s ‘Test Edit’ page.
Using NRQL to ask the database your questions
Once you’ve enabled the Insights integration, you can start the test. Just check that the data is there and make your first Insights NRQL query. This is a really cool function that lets you ask the database for information on the specific data that you want to see. Just go to New Relic’s query field, type “SELECT * from BlazeMeter”, hit Enter, and you’ll see the results in a matter of seconds.
Now you’re ready to analyze the data that’s come from the test.
I wanted to check out the throughput and, once again, NRQL queries make this easy. Just use a simple averaging function and limit the timeframe to show the most recent data. My query looks like this:
SELECT count(*) from BlazeMeter since 1 minute ago
That makes it easy to see the current throughput as a huge number on the screen:
To break this down further, you might want to see which requests, and in what proportion, make up the overall count. To view this, you’d use the “FACET” clause in the NRQL queries so the query looks like this:
SELECT count(*) FROM BlazeMeter FACET label since 1 minutes ago,
These are just two examples, but you can get pretty much any information you want just by entering the right syntax in your NRQL query. Learn more about creating NRQL queries here.
Analyzing the dashboards
You can easily view and analyze this data by putting it on New Relic Insight’s dynamic dashboards. For example, let’s add that big number into the dashboard as a widget:
Now let’s make it dynamic! We can view the request counts in a time series. In fact, this is exactly what the clause in New Relic’s NRQL query is called: TIMESERIES. Our query now changes to:
SELECT count(*) FROM BlazeMeter FACET label TIMESERIES 1 minutes SINCE 30 minutes ago
And what do we get? A beautiful stacked graph, which shows precisely when the load hits our servers—and segments it by the requests made:
Add this as a widget to your Insights dashboard and spend a few minutes watching it. You’ll see that the dashboard automatically updates to constantly reflect the most recent changes in your metrics. (If you want to plot the graph with active virtual users in the test, you can find the active users count in the “users” field of the “BlazeMeter” table.)
Now let’s check how our service responds to those requests. We want to analyze as many metrics as possible with a single request, so we’ll ask Insights to plot the timeline for minimal, average, median, and 90 percent response times. This is how the query should look:
SELECT min(responseTime), average(responseTime), percentile(responseTime, 50,90) from BlazeMeter timeseries 1 minute since 30 minutes ago
That’s useful, but simple aggregated charts can hide important information about response times. That’s why, in load testing, people often search for important patterns in the distribution of response times. Insights makes this easy, too. Just use the “histogram” pseudo-function to plot the bucket-distribution from a huge data set:
SELECT histogram(responseTime, 10000, 50) FROM BlazeMeter since 5 minutes ago
Now it’s easy to see an important fact that we couldn’t before: The response times aren’t constant or normally distributed.
To find out what is causing these peaks, we need to drill down. Fortunately, Insights’ dashboard filtering makes that easy, as well. You can drill down for more specific details on previously accessed analytics without having to repeat the queries. And you don’t need the ‘where’ clause. You can just use the Filters feature on your Insights dashboard. To enable this feature, start editing your current dashboard and click the “Edit Filter” link on the top. Then choose the “BlazeMeter” table, “label” field, and save your changes.
Now let’s start using a Filter field that’s on the top of the dashboard. We already know the possible values for the labels from the widget we created (which showed the composition of requests). Now we can filter them and view the distribution widget. The graph shows that our overall distribution is composed from several Gaussian distributions. For example: in my load test I was able to identify a bell curve at around 4K milliseconds, and discover that it was completely generated by requests labeled “demo.”
Now I want to see how these externally measured requests and responses correlate with internal application metrics. I had previously set up the collection of request and transaction level data using New Relic Agents. So this data has already been fed into the “Transactions” table in Insights. The table contains internally measured response times at both the database level and the application level. Again, I’d like to see several metrics on a single graph, so my query is:
SELECT average(duration), average(databaseDuration), average(webDuration) from Transaction timeseries 1 minute since 30 minutes ago
After adding it to the dashboard, we can see that the values correlate perfectly with what we measured from the load generator side. That leads to an important conclusion: If it’s clear that the network isn’t causing the response slowness, the source is likely to be in the application itself.
A static blog post can’t fully express how exciting it feels to see your load test displayed on a live dashboard. It’s really alive! It refreshes every few seconds so it’s constantly up-to-date, and it needs no additional set up. I can simply observe what’s happening with the response times and conduct experiments on the server side to optimize performance—and see the immediate effect.
Finally, I love the fact that I can just query the data and put it on my own dashboard in the exact way that I want. Here’s how mine looks:
Oh, and if you want to turn your live dashboard into a static one, Insights can do that for you, too. Just check out Insights’ documentation on using absolute data values for “SINCE … UNTIL” clauses in NRQL.
New Relic Insights and BlazeMeter make a uniquely powerful team. Together, they make it easy to answer specific queries, find hidden patterns, and drill-down to see the underlying sources of performance issues.