App Management Anti-Pattern: Monitoring a subset of hosts

Working with customers to solve support issues is a great learning experience for our engineers. Not only do we get valuable feedback on features that shapes our product roadmap, but we also get insight into how our customers use New Relic in their unique environments. And that’s also how we learn about application management anti-patterns.

Anti-Pattern: Monitoring a subset of hosts

We occasionally work with customers who are monitoring a subset of hosts in production, rather than all of their production hosts. How do we know they’re not monitoring all of their hosts? Several specific issues tend to crop up when you only manage parts of your application.

Missing Transactions

Murphy’s law says that critical business transactions are bound to happen on the unmanaged hosts. Errors, slow requests and poorly performing queries are all invisible when they happen on unmanaged hosts. By monitoring all hosts, you’ll never miss another poorly performing transaction.

Seeing only parts of the whole picture

By only monitoring a subset of hosts, virtually all of the metrics you see in New Relic would be skewed. Instead of seeing the throughput of your entire application, you see the throughput of activity on just a few hosts. Even with “fair” load balancing across a number of hosts, distribution is not typically equal. We consistently see a level of variability in response time and throughout across hosts supporting a given application.

Scalability and Optimization features work best with a complete picture

Our Gold level of service includes capacity and scalability reporting for optimizing application performance. These features are incredibly valuable in helping you scale an application over time and proactively manage performance. These features work best, however, with a complete picture of your application’s performance. It’s very difficult to extrapolate capacity analysis, for example, with only a limited amount of information.

Gathering performance metrics in pre-production environments

We believe managing performance in production is essential. Real performance data comes from real customers using your application every day. For highly performance-sensitive environments with very large amounts of traffic, we often see customers also deploying to QA and or load testing environments before releasing to production. In these environments they may run synthetic load tests, specific stress tests or other forms of quality assurance that aim to find the “low hanging fruit” for functional and performance related issues.

We recommend that customers include the name of the non-production environment in the application name, so they are able to track the performance of these environments separately. For example, for a production application named “My Application”, you might configure the application name in your load testing environment to “My Application (load testing)”.

Spread the love of New Relic everywhere!

Having performance management available across your entire application is the best way to mitigate application problems while optimizing for long-term scalability. Sign up for New Relic today or upgrade your existing service to cover your entire environment. You’ll be glad you did!'

Marketing at Github View posts by .

Interested in writing for New Relic Blog? Send us a pitch!