Guest author Christof Geirnaert, Global Director of Integration at managed service provider Sentia Group, explains how Sentia’s Belgium team uses observability to reduce mean time to resolution and deliver on its customer promise and service level agreements.
Sentia Group is a European leader in managed services for complex and critical applications on scalable cloud platforms. As a cloud-agnostic, managed service provider (MSP), we help our customers join the digital revolution and take advantage of the benefits of hybrid, private, and public cloud infrastructure.
We pride ourselves on taking extreme ownership of our customers’ application landscape and IT infrastructure by managing availability and performance for their key business services, as well as helping them innovate. To help us deliver on our customer commitments, we extended our toolkit with New Relic One to give us flexible, open, and programmable observability across the full technology stack, including hybrid, hosted private, and public cloud. Our custom approach gives our service delivery team a way to visualize data that prioritizes alerts and issues, so we know where to focus our attention first.
Monitoring the full estate
While cloud services bring many advantages for our customers, such as speed to market, scalability, and agility, these environments also increase complexity, especially when it comes to understanding why something isn’t functioning properly.
New Relic gives us a central observability platform with wide coverage and the ability to ingest data from open source and other tools, and then see that data all in one place. Now we can monitor everything hosted by Sentia, whether that’s on our private Sentia Cloud, Amazon Web Services, Google Cloud Platform, or Microsoft Azure infrastructure. As one of the fastest-growing IT service providers in northwest Europe, this gives us the visibility we need across the hundreds of sub-accounts and thousands of hosts that we manage.
Customizing the platform for our use cases
In the process of searching for the right strategic partner for the next phase of our service provider journey, it was important for us that we choose a platform that lets us support all of our use cases and situations, either using out-of-the-box functionality or by creating apps ourselves.
For example, we need to visualize data using different perspectives, dimensions, and roles based on our company’s and our customers’ needs. New Relic One gives us the flexibility to do that. If we want to ingest data into New Relic from open source tools, New Relic One supports that. We can adapt the platform to best meet our needs as they evolve.
Developing a situational awareness app
One of the first things we wanted to do when we rolled out New Relic One was to provide our service delivery organization with greater context into which alerts and issues are the most important for their teams to focus on. To accomplish this, we decided to co-develop a custom app that allows us to visualize the information we need as an MSP, letting us quickly surface what matters most from the many notifications we get across our infrastructure, customers, and accounts.
When you have a high volume of alerts, it’s difficult and time-consuming to understand what is going on so that problems can be prioritized and resolved as quickly as possible. However, a visual representation that surfaces the most critical issues happening across the entire estate would give our teams instant visibility into where important problems are happening so they can decide which resources are needed to resolve.
Working together with New Relic Expert Services, we co-developed a New Relic One custom app that we call our Situational Awareness Dashboard. It provides our operations leaders with the context and situational awareness they need in an easy-to-understand, visual format.
Automatically surfacing issues
The Situational Awareness Dashboard shows the full estate for a particular business unit, including multiple cloud environments, customers, and accounts. It prioritizes alerts and shows them in groups based on whether they are mission-critical, business-critical, or non-critical for meeting our SLAs. Customers and accounts having issues are at the top of the visualization, so it’s immediately obvious where there’s a potential problem.
The telemetry data is enriched with labels that can be used to aggregate and visualize information. We provide different views that show metrics, regions, cloud, and customers. Using red as a visual aid lets us immediately see areas where we need to focus and prioritize.
Now our service delivery teams can quickly see whether a problem is concentrating around a specific availability zone, region, customer, or environment. They can make informed decisions about whether it’s more important to focus on the 100 synthetic alerts popping up or contacting a cloud provider to report problems with their service, for example.
Beyond reducing mean time to resolution
While our new app is helping us restore services more quickly, reduce our mean time to resolution, and meet our customer SLAs, we’re already looking to expand the capabilities and use cases for the app to include data that reflects customer context to the engineer. Enriching the current information with impact analysis and the strategic importance of the service for our customers would give us an instantly digestible visualization of where we should prioritize our resources.
We expect Full-Stack Observability to support the sustainable growth of our business in other ways as well. As our customers progress and mature on their cloud journey, they will increasingly be prepared to invest in Full-Stack Observability for their applications. Offering visibility and context to reduce cloud complexity gives us a way to differentiate our business and increase our value to our customers.
Check out more New Relic customer stories.