Launch day. Years of effort have gone into this moment: You’ve obsessed over every detail of your product and worked tirelessly to prepare your technology stack. As you get ready to launch, the last thing you should have to worry about is your monitoring platform.
But all too frequently, launches are a white-knuckle moment, and complicated monitoring setups add to that. What if something goes wrong? Will you be able to find it in a sea of specialized tools, each holding only a piece of the puzzle? Are you recording the right information in the right format so that you can answer the questions that will come up? Did you build your data pipeline so that it can scale as its workload increases?
There is another way. We believe that monitoring should reduce anxiety instead of adding to it. All your data should be in one place and connected. Tools should help you answer questions quickly, even the ones you never thought you’d need to ask. And scaling up should be our problem, not yours.
This blog introduces you to our Telemetry Data Platform behind New Relic One—its design principles, how it achieves millisecond performance, and how it can support you on your biggest days.
Three design principles
As a pioneer of SaaS APM, we have over a decade of experience supporting our customers’ biggest deadlines, sporting events, and product launches year after year. That experience is manifest in everything we’ve built, and at the heart of it all is the Telemetry Data Platform, the world’s most powerful platform to analyze your operational data. The Telemetry Data Platform serves the needs of over 180,000 accounts around the globe by ingesting over a billion telemetry data points every minute. Its unique power comes from adhering to three design principles:
- Observability requires a unified telemetry platform
- Real-time investigation requires both speed and flexibility
- Dynamic demand requires unlimited scalability
Observability requires a unified telemetry platform
By combining your metrics, events, logs, and traces in a unified platform, the Telemetry Data Platform gives you a complete view of your technology stack, enabling you to identify, understand, and resolve the issues that impact your business. No more combing through multiple systems to hunt for needles in different haystacks, while minutes, or even hours, are wasted getting your systems back online. With the New Relic Query Language (NRQL), you get a single interface to explore all of your data.
Real-time investigation requires both speed and flexibility
Most databases require you to choose between speed and flexibility: You can get answers lightning fast, as long as you chose the right schema and indexes. Or, you can ask any question you want, as long as you are willing to wait for the answer.
In today’s complex world of distributed systems, microservices, and ephemeral infrastructure, it’s impossible to predict every question you will need to ask of your data. When trouble strikes, you may need to answer questions that you had never thought about asking before, and you need those answers fast. That’s why we built the Telemetry Data Platform from the ground up as a schema-less database that enables fast queries and queries formed ad hoc without requiring indexing in advance, so that you don’t have to choose between speed or flexibility—you get both.
How the Telemetry Data Platform achieves millisecond performance
Answering unindexed queries requires processing huge amounts of data, so we’ve optimized the Telemetry Data Platform for speed and parallelization. Every second, the Telemetry Data Platform serves thousands of queries for our customers, who need answers stored in multiple terabytes of data. Moving all that data around to search through it doesn’t make sense, so instead, we take the query to the data.
Every New Relic query starts at a query router that locates the data in the cluster and sends the original query to hundreds, or even thousands, of workers to scan where the relevant data resides. To balance memory and IO needs in the Telemetry Data Platform’s multi-tenant cluster, very large queries are broken up into smaller pieces. Those pieces of the query are sent to other routers that deliver their partial queries to the workers holding the data. The Telemetry Data Platform’s in-memory cache provides the fastest results for recently executed queries, or the workers scan the data from disk for queries asked less often. As each worker reads its files to answer the query, the process is reversed. First, the results of each file are merged on a worker. Then, each worker’s result is merged through the routers recursively until the original router has all of the data, returning the completed answer to the user.
Dynamic demand requires unlimited scalability
We designed the Telemetry Data Platform to scale without limits to support the unpredictable demand of our customers around the globe. As our customer base has grown over the past decade, from retail to entertainment, apparel to healthcare, and gaming to e-commerce, we have scaled the Telemetry Data Platform, which minimizes the overall impact of local spikes in demand. The Telemetry Data Platform ingests over 1 billion data points per minute, so when any customer experiences increased demand, the Telemetry Data Platform handles the incremental hundreds of millions of data points with ease.
How the Telemetry Data Platform benefits you
With a lightning-fast median query response of 60 milliseconds and the ability to analyze over 50 billion events in a single query, the Telemetry Data Platform enables you to find the needles within your largest haystacks. And because of its multi-tenant architecture, our smallest customers benefit from the same massive computing resources as our largest users. Additionally, the Telemetry Data Platform delivers the following capabilities:
- Single query interface: Use NRQL to search all your telemetry data
- Intelligence: Correlate insights across all your data sources
- Performance: Query tens of billions of data points with results in milliseconds
- Elasticity: Scale your business and trust your data retention will scale too
- Predictable costs: Only pay for what you need
Don’t miss How the Telemetry Data Platform Is Built for Scale and Flexibility to learn more about how the world’s most powerful telemetry database powers the New Relic platform.
Sign up for 100GB of ingest per month and one Full-Stack Observability user license—free forever!
This post contains “forward-looking” statements, as that term is defined under the federal securities laws, including but not limited to statements regarding market trends and opportunities in the Telemetry Data Platform and the benefits that the Telemetry Data Platform may provide to current and potential New Relic customers. The achievement or success of the matters covered by such forward-looking statements are based on New Relic’s current assumptions, expectations, and beliefs and are subject to substantial risks, uncertainties, assumptions, and changes in circumstances that may cause New Relic’s actual results, performance, or achievements to differ materially from those expressed or implied in any forward-looking statement. Further information on factors that could affect New Relic’s financial and other results and the forward-looking statements in this post is included in the filings New Relic makes with the SEC from time to time, including in New Relic’s most recent Form 10-Q, particularly under the captions “Risk Factors” and “Management’s Discussion and Analysis of Financial Condition and Results of Operations.” Copies of these documents may be obtained by visiting New Relic’s Investor Relations website at http://ir.newrelic.com or the SEC’s website at www.sec.gov. New Relic assumes no obligation and does not intend to update these forward-looking statements, except as required by law.