Trust… but verify.
With Amazon Web Services revenues reaching $5 billion over the last year (as reported in Amazon’s financial statement) and Bessimer Venture’s Cloud Index reflecting another $17 billion in 2014 cloud revenues, clearly many organizations are putting their money on trusting the cloud and Software-as-a-Service (SaaS). While we believe these billions are growing faster than just about anything else in tech, what will it take to 10x these numbers? Many surveys cite trust and security as critical to accelerating cloud adoption.
So what does it mean to trust the cloud? Is it more access controls, encryption, and firewalls? Sure, but just as important, it means that accelerating cloud success requires accountability and visibility.
In the information technology world, the word “trust” has been highly correlated with the marketing of bolt-on security technologies aiming to induce fear and produce money. We believe the reality, however, is that in a world of applications made up of distributed components, teams, service providers, and services, trust without verification is for amateurs. Similarly, trust with verification and accountability isn’t just for professional paranoids, it’s also for professional software teams who are working with an ever-expanding set of distributed components and teams.
These three examples illustrate places where trust isn’t enough—verification and accountability are required:
1. The trust-demarcation point in the cloud
Every service provider has a demarcation point where delivery of a service ends and a customer’s responsibility begins. For example, in the router in your house, the demarc point is the point between the connection to the router controlled by your connectivity provider and the Ethernet ports (or wireless LAN access point) that you use to connect your devices. If you are having a problem with your Internet-connected thermostat, your connectivity provider might be able to offer some assistance, but the thermostat is your responsibility, not theirs. These demarc lines often move from provider to provider and service to service, but they always exist.
In a cloud service, the demarc point varies depending on where in the stack the provider draws the line between the service and your code. For example, in an infrastructure service like Amazon’s EC2, the physical infrastructure is the service provider’s responsibility while the operating systems, virtual machines, containers, and the code running on it is your responsibility. Or in a Platform-as-a-service (PaaS) like Amazon Beanstalk, Pivotal, Heroku, or Azure, the demarc point is between the run-time service and your code. By design (due to security, governance, and liability) Infrastructure-as-a-Service (IaaS) and PaaS providers generally do not want access to your code to help fix issues. It’s fine to trust the line between the service and your code is properly placed, but anyone worth their salt running a software business runs tools to verify and monitor the state of the service at that demarc point.
Why is it important to verify the demarc point between your application and service providers? In a word, accountability. When something goes wrong, is it the provider or you? It’s critical to determine whether it’s a service provider causing problems or something your team changed in an infrastructure configuration or recent release. Note: Even if you architect your applications to be resilient to service failures, you should be cognizant of the breadth and depth of those failures when making future architectural and service provider decisions. Search “Chaos Monkey” for more on that front.
2. Trust in the ties between microservices
How do software teams deal with the challenge of dependency-filled monolithic code bases that need innovation and iteration? We break them apart into services. The cool kids call them “microservices.”
Service-based application architectures have been around for a while, but with infrastructure version control, containers, and API standards, building and maintaining composable applications has become much easier. Meanwhile, the need to scale developer productivity and break the man-month constraints of monolithic apps is critical to scaling up successful software businesses. This has resulted in a growing number of applications becoming composites of services that are developed, tested, scaled, and toggled on and off in ways that give an application resiliency and the ability to evolve with fewer build dependencies.
Instead of simply trusting a service to do a job, the best practice is to build in authentication mechanisms, service directories, API monitoring, failure and fallback scenarios, and the ability to turn a service from an internal-only service to an external service.
Otherwise, if the only way these services communicate with each other is via a standardized interface, then the team next door could muck up and inadvertently launch a denial of service attack against your service. Organizations need to verify that each service is working. And as noted in Steve Yegge’s famous rant on why and how Amazon is so good at platforms and scaling developer productivity, that means being aware that “when your service says, ‘Oh yes, I’m fine,’ it may well be the case that the only thing still functioning in the server is the little component that knows how to say ‘I’m fine, roger roger, over and out’ in a cheery droid voice.”
3. In the customer-experience race, code changes are constant
Removing friction and providing moments of joy and discovery in your customer’s experience requires endless art and experimentation. The science behind enabling this art requires constant code changes.
In a world of constantly changing software, a change in one line of code could take a site or app that worked fine yesterday and take down the entire business today. Fear of these scenarios could keep developers from pushing code to production.
Instead of simply trusting code into production, best practices call for verifying the code with CI tools, automated testing, automated security testing, and load testing, as well as monitoring production applications for availability, performance, feature adoption, cost controls, and more.
(For more on this topic, see this BlazeMeter guest post on Two Vital Components Your Continuous Delivery Process Can’t Ignore.)
Don’t trust the cloud…
The three examples above demonstrate cloud-based relationships that require accountability, verification, and monitoring in a modern distributed application portfolio. However, trust and verification are needed just about anywhere multiple teams work together, especially when you’re dealing with contractors, cost controls, and other moving parts. Cloud services can give software teams unprecedented levels of focus and speed—but the smart approach is always to make sure you have visibility into everything.
This blog may contain links to content on third party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.