You can tell a lot about people by the way they handle a trip to the airport. For some, the goal is to avoid waiting in the terminal with the rest of the cattle. They’ll try to cut it as close as possible and assume everything will go smoothly, without a hitch. Others are more cautious. They anticipate delays and allow extra time in case of traffic, long security lines, baggage snafus, etc. – in an effort to reduce their risk of missing the flight. The approach one takes doesn’t change the probability of unexpected delays, but it does determine the severity of the consequences. It’s a lot like dealing with latency in distributed systems, except that latency is inevitable and assuming otherwise almost always yields an undesirable outcome.
This is the second installment in our series examining the Fallacies of Distributed Computing and their correlation to modern web app development. Last week we looked at the importance of recognizing the network isn’t always reliable. This post will explore Fallacy number two, “Latency is Zero,” and why it’s dangerous not to account for it.
The fact is, there’s no such thing as instantaneous data transfer. And just like everything else, the farther the destination, the longer it takes to get there. Our own Brian Doll covered this before and we’re pretty sure time travel hasn’t been pioneered since then. But there’s another subtle wrinkle to this Fallacy. Latency is also unpredictable. It’s influenced by several variables, and is constantly in flux. Nowhere is this truer than in web systems, due to the sheer volume and distance of network calls.
Developers who ignore the effects of latency are essentially considering remote calls equal to local calls, a mistake that can result in degraded or disrupted user experiences and have customers jumping ship for a more reliable alternative. Because reasonable or not, the rapid pace of technology innovation combined with the preponderance of web apps has sent user expectations of speed and consistent performance soaring. Failing to meet this challenge is often costly.
As we pointed out in the last post, opinions differ on the Fallacies’ role in web apps – with noted technology authority and Sun veteran Tim Bray arguing that web developers sometimes have the latitude to believe them. Yet, he concedes the web makes latency worse “because every interchange, on average, requires connection setup/teardown.” His contention is that users have come to understand such delays as intrinsic to the environment. With some degree of latency unavoidable, and aspirations to the contrary unrealistic, he recommends focusing on setting the appropriate expectation up front.
It’s a smart tactic. Managing user expectations of web apps is always a good idea. But we don’t think this method alone is a sufficient response to the Fallacy of Zero Latency because it does nothing to effectively address the problem itself. Especially when there are some viable engineering solutions for minimizing latency within distributed systems, including web apps.
The most obvious alternative, as Brian pointed out when we visited the topic last year, is to move your services physically closer to users through cloud availability zones and content delivery networks. If that’s not an option, or it’s still not enough to mitigate the delays, additional remedies have been proposed by other evangelists of the Fallacies.
For starters, accepting that latency is neither zero nor constant means figuring out how to reduce the number of calls made and move more data through each one. It’s also helpful to make calls idempotent whenever possible, so they may be called more than once if necessary. As Tim Bray notes, every call requires a new connection, so the fewer there are, the better.
No matter how few calls your web app makes, however, the constant fluctuations in latency make it difficult to assess the true impact on end users. That’s why thorough testing is essential, specifically under conditions that mimic the loads of a production environment. Anything less, and you might get caught off guard when your app doesn’t perform like it did in staging.
There’s no way to avoid the latency problem, and everyone seems to agree it’s a greater hindrance within the web environment than for other distributed systems. So while it’s definitely advantageous to “defuse the expectation” of zero latency, as Bray recommends, embracing the inconvenient truth and taking additional precautions can fundamentally improve an app’s response time. And when it comes down to customer satisfaction and the success of your app, a few seconds make a huge difference.
Next up: Bandwidth matters, especially on today’s mobile web.