‹ Blog Home

Net Lag: The Fallacy of Zero Latency and the Web

You can tell a lot about people by the way they handle a trip to the airport. For some, the goal is to avoid waiting in the terminal with the rest of the cattle. They’ll try to cut it as close as possible and assume everything will go smoothly, without a hitch. Others are more cautious. They anticipate delays and allow extra time in case of traffic, long security lines, baggage snafus, etc. – in an effort to reduce their risk of missing the flight. The approach one takes doesn’t change the probability of unexpected delays, but it does determine the severity of the consequences. It’s a lot like dealing with latency in distributed systems, except that latency is inevitable and assuming otherwise almost always yields an undesirable outcome.

This is the second installment in our series examining the Fallacies of Distributed Computing and their correlation to modern web app development. Last week we looked at the importance of recognizing the network isn’t always reliable. This post will explore Fallacy number two, “Latency is Zero,” and why it’s dangerous not to account for it.

The fact is, there’s no such thing as instantaneous data transfer. And just like everything else, the farther the destination, the longer it takes to get there. Our own Brian Doll covered this before and we’re pretty sure time travel hasn’t been pioneered since then. But there’s another subtle wrinkle to this Fallacy. Latency is also unpredictable. It’s influenced by several variables, and is constantly in flux. Nowhere is this truer than in web systems, due to the sheer volume and distance of network calls.

Developers who ignore the effects of latency are essentially considering remote calls equal to local calls, a mistake that can result in degraded or disrupted user experiences and have customers jumping ship for a more reliable alternative. Because reasonable or not, the rapid pace of technology innovation combined with the preponderance of web apps has sent user expectations of speed and consistent performance soaring. Failing to meet this challenge is often costly.

As we pointed out in the last post, opinions differ on the Fallacies’ role in web apps – with noted technology authority and Sun veteran Tim Bray arguing that web developers sometimes have the latitude to believe them. Yet, he concedes the web makes latency worse “because every interchange, on average, requires connection setup/teardown.” His contention is that users have come to understand such delays as intrinsic to the environment. With some degree of latency unavoidable, and aspirations to the contrary unrealistic, he recommends focusing on setting the appropriate expectation up front.

It’s a smart tactic. Managing user expectations of web apps is always a good idea. But we don’t think this method alone is a sufficient response to the Fallacy of Zero Latency because it does nothing to effectively address the problem itself. Especially when there are some viable engineering solutions for minimizing latency within distributed systems, including web apps.

The most obvious alternative, as Brian pointed out when we visited the topic last year, is to move your services physically closer to users through cloud availability zones and content delivery networks. If that’s not an option, or it’s still not enough to mitigate the delays, additional remedies have been proposed by other evangelists of the Fallacies.

For starters, accepting that latency is neither zero nor constant means figuring out how to reduce the number of calls made and move more data through each one. It’s also helpful to make calls idempotent whenever possible, so they may be called more than once if necessary. As Tim Bray notes, every call requires a new connection, so the fewer there are, the better.

No matter how few calls your web app makes, however, the constant fluctuations in latency make it difficult to assess the true impact on end users. That’s why thorough testing is essential, specifically under conditions that mimic the loads of a production environment. Anything less, and you might get caught off guard when your app doesn’t perform like it did in staging.

There’s no way to avoid the latency problem, and everyone seems to agree it’s a greater hindrance within the web environment than for other distributed systems. So while it’s definitely advantageous to “defuse the expectation” of zero latency, as Bray recommends, embracing the inconvenient truth and taking additional precautions can fundamentally improve an app’s response time. And when it comes down to customer satisfaction and the success of your app, a few seconds make a huge difference.

Next up: Bandwidth matters, especially on today’s mobile web.

Want to use New Relic and get an awesome Nerd Life tee?
Sign up here. It's free, so why not?

Comments

RSS feed for comments on this post.


  1. Latency on the web is rarely related to the proximity of the hardware to the end uses in this day and age. The #1 reason for latency is inefficient code, and particularly inefficient queries.


    Jonah Williams Reply:

    Fadi raises valid points; physical distance is not necessarily a significant factor compared to other sources of latency and slow server responses are a depressingly common problem. However I read the comment above as suggesting that eliminating “inefficient code” might somehow fix the problem and I’m afraid that is not the case.
    Some of your site’s users are competing for access to coffee shop access points with a nasty habit of redirecting requests to “terms of service” click-through pages. Some of your users are on overloaded conference wifi networks. Some of them are using mobile devices or cell network access points. Some of them rely on satellite or wide area wireless networks. No matter how performant your backend systems are your users will still experience all of the limitations captured in the Fallacies list. Conveniently many of the steps you might take to reduce the impact of these network conditions will also improve the experience for users interacting with an inefficient backend.


    Posted: 21 January 2012 at 7:54 am by Fadi

  2. I’m hopeful that the current trend toward “mobile first” products will encourage more consideration of real world network conditions when designing iterations which depend on network connections. Its disappointing to see how many sites don’t include some sort of activity indicator while a long request is in flight or can’t handle a failed request. As developers we have many options for dealing with these network conditions but until we test against them and prioritize the experience of users dealing with them I think it is no surprise that the precautions mentioned above are happily ignored.

    Posted: 30 January 2012 at 12:22 pm by Jonah Williams