At RailsConf 2012, we sat down with five of our customers (including Jesse Proudman, panel moderator and CEO of the Blue Box Group), and asked them about the ups and downs of scaling massive websites. In the session, they discussed how to manage millions of unique visitors, unexpected traffic bursts and more.
NR: What are the three most actionable items you should pay attention to as your application grows?
PV: I’d recommend watching the following:
- Current and potential bottlenecks (e.g. request queues in front of application server, connections to third party services.)
- Monitoring as much as possible, both performance of different infrastructure pieces (request time, memory usage) and within the app (e.g. are signup numbers dropping after last deployment.)
- Code quality. It’s not as hard as some people believe it to be. Scheduling a week per month for only cleanup/refactoring goes a long way.
NR: What two items caught your team by surprise?
PV: These would be:
- Growth of disk usage by databases. Some tables we expected to grow fast and took measures by archiving/truncating. But there were others that grew slowly and got pretty big.
- Integration testing becomes complex/impossible as we add new services for our apps to use.
NR: Walk us through your capacity planning process?
PV: New Relic is a big part of our capacity planning. We check it regularly to make sure our apps allow for (sometimes unexpected) growth. Judging from those graphs, we can easily provision more app servers or DB slaves or vertically scale out master DB.
NR: How many people are accessing your site on mobile devices and how do you optimize for that?
PV: 90% of our traffic is mobile. We customize our CSS and HTML for different platforms, leveraging the strengths and abstracting away the weaknesses of each.
NR: How do you solve the data challenge and what do you do with the data you collect?
PV: We use MongoDB’s map/reduce facilities, though we are always investigating alternative solutions. We collect as much information as we possibly can, so that we can always query data going back to beginning and extract meaningful results. One thing we learnt is that you can’t have too much data, because you never know what questions you will come up with next week.
Watch this video or more information on managing real world web apps at massive scale: