At our recent New Relic London Summit, New Relic Site Reliability Engineer Alice Goldfuss presented a talk on “Scalable Meatfrastructure: Building Stable DevOps Teams.” Alice explained to the initially perplexed crowd that the “meat” in “meatfrastructure” refers to that which makes up the most important infrastructure underpinning any organization—its people.
“At the end of the day,” Alice said, “your people are the ones who pick out your technologies, install them, are going to fix them when they break at 3 a.m. And if those people don’t have processes in place to keep them reliable and happy, everything is going to crumble.”
She shared her advice on growing a stable—and happy—meatfrastructure in your own organization:
Move your team’s culture from oral to written
It is important to determine the “bus factor” of your team, Alice explained. Bluntly, that means, “How many of your people can get hit by a bus before you no longer have the knowledge you need to move forward?” Most teams have a bus factor of 1 or 2, which is way too low. The best way to raise your bus factor, she said, is through documentation.
Moving your team from an oral culture to a written one, while challenging, can make all the difference. Be sure everything is well documented and available to everyone at any time: your team’s mission, its chosen tools and methodologies, the standard practices and procedures, and so on. Alice advised: “Factor time into your tickets or tasks for documentation, and everyone should feel comfortable in writing and asking for docs.”
Make it easy for other teams to work with yours
Alice described how New Relic’s Engineering teams have benefitted from the concept of the “hero” role. On a weekly rotating basis, one developer on each team is designated the hero, available to answer any questions from outside the team. Initially, this role was instituted to help field questions coming from support, but today every engineering team at New Relic—even those that don’t interact with support—have a rotating hero role for other engineering teams to make it easier to interact with them. “Think of it like an API endpoint for your team,” she said.
If you want other teams to follow certain processes, like filling out tickets for requests, take the time to train them on how to do it properly, or else provide documentation. Alice’s team recently launched an internal request UI, essentially a Google form that asks for all the needed information, which is then used to produce a JIRA ticket. This self-service approach has been a hit. “This is a DevOps approach,” Alice explained. “We used a development strategy to solve an operations problem.”
Make remote workers feel part of your team
As your team grows, eventually you may need to hire remote workers. It’s crucial that you make offsite workers feel like they are part of your team by recording meetings for them, flying them in for face time at least once a quarter, and taking care of the little things, like making sure they get the same company swag everyone else does.
Alice shared two creative ways her team works to include its remote members: In order to make sure people have a record of what was covered in standup meetings, notes from those meetings are fed to GitHub’s customizable chat robot Hubot, which publishes them as an archive to a GitHub gist. Additionally, one of Alice’s remote colleagues works in front of a camera so that he is always “present” on a video screen nearby. “If I have a question about something he’s working on, I can just go sit down next to him and talk to him face to face,” she said.
To listen to more of Alice’s insightful tips for growing and managing a stable DevOps team, watch the video of her full New Relic London Summit talk: