In this episode, Mando Escamilla talks about his transition from a DevOps individual contributor to becoming a manager, the relationship between engineering and Ops, striving to be on the same page as a team, and attempting to do his best work while wanting to do what’s best for the company as a whole and the customers they serve. It’s all about coming together, growing together, and having empathy.

Should you find a burning need to share your thoughts or rants about the show please spray them at devrel@newrelic.com. While you’re going to all the trouble of shipping us some bytes, please consider taking a moment to let us know what you’d like to hear on the show in the future. Despite the all-caps flaming you will receive in response, please know that we are sincerely interested in your feedback; we aim to appease. Follow us on the Twitters: @ObservyMcObserv.

Jonan Scheffler: Hello and welcome back to Observy McObservface, the observability podcast we let the internet name and got exactly what we deserve. My name is Jonan. I’m on the developer relations team here at New Relic, and I will be back every week with a new guest and the latest in observability news and trends. If you have an idea for a topic that you’d like to hear me cover on this show, or perhaps a guest you would like to hear from, maybe you would like to appear as a guest yourself, please reach out. My email address is jonan@newrelic.com. You can also find me on Twitter as @thejonanshow. We are here to give the people what they want. This is the people’s observability podcast. Thank you so much for joining us. Enjoy the show.

 

I am joined today by Mando Escamilla. Hi, Mando.

Mando Escamilla: Hey, Jonan. I’m doing great. How are you doing, man?

Jonan: I’m hanging in. I’ve got to learn to stop asking the question—eventually, I will. We need another like casual intro, that’s just like, “I see you are also human and still alive. Well done!”

Mando: [laugh] Yeah. It’s actually been a pretty good day and feeling good. Pretty good. I don’t know if that’s the gin that’s in my cup that I was drinking before this to settle my nerves a little bit or what, but I’m actually feeling pretty good.

Jonan: You know, I should really drink when I podcast—that’s, I think, maybe the secret sauce that I’ve been looking for, literally.

Mando: [laugh].

Jonan: Yeah. I’m going to open whiskey next time. That’s a brilliant idea. So tell us about yourself. What sort of things have you gotten into over your career and what are you doing now?

Mando: I started off my career as a young, bright-eyed, bushy-tailed, Perl developer back in the mid- to late ’90s. Back then, server-side applications were like ProScripts that ran and [laugh] it’s like a background process. And then you started wiring up these ProScripts to Apache, right and putting them in a CGI-bin directory. And so, did a little Perl, did a little bit of sysadmin stuff and, rode the wave a little bit. I caught a tiny bit of the Java web application stuff in the early 2000s, but not a whole lot. And ended up kind of shimmy-shucking my way into the Ruby on Rails worlds.

Jonan: Nice. My wheelhouse. I love it.

Mando: So I lived there for a while as a software developer, spent most of my career at small, early-stage startups. So ended up wearing like a lot of hats.

Jonan: Yeah, I mean you don’t just write it. You got to get it out there, right? You’re out there writing the code and then shipping it over the fence and maintaining it to production. And back in the early days, that was a much different proposition than it is today. So, you’re doing Rails for a while, and then what happened?

Mando: So doing Rails for a while, and then I got the opportunity to work at the company that I’m at now, moving over from software development, application engineering into Ops. Like you were saying, a lot of the stuff that I had done in my career was obviously adjacent or at least had Ops parts to it. But this is going to be my first full-time “I am an operator, hear me [roar].”

Jonan: Yeah.

Mando: I guess about a year and a half ago, I transitioned from being an individual contributor on the team to being a manager. And so I got to lead this fantastic group of folks through some pretty dynamic times, is the way that we’re supposed to talk about.

Jonan: Dynamic. Yes. May you live in dynamic times.

Mando: [laugh]. As they used to say. And then I ended up here, drinking on Zoom.

Jonan: Drinking gin and talking on a silly-named podcast about observability.

Mando: That’s right.

Jonan: I worked at Heroku for a long time and you’re expected to own your stuff. You could shut the door. So I definitely had an understanding. That being said, I worked at Heroku, I never had to know anything about any of these containerization technologies. That whole revolution happened at the same time that I had access to infinite free Heroku.

Mando: Oh fair enough. Yeah.

Jonan: I’m just going to click this button. And so then I had a lot of catching up to do recently and now being back at New Relic, I’m continuing that education so I can keep up with the community. The Kubernetes ecosystem is shockingly complex and…

Mando: And vast.

Jonan: Yeah. And yet still way better than anything we’ve ever had before.

Mando: Absolutely.

Jonan: It’s like you effectively click the button, you run the one command, I have my kube cuddle and it’s out. I’m sure I’m saying that incorrectly, kubectl, I don’t know.

Mando: I don’t know.

Jonan: People have a lot of opinions about how to pronounce the Kubernetes ecosystem names of things but…

Mando: It’s like I say about my own name— call me whatever you like, so long as you don’t call me late for dinner. I’m fine.

Jonan: Yes. I feel very similarly, but I will also very often correct people because Jonan gets mispronounced as Jonah and it’s viral. Like once one person says Jonah, then it’s just…

Mando: Everyone is Jonah?

Jonan: For years. For years.

Mando: I feel you. Yep.

Jonan: Yep.

Mando: I, talking about like the old systems and the old ways that we used to do these kinds of things versus Kubernetes. It just seems as though the Kubernetes core team or however you want to refer to them, they really made some key decisions correctly and well. And really found a good level of abstractions around the kinds of things that we’ve been building for years. But they’ve all been as bash scripts or Ruby scripts or Python scripts or Jenkins jobs or whatever. And they codified it in this system that, like you said, is one Kube cuddler, Kubectl apply away from magic, you know?

Jonan: Right. It put so much more power in my hands as a developer—[chuckles] for better or worse. To get in there and work with these systems, I think, eases the burden on either side of the aisle and frankly, an aisle that is disappearing entirely, right? We talk about the term DevOps and there is historically a little bit of an us-versus-them mentality when it comes to Ops and Devs or vice versa. How do you feel about that, having been someone who bridged that gap earlier on in your career?

Mando: There has to be some sort of stages of grief when [chuckles] talking about the relationship between engineering and Ops, because I’ve gone through a lot of them, and a couple of different times. I started off when I came over, I was super hopeful. This is a complete aside and I apologize, but have you ever watched “Ted Lasso,” the TV show?

Jonan: No, but I’m taking a note to watch it. It sounds lovely.

Mando: It is. Yeah, it is a delight and one of the episodes is titled “It’s the hope that kills you,” and so it was, in fact, the hope that killed me. I was super-hopeful coming into this role. I was like, “I’ve got all these years of experience as an engineer and as an application engineer and I have this background in operating the kind of systems that we’re going to be running.” In my most egotistic times, I saw myself as the Anakin Skywalker chosen one. I’m going to come in here and I’m going to do this. I’ll be the one who can do this. That lasted about a week and a half.

Jonan: Right. Turns out that stuff’s real hard.

Mando: Yeah. There’s a reason. It’s really hard on both sides. A bunch of really, really smart people—much smarter than me—have swung and missed it, this ball. Right?

Jonan: Yeah.

Mando: As I progressed through this role, there have been times where I’ve thought that software development would be a great career, a great place to be if it just weren’t for the pesky software devs,—they come through and ruin everything.

Jonan: They just keep mucking everything up, they don’t even seem to care about memory.

Mando: More importantly to me, they don’t care that they don’t care. Right?

Jonan: Yeah. I just shipped it. This is a problem in the industry at large. And I think it also comes out of the fact that we have such different spheres of information. I was talking to a code school the other day. I often speak at these code schools, because I came through one way back in the day and talking to students about the idea that like, “Look, if you have a really crazy-successful career in tech, by the end of it, you might acquire 1% of all of the knowledge that is in tech.” And the numbers are arbitrary because it’s going to be way less than that. But you have your slice of the pie and everywhere you look, someone has a different slice of the pie. And it’s easy for people to do this gatekeeping trash that they do sometimes where they’re like, “Oh, you don’t even know that, I can’t believe you don’t know how kernels behave in this.” Like, “OK, come on.”

Mando: Right.

Jonan: It’s because we all have this different piece of the pie. And I think that amplifies this divide sometimes where an application developer is going to be concerned about a very specific set of things and an Ops person is going to be concerned about an entirely different set of things and…

Mando: It is so true.

Jonan: Today we’re in a world where we’re coming together again, and I love that. I love this DevOps theory of development and operations.

Mando: Yeah. It does require a bit of assumption of good faith.

Jonan: Yes.

Mando: Like we all have to be on the same page—at the very least the same page—of, “We’re all trying to do our best work here, and we want to do what’s best for the company. We want to do what’s best for the other members, for our customers.” And so long as we can all agree that this is where we’re coming from, then we can come and grow together. But there are times where it’s more difficult than others to presume that.

Jonan: Yeah, it is. I think that developers across the aisle are not known for their strong communication skills.

Mando: Operators either. Right?

Jonan: Yep, absolutely. So let’s talk a little bit about what it’s like to lead an operations team—something I have never done, but it seems like a complicated endeavor as leading always is. You talked about moving from an IC role to a manager role. This is a term—individual contributor—that is often used by managers, and I’m convinced the term that helps managers feel as though they are still just a different kind of contributor. Whereas I think of it as the people who built the thing, then me who has the meetings about the thing, right?

Mando: 100% right. It can be a useful piece of vocabulary. But like I tell everyone, even now I tell everyone at work, “I’m the guy who lives in a Google spreadsheets and JIRA tickets and meetings, and it’s my team that does the actual stuff.”

Jonan: Yeah.

Mando: My job is to make sure that they can do that stuff well-coordinated and all that. But it was very different being a manager of an Ops team than leading or managing software Devs. Or application engineers, or what you want to call them. Primarily around the amount of unplanned work that comes up.

Jonan: Right.

Mando: For the team. I find it extremely difficult to manage the work that comes in, in any sort of sprint cadence or anything like that. Either the unplanned work or the planned work. I mean, we have products that we need to do. We have Kubernetes clusters to upgrade and we have a new infrastructure to roll out, and all of these things we can sit and we can scope them and we can talk about how long they’ll take and assign them to people. But if a customer outage hits, or we have—we call them P zeros, which are like the all hands on deck, the world is on fire. That comes up, and your projects kind of get thrown out the window. Even things that are more common to throw you off—you have an engineer who comes and asks you a question. “How do I do this? Where do I get the logs for this?” Or, “What kind of metrics should I be using for this service you’re capturing here?” And you want to engage, and you want to provide this information for them so that they can continue on and do their job. And they’re actually reaching out and trying to work with you.

Jonan: Yeah.

Mando: And so you have this conversation and then you realize that an hour and a half has gone by, and it’s almost time for you to close the laptop for the day, and you didn’t get done the thing that you…

Jonan: Needed to get done.

Mando: Right.

Jonan: You still have so many kubes to cuddle after this.

Mando: So many, so many to cuddle.

Jonan: This surprise work stuff that happens in the Ops world, there are certainly emergency deployments and bug fixes and things that come up as an application developer when you’re focused on the code base directly all the time, but you mostly live in a world that is burning all the time. Ideally, things are relatively stable, but you are very often interrupt-driven as compared to someone who can plan for a feature. I’ve got two weeks to ship a new button on the webpage, and then I start typing on the button and there are unlikely to be any button emergencies between here and complete.

Mando: Yes. And not to say that engineers don’t get interrupt-driven, right? There can be P zero–level events that happened because an API changed, something outside of their control that they’re the ones who are best equipped to fix. But you’re absolutely right. That’s way less common than in the world of Ops where, especially in—I don’t know if it’s especially, it feels more especially—the world where your infrastructure is based in the cloud.

Jonan: Yep.

Mando: At any given time, AWS or Google or Azure, they can just make parts of your infrastructure go away.

Jonan: Yeah.

Mando: And you always hope that you’ve planned for that, and it’ll be fine and it can heal on its own and all of these things.

Jonan: It’s kind of hard.

Mando: It is hard and…

Jonan: What are you going to do?

Mando: What are you going to do other than drop what you were working on and fix it.

Jonan: Yep. Exactly.

Mando: The other issue that I’ve struggled with is that the ratios of the software engineering organization versus the Ops organization, is say an engineer asks someone on my team a question, that one person sitting there and engaging with that application engineer makes up one-fifth of the entire Ops organization.

Jonan: Yeah. You’re taking 20% of the time the whole team has for this hour and a half.

Mando: Right. And it’s 1/100th of the application engineer’s time.

Jonan: Right.

Mando: It’s difficult at times for devs to understand that. Because they’re focused on their team. So their team is three, four, or five, maybe six people, 10 people, whatever it is, and so that ratio doesn’t seem skewed.

Jonan: Yeah.

Mando: But that’s because they don’t see all the other teams that the Ops team ends up interacting with on a daily basis.

Jonan: Yeah. I think fundamentally what we’re talking about here is kind of a lack of empathy sometimes that exists between an engineering organization and an Ops team or an understanding of the work. What kind of strategies do you have for people who find themselves in a similar situation to help build empathy with the Ops work? Besides just like a Slack auto-reply that says “Go away, we’re busy.”

Mando: Right. I think it’s a really good question, because the lack of being able to answer that question well leads to the types of policies and procedures that you see where no one talks to Ops without making a JIRA ticket first.

Jonan: Yeah. Very common.

Mando: Put it on a JIRA ticket and then we will address it. But until then, I’m not going to talk to you. You end up feeling like you’ve been pushed this far. And the only things that have really worked for me are deep, frank, vulnerable conversations with team leads, with entire engineering teams.

Jonan: Yeah.

Mando: And not just one—that’s one of the secrets. This is a conversation that has to happen over and over and over again.

Jonan: It’s ongoing.

Mando: It has to be. Because it’s so easy to get focused and distracted by what you’re doing.

Jonan: Yeah.

Mando: Everyone’s busy. We don’t have like a PM or anything like that for my team. We self-evaluate the work, and schedule it ourselves. But application engineers are beholden to managers at times, and they come in and they say, “This has to be done by this day, come hell or high water. I don’t care if Ops says that they can’t give you a new Postgres instance by then, it has to get done.” It’s hard for operators to see that. And we don’t see that side of their world, either. So trying to create this world where people feel safe and secure enough to have a meeting where they can say—this is something that I’m remembering, a meeting that I had not a couple of months ago, where I sat in a Zoom room with all the engineering leads for this one part of the company, and I told them, “I thought we were friends, and what you said and what you did really hurt my feelings.”

Jonan: Yeah. That’s the kind of open honesty you need in a workplace to actually get things done sometimes.

Mando: I think so. And it’s very easy for me—I shouldn’t say it was easy for me—to say that because it was difficult. But I think for more than most people, I have an easier way or an easier path to having those kinds of conversations. Right?

Jonan: Yeah.

Mando: I’ve been working in this career for over 20 years. I come with a lot of experience, I’m in a leadership position. I have all of this privilege that makes it easier for me to be open and be vulnerable and say those kinds of things. I imagine there are a lot of people in the organization who wouldn’t feel OK to say that.

Jonan: I think it’s easy for people to forget how much their ability to be vulnerable in a professional environment depends on their privilege.

Mando: For sure.

Jonan: It is, for me, a much easier thing than even to ask of people—you start to be in a leadership role and you’ve got to think about asking that of your team—it’s very important for the kind of creative work that we’re mostly doing in that. We’re creating a lot of content and we’ve got to think about how we want that to be and what kind of psychological safety needs to exist. And there are a lot of reasons why people cannot just be their authentic self and be vulnerable.

Mando: Yes.

Jonan: Being able to address those things as a leader is really important, but also to stand up and do it yourself? Right? You get an opportunity to lead by example.

Mando: Right. I think in a lot of cases, that’s the first step. The first step in creating that world, is leading by example. Like you said, showing it, and then you have to constantly feed it out to the team.

Jonan: Yeah.

Mando: And do things that make them feel like, at the very least, they can be open and honest and vulnerable with you. Maybe they can with the rest of the team—build those circles out until they can get to the place where they can do it.

Jonan: Exactly. You’ve got to help build up that muscle over time or it atrophies. It’s a very easy thing to damage the psychological safety of a team.

Mando: Yes.

Jonan: You get one person who speaks up, and someone says, “That’s a stupid idea” or “I hate this plan,”—and in engineering teams, it is very common. And we actually, I think, do a better job of it, of being critical of our ideas is separate from critical of people. Then some areas of the industry—and especially in a situation like what you’re describing, and I don’t know the details of your meeting—but I’m imagining where engineering is throwing shade at the Ops side of the house about something that went wrong. Definitely wrong.

Mando: Sure. That was definitely part of it. Yeah.

Jonan: They weren’t prepared for this and we did what we needed to do right, and they didn’t. And I see over time as those two roles grow closer through methodologies like DevOps, there may be a world where we’re all kind of on the same team and we’re all interacting. You have someone who is setting up a Kubernetes cluster and also shipping some application code that is deployed on that Kubernetes cluster and vice versa. You have your hands in both pies and that really helps to build the empathy. But ultimately that introduces such a broad sphere of responsibility that the skill sets are so deep. I have a hard time believing there could exist people who are deeply specialized in all of the things necessary to own that whole stack.

Mando: I think you’re hitting the nail right on the head. I find myself falling into this trap often, where I’ve trained myself to not say these kinds of things, but sometimes I sure do think “How can you not know this?”

Jonan: Yeah.

Mando: You know what I mean?

Jonan: Yeah.

Mando: I have to step that back in my own head. I have to be like, “No, there is no way that this person knows about the admissions controller in a Kubernetes cluster or knows how to properly assume a role in AWS or whatever.” You know what I mean? Maybe second nature to us, right?

Jonan: Yeah.

Mando: But it is near Greek to most other super-smart, very capable engineers. It’s just outside of their sphere of knowledge.

Jonan: So they’re a piece of the pie—they’re 1%, it just is on the other side of the fence. And I have to tell you, speaking of things that are second nature, literally, no part of AWS has ever felt second nature to me. [chuckles]. I give them an image and I’m like, “Hey, I’d like to see this image again someday, maybe on a website.” And they’re like, “No, it’s ours now. Good luck.”

Mando: Thank you. Please, you know, life just goes on.

Jonan: Goes on. [laugh]. You get the surprise billing. Oh, man! The horror stories I’ve heard, but I’m not going to dig it, because frankly AWS changed the world in a lot of ways. Now, you’ve got half of the internet that goes down, as you talked about earlier, when AWS disappears—we shouldn’t talk too much trash or we jinx it. We might break the internet if we keep digging.

Mando: Yeah, this is a good point. I’m going to look around for something to knock on really quick here, because you never know.

Jonan: [laugh].

Mando: You never know.

Jonan: So let’s talk about when things go wrong. How you handle that? I think, as a team leader, you have a unique perspective on how you get through those times. We talked a little bit when this episode opened about the state of the world—and I don’t even want to think about it right now in the middle of a conversation that I’m enjoying—but there is some terrible, terrible world going on right now.

Mando: There’s a lot of terrible world going on.

Jonan: And you’ve got to work through that with people, and still ship. You’ve got to be productive. How do you lead a team in that circumstance?

Mando: Maybe more delicately. If the application is down or a service is down or you know what I mean, offsets to fix it, right?

Jonan: Yeah.

Mando: There’s no one for us to turn to and say, “Can you do this instead?”

Jonan: Yeah.

Mando: And so it doesn’t matter if you’ve been staying up till three in the morning every night, watching BLM protests. Or doomscrolling on Twitter after a presidential debate. At some point, if the pager goes off, the pager goes off, and it’s you and your team that has to fix it. It’s been a challenge. It really has. I’ve been doing what I can to support the team and to try and defer work that doesn’t have to happen.

Jonan: Yeah.

Mando: So that the team can stay a little fresh. And it’s interesting to see the team’s reaction to this, because there’ll be times where we’ll sit and stand up, and there’ll be like, “I don’t have any tickets assigned to me. I need a product to work on.” And I’ll tell them “Right now, we don’t have anything pressing today. We don’t have anything pressing today or tomorrow. Take a break, keep your phone close, but take a break.”

Jonan: Yeah.

Mando: Because the infrastructure may not be on fire, but the world is kind of on fire.

Jonan: All around us and deferring that work is, I mean, they got a ticket. Their ticket is like, try to remain sane for another 48 hours, because we need you on deck, you know?

Mando: Yes. You’ve gotta be ready. And I’ve got your back. One of the first things that I did was—after it became clear that this wasn’t going to be a two- or four-week kind of work from home situation, right around April or May? I remember, at least for me, [I thought,] “We all go work-from-home early March.” I really thought, I really thought it was like two or three weeks, tops.

Jonan: Oh, man! I was shocked. I remember I had a conversation with someone. They said, “You know, I think all of the events are going to be canceled.” And I was like, “When you said that, I thought you were nuts.” This is, of course, after all of the events have been canceled for months, like, “I thought you’re totally insane. And this is me apologizing for that.” [laugh]. This is far beyond what anyone could have predicted or expected. And you talk about deferring work, the emotional labor that remains for the world to do of just healing from our collective trauma having lived through this is terrifying just in and of itself. Not to even look at the part that is still happening every day. And you’ve got to ask your people to be ready and you’re absolutely right that on- and off-steam, the buck stops here. You don’t have a different Ops team. You get to call and be like, “You know what? We’re sleepy today. Take it away.”

Mando: “Had a rough night.” Yeah. So first thing that I did was, I went in and I added myself to all of their PagerDuty rotations. I get paged when they get paged. Not to jump in on things, if they get woken up in the middle of the night.

Jonan: Sure.

Mando: We’ve got some pretty well-handled, and so the pager load is pretty low on the team.

Jonan: Wow.

Mando: Yeah. We worked really hard to get there.

Jonan: Why don’t you show off somewhere? My goodness!

Mando: [laugh] But you know, I didn’t want those one or two a week to go off. And then someone sleeps through it, because the world is on fire. And then feel bad because they let the org down, you know what I mean?

Jonan: Yeah. Oh, it’s crushing to be that person who either didn’t respond to the page or showed up and couldn’t solve the problem, had to involve other people and wreck other people’s night. It’s just tough to be in that spot every time. Every time the page comes this hard. But especially when you feel like you’ve failed your peers.

Mando: Right? I wanted to avoid compounding that emotional stress. The world is hard enough. Work is hard enough, and not so much anymore. I was going to say that it is a uniquely Ops problem, but more and more application engineering teams are starting to carry a pager.

Jonan: Yeah.

Mando: An anachronism, right? If something with their service happens, they’re the ones who get paged first.

Jonan: And they should. They should own that code into production. If it’s your responsibility to write the good code, I hope that very soon, the world where I throw my code over the fence to QA who writes the tests for the code that I just YOLO-coded, and then they pass it on to Ops to keep it running forever, that world is just dysfunctional to an extreme, in my opinion.

Mando: Oh preach, brother! You are dropping truth bombs over here.

Jonan: Yeah.

Mando: And again, we started out this conversation talking about the tension between…

Jonan: Sure.

Mando: … Dev and Ops. But that’s a whole other branch of this love triangle, my friend. The QA folks are facing the same kind of challenges. And I’m sure there’s a QA podcast somewhere, where there’s people talking about the same thing.

Jonan: [laugh] Yeah, they are.

Mando: But, yeah, it is a dysfunctional way to manage a relationship.

Jonan: I get the instinct to dissect things and try and divide spheres of responsibility, but I don’t think that this particular way that software is built is healthy. I think that trying to take ownership away from a portion, like if you write a line of code and the line of code is bad, it is your line of code. You own that line of code all the way through testing and all the way through production. And had you written a test, the line of code would have been better in the first place. Test-drive your stuff. My humble opinion—that is not so humble—I think at the risk of being the person who says, “Technology will solve this in the long line of all the problems technology will solve.” [laugh]. And, of course technology is proving not to solve any real problems, but we do have this Kubernetes thing now. And I think it is helping to bridge this gap. I think it makes it easier for people in all parts of that equation to be involved without stomping on each other.

Mando: Yes. That is a great way to put it. You’re absolutely right. There’s an ability to interact with your—I don’t know, I don’t even know what you want to call it—your execution environment.

Jonan: Yeah.

Mando: You know what I mean?

Jonan: Your deployment environment. You own some portion of responsibility for me. When I’m an app developer and I ship an app, I have to think, “What resources do I legitimately expect to have?”

Mando: Yes.

Jonan: How many cores? How much RAM? How much disk space? Where am I going to get the disk space? How much do I depend on it being available? That’s a conversation you need to have with your team and yourself to get the app onto Kubernetes.

Mando: Absolutely. And this is where, like you were saying, these lines get fuzzy. Because if my team is managing this Kubernetes cluster, and we’ve got our five Kubernetes minions. And one night, all of a sudden, we get paged because CPU utilization across the cluster has gone past a certain threshold. Or memory utilization. And so we look at the graphs to see what’s going on. And we drill down and we find, “Oh, OK, well Team X’s deployment has been gradually increasing its memory utilization over time. Now we’re at a point where it has pushed us over the edge. Our only option at this point is add another minion cluster.

Jonan: Right.

Mando: And so whose responsibility is it to be monitoring Team X’s deployments, memory utilization?

Jonan: Yeah.

Mando: Or we put in memory limits.

Jonan: Sure.

Mando: In our deployments. And then all of a sudden, we get a notification from a team saying that their application keeps getting kills.

Jonan: Yeah. Right. My app is just dying over and over again, and we’re impacting our customers’ lives and you’ve created a situation—and what I want you to do immediately is to bump this arbitrary limit that I know you’ve chosen for no good reason. Just punish me.

Mando: For no good reason, yep.

Jonan: This brings us back to that observability story where having visibility into all parts of your infrastructure and your application together—there are a lot of ways that people can show you that a thing broke. There were a lot of ways that you can just determine that a thing broke, but being able to see inside and see why it broke and understand how that fits into the entire system and using that kind of systems view to understand these things. And I’m not talking about just finding a root cause where root cause is a human. But understanding how these things come to be both sides of the house, where the humans exist and they have developed systems that the ways of working processes together, that led to a situation where this could occur, but also in the software and in the hardware itself, how did we get to a place where this could happen? That’s your cause. And that’s the part that you want to address.

Mando: Yeah. And it is not the kind of thing that can be the opposite sole responsibility or the application engineering team’s sole responsibility. It gets fuzzy. It gets kind of gray and subtle. Right?

Jonan: Yeah.

Mando: And for some people I think they enjoy kind of when you get to live in that subtle space and the gray space, you know what I mean?

Jonan: Yeah.

Mando: And some people super don’t. Some people are like, “What are you talking about? I don’t want to take a philosophy class. You’re telling me there’s no right answers. What’s the matter with you?” You know what I mean? You need all types. You need all those different types of people on your teams. And so it takes all kinds, and some kinds are going to be better at having these conversations and navigating this stuff than other folks.

Jonan: It’s a difficult line to walk, I think all the way across the board. But I think we probably both agree that— I don’t want to put words in your mouth—but I think that empathy here is key all the way around, and understanding that.

Mando: If Mr. Rogers taught me anything, right. And he taught me many things, but yet you cannot go wrong when you are doing the best you can to put yourself in the other person’s shoes.

Jonan: Yep. And just imagine what it’s going to be like for the person you’re about to page to fix a memory error caused by your app, and it’s a memory leak.

Mando: Yes, yes, yes. Or the other side. Imagine what it’s like when your friends over in Team X, when they get paged because their application goes down, starts throwing exceptions because you didn’t set up your Aurora failover stuff correctly. So they got woken up and now they have to wake up you.

Jonan: Yeah.

Mando: Because you didn’t build out your own metrics and monitors. So you just throw them a thing, and now it’s broken and everyone’s woken up because of you.

Jonan: Yeah. It’s frustrating for people on either side of this equation. And it happens to be a thing that humans are very good at. We’re good at othering because we have to…

Mando: Yes, dude, yeah.

Jonan: … Fit the world into our brain. We want things to get boxed up and then in clean, little, tidy rows. And sometimes there is no answer in a system this complex with this many elements, human and otherwise, there is no clear path—you do the best you can.

Mando: Yeah. I was talking with this with a friend of mine, earlier in the week. I don’t know, time’s a flat circle these days. It’s hard to know when [laugh] I was having this conversation. We were talking about the differences between what we build now and what we built five years ago, 10 years ago. The applications that we’re building now are vastly more complex than they used to be. And we can do that thanks to the latest flock of tools and infrastructure programs. And Kubernetes.

Jonan: Sure.

Mando: AWS, like you were talking about earlier. But it’s almost like it’s in our nature to use these things and build out to the limits of them instead of just making our lives easier with the simple systems that we used to have, right?

Jonan: Right.

Mando: We have kept that level of technological pain, the same. It didn’t actually make things better for us. We can just build larger castles now.

Jonan: We could build larger, more complicated castles and they do more stuff. But the bigger—it’s still hard.

Mando: We could have just stopped while we were ahead. But we didn’t, you know?

Jonan: Yeah. Because we’re in technology, we don’t do that. The stop-while-you’re-ahead thing.

Mando: No.

Jonan: We’re not great at that.

Mando: It might just be a human thing. You know what I mean? We just don’t in general.

Jonan: We just don’t. You’re like, “Wow! You know what’s good? Is this fire I’ve got in my house, because I chopped down that tree. I’m going to go get every last tree. My house is going to be so warm.”

Mando: So freaking warm. Yeah. [laugh].

Jonan: “Go big” seems to be the human way. Now, what does that mean for our future then? I’m doing this lightning talk at KubeCon where I’m going to set up Kubernetes on Raspberry Pi in five minutes. And that’s an amazing thing to me, that I can take the Myst card and a Raspberry Pi in five minutes later deploy a Rails app to it in a mock scalable way. I’m not expecting everyone to replace their data center with a bunch of Raspberry Pi, but that’s impressive to me that we’re in a place where that happens. What is the future like then? We are moving towards a place where we’re able to do more and we are building more things that get more complicated—so make a prediction for me so that I can have you back in a year and shame you for your misguided attempts at doing the impossible.

Mando: Yeah. I look forward to coming back in a year and being shamed.

Jonan: Yeah. That’s one of my primary features on this show—the shaming hour. [laugh].

Mando: If I have a flaw, it is that I do not have enough shame in my life. Personal, deep-seated lack of self-confidence and shame—that’s not on this side, my friend.

Jonan: My brain is often coming up with additional reasons. It’s like, “Hey, remember when you popped that basketball that kid had when you were eight?”

Mando: Oh my god! Yep.

Jonan: When I’m trying to fall asleep—if I finished the doomscrolling—that’s when I hear about that.

Mando: Right. Once the phone has hit your forehead.

Jonan: [laugh].

Mando: That’s when the little troll comes knocking at the door. I think that in a year’s time, we’re going to see a convergence on—not cloud providers, that’s not what I mean—we’re going to see better tooling and better abstraction around interacting with cloud providers. Like Terraform, but better. Terraform-like kinds of things, where I think the cloud providers all really understand now that they’re a commodity, and they’re continuing in this race to the bottom.

Jonan: The walled garden is breaking down. The walls are coming down around these companies.

Mando: Yeah, absolutely. And there’s only so much that you can do when you’re using these kinds of infrastructure environments like Kubernetes. It is one thing to be all in, say, AWS. And have all of your applications running on EC2 instances. And then you call your AWS sales rep, and ask for a discount. And they’re like, “Yeah, nah. Nah, you’re not going to get it.” But it’s a very different thing to say, “75% of our [interviewer] spend is on this set of 10 Kubernetes clusters.”

Jonan: Right.

Mando: And none of these applications know that they run inside of AWS.

Jonan: Right. This changes the game.

Mando: It changes the game a little bit.

Jonan: They come and poach each other’s business right now, by coming in. If you’re in an AWS shop, Google comes along and they say, “Hey, we’re going to give you two years of free GCP of millions of dollars of discount.” And once you get over here, you get tied into their ecosystem. You start using those services that work for free together and authenticate more easily. And it all just works naturally, but Kubernetes just breaks that. And people joke about multi-cloud being ridiculous as a goal. But what multi-cloud provides you is the opportunity to flee on a whim. And suddenly those discounts become very easy to acquire, like moving credit card balances for that zero APR.

Mando: Yes, exactly. I’ve got nine more months on this one. It is a little pie-in-the-sky and it is way overly simplified. If I’ve got a three terabyte RDS instance that I have five different services in my Kubernetes cluster talking to, how realistic is that I’m going to move those to a Kubernetes cluster run by Google? Like probably not, but it’s way easier than it would have been five or 10 years ago. And most importantly, the cloud providers understand this, I think.

Jonan: I’m really interested to see what they do to respond to that. I would not want to be in a cloud provider’s shoes today trying to stay competitive in an environment where I’m suddenly selling electricity.

Mando: Yes. And tubes, internet YouTubes. You know what I mean?

Jonan: Yep. Well, my wire is the best one, because it’s oxygenated. It’s like selling Monster Cables.

Mando: [laugh].

Jonan: This is oxygenated stereo cable. You can tell it’s way better.

Mando: That’s right.

Jonan: It’s just faster.

Mando: It’s just faster. It has more 24 karat gold in it, I guess. I dunno.

Jonan: Exactly. It’s an interesting world.

Mando: It is.

Jonan: So, it has come to the end, unfortunately, but I do want to ask you for some parting words for our listeners. Do you have any advice for people who are may be following in your footsteps? Think about what you wish you had known when you first—I mean, I guess you’ve been on both application development and in Ops, and now in leadership—at what phase and what advice would you give to someone who’s looking at that particular transition that you’ve made?

Mando: If someone’s looking to move into a leadership or management role—specifically coming from a technology background as a developer or an operator or systems engineer—I think that the most successful folks who make that transition tend to have a couple of things in common. They do it for reasons that might seem a little counterintuitive. I think they do it not for the reasons of trying to garner more control or more power.

Jonan: Right.

Mando: They don’t do it because they’re going to be in more control. Because the truth is, you’re going to actually have less control over certain areas of things. But the folks who are good at this, I think they do it because they care about people.

Jonan: Yeah.

Mando: And they care about the way that people grow. And kind of going back to what we were just talking about earlier, they care about making the workplace be safe and secure. And be as authentic as they can be. And this is going to sound weird—and I don’t mean it to sound this weird—but I think this is a place where folks who are very aware of their privilege can use their privilege very well. And what’s weird about that is, it kind of sounds like I’m saying that it should only be people with privilege who are in leadership positions, which is not at all what I want to have happen. But folks who are in these leadership positions, if they can start to go down that journey of understanding what their privilege affords them and what good they can do with it, that can be kind of a superpower.

Jonan: You recognize this part of yourself and understand how to use it—to lift other people up and have opportunities that would not otherwise have been afforded to them, provide those safe spaces that people need to operate wherever they are and whoever they are and whatever background they have—it’s a very fulfilling part of being a leader.

Mando: It’s one of the best things of my job, for engineers and for operators as they move to their careers and managers too. Like what you said earlier about focusing on finding that empathy for others in the org and not just people that you had considered across the DevOps aisle. Or the dev aisle. But finding empathy for TMs, for your support folks, frontline people who are actually on the phone getting yelled at by customers because of your crappy infrastructure or your crappy code, you know what I mean?

Jonan: Yeah.

Mando: Being able to find that empathy is going to do more for, not only for your career, but your own…

Jonan: …your own process.

Mando: … and spiritual well-being.

Jonan: This is where self-actualization comes from, this empathy.

Mando: It 100% does. Especially with the world as it is now, it can be hard to find things to be hopeful about or feel happy about sometimes. And sometimes the only thing that gets me through the day is knowing that I did something to help someone on my team or help someone on another team—putting myself in a position to help others. I don’t know if it’s just that endorphin rush or if it’s something more meaningful than that. But it helps.

Jonan: Yeah.

Mando: That comes from empathy.

Jonan: I absolutely agree with you. And I tell you, one of the things that helps me get through my days is exactly this. Just sitting down and having a real conversation with a real human being about things that matter.

Mando: Yeah.

Jonan: Having that empathy for each other’s stories and just being in that moment. It means a lot to me that you came and joined us on the show today.

Mando: Thanks, man.

Jonan: Really appreciate your time. If you want people to find you online, assuming that’s a thing you would like, where would they look for you?

Mando: Probably Twitter is the best place to do it. I’m not super-active beyond the doomscrolling and the re-tweeting of dumb political memes.

Jonan: Yeah. But it’s always a good Rolodex for those of us in tech. I advise everyone to get on there if only so you can DM.

Mando: Absolutely. I mean, this is how I got on here, right?

Jonan: Yeah.

Mando: I got on here because your wonderful producer, Mandy…

Jonan: She’s fantastic.

Mando: …She’s asking for folks. Shout out to Mandy. That’s how I was able to meet you and have this very lovely conversation. I am Mando Escamilla on Twitter. If anyone needs, I don’t know, anything. Right? Like, you know…

Jonan: Advice, you want to reach out.

Mando: Having a bad day, you know?

Jonan: Yeah.

Mando: Yeah. Look for me there. And then we can have a chat.

Jonan: I’m volunteering Mando to be the world’s counselor going forward. All of the emotional labor. You don’t want to do it yourself. Just reach out to Mando on Twitter.

Mando: Reach on out. Yeah. I am 100% capable. [laugh].

Jonan: Live psychotherapies are really valuable use for Twitter.

Mando: [laugh].

Jonan: With that, we’ll just call it a show and thank you again for joining me, Mando.

Mando: Thanks, Jonan. I appreciate it, buddy. You have a good night.

Jonan: Thank you so much for joining us for another episode of Observy McObservface. This podcast is available on Spotify and iTunes and wherever fine podcasts are sold. Please remember to subscribe so you don’t miss an episode. If you have an idea for a topic or a guest you would like to hear on the show, please reach out to me. My email address is jonan@newrelic.com. You can also find me on Twitter as @thejonanshow. The show notes for today’s episode along with many other lovely nerdy things are available on developer.newrelic.com. Stop by and check it out. Thank you so much. Have a great day.

Jonan spends most of his time staring into tiny boxes and pushing buttons. He likes Ruby, Go, machine learning and playing with robots. View posts by .

Interested in writing for New Relic Blog? Send us a pitch!