Jay Gordon: you're listening to the on call nightmares podcast. Just about every week I go out of my way to bring you conversations with technologists have spent some time on call, and this time I'm speaking with Tim Yocum and he's a passionate builder and leader of support, Dev. Ops s sorry and engineering teams. And we had a really great conversation about what he's currently doing it in flux data at what he's done before. I really appreciate speaking with him because he's got such a long and very interesting career where he's doing some really bleeding edge stuff nowadays, while he's done what was really on the edge of new technology and his past. So I'm really glad that he spent some time to talk with me and you can do the same. That's right. He found me. He started listening to the podcast and he offered his time, and I'd like you to do the same. All you have to do is send me an email that email could be sent to on Call Nightmare's at gmail dot com, or you can reach out by a Twitter. It's on call Nightmare or at jay destro. So what have I been up to lately? Well, I've been home for the most part. Haven't been any conferences, which has been nice to be home because I've been working on some new content and coming up with ideas on how to expand the podcast, do some new things, maybe some best of episodes with some themes with previous conversations that I kind of just pull some clips from. I been thinking about stuff like that. If you have any recommendations on new sector segments, you like to hear in the podcast or something else like that, be happy to do that for you. So let's get into the conversation with Tim Yocum of Influx Data and learn a little bit about, you know, when things go really, really wet and wild for someone at a data center. Jay Gordon: Welcome back to the On-Call Nightmares podcast. And just about every week I say just about and this will make 30 a total of 30. Now I go out, tower into the world out Lar into the world, and I I look for technologists who spent time on call, and this week I've got Tim Yoke. Hi, Tim. Hey there. So, Tim, you've got Ah, pretty big background and I I don't want to minimize it because you wrote a bio That's that's pretty small is compared to what? The breath of your experience. But Tim is an engineering manager now. It influx data and you've got Ah, good. 20 years of experience going back to time, working in dial up all the way into now working in hybrid architecture, potato is ations. We're taking buzzwords now and we're just putting him in Ah ah Yat see thing and then shaking it up and coming up with what? You've done that because it seems like you've been around for a while and you've done a lot. Yeah, a little bit Tim Yocum: of everything. You know, it's it's interesting to talk to people who just started in this this industry and to say, you know, there was a point in time where we would listen on the phone to people's modems dialing up, and we could tell that they were having problems based on the beings. In the words of the modem negotiating s Oh, yeah, it goes way back and you say on call nightmares. I look back at the last 20 years or so and realize that I've basically been on call my entire career, which is which is a nightmare in itself. But it's also been perhaps one of the most rewarding experiences of my entire career to Jay Gordon: well, that's it's always really cool to like, Look back and see what were the old nightmares and where we are kind of now. And I think that's why I've been doing this podcast is that there's so many people. I think we're no, no matter where they are. I think in this place in the industry, if you've just started or, um, you bid in 20 got stories and you've got something that it's either gonna be. Uh, wow, this was a great experience are Oh my God, this is terrible experience. There doesn't really seem to be a lot of in the middle, and I like to capture both, because somewhere I think we can find the middle and find better ways of doing operation. Tim Yocum: Yeah, for sure. And you know, I reached out to you because I was. I was thinking about some of the more interesting events that have happened in my career and how completely unrelated to pushing a bad config or just doing something that that causes ah, cascading failure. Something on the physical side that wasn't related to code at all was so instructive and so interesting to learn from. And even to this day, some of what I learned from from these incidents really resonate. And can we can pull value from a lot of these thieves, even old school issues, just as much as as the new stuff? When you run into ah kubernetes problem, that's that's new and intriguing. Jay Gordon: So let's let's just take a step way, way back because we've already started talking about where we are today. And I want to talk a little bit about where you started. And it's It's a pretty, ah, long time ago not to make you feel bad, but rage. That's not what I'm trying to do. But in the context of our modern Internet, you know, you started in the late nineties, uh, working with, uh, dial up and then eventually going overto vario and burial was ah, pretty big deal like late nineties early 2000 spars just providing Internet and D. N s all these things to the world. Yeah, they had Tim Yocum: an interesting model. And so I was when I started. It was kind of an interesting scenario where I was just graduating high school and and I was annoyed because there was a guy who had compromised one of the FreeBSD hosts that I was using, and I took it upon myself to kind of figure out Okay, what is this guy doing? Where is he connecting from? Let's let's figure this out on track amount. And you know, that just snowballed into a conversation with the small provider that I was using at the time. And lo and behold, I end up working there and some burial came along, and they had a strategy of basically buying up regional Internet providers, mashing them together and building out this huge provider Patrick they wanted take on like the ale of the world. But they built a huge, huge company out of all the small constituent parts. Jay Gordon: Yeah, I remember it was a big part of Lake, just a consolidation plan that they had and then eventually became. Now we have too much and we can't really afford. What we bought is what it came towards. The end. At least that's the way it looked outside looking in, Tim Yocum: Yeah, I mean, it's it's challenging to run until Net provider that does everything from dial up through hosting and leased lines and D S l and all that good stuff. Um, when you when you buy, like, 100 or 200 of these companies and try to match them all together, you learn that operations are entirely different. How issues air handled, how tickets come in, what customers expect. There's so much difference there on. It was really interesting for me because I was out training people at a kind of ah, nationwide call center. How deal with our customers and every single person there was commenting how different it is from every other provider they have. So when they get a call that comes in, they didn't know what they were going to deal with. And in a sense, though, they weren't on call. Every call was like being paged. You don't know what to expect. You don't know what's coming Jay Gordon: up next. So the next kind of step in your career is you bump overto Playboy Enterprises in a time where, um you know, 2000 and 2005 and We're talking about the edge of providing video content and the really early big CD ends that people had to create on their own because they didn't exist. Talk to me a little about them because this is, you know, I had a a little early time an adult when I was getting started working at a provider. And the one thing that I saw more than anything was just the bleeding edge aspect of it, eh? So maybe you could talk a little bit about it. Tim Yocum: Yeah, it was really the golden era. It seems in retrospect s a playboy. It was you say, Playboy. Everybody thinks of the magazine. But back in the day, when I was there, they started up what they called the Cyber Club, which was you pay for play type deal? Um, there weren't many Membership service is out there at the time, So all of this software was built in house from user management subscription billing. Everything was done organically and in house. It was a really small team. On the operation side, we only had three or four people, tops, Which is kind of kind of amazing to think about at the time and yeah, I mean, we were building out our own infrastructure. We had data center space out on the West Coast and here in Chicago, which is where Playboy was based on. We managed all of that, and we we worked on directing traffic to the node closest to the end user. We were really mindful of speed and accessibility and just getting content to people as quickly as possible. Um, so, yeah, I mean, we really were at the forefront of building a lot of the building blocks of what's now considered just table stakes like, um, no Brazilian D N s. And yes, I'll be stuff and being able to ah, have ah, robot cdn. So you know you can You could take maintenance and not take your sight completely offline. A lot of fun and a ton of learning in that space. At that time, Jay Gordon: I really have to read just the amount of education I got around how to get the most out of the gear that I had, especially around video encoding. Ah, couple of years later, once to stuff started. It's really amazing. And the amount of pressure that was put on providers to have faster links because of the the amount of content that was being provided. And maybe it was a dull. Maybe it was not a note, but the fact is, is that we saw a transition and in a lot of it came from people who needed to provide video content of people that we're gonna pay for period. Tim Yocum: Yeah, absolutely. And it's still to this day. There are pioneers in the video delivery space that air that Aaron the adult area. And it's just fascinating to see how various parts of the Internet push push the envelope with technology. I mean, we were back a playboy we're using cutting out of storage technology. We started doing some outsourcing stuff that just wasn't done, Uh, commonly with provider with with content providers at the time. And we had people from large news organizations, three letters that shall remain nameless, primarily in the UK that would ask us how we were doing things because it was really we're both kind of learning at the same velocity. And yet here we were, a very small ops team, talking to huge established companies who were at the same same point in their uh, their online evolution. Trying to figure out how to deliver content as quickly as possible with with what we had, because we didn't have a cloud then. So Jay Gordon: now you jump forward. Ah, a couple of years later, and you take on a bigger role, a much bigger role. A server central. And there you're kind of from hands on all the weights and managing people. You're you're basically director of technology operations. How many people were you seeing underneath you at that point? How many directs did you have? How many did they have to scatter, et cetera. Tim Yocum: It was pretty crazy. You know, Sir, Central's ah small culo provider started out downstate Illinois hosting content out of the dorm room and grew into ah provider in Chicago multiple pops in Chicago but also overseas and throughout the U. S. And when I joined, there was a network operation center, small group handful of people 24 7 But we really needed to expand with multiple facilities in Chicago and elsewhere. So ultimately I had Gosh, I think around 50 folks that were either running data center operations cruise or actually out there racking and stack and gear dealing with customers. You know, the full the full life cycle from pre sales, too. Turn ups. Migrations. Basically the startup mentality that you come into today where you wear so many different hats. You do What? What needs to get done for the customer? Absolutely. Yeah. I mean, and that was That's kind of why I'm always drawn to start ups. In a sense, I flip flop between the bigger companies and the startups because there's just so much to learn when you get thrown into, ah, situations that are not necessarily part of your wheelhouse. Jay Gordon: Understood? Now you come along over to compose, and you end up helping being part of running. There s r e t a director. Um, Now you were there until eventually the IBM acquisition happened, which was a big deal. Uh, what was it like? Kind of making that transition from, you know, a go getter kind of place that composed was to be part of Big Blue. Tim Yocum: Oh, is pretty crazy. Um, you know, we use slack at composed thio talk amongst each other because we're basically 100% distributed. I think everybody uses slack these days. And, you know, we had maybe 40 50 people in there, and it was a nice small community. And then we joined IBM and there's channels with 10,000 people in them. You know? Well, over 200,000. Some people on slack. I mean, it was just you think of the twitter firehose of trying hubs or information. This was this was having IBM Seo. Now we have a managed database provider. I don't need to run databases on my own. Let's go talk to them. And so, you know, I would wake up in the morning and have dozens of messages from people and just a constant scroll of Hey, we want to do this. We want to do that, which is great, because obviously we feel the need that they had. But, um, it was certainly different from from startup days where you really clien t get every customer and you have a lot of focus time on solving tough problems. And it just became kind of like we were an internal vendor of sorts where people were clamoring for the product, which is a great feeling. Shub, but just a different context, just really different feeling. Jay Gordon: So then now we go to modern day, and now you're the engineering manager over at Influx data. You've been there since around December. They're doing lots of really interesting things, and we'll get to talk specifically more about what you're doing there kind of later in the conversation. But I know it's another databases service. Want to just give us a brief phone? One influxes Tim Yocum: eso influx is, uh, popular for Influx db and the tick stack. So a lot of systems monitoring it's at its heart. It's a time Siri's database. I can't I can't seem, can't seem to get away from databases for whatever reason actually started there because that composed we were going to launch in Flux TV as a hosted solution. Uh, and that's that's where I got my first exposure to it. It's super easy toe monitor systems and collect metrics and pop up dashboards and all the fun stuff that, uh, that you might do on a day to day basis to get visibility into your infrastructure or I ot type stuff. And yeah, I joined in flux, primarily Thio. See what we can do with our cloud solution. So hosting databases, basically the same stuff at composed Only in this environment a little bit more communities based and container and just really, really exciting fun stuff with a with a smaller crew. Then then you might have at IBM where you have, like I was saying, 10,000 people in a slap channel. That's not an exaggeration. Jay Gordon: So it's a more focus group of people working on a product like you would expect it pretty much. Any start up? Tim Yocum: Yeah, I mean, there's there's just this excitement of trying to build something new from scratch and put it out there and see what what people think of it and get that feedback and be able to continually and integrate continually at a rate and just improve repeatedly. I mean, that's just that's so exciting to me. Very cool. So Jay Gordon: let's let's now get back in the wayback machine. So we've given everybody kind of an introduction to where you are, and I didn't ask the whole Well, let's talk about the first time you were on call because, let's face it, it sounds like you've been on call pretty much your whole career, So let's let's get into the rules that because the rules are the important part to make sure that everything goes to plan in the podcast. We've We've already gotten through who you are, how you got here. And now we're gonna talk about something that happened. But before that, we were going to remember the rules and we won. We don't incriminate yourself. Ever. Don't incriminate yourself because you know what? Nobody wants to get in any trouble. We don't want to say I did this and I No, let's not, uh, don't incriminate others because we are blameless in how we do our retrospectives and help us learn, because this podcast is just one giant retrospective. Uh, so let's let's hear an interesting story from an on call incident and kind of how it don't turn into a nightmare. Yeah, Tim Yocum: sure. And I will step back one second and say I've been on call long enough to where I actually did have a real pager, was excited to have a two way pager. But the worst part was having Nextel phones where you would be awoken by a chirp, followed by a grumpy co worker just talking at you. So I am so glad to not have next telephones anymore. Um, anyway, uh, story that that really? I've told a few times, and I still find useful to think about, um, back in the holo days. So we were running call location service is out of out of facilities, basically leasing space bandwith on hands time, Okay. And, uh, small group. So one night we decided after after work, we were all the founder and some of the folks at the office we were gonna go out to dinner and just blow off some steam. Small group. We all fit in one car. You know, that type of team. So we're enjoying our Brazilian steakhouse meal getting sufficiently stuffed. And it was it was, ah, very cliche, almost movie scene where almost in sequence, everybody's phone starts to vibrate. And at that point, we were all on calling at multiple levels. So it was weird that everybody's phone was was buzzing at the same time. And though we're looking at our phones and then we look up at each other with with just like fear and confusion in our eyes turns out the onsite data center staff was reaching out to basically everybody in the company because there's water in the cooler Oh, boy, Not something you expect. You know, it's kind of weight. Water? Like where is the office? They're wet, you know, Is a water fountain. Week ain't no, Actually, there's water falling onto racks. So s So we quickly pay the bill and we all pile into the owner's car and and he's He's as confused as everybody else because we have no idea what the scope of this is. Um, you know, we're trying to figure out what's alive, what's not. Clearly the phones are working, so least we know that's what that's up eso. We haul over to the data center and rush upstairs, and we find that in fact, there are several racks that where there's water cascading from the ceiling into the into the racks, and there's actually water shooting out of the floppy drives. This tells you how old this is. There still servers with floppies on. It's one of those surreal moments where you look at it and you're like, you know, the lights are still on, their still flashing, but there's water coming out of them. That cannot be good. So, you know, at that moment we didn't know what to do we had never thought about this. We never planned for this. We had no clue what how to react. Um, one guy's climbing up on a ladder looking to unplug the PT. Use thinking, Well, let's just shut everything down. And other people are looking at him like, yeah, water and electricity don't mix. Maybe you shouldn't be doing. We're all standing there trying to figure out what do we do next? So it is this one of those things you never expect. You just don't think that that Ah, hole on the roof is going to happen and you're going to have water pouring into your infrastructure. It was only a few racks, so it wasn't everything. But what do you do? You know, it's like, how do we proceed? So first thing we do is we get the facility to shut power down and we tell them, Hey, there's water in here which they didn't even know about at the time, which was interesting. And they have the same reaction. We did kind of looking at each other like you're kidding, right? Water. Jay Gordon: That's not supposed to happen. Yeah, I know. It's not supposed to happen. Let's find out why it is happening. Tim Yocum: Yeah, and it's like, No, really, there's water and you need to do something. And they didn't They didn't really have a plan either. So we get the power off, and that was that was the first step in the second step was what? We still have water coming in. How do we get the water to stop running on our servers? Um and so we sort of rigged up a tarp up up, up above the racks to divert the water off and the facility had a guy go up on the roof and look for this this whole and plug it or put something over it, at least because you have ever been in a brain shower in Chicago. They're not very pleasant. That could get really, really nasty. And this was one of the nasty ones. Jay Gordon: I'm supposed to go there on Saturday, and I'm really hoping it doesn't rain because I want to see the coat. So Tim Yocum: it's 90 degrees now, so hopefully it'll warm are cool. Cool off a little bit. Jay Gordon: Cool off a little butt's bombings? Yes, as longer like you get, at least. Ah, you know, on afternoon of no rain, I'll tell you anyway. So you got some rain in the data center. They pluck Bailey, start finding the whole to plug it up. Um, but you're still left with several racks of wet gear. Tim Yocum: Yeah, and this is is a very multi tenant. So this isn't just one wrap, one customer this because this is a couple racks of gear and probably 56567 customers maybe. You know, we really didn't know at that point how many customers were impacted, but the first thought was, Well, just get everything power down and let's figure out what to do next and kind of at that point, we broke up in the teams and, you know, we said, Okay, first of all, we need the knock to start communicating to customers. Figure out who's down, let's identify all the systems that are wet and reach out to these people and say, Hey, there's been a problem. And fortunately, everything was labeled with bar codes, so it was pretty easy to figure out who was impacted. Um, but that person really needed to be away from from the issue. They needed to be in an office, just calling people saying, Okay, here's what's going on here. It's all under control. Here's what we're doing and just keep calling people repeatedly. Uh, it was one of the best decisions that we made at the time that we still are that I still, uh, tell people when dealing with emergencies is you need to have that point person for communication. You have to tell customers what's going on and make sure that they understand what we're doing to mitigate the problem because you're not the only one freaking out. Customers are freaking out, too, Jay Gordon: like the uh, not necessarily the incident commander, but a person who's assigned to be a communicator. Exactly the just do communication to outbound not involved people. And that could be really important because having someone to just say, Look, this is the situation, This is what we're doing, and this is when we'll talk to you next. It, especially in a tremendously massive outage, is in court because getting those three pieces of information of what happened, what's going on when you'll hear from us next? It is just I'm incredibly worked. Tim Yocum: Yeah, it really is huge, and I can't understate just how important it is to be open and transparent with your customers. You know, if you have a problem. Yeah, there's water coming into the data center. That's obviously whatever SL is you have. Who knows where that's covered. But look, your systems are down. We understand that they're down, and here's what we're doing to fix it. And you don't want that person to be someone like me or the other people that are sitting there soaking at trying to figure out what to do next, because we're just going to be saying I don't know. We're trying to figure this out on the fly and communicating that out to your customers isn't the greatest thing in the world. You want somebody that can confidently say, Here's what we're doing you know, tell us your concerns. Take that all down, take good notes and keep continuously communicating with them. And this is, of course, before slack. Before customers could tweet at you and say What's going on? And you could really blast out an update. So these are people that you had to call back or they were calling it, um, you know, totally one on one conversations with support people so once we got that done and then really got that process rolling, and it really speaks to the people that we had hired because everybody was so customer focused. And so it was no problem. I'm gonna take that. I'm gonna go talk to these customers. That's great. Go do it. The next step was all right, Let's get this stuff unwrapped and crack open all the chassis easels Try to dry them out. I guess so. You can imagine if you're thinking of a typical data center. You've got cages of equipment and then you have us with dozens of servers with the top spot off puddles of water around them and basically a bunch of box fans that we just happened to have in the office blowing over the top, trying to dry them off. And I'm not sure why we wanted to do that, because this was not ssd days. It's not like we were gonna stick the servers in a bowl of rice and they would power up again. We were talking about spinning rusts. You know, these systems literally spinning Ross at this point, Uh, you know, these hard drives are likely toast everything in here is just dead. There's really nothing that we can do to recover these things, but that's what we tried to do because the thought was Let's dry him off and see what happens. Uh, once we realized that, you know, we've been drawing these things off, but they're still dripping. What? Plugging them in is likely not going to result in anything good other than releasing the magic smoke. Um, we had another group of people take inventory of what we had out there and what we needed to basically recreate. And we dispatched that group to go out and start racking up or find systems of, like, quantity so that we can get them started up, get him and get the OS installed because we had all this stuff in an inventory program. So we knew what people were running. Uh, get these things reinstalled, get them in some capacity on the network so that we can get people back up and running as quickly as possible. Um, and and that was that was probably the smoothest part of the operation because we had all of the details. So again, this is before cloud. You can't just say give me 12 instances of what we just have been turning up before it was all the spoke, it was all one off. Yes, we're pixie booting system, so there's a little bit less manual work with installing. But after that, it's essentially the customers responsibility to go log in and install whatever they're. They're APS are things. That's what, um, and throughout this whole process, the thing that so I'm jumping a little bit ahead. One of the takeaways that we learned at this point was just taking notes. Um, we were running all of this on the fly, so these were just decisions made in the heat of the moment. And we did tend to take a little bit of detail down as to what we were doing next. What the next plan waas. And after this all wrapped up, that was that was huge. So realistic. Colonel describes, if you will Yes, exactly. So any time that there's a big issue, like something like this, hopefully don't have water pouring out of your systems. But, um, you have somebody take notes and just recording the decisions that are made because you'll find that you make a lot of great decisions. You make some decisions that make no sense at all. But they just fold into developing that, continually improving response to problems you have. And and also you can identify gaps in procedures and policies. We had we found that we didn't have. This is really small, but barcodes all in the same location. Some of them were on the backup server. Some of them were on the front. Why? We didn't care about that. But after we had this incident, it was like, All right, maybe we need to standardize on where we put these labels should make it easier to find out what we have, you know, in structure, cabling, things like that that, you know, Yeah, we did a pretty good job of, but in the heat of the moment, if we had done a little bit better, we would have accelerated the repairs that we needed to do. So you're getting notes together, was huge. Jay Gordon: But at the end of the day, Tim Yocum: getting this getting this resolved really revealed some interesting, interesting problems. The biggest one was, you know, we got people back online. We got systems up for them and the systems were largely a complete loss. Unfortunately, insurance covers that, but insurance doesn't cover the data on them exam ever does. And we learned that some of our customers had great backups. The there's all self managed, so it was like, no big deal. We'll push our code back up, and we're all good. Not a problem. Other customers said, Oh, yeah, our primary server. You know, it was say, sir, for 2001. And our backup server was 2002 and they're wrapped right on top of one another. Yeah. Yeah, about that backup plan you have. Um, so it's stuff like that pre cloud that that you really never thought of too much. You figured if you had a separate machine, What's the worst that could happen on a hard drive? Guys, you got another machine. Their top your data. No problem. But long term, what really came came about in what I used. This story teach these days is the fact that you just can't assume that data centers are bulletproof. You can't assume that eight of us won't have. Ah, a leaky roof. Or that you're zonal. Redundancy is actually going to be redundant enough so I always figure I don't know what's actually going on there. I don't know where my systems are running. I don't know where my data is stored, so I'm going to store it in multiple places. Maybe different providers may be different regions and different providers. Really gonna be careful with what I do and where I do it, because I don't know what might occur in these physical locations and in the cloud. We've abstracted all of the physicality of it. It's all just this ephemeral out there in the environment. But ultimately, still, servers still in racks still subject. Oh, Mother Nature doing her worst. And you have to be prepared because ultimately AWS isn't going to fix your data as you is not gonna fix your date. And no one is. That's that's on you still Jay Gordon: Yes, I I think that one of the things that we've done is in this industry is distracted. So much of the things that equally to heavy lifting or friction. And we don't recognize sometimes that there are environmental variables like you would have in anything that can impact you that you just can't plan for. You know, we've seen things with the insider is getting, you know, waterlogged here in New York City after Sandy happens there to like huge websites like Tumbler and Buzzfeed that made all these really great decisions are building data redundancy and in the end, was all in the same data center. And, yeah, mount hell be a hill of beans, Tim Yocum: as they say. And I remember I think it was Katrina when when Katrina hit, there were providers where people were live blogging. Fact, they had feet of water in their facility and they're running low on fuel in their generators, and it's just you never can tell what might happen. So you have to prepare for the worst. And, you know, the other day, big takeaway that that we had was you get tunnel vision and these these types of outages, you say, OK, there's water coming down on these two rats. Let's fix it. In retrospect, maybe we should have been doing rounds looking at all the rest of our footprint to make sure there wasn't water leaking elsewhere. We didn't do that because we were so focused on those two racks and the same applies for any sort of cloud related technologies or anywhere you might be deploying code. You're diagnosing an issue. But what's the blast Radius? Have you really identified that? And have you thought about maybe in advance? If something like this happens and you see an outage in one particular area, maybe you need to look at other areas to see if there's a cascading failure. And I use the term cascading again with his translates well into having water in your data center. You really have to think, Where is the water going to flow next? Jay Gordon: Yeah, that that's a great point. So, uh, if someone's on call right now, um what piece of advice would you give them is either, Let's say that there are two weeks into the rotation. Two weeks Tim Yocum: into rotation, I'd say. Embrace it these days on call. There's been so many developments in howto handle on call and howto procedure howto make procedures and doing retrospectives and having an owner operator model where you don't have developers throwing code over the fence sitting. Okay, just go run it. Everybody has a stake in it and presuming you're working in an environment like that, being on call is where you're gonna learn all sorts of good stuff. It's going to be a lot of dirty underbelly of the product. You're going to learn where all the duct tape is applied, and that might might really shake you. But that's also how you're going to learn the ins and outs of a product. And you're really going to grow professionally by having those on call alerts that hit you at three in the morning, when you're groggy and you're trying to figure out what's going on. That's when I least in my career I've found that I've really felt super productive and like I'm really advancing my skills is when I'm faced with a problem I've never seen before, and I barely even know what I'm looking at. But it's on me to fix it, and and if you find yourself in that position and you don't know who you can call for support, that's a big red flag. When you're on call, you should always have a backup. There's always a release valve. There's there should always be someone you can lean on who's hopefully been there a little bit longer than you or is more closer to the code itself that can give you some him hints and tips. And and it's those types of pairing sessions when you pull that ripcord and you need the help that really do level up and you start to become a true subject matter expert in all aspects of what you're on call for. So if that's a software platform, you're going to learn about every component and you keep an open mind And don't don't get defensive, which I've seen happen a lot where it's like, Well, this this problem is outside of my my expertise, I can't do anything. No es Emily. I'm trying to take it on. If you can try to take it on, pull People in, um, really take advantage of that situation to learn more. Jay Gordon: That's That's some great advice. So let's talk a little bit about what you're up to. You really as an engineering manager at the influx data, Tim Yocum: that's an influx. We have a cloud platform, and it's been working really well. But we're releasing version, too. So Influx TV's gone through Ah, dramatic rewrite, and we're looking to offer that as a hosted solution as well. So what? I've been involved in is building up in s sorry ops team, too. Put forth the infrastructure that that will run on and make sure that we're taking some of the best practices out there in the s sorry realm. Thinking about s Ellos s allies actually promoting the owner operator experience. When I started, developers were not on call. It just wasn't a thing. And developers are now all on call. And it's something that you can't just brute force. When I started, I was very open about that being my mentality, that everybody should be on call and because worked in the past and it really you in a start up, at least everybody needs to feel like they're vested in the success of the product. And again, the best way to learn is really being on call. And so, you know, we've had we've really been blessed with with people who are open minded, willing to get on call, have some questions because I've never been on call before, perhaps, but are totally jumping in and taking ownership. And by doing that, we've already seen improvements make its way out to are open source software based on what people have seen in doing internal monitoring and being on call and looking at at problems that come up. So it really feeds into itself. So we're very pro open source were oh, s s first company. And that's that's really where all of our passions lie. Um, and it was fun for me is the fact that I'm working with a group that is really pushing really hard to find the problems and to take those problems that we have in our paid platform. Make sure that that translates into improvements on the open source side so that everybody in the community wins from that 3 a.m. Call where something has gone wrong. Hopefully that 3 a.m. called results in the code fix that goes out to hundreds of thousands of people running in flux, TV or telegraph or any part of our stack. It's super rewarding to think that those those little things have such such huge impact. Jay Gordon: Absolutely well, Tim, thank you so much for your time today. It was really a pleasure talking to you about all these really interesting parts of your career and the reigning data center. Uh, really appreciate it. Are there any other projects or anything you wanna share with listeners before we kind of wrap it up. You know, I feel like Tim Yocum: on call is something that we as a community need to really embrace more and and, you know, the title of the show on call nightmares. It really is a big learning experience, and I feel like we should be taking our knowledge that we've gained over all these years and help the people who are not yet perhaps in the cloud, who are still deploying two systems in a cola facility really push our knowledge down. It's not necessarily a project, but it's It's a passion of mine to say, You know, we're really in the upper I don't know 5%. So I would say of technologists that are really embracing cutting edge technology in the club. There's a lot of people deploying systems and data centers or in a closet in their office, and there's so much that they can learn from our experience. So I'd say people in my position, your position listeners right about that, put it out there, leave practical advice for people who are also coming up the same way that we all have on debts bring everybody up with with the experience. So hopefully they can avoid the problems that we've faced in the past. And I know that to put a nice cap on it at the data center we were at, they now have roughly kits ever possible and known about. And so if there is a roof leak, it's really easy to divert the water and get help as soon as possible. So lessons were learned all around Jay Gordon: that night. Awesome. Thank you so much to him. Um, we're gonna wrap up. That was Ah, great conversation. And I really, really do appreciate your time. So thanks so much for having interest in it. How can people find you on the Internet? Tim Yocum: I'm at Twitter, T K Yocum. Why o C u m also at t k y dot io, where I have a little blurb about fixing on call and also fixing hiring because I think hirings kind of broken too. I look at this industry, there's a lot of things to fix. Jay Gordon: Yeah, absolutely. Well, thanks a lot. And, uh, we'll be right back to wrap up this episode of on call nightmares. Well, that's it for this week's on called Nightmares. Thanks a lot, Tim. For your time. I really enjoyed that conversation with you. And if you'd like to be part of future one, just reach out on Call Nightmars at gmail dot com, or reach out on Twitter at onCallNightmare or at jay destro. I really appreciate all your listens. All of your e mails, the tweets, all of it really do appreciate it. There's no podcast without you, and we'll catch you next week. We'll have another conversation with the technologist who spent time On-Call. See ya