Founders Talk – Episode #92
Enabling a world where all software is reliable
with Robert Ross, Founder and CEO of FireHydrant
This week Adam is joined by Robert Ross founder and CEO of FireHydrant — the glue layer between your tech stack and your teams to mitigate and resolve incidents at scale.
Robert shares his journey to become a software engineer, his time at DigitalOcean, this idea of incident management as a platform and how he shifted his focus from creating courses on incident management to recognizing the value of the software he was creating for the course — what is now known as FireHydrant. We also talk through his first experience in raising capital, what happens when the bar is raised on the reliability of the world’s software, and why their mantra is “Hire great people, who build, sell and market a great product, and you’ll have a great company.”
WorkOS – A platform that gives developers a set of building blocks for quickly adding enterprise-ready features to their application. Add Single Sign-On (Okta, Azure, Google, Microsoft OAuth), sync users from any SCIM directory, HRIS integration, audit trails (SIEM), free magic link sign-in. WorkOS is designed for developers and offers a single, elegant interface that abstracts dozens of enterprise integrations. Learn more and get started at WorkOS.com
Sourcegraph – Transform your code into a queryable database to create customizable visual dashboards in seconds. Sourcegraph recently launched Code Insights — now you can track what really matters to you and your team in your codebase. See how other teams are using this awesome feature at about.sourcegraph.com/code-insights
Rewatch – Rewatch gives product and engineering teams async superpowers and helps them move faster with greater clarity. Imagine all of your team’s videos, all in one place. Record, organize, and share the videos that your team needs to ship great work. Get started for free with 14-day trial at rewatch.com.
SignalWire – Build what’s next in communications with video, voice, and messaging APIs powered by elastic cloud infrastructure. Try it today at signalwire.com/video and mention “Go Time” to receive an extra 5,000 video minutes.
Notes & Links
Click here to listen along while you enjoy the transcript. 🎧
Robert Ross, thanks for joining me. We’ve been chatting behind the scenes for a while now. You’re one of our sponsors, and I love catching up with the sponsors on the show too, because for us, just to be super-clear with our listening audience in case it hasn’t been that clear, we choose our sponsors just as well as they choose us. And I think FireHydrant is such a cool company, and the conversations I’ve had with you behind the scenes just about your journey has been very interesting to me. So I’ve been looking forward to finally getting you here on Founders Talk and just sharing this journey for you. So welcome to Founders Talk.
Thanks so much for having me. Excited to be here.
And it has been a journey, right? you’ve been going for a while in terms of your career. You started young…
I started semi-young… I think I touched the computer for the first time in my late teens. Now, we may date each other. I’m 43. I don’t know how old you are. You don’t have to share your age if you don’t want to, but I’m guessing that I’m older than you.
I’m 31, so let’s just put that out. [laughs]
[05:42] Gotcha. I mean, I didn’t really use a computer until I was in my 20s. really use a computer. And it was just so interesting – today kids grow up, a whole generation literally grows up with technology. I just think that’s so just so interesting, honestly. People get started younger, some get started older… I didn’t start building websites until I was in my mid 20s. You were in your mid teens. You know what I mean? So there’s just an interesting thing, when you look at people’s lives and say “When did you begin?” but it’s also when were you born that kind of determines when you begin. I guess at least for this next 30 years.
Yeah. Well, I didn’t have a say in when I was born…
But luckily, we had a computer when I was pretty young, that was capable of playing some simple video games. And I had a pretty simple editor and Notepad++. I was a Windows person when I first got started. I’m on Mac now, but when I was 12 years old we had a little Compaq Presario… I think we had the one that said “We’ll never age.” I don’t know if you ever saw those Compaqs they sold. They had a sticker on them, –
Is that right?
“Future Proof” or something silly that, which is so funny to think about…
That didn’t age well. [laughs]
It did not age well. But yeah, we had a computer that was capable of making some simple websites, and one day I was just pretty curious… My mom was off at work, and we had just moved to Oregon, actually; I was living in Lincoln City, Oregon. I’m originally from San Diego. So I didn’t have any friends yet. I was home alone, school hadn’t started, and my mom was off at work, so one day I had to call her to ask her if I could use the dial-up… Because that meant she couldn’t call me. And she said yes, and I just searched one day “How to Make a Website.” And that was kind of the beginning.
I started making all these silly little websites for myself. The first websites I was making, which is funny to me to be thinking about it now, was tutorial websites. So I was 12-13 years old making this tutorial websites on how to make websites. And I always think that’s funny. Maybe there was somebody out there that was at a bank or something that, “Oh, how do I do this weird PHP thing?” and they were reading this 12-year-old’s website on how to make those things that they wanted to make.
But yeah, that’s kind of how I got started in the earliest days. It gave me the bug, I just wanted to keep making things… And I just kept asking questions, and I was very annoying, I’m sure, on all the forums back then in 2004 and 2003.
Oh, well… Right? Oh, well. No one remembers. Time’s moved on. That’s the funny thing about, I guess, embarrassing moments. We think they’re really embarrassing, or you may have been annoying to somebody, but they didn’t even care anymore. Time moves on?
Not at all.
What year were you 12 years old then? Was out around 2003-2005?
Cool. That’s roughly my timing, too. So my timing was around 2004-2005. And I just had a gap in my life, essentially, between my ability to have a full-time job - I was trying to immigrate to Canada… And things were just really weird where I couldn’t work for nine months, because of immigration issues. I couldn’t go back to the United States, and I had to stay in Canada, but I was unemployed, because I was unemployable. And I just had this curiosity bug; it was around 2000-2005. Similar - just websites, WordPress… Thank God for open source and WordPress, for me at least… Because that was – when I draw my straight line from then to now, WordPress is in that earliest day.
Yeah. When I was getting started, there was this forum software, it was called phpBB.
[09:45] And it was open source, and I was able to look at how they were writing phpBB. And that was actually a really good resource for me, because I would just crack open whatever PHP files that were in that repository, or whatever it was at that point, and just read and learn and view source back then. We didn’t have the tools that – we’re in 2004 2005. You didn’t have “Inspect element” really. You had Firebug, and that was kind of a way to look at the DOM… But really, you had to view source and find the line of HTML. You’re “Oh, how did they do that HTML thing on this website?” That’s how I tinkered all the time, was just constantly open, view source.
And then I got really nefarious with it. I was 13, I wanted to watch this movie that was a rated R movie, and what I did is I opened up the IMDB page, and to prove to my mom that it was PG 13 I copied the View source, modified the content page from R to PG 13, and then saved that file locally, opened it up, and said “Mom, it’s PG 13.” She’s “Okay, fine. You can watch it.” So I became a little dastardly with it in the early days.
Wow… I saw [unintelligible 00:10:59.11] on Twitter recently, she had a pretty cool meme video she shared… The easiest way to say it was “You can make your website perform better just by Inspecting element and making all the tests pass, essentially… Making all the performance tests pass.” it’s not 58%, it’s do this, do this, do this, and bang, it’s 100%. You have the best-performing website ever.” It was really funny.
At the end she’s “Don’t do this. This isn’t real advice.” But I thought it was just the same kind of thing; yeah, View source, change things, and you can make the website be whatever really it is, wherever you want it to be.
I love the curiosity though. The pursuit of your curiosity. So where did you go from there then? So you’re 12, you’re proven to your mom, the PG 13 movie is not PG 13… It’s not an R movie, it’s PG 13… you’re doing some tinkering stuff… How did it go from - okay, young… And young people are always curious, right? You can’t help but be curious, because you’re so new in life. But how did it go from curiosity to “Okay, I can actually go this route. There’s a market opening up for the world, and as I pursue a career when I get older, this is a direction I can go”? How did it get serious for you?
Yeah. So when I was 13, I started making pretty, I’ll say more advanced websites. I was making applications with logins, and I was making things with bulletin boards, and this was – I was a big PHP person back then. PHP 3 I think was the language of the era; PHP 4 was just getting out there… And I was making these back-end-heavy applications. And what happened is that a lot of small businesses started wanting websites around that time period. And when I say small business, I mean a landscaping company. I was living in San Diego, moving back from Oregon to San Diego at this point, actually; there was fishing boat companies that wanted little websites for themselves. So what I – my mom basically became a salesperson a little bit for me on the side…
And whenever she was with her friends who had small businesses at the bar, or whatever she was doing back then - we’re kids, we never really know, right? And she was saying, “Well, my son is making these websites, if your business needs a website.” And I was charging $599, because I read somewhere - and I don’t know how true this one is, but I remember at the time that I didn’t need to tell the IRS if it was less than 600. And I don’t remember where I learned that, or if that was even true, but I was charging $599 Very specific. And really, what I was doing is I was just copying and pasting this content management system I made for myself in PHP, and FTP-ing it to another server, changing the domain, and re-skinning it for these businesses. That’s how I was making my money at the time. So I made – I don’t know, half a dozen or so small business websites around San Diego.
And the brilliance of this – I didn’t realize I was doing it at the time… For me, it was “I want to buy a new Xbox game. I have to go seek out new business.”
One must have goals, right?
[14:05] Yeah. And I wanted the latest Halo, or whatever was happening at the time. Halo two, I think. And what I didn’t realize I was doing is I was actually making a portfolio. I had half a dozen websites that were live on domains, with real people using them… And that was good, because not only did I realize this is a monetizable skill, it gave me experience, and I was still running my tutorial website on the side. And I was talking about all the things I was doing. I was writing down these articles, “How to do logging in in PHP” and “How to read cookies in PHP.”
Side note, there is a post on a forum that I used to ask questions on from 2004 that I asked “How do I keep a user logged in?” That was the title. And then ten years later, I recovered my account on that website, and ten years later I replied to myself with the answer. So you can see 2004, and then 2014, the same person replying back.
Just for posterity’s sake, right?
Yeah, I just wanted to make sure I had the right answer. So by the time I had actually graduated high school, I wasn’t going to college… I wasn’t applying – my mom had passed away when I was 16, and my dad wasn’t in the picture, really my entire life. So I was basically “Oh, well, I need a job.” So I started looking for a job right after high school. And I really just was looking for money. I applied everywhere - Barnes and Noble, McDonald’s, the movie theater… I just was tossing up my resume to this boardwalk in San Diego. I just walked down and handed my resume in to all of these businesses. And the company that, of course, called me back - none of those called me back - was the one that I applied for online for this little agency in San Diego (it’s still going) called [unintelligible 00:15:52.11] And they brought me in for an interview. My first ever interview for any job ever. I wore a suit that was way too big, and I brought in – I’ve done this twice in my career; I brought in printed out code, in plastic sleeves… I brought in this printed out PHP templating system framework that I had written.
Back then Smarty PHP was the templating language, and I had my mind. And they’re talking to me, and they’re asking me “Explain this, show me this website…” And then later that night, they offered me a job. I was offered, at that time, 15 bucks an hour, 18 years old. That was great. It was plenty to live on. So I took the job, and that’s where the career started. So it went from making nonsense websites to a full time web developer.
That’s super-cool that you printed out your code, man. I mean…
I’ve done that twice.
It’s actually genius, honestly.
Well, iPads didn’t exist, so that’s my excuse.
Right. Well, at the same time, you have to – I’m sure GitHub probably didn’t exist then either. I don’t know what –
I think it was really early.
Okay. So really early.
Yeah. And at this point, you know, I hadn’t worked at a job that needed source control. I had some experience with SVN. Very, very little with Git, and so that was my solution.
Well, the point I was trying to make is how else are you going to show your code? I guess you can open up your computer potentially…
…and maybe, I guess you’d probably have it locally… But printed, it’s like “Hey, here’s – you could read it right here. We can look at it together.” You can share the view really easily, because it’s literally a piece of paper. It’s actually quite ingenious, honestly, if you ask me. I mean, sure, if you had an iPad, that may be better; but maybe even an iPad, you can’t interact with it that much maybe. I guess paper you can’t interact at all.
Yeah. And I think that now we do interviews with tools like VS Code, like remote workspaces, and do interviews that way. But at that time, it was like “How do I show that I know what I’m talking about? I’ll just print this out.” And it worked out for me, so I’m glad I did that.
[18:02] How did you get to the point then where you are an engineer where incidents became an issue, and you were able to overcome these challenges? Because FireHydrant is about incidents, it’s about incident management… How did you get from the age of 16, making 15 an hour, probably still making similar, but better, less nonsense websites, more legitimate websites maybe… I don’t know how to describe it. To a point where you’re “an engineer”? And you literally were an engineer. I don’t want to in-quotes engineer you. But I mean, you go from that 15 an hour standpoint… Behind the scenes you mentioned new school, so just… I’m trying to figure out how you get to an engineering standpoint?
That’s a good question. I don’t think I even have a definition, like what’s the difference between a developer and an engineer; in some countries you have to have a degree to even say you’re an engineer. But for me, I think it’s when the problems became more complex, and we weren’t just doing like Drupal installations and things that. And no shame to that, it’s a perfectly legitimate, it’s a huge market. There’s a lot of people that rely on those types of tools. But when you have to incept something from scratch, with no prior example, that’s when software becomes challenging, because you can’t Stack Overflow questions anymore. You can’t search for things, because the question hasn’t been asked. So you have to get creative. And for a while, when I was going from 12 to 18 in that first job as a web developer, really all I was doing is I was collecting Legos. The reason I love the Lego analogy for really any career but software - it’s a really good one - is that when you build, say logging in, for some site, just a feature logging in, that’s a Lego set that you’ve built. And then you get another Lego set in the line, where it’s “Posting comments”, and that’s another Lego set. And you can transfer that knowledge. I can build a comment system, I can build a logging in system, and I’ve done those. I’ve built that Lego set multiple times in my career.
Now, I think you transition much more into engineering mindset when you can start to think of “Well, what can I do with this individual Lego piece that I used on this logging and Lego set that I built? How can I use these pieces from this other set that I have?” and now you can start to layer in creativity into how you build new software. It’s “How can I munge all of these Lego sets that I’ve been collecting over the years into something new?” And that’s kind of what led to my career; all the Lego sets that I ended up collecting over the years, I was building a lot of internal stuff, I was building internal tools, internal admin systems for websites.
When I was at DigitalOcean, I was on one of the teams that worked on – it was a deal for just over a year, but I worked on one of the systems that was the internal dashboard of all the servers, and all of that, and I was on-call. And then I went to Namely; these were my last two gigs before starting FireHydrant. I was the core services engineering manager. So all the Lego sets that I started kind of building and collecting in my career were much more focused on my peers, and their day to day life, than the software that was being built for the people paying us money. And when you start to have that career trajectory, like internal tools, core services, developer-focused tools, that’s kind of what led into this journey of incident management, because I was on – I’d been on-call for a number of years now. I think I’m still technically on call at FireHydrant, but that had a much sure higher tier of escalation.
Gotcha. That’s interesting, in terms of DigitalOcean, too. I’ve used DigitalOcean plenty in my day, of course, and I’ve used what you’ve built, which is even more cool. I’ve used the dashboard to see all the different droplets I’ve got out there in the world, and what their status is, and if I need to do a backup, or clone them, or whatever it might be. So that’s cool to see that you built that.
[22:13] So this idea of incident management I would say has become maybe more of a known term to me in the last couple of years, I would say. And less before that. I mean, incidents have always been a thing; I’ve always enjoyed a good post mortem, especially as you see JIRA being down for weeks, or whatever it might be… Like, what’s a post mortem on something that. Or this Heroku thing going on right now currently, with security stuff; there’s gonna be a post mortem on that. But these are bigger incidents. But this idea of incident - it can be small, it can be big. We’ve had incidents here at Changelog where our database will fall over because we didn’t have a managed Postgres, and we had an issue when we were in Kubernetes, and this or that. But there’s a whole show on our podcast called Ship It, that you can listen to, listener, if you want to go deeper on that. But the point is that incidents are this challenging thing, and a lot of people from the outside see it from this post mortem perspective, like “Oh, something really big happened” and how does the world get told about this?
But internally, when you’re on call for these incidents - it could be like pager duty, or two in the morning, a server falls over… But how did you get into this incident space? How did you begin to think about, from an engineering standpoint, the challenges of those who are on call for those incidents, and have to manage the incidents and deal with whatever might have happened, whether small or large? How did you get in there and start thinking “I can solve these problems? I’ve dealt with this. Now I can actually build something that can actually solve these problems for engineers.”
Yeah, I think a lot of it actually was a little bit of luck in my career. I happened to pick companies that were kind of pushing the needle with software. Even from my first job, we were kind of doing new stuff; we were experimenting with new things. And my second job, we were – I feel pretty confident saying this; in 2010, we were pushed to production when the test was green. And in 2010, it was just getting started, the DevOps lifecycle.
So I got to work at all these companies… One of the things that happens when you push the needle is you probably encounter some pretty hairy incidents; you’re gonna encounter hairy incidents regardless, but when you are introducing things like Kubernetes… We were running Kubernetes at Namely on 1.2. Really early. Somehow I was allowed to install a service mesh; we used Istio at Namely, on 0.2, and I took down production, and then I had to reply to my own page, and reply to the incident… At DigitalOcean we had databases being dropped every so often, and those were major outages… So I just happened to – this is the luck component; I just happened to be where a lot of incidents were happening.
But also - maybe I don’t like myself a little bit, but I love incidents. I think they’re really fun, in a weird way. And I would never be an arsonist and cause an incident for customers, because you always want to build trust with them. But incidents are kind of a fun moment for me. And they always have been. I always love jumping into incidents, and figuring out what’s the fastest way that we can solve this. How can we manage this really well? Do we need to communicate with customer success?
And because of that love of just always wanting to help during an incident, that was a natural point to start FireHydrant on the side, because I was living that problem. But FireHydrant actually – I don’t know if you know this, Adam… We didn’t start – it wasn’t actually supposed to be a company initially.
Oh, is that right?
From the first line of code.
I didn’t know this.
It actually was supposed to be a video series. So I was a teacher at night, part-time, teaching very basic how to make websites, transitioning from writing tutorials on my websites in 2005 to actually teaching at night, starting in Los Angeles and now New York City at that point… And I realized that there was this gap, where these kind of code bootcamps or part time classes end, and where production-ready software really begins. It’s a canyon that these newcomers, beginners to software engineering have to cross.
[26:25] So I was actually – I sought out originally to make a video series on how I would approach building an application from scratch. And that application that I chose to build was FireHydrant, because I was trying to kind of kill two birds with one stone. I was I wanted to build something which I could use in my job, which was an incident response management tool, while also recording a video series that I was actually planning on monetizing.
I got about 40 hours into recording, which is why I even have this mic boom, is because I was recording that series with all the curse words included. And I had a friend eventually say “What you’re building, FireHydrant, is far more valuable than the videos that you’re making and producing. And I kind of took a look and like “Yeah, you’re 100% right.”
“Wow. Thank you.”
Thank you for doing that, whoever – if you ever hear this, and you’re that person, please tell me, because I owe you something. And so I stopped recording. And it turns out when you stop recording something, you go way faster. So I started building FireHydrant on the side, burning the midnight oil, getting up super-early before going at my full-time job… And in a couple of months I had something that was an incident response tool. And that’s when it really kind of started to become a company. And then we raised our series seed in December of 2018. That’s when it started, was at that point.
Yeah, that’s interesting. I love those moments too, the beginnings or the fun parts… And I think that the advice you had gotten from the person you can’t remember is key, because sometimes you’re doing things in life, and you can profit more from the exhaust you’re creating, the by-product, so to speak, than the main thing you’re doing. And I think that’s super-interesting, that somebody shared that with you and you’re like “You know what, you’re right.” And then you stopped recording, and you moved faster. Because you’re right, when you’re not recording, you can move faster, because – well, you were trying to teach people how to do things, versus actually doing the thing. You were doing the thing, but you’re probably moving at half the speed, because you were more interested in sharing the knowledge than you were actually building the thing.
Typing and coding and explaining what you’re coding at the same time - way slower.
Right. Exactly. I’m sure even mentally, the gymnastics in your brain you had to do were probably even more challenging. You were probably mentally/cognitively more challenged, because…
…having to explain what you’re doing while doing it is challenging. There are blooper reels too, of all of it.
Is that right?
Oh, yeah. There’s just full-on videos of me finding bugs, and cursing, and… Yeah.
Are they on the internet, or are they behind the scenes?
There’s somewhere… They’re in an S3 bucket somewhere I have…
So not public, okay. Not on YouTube. This isn’t like FireHydrant Inception Outtakes, so to speak.
The product has evolved so much since those days… You don’t want to show “This is literally how we built the product.” Maybe I can find them at some point.
Well, it’s more or less to laugh with you, kind of thing. Talk to me about the midnight oil aspect and the – I would just say the early charge, the early energy you had. Was your discovery of you know living this problem day to day in your job, and then building a tool to manage it and resolve it easier? How did you feel when you first started to do this? Talk to me about the energy level you were feeling to make you want to get up earlier and stay up later. Not just that founder drive, but something in that moment where you find sort of like “This is the next thing I can do. This is the next big thing I can work on.”
[29:56] I think it comes down to a lot of factors. When FireHydrant had its first round of funding I brought on two people I deeply trust as co-founders. So Dylan Nielsen, who’s our head of product, and Daniel Condomitti, who’s our head of engineering. And when you have – let’s call them like startup accountability buddies, that helped a lot.
Another thing that was big is - I knew the problem, I deeply understood the problem as an on-call engineer that was fighting fires for so long, which is why the company is called FireHydrant. What do you need to put out a fire? A fire hydrant. And I think that’s still what excites me, is that this is such a large problem, with so much return on investment for companies, that I just want to solve it. I want to solve it as much as I can. And I don’t think it’s ever truly solved. We’re always going to have incidents in our systems, but the impact of them, how long they last, and the morale impact and the internal impact of them… There’s just so many facets of incidents that are problems that we are setting out to solve. And that’s what keeps me going. And just the quality of the customers that use our tool today. It’s just gotten more exciting every day.
It helps that you love incidents. I mean, honestly, right?
It’s semi-sadistic in a way, because it’s like torture, right? Because for everybody they’re not fun, right? Especially when you’re on the chopping block. Like, “Oh gosh, I took down production with Istio, or whatever. I should be fired?” “No, no, no. I actually solved the problem. Here’s how the incident went down. Here’s the playbook we could run next time. Here’s the learning from this challenge we faced.” “Okay, great. We deployed Istio and there was this challenge. But it was because we weren’t planning in this way, and now we have reliability because of this incident.” I think that’s kind of the beauty that comes from the learnings, I suppose, of incidents.
The opposite of incidents may not necessarily be opposite, but it’s reliability, right? Because if you have reliability, you probably have less incidents, or maybe less severe incidents.
Yeah. And at FireHydrant we think of reliability – so we think that there’s a staircase, and incident response is ad-hoc, freak out, get there as fast as you can, pour water on the problem, and…
…drive off into the night. Incident management I think is where you have a graduation into service ownership, and you have people responding to incidents; not at every hour of the day, because you’re constantly learning more and more about your system, and how it behaves, and therefore you can get smarter about how you prevent incidents and not and reply to them quickly, and mitigate them quickly.
And then reliability - what I think is really interesting is reliability to me is a business metric. It’s not an engineering metric. It’s not a – it’s the whole company… Because reliability impacts every corner of the business. So if you think about - let’s say you’re running an eCommerce site, and you have a checkout system, and it has a little box for a promo code. And we’re going into Black Friday, and marketing is about to send an email to 2 million people with a promo code for your eCommerce site. And you’re down when this email is about to go out.
Reliability at this point is no longer an engineering problem. This is a marketing problem, because people are going to click that link in that email, go to a dead page, get a 502 or whatever it is, and go “What the heck?!” So it’s a much bigger problem than just engineers replying to incidents, responding to incidents; it’s “How can we build a tool that impacts literally every corner of the business?” Because that’s what it incident is. Every department gets impacted, even down to legal. I mean, legal is gonna have to go review contracts; like, “What is the SLA for this company over here using our tool? Oh, well, they have a different number, so we’ve got to recalculate that one.” In 2018 Slack had $8.2 million in SLA refunds.
That was a lot of departments involved in getting to that number. Finance was involved, legal was involved. Engineering was certainly involved, like figuring out which customers were impacted, for how long. So we’re moving towards this world where the reality is that reliability is going to be on – in this decade, and I’m this my big bet, is liability is going to be a metric that publicly traded companies have to start reporting on… Because investors are gonna see “Well, sure, you gained this many customers, but how reliable were you for them?” Because that’s a good indicator of how happy they are, and if they’re willing to stay. And if you have a competitor that’s more reliable than you… It starts to change the equation of businesses.
I like the fact that it puts the ownership too, because I guess – it’s one thing to win business, it’s a whole different thing to keep the business, right? Gain the contract, sign it off, they’re legal, you’re legal, everybody’s cool… Okay, that SLA works for us… But then actually performing for it and knowing the difference there.
So you’re saying that reliability being a business metric more than just simply an engineering metric means that - at that point, how do you then track it? Would you just simply call it reliability? Is there a better name for it? In the next decade what will the term be used when CNBC is talking about it, or someone’s on Bloomberg talking about – you know, some CEOs talking about their reliability index, or whatever? What will the metric be, what will it be called?
I reliability index. I think that’s a good one. I mean, the state of DevOps actually introduced the use of measure availability, but now it’s measured as reliability. Because they’re saying in that report more goes into it. I can’t say for certain; I just have my big bet, bold, H1, publicly-traded companies are going to have a report on reliability.
If I were to have to guess though, I’m going to assume that they’re going to have to say the number of refunds issued because of reliability SLA violations, I’m going to assume that they’re going to have to talk about potentially even DORA metrics I could see coming into the fold here, which I could see like change rate failure. So for how many times we deployed in this quarter, how many times did that cause a reliability or availability outage? And that goes into your reliability index.
I think there’s a lot there that goes into reliability, because… Another one could be how many bug tickets did we have reported? Many times we don’t treat bug tickets as an incident, but to that person that hit that bug - it’s an incident for them; it’s a reliability problem for them. So I think we’re gonna see a lot of changes in the next few years towards this business metric of reliability.
I think it’s interesting too how the developer and software engineer is getting more and more closer to business goals, where before it was sort of like an “us and them” side of things. And now, when I look at – I have an idea for a podcast that we’ll eventually potentially do, that I won’t say here, but it very much looks at the way engineering departments, and essentially the way the software ran companies, which most companies these days are really software companies. My grocery store, HEB, down the road - here in Texas, we have HEB. Everybody loves HEB. If you came to Texas, you’ve been to Austin, you’ve been to Houston, Dallas, wherever, you’ve been to an HEB. It’s an amazing grocery store. They do great things locally, they have great business practices, but they also have an amazing app and great software that runs their business; they make great decisions. And so I think the value of that company, specifically of that company, or others, is predicated on their ability to engineer good software. And so in some ways, you can speculate the value of this company based upon the ability for the engineering teams to execute and do well.
And so in this case here, if we look at incidents, or reliability, or this reliability index, that very much could be a key metric you watch and leverage as a potential investor, whether it’s an employee, actual money in the stock market, if it’s a public-traded company, an angel investor, or a seed round funder, or series A lead, or however you look at it; you might look at this reliability index number, or whatever this might become, as a key indicator of why it’s bad or not.
Yeah, I think you nailed it. I mean, there’s so many surprising companies coming out now… And sorry, by surprising companies I mean companies that you wouldn’t expect to have amazing software. I have a Roomba. And the Roomba app is fantastic. It’s internet-connected, it does a push notification to me when it’s done… My vacuum cleaner has no business having amazing software behind it, as far as I’m concerned…
Yeah… I have a Roomba too, and it’s spectacular. It’s really good.
It’s weirdly good. And you can feel it in the app, like “Wow, they’ve put effort into this app.” This isn’t like a UI WebView in an iPhone container. This is a real native application from a vacuum cleaner. And that’s gonna start happening more and more, because the expectations of customers are changing rapidly.
I’m in New York City, and I shop at Whole Foods, and I am just amazed at how easy it is for me to get my groceries now through Whole Foods, through the app. I can say “This is what I want.” And it’s to the point now where if the person on the other end can’t find the item, or they’re out of stock, that I get a text message saying “We’ve found a substitution for you.”
Grocery shopping is changing rapidly. So I think that we tend to have a narrow mind on where software is, I think, because we’re just surrounded by developer tools. That’s where we kind of live as engineers, is in our developer tools, and things that… But the world is just surrounded now by great software, and a lot of other software that’s trying to be great.
The way I’m thinking about it more and more lately - I walked by a table at a coffee shop, and I’m just having fun in my head, I’m like “What software touched that table?” Well, it was probably logistics software that got it here, there’s probably an AutoCAD file somewhere in a cloud server that’s being hosted, and there’s some software involved there… And there’s some software for the person on the truck… And it’s just gets kind of insane when you think how much software touches mundane things that we just walk by now.
My favorite one lately is trees. I like thinking about how much software is touching a tree. And so far, I’ve only been able to come up with the amount of water used to water that tree. That’s all I have.
[44:09] Yeah. Well, I guess you might have the seed package at some point, or maybe the tractor involved, that at some point harvested something, which led to the possibility of the seed… Or the clearing of the field, or I guess potentially of not clearing the field in the case of a tree, right? It’s not cleared.
Yeah, New York City Parks Department probably has something tracking something for the state of a park. When was the last time we – you know, customer complaints about a park that they have to address and… Yeah, it made me excited about how much opportunity there is to improve reliability just for everyone, not just engineers.
I like that aspect of it, because I can almost imagine you walking through your life on a day to day basis and you’re thinking “They can use FireHydrant… They can use FireHydrant…” In the case of Roomba maybe not so, much because they’re pretty amazing… Or maybe they do use FireHydrant or something like it. Maybe it’s an internal tool that eventually they can just let go and use your service, because it is so reliable. Maybe they have great practices behind the scenes, so when things do catch fire, they put it out quickly, or it’s unnoticeable, or perceived unnoticeable to someone like you, because they’ve done a great job in their engineering practices. And I think that’s really what I’m learning more and more about incident management, is - and it seems like it can almost touch the hierarchy of a business too, because you mentioned owners, service owners. So that means that the business has enough wisdom to 1) hire more than one person to run the engineering department, and then 2) to actually dedicate someone who’s owning it, either the incident… Maybe you can educate me more about the behind the scenes of this and how it permeates, but I’m thinking that as teams become more and more mature, they identify how to handle their incidents, and they either have a homegrown system, an ad-hoc system that, like you said, you just pour water on it, you run around like a crazy ant or something like that, because somebody stomped on your ant hill or you have calm, cool, collected, pre-thought-out ways to handle things. And because you have a mature team, you’ve got owners, that owner reports to so-and-so… You know, when this happens, legal needs to be brought in, because it’s likely an SLA is attached to this kind of incident, or this kind of service, or whatever it might be. Help me understand the breakdown of incidents and how it doesn’t just involve the engineering department, but has maturity and has other teams involved that they all sort of collect, and be informed, and do their work, and then put out the fire, for a lack of better terms.
Yeah, so there’s something happening in the industry right now, that I think is to the benefit of everyone; not just the people operating software, but people utilizing software… It’s service ownership. We’re seeing a world where you not only build, deploy, manage, you’re now directly responsible for the reliability of that service operating in production.
I’ll give you an example. At a FireHydrant we were very product-focused on our team separation. So we have an incident management team, we have a Service Catalog team, we have a foundations team, which is like user management, invitations, signups, things like that… And we have an integrations team. And the value of that is that we now have very carefully crafted like “Well, if it’s an incident management issue, this is the team that responds to it.”
And that’s important, because what can happen if you don’t create these, say, lines in the sand, you can accidentally introduce heroes to your incident management processes. So I used to be – I’m not going to call myself a hero; I used to be a first-responder, whether or not I was called to the fire or not. I was like “I smell smoke. I’m gonna go see if I can help.” And one thing that we introduced into FireHydrant to help identify this is responder stats, like “Do you have a person that is just always responding to incidents?” Because now you have – if that person wants to leave your business, go become a baker someday, they’re tired of software, who’s gonna respond to incidents anymore?
[48:19] So I think service ownership is super-important, because you’re spreading important knowledge about those applications across a team, with known barriers… And it also improves how you respond, because maybe, maybe a portion of your product isn’t as impactful to a certain part of your bottom line. So you can actually create different escalation policies based on product areas. So now we can say, “Oh, the analytics tool is five minutes behind. Is that worth waking up an engineer at 2am? No, probably not.” Well, maybe it is, if you’re an observability type company. But for some other company, it really depends on what you do. They might say, “Well, that team doesn’t need to wake up at 2am” and you can start to segment really, really nicely down there.
And something that we’ve done in our team too is this has actually created a nice way that we do our observability. All of our traces, all of our logs include the owning team now. So you can actually go into – we use Honeycomb. Amazing. But you can go in there and you can actually type in “show me all the traces for this team.”
And it creates this really, really well-threaded line throughout the people that write the code, people that have to reply to the incidents, all the way down to the same tools to power all of that. And I think that’s what’s going to have to happen. Our systems are getting so complex that a single site reliability engineer that has consistently responded to your incidents for the last couple of years is no longer gonna be able to cut it, because it’s impossible to maintain that much complexity in one human brain; you have to have to spread it out.
So does that make then ops departments, engineering departments more like everyone’s in SRE in a way then? Or is it like specific owners become, in quotes – because SRE is in a lot of people’s titles, right?
And so that seems like this is making it in a world where almost everyone’s responsible in a way, but there’s a particular person, obviously, that’s key; because you don’t want one person owning all the knowledge.
Right. So I think SREs are – if we’re using the quite literal term from Google is that they’re building software to empower reliability. They certainly own a lot of aspects of reliability, and maybe some of the core systems or the platforms that people build on top of… But they should be building tools that enables the other teams to manage their own reliability, too. Or at least in my world.
For example, if I need to roll something back really, really fast because of a reliability issue, I should have a tool that someone else has provided to me. And maybe that’s an ops team, maybe it’s an SRE team, maybe you have an internal tools team… But that’s kind of where the line in the sand I think exists.
How then does a platform like this enable other parts of the organization to play a role? So if you’ve got service owners, which is obviously the majority of the team, and you’re also not having too many heroes, as you had said, so that you don’t have isolated or compressed knowledge in one person’s mind, you have it distributed across the team, and there isn’t one response from one person, it’s distributed that way… How then does FireHydrant enable legal to get involved, or marketing? Or how does the rest of the team care or get involved? Care is probably the first step, and then get involved is the next… Because at some point, it’s like “That’s just not my problem.”
[51:58] Yeah. Well, you have to know about it–
Sure, I want to send this email, as you had said, but how does in the case you gave before the marketing team who’s about to send the email and the site’s down - how does that knowledge, I guess, get to the rest of organization? Does an incident actually have to occur? Or how does this enable more team members to care about the reliability?
Our approaches is within incident management we have communications, so status pages are key; internal status pages, product related status pages, external status pages… When we say communication, it’s really like even just sending an email to someone. Because when you communicate about an incident quickly, you build trust internally with the team. So for us, every incident that happens through our tool, you automatically get a status page. That can be sent to anyone in your company. It doesn’t require a license. Because if there’s a fire and you need to tell people, tell them. So that’s part of our solution.
And then the other angle that we take is we have a built-in Service Catalog with service ownership, with teams and team assignments. So when you do encounter an incident, our belief is that you should be able to very quickly get the right people to that incident as fast as possible. And I’m a big analogies kind of guy, and we’re called FireHydrant… So imagine you live in Brooklyn, you call 911 and you say, “There’s a fire in my apartment.” They’re not going to send a fire truck from the Bronx. They’re going to send a fire truck from around you. And the way they know how to do that is because they understand your neighborhood, they understand who owns your neighborhood, and the best people suited to get there quickly, and know where the fire hydrants are, right?
There’s so many things that come from service ownership and service catalog that you have to have in place. And we’ve been building that central pillar of FireHydrant from day one. We’ve had a service catalog in our product from the first few lines of code.
So in a lot of ways when you decide to use a tool this, in a lot of ways it helps you organize. It’s almost like the exit plan that you see when you walk into a room. Usually, in commercial buildings, when you walk into a certain room, like “Here is the fire exit” kind of thing. It’s the forethought to inform, right? It’s the forethought to say “Okay, this is the service owner, this is the Service Catalog, here is the services we have, here’s who owns those services, here’s how things happen.” So you have to do a lot of, I guess, preparatory stuff, to get the benefits of the software, right? Because you can’t just plug it in and boom, it just works. You have to sort of tell the software who is what, what is where, why it’s there etc, and that gets more and more complex as the enterprise gets bigger and bigger, I’m sure, and as you use certain features. But the point is that to sign up day one, it’s not just “Okay, receive benefit.” It’s “Okay, get organized, tell the software how you organize, and the software informs you based upon incidents” and all those different things with status pages, and informs the right people at the right place, as you had said. Not calling the Bronx, calling somebody actually in Brooklyn, right down the street from you… You know, all that good stuff.
Yeah. And the way that we think – you should have a quick… Like, if you just have a lot of fires going on, and you just need something to reply to them quickly - perfect. You don’t need to use Service Catalog and FireHydrant. We certainly recommend that you do, but if you have an acute knee pain and you just need Advil - certainly, we are great for that. That’s what our free tier is for. And then once you’re on that page, and you have the right tool to get there quickly, it’s about managing those incidents substantially better. And that’s kind of where my example came in, of get people from Brooklyn, not from the Bronx, which I’m definitely going to start stealing that analogy more and more.
Yeah, I like the analogy. It does take some knowledge of New York City, five boroughs; you have to understand that the Bronx is different than Brooklyn… So there is some localized knowledge you could have.
For folks that don’t know, in New York City, even though they’re only four miles apart, it’s probably an hour and a half drive, so…
[56:02] They’re very far.
…you definitely don’t want a fire truck from the Bronx coming to Brooklyn.
Lots of traffic… And yeah, it’s a short drive, but lots of traffic. It’s usually the traffic and whatnot; pedestrians having the right of way, in some cases…
What is it about the state of incident management - is it early innings for this, like a management tool that isn’t internal? When you look across the most successful companies out there today, how many of them use an organized tool like FireHydrant? Not so much FireHydrant itself, but –
…are organized enough to respond to incidents well? I’m thinking more like TAM, total addressable market. I’m curious about that. Is this a big market? What’s the future?
So we have a number of enterprise clients using FireHydrant, and thousands upon thousands of engineers in one company using our tool. And for one particular company that uses us, they had a tool that they built and managed internally for - I think it was over six years. And the reason they built that is because we didn’t exist yet. And so they are certainly on the – crossing the chasm; they’re an early kind of adopter type of company. But if you think about just the scale, that a company with thousands of software engineers is operating at - I mean, that just kind of tells you how pervasive the problem is, that they need to build a tool internally for this.
We have a number of companies that have used open source tools that just were very bare-bones, called incident response, and then switch to us. But to the direct question of how big this market is - I mean, every company that operates software is going to have an incident. It’s not if, it’s when. And they’re going to need a tool to start responding to this.
I imagine we’re going to start to see compliance get in the mix here. I mean, even SOC compliance already asks, “What’s your disaster recovery plan?” And I think that we’re gonna start to see on questionnaires, like vendor questionnaires, “What tool do you use when you have an incident?” Because we already see that question for “What tool do you use for responding to security incidents?” As a vendor ourselves, we see that question. I think we’re going to start to see that more and more, because people are realizing, “Oh, we have a lot of incidents.” And it’s only a matter of time. And people are catching up… This is a big market; we have a lot of people that take interest in FireHydrant every single day, new people, and it hasn’t slowed down, it has only accelerated. COVID has, unfortunately, accelerated that, because everyone’s online even more now. We rely on software more than ever before, especially before COVID. This market is just only going to get bigger; it’s expanding every single day. And the way I think about it is every time someone introduces a new microservice into production, or a new deploy, that’s a more complex system somewhere in the world, and that’s happening hundreds of thousands of times a day, and every complex system is going to need a tool like this.
When you look at the market then, currently, any business with hundreds or thousands of engineers, if it’s when, what’s in place currently? When you look at the market of addressable servicing, I suppose, what is it being used? If it’s FireHydrant at 2%, or 5% - I have no idea where you’re at, but what are people using today? How are they getting by? If it’s when they’re clearly having incidents, things happen, how are these teams managing these things? Is it mostly internal? Is it mostly just ad-hoc? Is it nothing at all? Is it maybe a small sliver using FireHydrant because you’re newer to the game and you’re still gaining market share? How do you break it down?
[01:00:03.12] When we talk to people, they’ll have some tool that tells them about an incident; it’s kind of a smoke detector. It’s like “I smell smoke. There’s an incident.” And that’s just kind of where the value ends of that. It gets you up, right? It gets you up at 2am.
And the handoff from that to incident management is predominantly ad-hoc. There’s a lot of people that when they tell us their stories about how they manage incidents today, it’s like “Well, I manually create a Slack channel, I manually create a JIRA ticket, I manually notify a customer support team in Slack, and doing all that manually, figure out I have to reset my password for status page, and go in there…” We hear these stories, just a lot of ad-hoc, manual, very, very little tooling. And that’s why they come to us. Because we saw that immediate – from 10 minutes of getting to finally being able to start responding to the incident, to 10 seconds. And it’s because of all that manual ad hoc freakout that we see in the market.
We’ve built a tool from where people know where they are today, which is manual, ad-hoc, freakout a lot of the time, to automated respond to an incident, and get to be able to start mitigating the incident faster, instead of doing the bureaucracy of create a Slack channel, shall notify everyone, create a JIRA ticket… All those things that I just said.
Yeah. The maintenance of that is scary, because as you had said before, back to the hero analogy, as you had said, you have service owners inside of FireHydrant, so you’re sort of identifying these owner stats, as you mentioned. In an ad-hoc scenario, you’re not doing that. So you’re allowing the heroes to be heroes, right? Which is okay. It’s not terrible, but that means that there’s a bus factor there. If that person leaves, then all that manual process and the knowledge of how to do those manual processes leaves with them. And so in a high churn environment, which is engineering, specifically… So if it begins in engineering, but then permeates to the rest of the business world - marketing, legal, whomever else - then that knowledge of how to deal with these things, if it’s ad-hoc, that ad-hocness also leaves. And someone has to relearn, by – how do you learn? By doing. Having more incidents, right? So it’s kind of scary, honestly, in a lot of ways.
You mentioned Honeycomb before, and Christine Yen has been on the show before, and I would call her friend… And one thing we talked about recently, specifically to their most recent fundraise, was observability for everyone. And this sort of like “for everyone” after it seems like maybe teams currently use observability as the smoke signal potentially. Observably is almost the ad-hoc version of incident management. Because observability is like “Did my CPU spike in production? What’s happening in production? Alert me. Tell me. Let me go there and ask questions”, all that good stuff. How do you see, I guess, since you mentioned Honeycomb - do you use Honeycomb? Is that part of FireHydrant’s tooling? Do you use that behind the scenes? That kind of stuff. And then two, how does observability play into the bigger picture of incident management, and then more importantly, reliability?
Yeah, I think that observability is the vitals of your system, in many ways. It’s the heartbeat of how your system is behaving. So we use Honeycomb for that. And if we feel the heartbeat going faster, or customer pain coming into the picture, that’s one of the tools that we first go to. “Well, let’s take a look at where these errors are happening.” And that’s a good signal into how you respond to the incident. Think about going to a doctor, right?
The severity, right? If the heart beats too fast, you want to raise the signal level of this incident. Like, “Hey, call in all the shots. Everybody’s gotta come, fast.”
[01:04:00.10] Well, potentially. And this is the hill that I’ve been dying on lately, is that CPU at 99% doesn’t mean anything. It’s nothing. But if CPU is at 99% and people can’t check out or load a page or log in or do whatever your system is supposed to be doing, then maybe CPU is a part of that equation.
But if you think about a doctor - you don’t go to a doctor because your heart beats fast, you go to a doctor because “I’m feeling lightheaded, and I can’t think.” So I think that there’s a very, very – SoundCloud has a great blog posts on this, where they say you should be basically alerting on symptoms, right? High CPU - that’s a vital, and that’s why they always measure all those things. Whenever you go into a doctor, what they do? They check your heart rate, they check your blood pressure… It’s the same thing, every single time. And the reason they do that is because they’re trying to correlate to the symptom that you’ve told them, and why you’re sitting in that chair in the first place. So I think that’s the difference that I see.
That’s just interesting, because one thing we talked about internally recently was this idea… So just to frame it a little bit, we have a podcast called Ship It. That podcast was really born from this once a year, twice a year podcast we did with Gerhard Lazu, who’s been our infrastructure operator and SRE and friend for many years, helping us build out the infrastructure here. And a newer acronym brought into our world has been SLO, which is also, you know, kind of crossing the chasm between observability as well. It’s because you want to have an objective for how the service performs, and do things as a result of that.
So there’s certain things that are near and dear to us… Like, if our RSS feeds are down, if they’re incorrect, so cache missed, or whatever - if they’re down, or if they’re incorrect, then to us, that’s a lifeblood. Our business is podcasts; if the information isn’t getting out, if Apple, Spotify and all the other podcast indexes that index our feeds can’t get the latest/greatest or whatever is most new, to us that’s a fire issue. We need to fix that ASAP.
So there’s particular things that we’re looking at in terms of SLO, and that’s something that is shared in terms of a
metric between the observability world and FireHydrant, is this idea to send an objective level for a service and respond accordingly. And that’s why I was asking about observability… How does it go from observability to an incident and incident management? And how are those worlds sort of – will your worlds collide? Will you eventually be an observability tools as well? Will you marry Honeycomb, or will you merge? Is that how things will work? Or do you operate with other observability tools like Grafana, or if somebody’s using a time-series database, or whatever, a homegrown thing… How does FireHydrant play in that world of observability, and then also, how does the idea of SLOs and things like that get into that world?
I’m going to start by saying we’re not going to be an observability tool… That they’re good at what they do, and we are going to integrate with those tools. But to your point around SLOs, I think that’s the touchpoint…
Where you touch, yeah.
Because I think that an SLO is this really, really nice, simple tool… We actually just posted a really amazing update about SLOs on our blog, but I think that SLOs are sometimes, in my experience, still suffering from some of the same problems of reporting on vitals.
[01:07:48.28] An SLO that says “We want our CPU to be lower than 90%.” Like, why? Why is that your objective? “Well, because if it’s over 90%, our site slows down and customers are unhappy.” It’s like, “Okay, that’s your SLO. The site is too slow.” It’s a good joke, actually… [laughs] Sorry, I’m just gonna give myself a little chuckle on that one. Your website’s responding not quick enough, and you have a service level objective attached to that. And then the indicators that go into that, from your tools, like the Honeycombs of the world or whatever else you’re using, give the signal, they give the vitals that can compose of that.
And then on our end, it is way easier if your SLOs are detailed to the point of customer pain, and like a starting point there, because now your incident management gets better. Because now “Oh, the SLO says that–” I’ll just use us as an example… “Incident management is broken in some way, like a runbook step is running on our platform.” Okay, perfect. That’s a great SLO, because it doesn’t – I actually don’t think you should talk too much about tech; I think you should have references to “And here’s a dashboard for it. And here’s where we’ve got everything for it.” Because now in incident management world, I know which team to get on that. I know exactly “Here’s the team I need to assign to this incident. Here’s the severity of it. Here is the potential priority of that incident. Here’s all the recent deploys for that area of code…” There’s just so many other things that you can stem off of really amazing observability and objectives in your system. So I think that’s where I see a really good marrying between the two systems. It’s not to say you couldn’t use FireHydrant without an SLO. Most people do, frankly… But that’s what I see happening in the next few years, too.
Let’s go back in time a little bit… I think you’d mentioned your seed round was in 2018… But when did you begin to – take me back to the day you mentioned, when you were doing the course, and accidentally created a company. What year was that? How long before your seed did you work on the software, and what was that initial journey from engineer to potentially future CEO? I say potentially because we know you’re here now, of course, but in that moment, you were potentially…
Yeah. So I think the first line of code was in September 2017, so not too much earlier… But yeah, September 2017, is when the first lines of code were written for FireHydrant, and the commit messages were really beautifully formatted, because I was recording… There was perfectly linked GitHub issues to everything, and acceptance criteria… Because I was trying to be the best engineer I could possibly be, because I was recording it. And I stopped recording, and then punctuation goes to hell in a handbasket, and you have whip commits in main… [laughs] That’s really when it started, was in 2017, the first couple of lines of code.
What’s funny is now, years later, those comments on those commits don’t even matter. That’s kind of funny, how much effort we put into a commit message. Sure, if you have a team and you need to communicate and commit messages are for your communication, then sure.
But early days on software, just atomic commits… You know, move fast.
There is one commit that I still get ragged on by the engineers that we have on our team now… I think it’s Bobby 318, which is March 18th, I think is the date… And it’s because there’s this just massive pull request that gets merged in, and the only comment on the pull request merge was my co-founder who just says “Oh, God. Approved.”
And that was years ago now, but that one comes up consistently, and there’s a Slack auto-response for it and everything, so… I still get haunted by some of those. [laughs]
How did you go then from “Okay, this idea makes sense…” Because we’ve sort of touched on what’s happened. Let’s go back and let’s talk about what could have happened. So how did you go from “Okay, this idea makes sense. I’m feeling this pain every day. I’m kind of weird that I incidents… Okay, let me embrace this a bit. I can write some software on this. There needs to be more management around this.” Maybe even seeing the future of what enterprises will deal with; not the ifs, but the whens of these incidents happening in organizations… And I guess probably just the desire to have more reliable software around you. How did you go from that to the seed round? Because you were an engineer. Did you have a network? How did you attach to venture capital? How did you leverage your network? What was your first step to even being like “Okay, we need money. Let’s build.” How did that happen?
So this is the part where luck is just a huge part of this journey… And I totally embrace that. I actually wasn’t looking for VC money, and I’ll be the first to tell you that FireHydrant – I was going to run it myself on the side; I had signed up for an LLC using Stripe Atlas, and I was fully ready to just run this as a side bootstrapped thing.
A side hustle, yeah.
That was the original game plan. I wanted to try that. And then what happened is - there’s a couple other players in our space, and during the journey of looking into the other players as potential investments at that time, in 2018, our first seed investor, Work-Bench, stumbled on my silly, little side project. And they reached out, they just happened to be in New York City, down the street from where I was living, I could walk there, and I was able to raise a seed round with them two months later after meeting them. Because they had already done the legwork, they had already researched this wonderful world of reliability and incidents, and they had a thesis on this idea already formed.
So when I showed them the initial product that was being built in coffee shops and late at night at home, they said “This aligns perfectly with a thesis that we’ve built on this. We’d love to do a seed round.” And I love and hate that story, because you hear all these stories, like, people have amazing ideas all the time, and they can’t raise capital, and then here’s me, having it find us… And I think I was just really lucky and really loved that initial seed. And they remain amazing investors to this day.
Were you scared going in there? What were you thinking? Like, “Oh, my gosh…” Where are you thinking–
Absolutely terrified, yeah.
Absolutely terrified. Did you wear a suit that was too big?
I did not wear a suit that was too big. So their mantra at Work-Bench is “The intersection between suits and hoodies” so I think I wore a hoodie.
Okay, cool. So you were nervous… What were you thinking then? Were you thinking like “Give me –” Do your best to remember how you were feeling in the moment. Not just nervous, in terms of an adjective, but specifically, were you thinking like “Gosh, I need this money. I want this money” or “I want these people involved, because I’ve researched Work Bench, and here’s the other investments… And I couldn’t be in this.” What were you feeling?
I was feeling – I mean, I kind of alluded to it; I didn’t exactly have an easy childhood… And when it comes to safety nets, just in general, I don’t really have one. So whenever I work, it’s all for me. I have my apartment, and that’s where all my stuff is. I don’t have – for people my age, a lot of people have their parents’ house that they can still… You know, if it really goes badly, they have that. And that’s a dire situation. But when you’re starting a startup, that’s a reality; that is a real thing that can happen, is that it doesn’t go anywhere, and suddenly you don’t have a job, and you are scrambling to figure something out.
So that was on my mind… It was also on my mind of “I’m a software engineer. I don’t have any experience running a company. I’m a first-time founder.” So I was kind of looking into what kind of support do I get with this investor as well. And luckily, they provided the best support I could have asked for.
And I think a little bit on my mind was all of the what ifs. This is a million and a half dollars; that’s the most money I’ve ever seen in my entire life at that point. And what if we built this tool and people start adopting it and we go out of business? What if that happens? Now I’ve not only lost a million dollars, but also these companies in the early days that put their faith in us also lost money. And so there was just all these early-day what ifs, and it just made it really kind of nerve-wracking.
[01:19:59.15] We have amazing customers now, we’re well capitalized, we’ve raised a series B, and all of those problems have gone away. But in the early days, that was on my mind all the time, was that fear of failure.
How many of those what ifs – I’m just gonna go psychological little bit with you… How many of those what-ifs can you recall were positive? Because you said negative what-ifs.
I did say negative what-ifs.
Did you have any positives? I’m just curious if you did, and why do you think that is?
Plenty of positives… And I think that I’m working on that, Adam. I’ve gotta be more positive. But I think the positives - we can change how people think and build and deploy software. We have all of the components, in the earliest days, where the puck has been going, since we started. Service Catalog is the pillar. Amazing incident management after that; role assignment. We’ve been building all these things because we kind of thought to ourselves, “This is what we want, as software engineers. Let’s build that thing.”
And it was so cool… I just remember being in this room with co-founders and just thinking about ideas or problems that we had, and solving those. And one of them was – I remember clear as day Dylan and I were trying to figure out, “Well, how do we associate recent deploys to incidents?” which is part of our tool now. And it’s like “Well, what about the deploys that you didn’t think were the incident problem, and you didn’t think to go look for them?” Just ideas like that. Suspect deploys - how can we add that to the product? And all these fun ideas coming into one tool that we have… Thinking about that in the early days was super-fun, and really invigorating. And this is why we got up every day?
Yeah. Well, we are years later, many years later, from the series A, the scary moment… Or the sorry, the seed round; the scary moment where we had negative what-ifs, not positive what-ifs… And I think the reason why I asked you that question is less to put you on spotlight and say, “Come on, Robert, why can’t you be more positive?” but more like – part of this show is I want to share the raw story, especially someone like you, who’s gone from engineer to CEO, and the chasm that’s between those two roles. There’s a lot shared, but there’s a lot from your engineering role that informs product direction and the ability to CEO… But it’s a strange, new world. But I also want people who listen to this show to hear stories like yours and be “Wow, it’s okay to be scared in that moment. Or to be afraid of the what-ifs.”
I’d be worried if you weren’t… [laughs]
Yeah. You know what I mean? But at the same time, it’s like “You know what - it’s possible” and in a lot of ways it’s representational. “Can I actually do this? Can I kind of luck my way…?” And I think it was luck, and I guess you could say – what is the definition of that version of luck, where it’s like preparation meets timing, or something like that?” I’m, paraphrasing here, but it’s something that. But that’s kind of your moment. There’s somebody else, a venture capitalist who thinks about the future of markets, thinks about the future of software, comes up with a thesis, essentially, that matches or mirrors in many ways your thesis, which was written software, of a future where enterprises rely upon a certain piece of software to describe to them when they’re down or up, and how to change accordingly. And to communicate and involve. That seems like a very possible future, and it’s just interesting how you think you lucked it into there, but it’s just more like I guess serendipity… I don’t know.
And I just asked you the question of the positivity side because I just wonder why people – if there’s someone listening to this going to go into their potential series A in the next few months, ask yourself the positive what-ifs, too… Because what if we can change the way enterprises organize around incidents? What if we can learn from these incidents? What if we can actually enable greater tools for future reliability of software?
[01:24:04.07] So those are like three positive what-ifs… Because that’s what you did, right? What if I could go from engineer to CEO and kick butt at it? What if I could hire amazing people to have fun at their job, and to help people create reliable software? There’s a few more positive sides to that, because… I wonder if you’d have had a little less concern or fear going into that seed round meeting had you asked a couple positives.
I think that it’s a great call. I mean, for folks that listen to this and are about to raise a round, have raised a round, or are thinking about starting a company, taking the dive, a few things I’ve learned are don’t measure the world with your own ruler. And that way, you’ll be able to get more excited sometimes. If you have that what-if-bad-things, change it or what-if-good-things.
One of the ways that we’ve done that is we look at just how many people are building and operating software, and the potential to improve their lives. We say at FireHydrant like “We want to make a dent in the universe.” Our vision for the company, our company vision is a world where all software is reliable. That is a monstrous vision. That is huge. And having that vision, that giant thing that you’re chasing, that’s just basically unattainable - that’s where it gets fun… Because you don’t run out of things to do. The effort is the prize. And for us, I get asked pretty consistently, “Well, do you want to be acquired? Do you want to IPO?” and I just don’t think that’s the right level of thinking, I want to build a great company, that has great people, that build, sell, market and support an amazing product. IPOs and acquisitions are a result of that virtuous cycle. So set your sights on a great product, with great people, and you’ll have a great business, and you will have a great outcome for everyone involved.
Those are wise words, Robert. I like that. Those are wise words. It’s challenging to have that perspective, though. Because so often – I think I mentioned Sid Sijbrandij, either on the show or in the pre-call, recently on the show, and obviously GitLab IPOed, and that was a question he got asked a lot too. And I think maybe because GitLab in many ways stood in the shadows of GitHub, and they grew up together. And obviously, GitHub was acquired. So the next obvious question for someone Sid and GitLab and those who are on the board, were thinking like “Should we get acquired or should we IPO?” And obviously, they’ve IPOed.
But so often is it like, okay, you’ve got a potentially large company here. So the next object is not “Build great company”, because that’s already kind of been done, in a way; you’ve still got to do the work, it’s still the possibility of a good company… But you know, so too often do we minimize it to just simply “Can it be acquired by another great company?” and exit, and kind of stop doing the thing, in a way, like as a founder, or early person involved; in some ways, it’s a new version of the road, but in a lot of ways it’s just the off-ramp, really. It might be a slow off-ramp, but in many cases – I mean, you can look back at all the stats, in most cases is an off-ramp to the thing. Right?
An acquisition is usually an off-ramp. A nice payday; maybe it’s a good effort, equity acquired, whatever, or [unintelligible 01:27:33.26] liquid or IPO, which is a whole different challenge.
[01:27:40.16] A whole different thing. And we’re far away from that being an option for us. We’re three and a half years old, and we’re the one of the oldest doing this… And that kind of tells you just how early this new categories is forming… But again, if you set your sights on – you’re supposed to aim past what you really want. You don’t throw a ball to glove, you throw it past the glove, that way you get the right power. You aim for an IPO, you aim for an acquisition, and then you just kind of come up shorter than you really wanted… So just hire great people, that build a great product, and then you have a great company. And I will keep banging that drum till forever.
I like that. Did you come up with that, or is that something you heard from somebody? Is that something that was baked in from behind the scenes? I know you said it was your mission, but was that something that you formed yourself, and sort of graduated to the company mantra, or the mission? Or is that something that you heard from somebody else and you’re like “I agree with that, too. I’m gonna bang on that drum as well”?
I can’t say that I heard it said that way. I think I’ve heard the ingredients just as my career has progressed… But one of the things that we also said in the earliest days - my other co-founders and I, we were in a car, traveling around Calgary, long story… And someone - I think it was Dylan that said “I want to build the company that I want to work at.” That’s been a big guiding light of how we’ve formed all of these opinions. And yeah, I don’t know, I think if you focus on the people in your company, and hiring amazing people that you want to work with every single day, that also want to build a great product… I still don’t know where you can really go wrong, in a lot of ways.
What are some of the challenges that you particularly face today? Today, this week… What are some specifics “I’m gonna get off this call with you, Adam, and go…” And maybe it’s not even a challenge. Maybe it’s a triumphant moment. Maybe it’s a meeting that you’re like “This is the next big deal for us. I play a hand in our sales, because I’m one of the faces of the company, and when I show up, things happen”, whatever. What’s something you’re dealing with, challenging or triumphant?
I think right now the largest challenge that I have, and we have, is there’s just so much opportunity, which means there’s so much to do. And as a founder, as a CEO - it was an old joke, it’s not that funny, but Chief Everything Officer… And I think that’s actually pretty accurate for a company at our stage and my role. So it really depends on the day. I’m on the same team, and I wear the same jersey, and that’s another mantra at FireHydrant… We all wear the same jersey. And if you need me to come into a call for sales that helps you with that - perfect. If you need to interview me for something in marketing, for a website update - perfect. Put it on my calendar. And it’s really just finding – but the challenge in there is the context-switching, like “How are you exceptional at all of those things, one after another?” You go from a sales meeting, to an engineering meeting, to a marketing meeting, in the same day. That’s where it gets challenging. And I’m always looking for ways to continuously improve my context-switching at this point.
I think I asked you before the call how much you prioritize your health, because if folks get to see some clips on Twitter or YouTube, they’ll see your bike behind you, and in the pre-call I said, “Hey, nice bike behind you. Do you ride that often?” You said not in a couple months, but then I asked you, “And how do you prioritize – do you prioritize health, and stuff like that?” So kind of two questions for you. And I think this one’s first, more or less, but you can go into any health aspects that might eek in as well… But how do you then remain focused? What do you push back on?
So in some cases, it’s like “Well, I focus on my health, and I take long walks” or “I live in Brooklyn, so I walk the bridge once a day”, or whatever it might be. I don’t know, whatever it might be. So how do you – given that context-switch and the need to be strong in all those points consistently, how do you carve out time for you? How do you remain focused? How do you know what to focus on?
[01:31:58.13] I think that if you’re not focusing on yourself - this goes for any job, not just mine. Any role, in any company - if you’re not focusing on yourself, you’re not going to be the best at your job. Because health is – I don’t know, we’re people; we need to make sure that we’re number one. We’ve phrased it at the company like Family, Friends, FireHydrant, in that order. And if there was one that started with F for yourself, that would probably be the first one. But… You know, alliteration.
But jokes aside, I try to prioritize – I work out multiple times a week; it’s one of the most important things that I’ve started doing for myself. It didn’t use to be that way. I didn’t grow up that way. I had to kind of force that habit. I do have a bike behind me, and when it gets just a little bit warmer, I’ll go on some bike rides. I took up skiing in last few years… So I was working remotely and skiing across the country, which - that was amazing, being able to just get out of the house and go do something for me. And probably the silliest thing that I do lately for myself is I have a Google Sheet on my personal Gmail that I – or a personal Google that I call the Activity Buffet. So if I find myself with some downtime, some personal time, or I’m bored on a weekend, I can go to this Google Sheet and I just have a list of things that I could do. It could be ride a bike, go take some photos, just go for a walk. And I actually have a scoring system that I have to hit five every week. So I’m basically forcing myself to go do things that are not work-related, that are Robert-related, at least five times a week.
Yeah. It’s good to be specific about that, because when you get busy, and you’re so needed, and there is so much opportunity, it’s easy just to kind of pull yourself back into work, or the Easy mode, so to speak. Hard mode is actually disconnecting, distracting from the main thing, and sort of taking a break and feeding into some things for you.
We’re getting close to the end here, Robert… What’s the horizon like for you? I didn’t prep you for this one, so forgive me if you’ve got nothing. But if you do, even more amazing. What’s something that people don’t know much about, or nothing at all, on your personal horizon, FireHydrant’s horizon, that you can share or tease here today?
That is – that’s a good one. On the personal horizon - everyone at FireHydrant knows this, I just told the whole company this last week… I’m interviewing for my job every day, every single day. Like, it’s a new job, and on the horizon it’s just gonna be a new interview, every day, at 7am when my alarm goes off, I need to be better than my previous day. And that’s just kind of the way that I’ve been operating for years now in this role. So on the horizon, a new version of me, every single day, until the end of time.
For FireHydrant, we’ve built tools for incident response, we’ve built the tools for incident management, built tools for people that care about reliability… You’re going to see a lot, even more of that. You’re going to see new ways of thinking about old problems that exist in reliability, we’re going to continue to push that service ownership is the future of building reliable systems, and you’re gonna see some pretty badass reliability. It’s just gonna continue to get pretty amazing. And I know that because I still appear at all the product roadmaps and everything, and there are a lot of days where I’m like “Holy crap, I wish I was still a software engineer full-time at FireHydrant, because that looks fun as hell to build.” So I’m really envious of the engineers at FireHydrant, because there’s some really cool stuff coming.
[01:35:58.02] Since you said you’re interviewing for your job every day to get better, is there a day that you think you might step away into a CTO role, instead of CEO role, or hire a CEO to let you get – not so much more into product, but… There’s a lot of responsibility a CEO has; and I don’t want to say MORE into product, because you’re probably still in it quite a bit… But that you don’t have to fully hold the CEO responsibility, which is… Which is a lot.
You know, I’d be lying if I didn’t think about it… I’ve thought about it – or you just think about it from time to time. But I think that it’s very natural, and for folks that are thinking about starting companies - you should be thinking about that. It’s a perfectly healthy thing to be thinking about, is “Am I the right person for a role?” Because then you start to identify your gaps on “Well, you go work on this…” and then you’ll actually surprise yourself, like “Oh, I can actually go get better at this thing.”
But if you look at pretty much every startup that is IPOed in – let’s use tech startups as the example. The executive team is not the team that it started with. And that is okay. I think we should normalize that. I think it’s okay to move on. And I have also told anyone that I’ve ever interviewed for a role at FireHydrant, I don’t even think this is my last job. We want FireHydrant to be a stepping stone on your career ladder, for you to move on to something that’s even better for you. And I take that away from myself, too.
Wow. I guess the only question I would ask you then to be very blunt about it – because that question is for me too, and my company is obviously much smaller than yours, in terms of like we haven’t raised venture capital, we have probably nowhere near the revenue you have, we have nowhere near the headcount you have… Our software is probably just as amazing…
A hundred percent.
The point I’m trying to make is questioning like “Are you the right person for the CEO job?” And I think the answer for me – now, obviously, we have a small company, so CEO is very loosely held, but person in charge might be better for me than CEO. I always say yes, because it was originally my vision, I think I know where I want it to go, I believe I can drive it there… And until the day I truly feel I can’t, can somebody truly do the job better than me, will somebody be more passionate than I am, will they think about the future more than I do? Will they sweat the details more than I will? You know, those are the things that really matter when you get down to it. Because you can be an amazing executive, but can you be an amazing sweat the details person? …which is such a nuance, and such a curated, creative process, that often can only be done by a founder or a co-founder?
Yeah. I have an analogy that I use for this one. My job - my job now, no longer writing software for our tool, or much else - is to pick the mountain that we are going to climb, all of FireHydrant. We’re all going to climb that mountain over there. And it’s the mountain of software, and a world where all software is reliable… And then I’ll know if I’m not doing my job well if I can’t convince people it’s worthwhile to climb that mountain with me, and giving the right supplies to everyone in the company that needs it to go distances. You know, if you want to go fast, go alone. If you wanna go far, go together.
So my role becomes I need to give my sales team shelter, and I need to give my engineering team water, and everyone needs to go with me up this mountain. And if I’m unable to convince people of any of that, then it’s probably time to change roles. And I’m not there. I see a big freakin’ mountain in front of me right now, and I want to continue to climb it. And the people that have joined FireHydrant - we actually had everyone together in New York City just last week, and we knew it was a good team. We knew it was a good team based on Slack and Zooms that we’ve been on for the last two years… But we didn’t know it was this good.
And how fun and how creative our company has been. And that in-person element just really was so special to see “Wow, this is the team we’re climbing this mountain with.” And I’m pretty sure we’re gonna get to the top of this thing with this team behind me.
That’s good, man. It’s good for you and the team to solidify a lot of - not necessarily question marks, but kind of question marks, because there’s something about meeting, hugging, shaking hands, physically being in the same space… We are electrical beings, right? Electrical, chemical beings. And there’s something that happens when you’re in the same room. There is energy transfer, literal energy transfer. And that data point is missing from Zoom. There is a portion of it, but there is definitely a missing component in a virtual setting, which is the physical nearness to people. And
the chemical reactions just happen. I mean, it’s just–
We’re social beings, we need that oxytocin. It’s how we succeeded as humans. There was no one human, it was a group of humans… And that’s how you succeed.
You must be on cloud nine though, after that kind of hangout… I know I would be.
Oh, yeah, it was surreal. It was – you’re walking around and you’re like “Wow, this is a lot of people that I’ve never met in-person, ever, and they’re all here to work on this problem as a unit.” And I think that was – that was a highlight. Certainly a highlight, a milestone for me in this role was last week, for sure, meeting everyone.
Is there anything left unsaid? Anything I didn’t ask you, Robert, that you were like “Man, I really wish we could have talked about this”? Anything in closing that I didn’t get out or bring out as part of your story.
Honestly, we’ve covered a lot. We’ve hit childhood, we’re up to today, we talked about teenage years… The only thing that’s slightly interesting, but there’s just no way to transition is I’m a huge marching band geek. So for all of my marching band geeks out there, what’s up?
We’re actually gonna go see a drumline in a couple weeks, which is kind of a marching band, but I guess it’s maybe a tangent to it. It’s DCI. They’re coming to town and we’re gonna go see that. My son is taking music lessons in drums now, so…
That’s what I did. I did Drum Corps International.
Yeah. So we’re going to that. We have similar roots then at least.
[01:42:56.18] I did three seasons of that, and my co-founder did seven seasons of Drum Corps International.
So that’s funny that you’re going to a Drum Corps International show. That’s exactly what I did. My camera is hiding the DCI medals, actually…
Wow…! Okay, so listeners, you’re not seeing what I just saw… He’s got some medals on the wall, and they’re from Drum Corps International. That’s super-cool.
From Marching Band, yes.
Wow. So we love that. I mean, if I’m on YouTube, or somewhere - pick Instagram, TikTok, YouTube - and they feed me, the algorithm feeds me this video for dopamine’s sake, on drum corps lines or something like that, I’m watching it, for sure.
I’m curious what show you’re gonna go see, because the season’s only just getting started here in a couple of weeks.
I don’t know. The place where my son goes to music lessons just shared the email with us. I didn’t get all the details; they’re coming like sometime in June, and they’re “We’ve got to get tickets now as a togetherness. We’re gonna go together”, so like “Yes, put us down for four tickets.” That’s all I know for now. So I figured, DCI - and I’m down for it.
It’s always gonna be a good show. And I have a two-year-old, so he can make a lot of noise there… Because that’s what you do at those kind of places, you get excited. So I’m sure it won’t be an issue for us with a two-year-old, which can be challenging.
I’m certain – yeah, you’re gonna have a blast; it’s super-fun, seeing the athleticism of a marching band. You never thought it was possible. So you’re gonna have a blast here, but yeah - huge part of my life.
Some of the best shows we go to, or some of the best times out of a Friday night football game is halftime, right?
Watching that band, yeah.
Amazing… You know, dueling bands… I didn’t tell you this because I wouldn’t call myself a marching band fan necessarily, but I was in the marching band when I grew up.
So I played the toms, I played snare, I played quads… So I did that growing up, and I loved it. I loved the cadence, I love all those things. It’s cool.
I still have trumpets sitting right next to me…
Well, Robert, it’s been a blast catching up with you on your past, present and your future, and I’m looking forward to it. I’m a big fan. I love having you guys as a sponsor, I love being able to come to this context here and go deep on your story and the details behind creating reliable software, which I think is obviously a much-needed thing in the world, if we’re moving towards the entropy of software, which is just more reliance on it in the future… So why not make it more reliable?
It’s what we’re doing, every day.
Well, Robert, thank you so much for your time. It’s been awesome. I appreciate you.
Thanks so much for having me.
Our transcripts are open source on GitHub. Improvements are welcome. 💚