Founders Talk – Episode #95

Builder journey to streaming data platform

with Alex Gallego, Founder & CEO at Redpanda Data

Featuring

All Episodes

This week Adam is joined by Alex Gallego, Founder & CEO at Redpanda Data, to share his builder journey to create the Redpanda streaming data platform.

Featuring

Sponsors

SentrySession Replay! Rewind and replay every step of the user’s journey before and after they encountered an issue. Eliminate the guesswork and get to the root cause of an issue, faster. Use the code CHANGELOG and get the team plan free for three months.

SquareDevelop on the platform that sellers trust. There is a massive opportunity for developers to support Square sellers by building apps for today’s business needs. Learn more at developer.squareup.com to dive into the docs, APIs, SDKs and to create your Square Developer account — tell them Changelog sent you.

FastlyOur bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com

Fly.ioThe home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Notes & Links

📝 Edit Notes

Chapters

1 00:00 On Founders Talk...
2 00:51 Start the show!
3 02:46 Graduating young
4 07:17 Now you're an engineer
5 11:43 Playing with hardware to learn
6 15:50 Three poor kids from Brooklyn
7 18:44 On being a first-generation immigrant
8 23:55 Sponsor: Sentry
9 26:14 Digging into Redpanda
10 33:32 Building for NVMe
11 39:26 Missing on Cloud
12 44:02 The lack of latency
13 46:07 The importantacnce of sovereignty
14 48:05 Sponsor: Square
15 51:00 Infrastructure choices
16 53:23 The state of Redpanda Cloud
17 56:16 Choosing the BSL license
18 1:02:25 Going public is next
19 1:04:27 Hack the Planet scholarship
20 1:08:55 Up next on Founders Talk

Transcript

📝 Edit Transcript

Changelog

Play the audio to listen along while you enjoy the transcript. 🎧

One interesting fact about your journey is you’re an immigrant, and you’ve got this sort of like origin story of building, and you’re a builder… Let’s begin there, because I think that surfaces a lot of the things you’ve done at Redpanda - that’s where you’re at now - and what you’ve done to be a founder and a CEO, and to be on this show, and to do all these fun things… But let’s go back, I suppose, and talk about your builder journey, because you’ve been a builder for a very long time.

Yeah, my good friend [unintelligible 00:01:21.23] she likes to say that I was the kid that took apart the television, if you weren’t watching… And so I guess part of my personal identity has always been being a builder. Growing up I was part of my uncle’s – he used to fix racing motorcycles, so like KTMs and Kawasakis and that sort of thing, or dirt bikes… And really super-young, in the summers I had nothing to do, my mom was a single mom, and so she would send me to my grandma’s, and I just took engines apart and put them back together… And it was a lot of fun. And that really sort of inspired me to build things, and in many ways, take things apart, which came on as I applied for college and I was trying to figure that out.

Later on in life I went on to build other types of electronics. I think I was nine when I built my first tattoo machine… And anyways, I ended up selling that to another artist. It was really fun to take these engines apart out of these electronic cars, and then put them back together in a different form. That’s definitely sort of had an impact, especially when I came to the US; the things that I wanted to do were always around building. And in my mind, it was kind of more around hacking; I think not in the physical space, though I still like to do that on my own… But when I was trying to figure out, “Hey, what do I do next?” after – I graduated high school super-young, and so kind of building systems became a good place for me to start to tinker with things.

How did you graduate young? What’s the story there?

Yeah, I’ve been very lucky, but I skipped a bunch of grades; I skipped like third grade, and part of seventh grade and eighth grade and 12th grade. I just got lucky, in many ways.

Wow. So you’re like four years early, basically.

Well, so I started late, because – when my mom put me in school, I started a year late, and then I skipped, because I was bored, and so they’re like “Oh, actually, Alex could do fourth grade.” And then I started doing the papers to migrate to the US, so I couldn’t finish my seventh grade. I got here, I took a placement test, and then I placed into ninth grade. And then when I was there, I didn’t know a word of English. Actually, my first class in the US - it was this history class, I remember it vividly, and I was asking the kid in front of me, I was like “Hey, can you translate?”, which to the instructor seemed like I was challenging his authority, because he kept repeating not to talk to each other. Anyways, 45 minutes into my first class in the US I’m sitting in the hallway. I was like “I guess this is how you take classes in the United States…” [laughter]

Yeah, and so I started – I was like “Well, I have to figure this out, the language thing”, and so I started a reading club. It turns out I managed to accrue all of my credits by the time I was in 11th grade, and… You know, that’s just how it was. It was like a year of study where I would only show up in the morning, then I would go work, doing whatever, and so I graduated early.

It’s interesting what happens when you’ve got a thirst for, I guess, progress, or achievement… I’ve gotta imagine your journey - you felt held back as you start to school later, right? But then you were able to progress sooner, and skip around, because you excelled, because you had a desire for it. It’s just interesting what happens whenever you sort of have that tension there in the progress of learning.

You know, actually it was a similar story when I went to college. I graduated in like three and a half years. And the challenge – it actually started being hard, because I went to not a great high school. It was just like – I landed in the US and I kind of self-registered myself to go to high school. I had to figure out something to do. So anyway, so I go to college, and then my math skills and a bunch of other skills weren’t necessarily as advanced; for the high school they were good, but not when I compared to my engineering class going to the NYU School of Engineering.

So long story short, I started just fine, and I went into his cryptography class, and the reason I kind of dropped out of my graduate studies in cryptography were – I guess through the first few years I just took a class and I was like “This is fascinating.” And then I ended up building prototype systems, got a bunch of awards, and then did so much research that actually my advisors and my counselor, they’re like “Here’s a semester worth of credits.”

Is that right?

Because I got to university.

Yeah. And so we ended up inventing class credit numbers… Because they actually don’t exist in the books. You can’t register for the classes that are registered to my name. And it was really all because I kind of became obsessed with this idea, I was like “How do you make cryptography usable?” And then I ended up kind of advancing that. And so it’s been a few lucky breaks, I suppose, in that way.

[05:47] Yeah. Well, I mean, I’m not gonna say I’m the same, but I get these obsessions. And I always have something. But I tend to obsess, and it’s just interesting - there are certain hobbies where I go super-deep, because I just can’t find all the knowledge. I can’t satiate the yearn and desire to just accumulate the knowledge of this unique thing. I don’t know how to describe it; it seems like you have that same thing, where cryptography, or the engines, or just learning in general, it just seems like you just obsess, in a positive way, over the thing.

You know, the story with Redpanda, and my previous company, Concord - I actually built them all on the weekends. And part of it was sort of – with Redpanda in particular, it was “What is the gap between what the platform could do hardware, and the state of the art software back then - which was Kafka - could do?” With Concord, the previous company, it was a similar thing. I was like “Hey, what is the gap between –” Back then it was this project called Apache Storm, and I think Flink, which was getting started… I was like “How can I make some of these things –” and it’s the kind of thing that I just think about it all the time. And I like it, and I enjoy it, and I’m lucky enough to basically get paid to work on the things that I really enjoy.

Kind of one of those lucky things. And I know that for a lot of people – my partner, she’s a writer, and so she doesn’t get paid well for the things that she likes to do right now… So it’s lucky when kind of those two things get combined.

At what point did you go from cryptography, and school, and earning a semester full of credits etc. to getting to a point where now you’re an engineer; not just into cryptography, and different things like that; you’re building different systems with software. How did you get to that point?

My mind was blown when I went to college. And the caveat was I was the first one to go to college. And it was blown on two dimensions. One, I was never really challenged before until I went to college. I think school was easy. And then two, I met this person named Joel Wine, who was part of the distributed systems. Anyways, he was part of some of the early TCP protocol congestions controls, and it blew my mind that this one person could actually help shape the way the world works today.

But to me, the best part about that experience was like “Well, he’s just as smart as my friends.” I was like “Why can’t I do this? Why couldn’t we change how the world works?” And a lot of it really just came down to trying. And so that was a big fundamental departure; I was like “Cryptography is fun”, but in my head back then I was like “The world is broken. I want to help fix it, and so I’m gonna go build systems.”

And so I actually left cryptography, went into distributed systems, went into a fast-scaling startup in New York for ads; it’s called Yieldmo. Anyways, back then we used to own Forbes, MSNBC, Reuters etc. all of the mobile traffic, because frankly, we were better than Google at the time, and so as a young grad, the CEO was like “Hey, how about you compete with Google?” I was like “Yeah, that sounds like an awesome challenge.” Plus - why not? So that’s when I sort of got started into building super-high-volume systems with low latency etc.

So my experience is now I guess 13 years into streaming, but it really came from this ad tech background where you’re serving literally millions of impressions, then there’s like a 10x panout in the backend with a bunch of pipelines, and things like that… And Concord was a compute framework, and then Redpanda became the storage framework.

Streaming really is the combination of a little bit of compute, and storage, you sort of chain it together, and then at the end you have something useful. When I sold Concord to Akamai, I couldn’t find a storage engine that was able to keep up with the volumes of data I was trying to do. And that whole experiment really started in 2017. I became obsessed, just to go back to the earlier point of “How fast can you push hardware?” I just wanted to understand “What can the hardware actually give software in 2017?” And to answer the question was “20 microsecond round trip between two [unintelligible 00:09:57.06] latency was actually what the state of the art.

[10:02] I remember there was a vendor that came to Akamai, they were selling [unintelligible 00:10:04.22] for 22-microseconds round trip at the tail latencies. And I built this product, but it was like 26 microseconds. And so chasing that performance edge was the humble beginnings of what eventually became Redpanda in 2019.

Yeah, that’s – did you say microseconds? Is that less than milliseconds?

Yes. One microsecond is one thousand –

Break that down. How fast is that, really?

So I guess a nanosecond is about like a page, like an 8 1/2 x 11 page the length of a page is a nanosecond, and so 1000 of those is a microsecond. So if you align those pages, 1000 of them, there’s about a microsecond, which I guess in instruction is a few million, but in terms of really like a program execution time it’s really a tiny, tiny fraction. For context, where is high-frequency trading to date is mind-blowing; it’s about 800 nanoseconds, 700 nanoseconds, when the standard is embedded into the hardware itself, into like an FPGA. Right after that, you have one microsecond, two microseconds, that sort of thing. That still for some users is high-frequency trading. The RPC mechanism, that was with plain hardware that existed in the networks of Akamai… But what it did is it went directly from the NIC into the program userspace, and it bypassed a bunch of complexities that we tend to create as a way to facilitate programming models on the hardware. Long story short, it’s super-fast.

Yeah. Do you mind if we dig deep a little bit on the hardware side? I’m just curious how did you come to play with the actual hardware? Did you actually – were you in the data center? Did you rack and stack - not so much like service to demand, but how did you… Were you in some sort of like workshop and you just had a couple super-fast NICs directly attached to a CPU? Give me an example of you playing with the hardware to push the hardware as fast as it can for the software’s sake.

Yeah, I have like three great stories. So one is, personally, after I sold Concord, I actually just bought a little three single U racks, and I put it in my Miami apartment, and then I just plugged it into the wall. And I needed that mostly to have access to profiling tools and kernel settings that are really hard to get into any cloud, right? I needed to change those, like let the kernel turn on the Intel metrics of this particular CPU, so that I can track –

You needed direct access to the BIOS, and stuff like that. You needed full control, yeah.

Yeah. Cache lines access, cache line misses, latency between, that sort of thing… And so there’s a project called PMU Tools that allow you to measure much deeper, especially for an x86 processor, like whether you’re top, or bottom-heavy. Basically, do you bottleneck on the instruction decoding, or on the instruction execution, or returning instructions…? There’s kind of a whole complexity on that. And so I bought hardware to try and understand just the basics on that. And now when I got into Nix, I didn’t have access to the kind of NVMe devices; they were very expensive. Akamai obviously had access to that, and at the time I ended up working for Akamai Labs. And so they had 15,000 computers, and it wasn’t very hard to call the lab person. I was like “Hey, can you plug in this cable?”

“Can I get one?”

Yes, exactly. [unintelligible 00:13:29.06] computers. And so I wrote it – I actually open sourced a part of this work under a project called SMRF. That is like this RPC using the FlatBuffers; I patched the FlatBuffers compiler, I wrote a code generator for it… And it was really honestly just for my internal knowledge. I never intended, at least back in 2017, to build a company; it was really just trying to understand “How do things actually work?”

That’s so interesting, how that begins with “How does this work?” Because we kind of got here to some degree through this description of obsession, these micro obsessions, at least in my case, and like you’ve built the company, I haven’t… But it’s just funny how you can sort of be super-curious, be willing to throw some money at hardware, plug it into a wall in your apartment or whatever, and then whatever access you had at Akamai to get super-fast NVMe storage to play with… Whatever that is. It’s just really interesting how you just pursue that deep curiosity, and out the other end, something valuable comes out.

[14:31] Exactly. Well, in 2019 the funny part about this whole story is that – so I’m in Miami, and I leave Akamai in I think late December 2018, or like January 1st 2019. Anyways, January 1st, for all intents and purposes. So I spend the month in my in my cave, which is really like my office. It’s almost like I think sleeping there just is like “Oh, there’s like this idea that I just – I need to write it.” And that to be an engineer - it’s so funny, because it’s just Emacs, and GCC, on like a Fedora, whatever it was at the time, and it was just like “I’m just gonna write it.” You don’t have to ask for permissions for that kind of stuff, you just do it.

And then I migrated to – I guess I moved myself to San Francisco, because I didn’t have a job. And it was fine, I was just trying to figure this out. I moved here, and honestly, it took off. The developers really loved the product. There was this need for something simpler. I think as a practitioner sometimes managing existing streaming systems felt like distributed system whack a mole, where you’re just kind of trying to whack the systems into obedience. You’re like “Okay, I have to remove the znode in ZooKeeper. Why is my cluster crashing?” kind of thing.

So I was like “Okay, something’s gotta be simpler.” So I moved here, and the rest of the company kind of took off from there.

Right. And here is San Francisco. Okay, cool. You sold Concord to Akamai, you didn’t have a job… Were you independently wealthy? Did you make out well in that transaction? Was it positive for you? I’m sure it was, so…

So for context, we were three poor kids from Brooklyn before that. That is the truth. I knew how to write C++, but largely, I was living in Brooklyn, and I was broke. We built a really cool system, it was actually used by some of the largest financial organizations, and then Akamai were kind of in the stocks for it… It wasn’t like a huge, hundred-million-dollar kind of thing. It was a small exit, but it was enough to make a big difference in everyone’s lives, or at least for me. I think there was a small return for investors; it wasn’t like a big return. But we didn’t lose money, we all made a little bit of money, and… Anyways, for me at the time it was important. I graduated with student loans. Even though I had a full scholarship in terms of tuition, I still had to pay for room and board, and get loans for that. So that was like a pretty important step I guess mentally for me to just like “Okay, I don’t have any debt” kind of thing. So it wasn’t like tens of millions of dollars, but I did have the flexibility to move without needing to have a job at the same time.

Yeah, what I’m getting at really is the specifics less so, but more so the buffer it creates. And part of this show is just to not so much – it’s kind of to chronicle somebody’s story like you, but also to sort of share a path that is possible. So you had an early exit that gave you buffer to essentially camp out and be curious. You didn’t have to worry about earning money. I’m sure you were temporarily financially independent, even if it’s not long-term, right? You had some buffer; that’s what I mean by the buffer - you had this margin to play with. And within that margin, you just didn’t go and sip MaiTais and sit by the pool… Not that that’s a bad thing. Maybe you did it for a day or two, and you were like “You know what, I’m just going to recharge, I’m going to regain my margin, I’m gonna rediscover myself, do a little retrospective, but I’m gonna get back to work, because I have curiosity to pursue.” But it’s this buffer that that moment created for you, that enabled that next leap. And that’s kind of what I was trying to figure out there, is how much did that acquisition and all that play into that ability to create that buffer to get you where you’re at now?

[18:09] It did, it gave me a buffer, but I think maybe more importantly, it allowed to be more ambitious and dream a little bigger. And I know it seems cheesy, but the context is I am a first-generation immigrant; you kind of have to take baby steps sometimes. It’s like, okay, one step, and then the next one… And so when I created Concord, I just – I’m not sure I was kind of ready for the impact, and maybe the breadth that Redpanda has now. And so having that buffer really allowed me to dream bigger… And it’s cheesy, but it’s true.

What does it mean to be a first-generation immigrant? What is that? Break that down for me?

You know, there is no backup. There is no other place to go. My mom didn’t obviously have the means to convert Colombian pesos into US dollars and support me if I fail. And to me, for my context, it means “I have to figure this out. There is no alternative. I either figure this out, or I’m on the street.” There is no buffer. There is no generational wealth. I don’t have a cousin in the US that is like “Oh yeah, why don’t you come and crash on my couch for the next six months? I’ll pay your food, and you’ll be fine.” That didn’t exist. You do it because you have to, and there’s no alternative.

Was that like a daily reminder to you, or was this sort of like an undercurrent to your life? How did that manifest daily to your hunger?

I think later in life I probably should have more backup plans… But largely, for the majority of my life, I had this one thing that I kind of became obsessed with, and I don’t really have an alternative; in my mind, where I sit today, Redpanda, the product, will be successful, and if the product is successful, we’ll make money with the company, and it will happen. To me, I don’t see any alternative for that not to work out.

At this point in my life, even if it’s not the financial success, the obsession is like to me I see an opportunity for changing how actually developers build applications. And for better or worse, I don’t know, I don’t have any backup. There’s nothing else. And maybe for the bigger parts of my life that’s how I operate. And so I’m not sure so much as an undercurrent, so much as more of like a mental framework. I was like “We will figure this out somehow.” Maybe not with the same impact, or reach, or speed, or whatever. Something will change because that’s how life works out, but I think it’s fundamentally how I approach a lot of things, and it’s worked out well for me. I know it’s probably bad advice for a lot of people, but so far it’s worked out for me.

Well, I’m not a first-generation immigrant, but I can similarize my story to yours, because I didn’t have a backup either in my journey. I grew up poor, my dad died when I was young… So a similar – single mom kind of thing, a lot of similarities in our story, except that I’m not an immigrant first generation, but there’s still that… I didn’t have a backup either, so I can totally empathize with that with you, because that was a lot of my story. It’s just interesting how that works out, but…

Yeah. And I wonder if you found – I guess where you were growing up, maybe there’s like subcultures, I think, in the US. For me, it was skateboarding and punk rock.

I skateboarded. I was a straight edge when I was in high school, you know?

Oh my God, that’s amazing. Listening to probably like Minor Threat, and some of the other [unintelligible 00:21:37.13]

Yeah, exactly. That’s amazing. I wasn’t in the straight edge, but I did go to a [unintelligible 00:21:45.16] punk rock, and…

I don’t know if I was really straight edge, but I said I was. I mean, I was on a journey to grow up, I was an adolescent, so I was anything I needed to be, I suppose… But I was straight edge, I skated, I had Etnies, I had duct tape on my shoes, because I skated… My shoes were torn up, I had long hair… I was way different. Way different.

[22:06] Yeah. Well, the culture that this provides is sort of this – there’s like this sense of independence, I think, in that subculture, that was really powerful for me, at least when I was a teenager, just trying to attach myself to something… I was like “Well, whatever. These people may not be the same background as me, but we sort of identify with the same struggles.”

Struggles, yeah. Exactly. That’s the thing with humanity - while you may have a different background and a different journey, and I have a different background and a different journey, we have very similar struggles, despite being so different in our backgrounds and upbringings. That’s the beauty of humanity, is there so much connectedness, while also being way different. And that’s the thing I love to bring together when it comes to animosity, is like “Where do we have similarities, less the differences? Are your struggles somewhat like mine?” that kind of thing. And then that’s where you sort of empathize and connect and unite and come together.

We have people in 22 countries right now, and the fascinating thing to me is – and this thing sounds truthy, but it’s a different thing to live it and experience it, which is, you know, most people just roughly want kind of the same thing. They just want to be successful, they want their kids to be fine, and go to a good school…

How simple is that, right?

Honestly, the basics. And to me – like, they sound true to you in abstract, but when you actually get to experience it, it’s like “Wow, it actually is exactly like you would expect people.” You know, there’s a bunch of good humans, and the struggles are very similar across the board.

Let’s dig deep into Redpanda. I want to dig into the actual tech, the inception of it… I’ll describe some of the, I guess, overarching, and we can sort of poke holes in the story as necessary. So you were a fan of Kafka, you were inspired by that team, but yet you compete in terms of obviously mindshare and market share; while it’s open source, Redpanda to my knowledge is not open source… So I’m just curious about all of that, between the obsession on that weekend, I’m sure, or maybe several weekends, to create the storage engine… Take us into that journey even further to Redpanda.

Yeah. So when I was operationalizing a bunch of the Kafka clusters - and I had been a user for basically since like ZooKeeper 3.2, and the first Kafka release is 0.7, or 0.8, or something like that, a long, long time ago… And so I’ve always been a fan of the programming abstraction that it gave developers, and the ecosystem that it developed. But it was really so hard to operationalize, really, to just like “Okay, how do I make this thing stable, so it doesn’t crash on me?” And you know, probably things have gotten better. So to me, I was inspired by them.

Now, there were multiple sort of deep technical approaches that changed the way streaming works. To give credit to Pulsar, what Pulsar did - the background of Pulsar is that they came from Yahoo, and so it was all about this disaggregation of compute and store, because that’s how Google published the papers, is like “This is how we get to scale.”

[27:58] And so Pulsar - and I want to relate it to Kafka in a second - they really pushed the disaggregation of compute and storage. They’re like “Hey, storage is S3.” And I was like “Ooh, that’s a really good idea.” And then Kafka, in my mind what they did is that they built this massive ecosystem. And so for Kafka, you get connected to TensorFlow, and Spark ML, and ClickHouse, and Flink etc. Basically just about every single data framework somehow, if it does streaming, it connects to the Kafka API. So it sort of became this lingua franca.

And so to go a little deeper, when I started Redpanda, I said, “Hey, how do we evolve this idea of streaming?” You know, I’d been working in streaming for a really long time… So when I went to storage, I said, “What Kafka has done well is the API.” And mostly it’s about the ecosystem that it brings with it. And what Pulsar has done well is that they adapted it for this cloud-native model, where you disaggregate compute and storage.

So Redpanda - I’m going to build a new storage engine from scratch, and the summary of that is that sometimes you get to reinvent the wheel when the road changes. And if I were to start from scratch, what would be like the three primitives? Keeping in mind that I wanted to use those two ideas of disaggregation of computing and storage in the Kafka API.

And so the three tenets were you had to be super-fast, because people are not going to move if you’re 1x, or 2x, or 3x. You have to be 3x, with published benchmarks where it shows like 70x with 50% of the hardware at a gigabyte per second.

Two, it had to be the best developer experience. And the way I envisioned that as an engineer is “Okay, if I wanna deploy this system, the simplest thing I could do is I can put everything in a single file”, and then you put that one file on 1, 2, 3 computers, and you’re done. That’s the deployment model. And then the last one is that it shouldn’t lose data. So we use modern replication protocols like raft to give the developer actually a sound understanding of what it means to have two out of three replicas.

So that was sort of the napkin sketch when I started Redpanda. And I really need to acknowledge the work that the entire Kafka community has done, which is the reason why we don’t have any clients. I actually think that the ecosystem needs to thrive on the Kafka community. I honestly see Redpanda the company being a good competitor to help both products get pushed in the designer space, that would be challenging without competition. And I could talk about specific examples where we pushed that, but the container startup time for a developer - I know that the presence of our container being the fastest has definitely pushed upstream to make changes in other companies, so that the developer experience is awesome, or the data safety story is better, or the clustering story is better. And those are the things where I see actually having competition being great for the developers. There might be friction, I think, between competing companies, but as far as the technology is concerned, actually it’s largely complimentary. I think that together Redpanda and the Kafka community - we make streaming easier and more accessible to developers, which is ultimately what we’re both trying to do.

This is an interesting space to be in. I mean, obviously, data is the new oil, and translating it, moving it at the speed of whatever it needs to be is paramount. What are some of the clients, the users of Redpanda? What are they doing? What are some of the actual use cases in the real world? How does it translate?

Yeah, StoneX is a really good example. They are a Fortune 100 company. They basically do their hedge trading through Redpanda; they did a YouTube on it too, so people can look it up after the podcast… But what’s cool about it is because the average latency is four milliseconds, but it’s predictable. So to talk a little bit more technical, one of the things that Redpanda has done really well is that we spent an inordinate amount of engineering time, money and effort in actually making sure that the predictability of the software is there all the way up to the max latency spectrum. Once it runs at a particular load, it’ll stay there; there’s like very little spikes. It’s really easy to understand. And so for a company like StoneX, that predictability is really the thing that matters, and the fact that it’s fast. So that’s one.

[32:13] Two, Lacework is an interesting one. So they do security events, like detection, and any sort of triggers alerts, and they do – it’s a very sophisticated company. And so they run at 14 gigabytes per second writes on top of Redpanda. We have others that kind of push the envelope… So that’s more - I would call that maybe classical eventing architectures.

Some new things that Redpanda is enabling, that can only be done because of the latency improvements, are things like space explorations, or electric cars; we have one of the largest electric car companies in the world using us internally. We power a few space exploration, where people actually ship the satellite to outer space, and part of the thing that is orbiting Earth is powered by Redpanda, which - I find that fascinating.

Oil and gas pipelines jitter. So if the physical pipes wiggle a little bit too much - which sounds absurd when you first hear about it - it goes boom in physical space, and you really don’t want that. And so what’s been fascinating is like the classic as well will do, but the new things that we enabled, we’re honestly just sort of starting to scratch the space.

And it’s because of latency, really. Like, if you can get faster and more real time, or you can do predictable four-millisecond latency, like you said, and you put all the effort in the engineering to sort of make that a norm, versus not a norm, then you can probably do a lot more things.

And as you’re talking about this, and I’m reading more through the notes here, I was looking at just the fact that NVMe’s were not a thing a decade ago. And you’d mentioned the beginning of this was like 2019, roughly, right? January, sometime in 2019, I think… So NVMe’s were up and coming, I would say; not so much – they’re ubiquitous now, right? If you build a new machine, you’re gonna put an NVMe as the primary store, on a personal computer, a gaming PC, for example, or even a server in your home to store a lot of data. I’ve gotta imagine at the data center level with Akamai, and whatever - like, that became the absolute norm. Was Redpanda designed in a world where you can do 7,000 megabytes per second reads and writes to a disk? Is that – the Redpanda was built in that world, versus drives that spin, for example, that rust?

Totally. That is like a fundamental rearchitecture of how we approach the problem. As a programmer, honestly, when you start a project, you have a few – just like not that many variables. Like, what’s your memory model? What’s your threading model? What’s your concurrency model, which is related to threading? Once you make those decisions, a lot, like a vast majority of decisions are on the programming language. But there’s like a huge category of decisions that are already made. And so when you look at hardware and the bottlenecks, a lot of storage engines were actually built for the clock tick of the spinning disk, which is - for those that run Windows, and right-click C: defragment you would hear these AOL dial-up noises on your hard drive because it was actually moving pages, and the spindle head was actually moving things, and rewriting the bits… So in terms of numbers, to make this numerical, there was high double-digit millisecond latencies sometimes, when you had a little bit of contention. Now, on this computer I’m doing low single-digit microseconds. And so when you get 1,000x performance improvement, you can never eliminate essential complexity, full stop. Like, if you need to send data to two computers, you need to send data to two computers. And so complexity simply shifts around, and it moved away from saving a single page to disk into CPU coordinations, which is “How do I keep, on Google, for example, these 225 cores busy? How do I do it?”

[35:49] And so we started with a fundamentally new threading model called a thread per core architecture, where every thread looks like an independent computer, more or less. FoundationDB actually observed similar things a long time ago; they sold the database to Apple. So this idea of a thread per core really allows us to do two things. One is remove bottlenecks on the threading, and it gives us a simpler programming model. But ultimately, it allows us to extract every ounce of performance of the hard drive. And modern NVMe’s, by the way, you could drive about 1.1 gigabyte per second sustained reads and writes to a particular NVMe. They are so, so good.

Oh, is that right? Particularly, yeah, you have access to more advanced NVMe’s than I do then… So I was thinking 7,000 megabits per second, but that’s certainly a lot more.

Yeah. And by the way, the latency is amazing… Which is why when we went to market I said “You no longer have to choose between speed and data safety.” You could get both. All you have to do is like reconstruct the software in a way that actually can take advantage of what the platform is. And so a big part of our story is like you no longer have to choose, for most use cases, right? Like, you could get both data safety and performance.

Interesting. Yeah. Wow, that’s interesting. You really look at the whole entire model differently. Since your disks are so fast, there’s almost zero latency, and your CPU is essentially probably bored if you’re not pushing it far enough, to rearchitect to really take advantage of, as you said, 200+ cores on GCP to give you… That’s interesting, the thread per core idea as well. All of that is just fascinating, because you looked at the problem differently, and you didn’t start with software, you actually began with hardware.

Yeah. And what that allows us to do, which we’re super-excited to share, is this idea of “Well, what else could you do?” In terms of even the – in plain terms, with two partitions, in the Kafka parlance… So for those that are new and listening to this podcast, Redpanda adopts the Kafka model, where topics are unordered collections; topics are broken down into partitions, and a partition is a totally ordered collection. And so I say this in that once you have two partitions, you can saturate basically the storage device, more or less.

And so now, what we could do - and most people don’t run this, right? Most people have like 10,000, 5,000, 2,000 partitions in a particular cluster. They’re like a mental model for data replication. But what this allows us to do is it’s giving us the freedom to explore different computational models. It’s like “How do we continue to make this easy?” WebAssembly was a super-interesting experiment that we’re looking to invest more money and engineering effort this year, because you’re like “Hey, what if you do –” I’m not talking about Flink-level stateful aggregations. I’m just talking about simple data masking. It turns out, when you ask a practitioner like “Hey, what do you do with the JSON object?” they’re like “Oh, I remove the social security number, I add xxxx, and whatever. And then I erase it, and I give this IP address like a credit score.” And that’s a real big bulk of the streaming, and so WebAssembly was an interesting experiment for us, where it’s like “Well, what if we just embedded it as part of the product? We have the CPU cycles, so what could we do with them?”

And other things, like pushing data into a tiered storage, in a different format, that is not just row-oriented, but columnar-oriented, with things like Apache Iceberg, and so on. So having this rethinking allows us to just play around with more ideas to continue to lower the barrier for people to try streaming.

Yeah. How does cloud play into this? This is one of the misses you said that you kind of had, and I think – where are you at today with cloud, and how did you miss?

Yeah, so I wish I had launched the cloud earlier. It’s like a big lesson learned for me. I wish I had partnered with a talent organization earlier, to kind of recruit for cloud faster. When we went to market, the context for that is we went from zero to the largest known Kafka workloads literally in the first year. And so we were just hitting the scalability bugs in the the first year, and we were like “How did we not hit this in our testing?”

[39:57] We had a social media company in Europe, where they were pushing a gig per second on the Kafka cluster, and they moved to us and they’re like “Oh, hey, can we push 10x?” And so they went to 10 gigabytes per second. I was “Ooh…” Literally it’s like the first year of selling the product.

And so we made the decision like “Well, we are a data store, and we can never lose data, and so if I had to make a decision, I’m going to make a decision to do right, and I will fix all the bugs. You paid us money, and I feel like part of my integrity is on the line, so I will do right and fix it.” And so when push came to shove, I prioritized stability, and basically being loyal to the existing customer base.

An alternative, by the way, which - this is like the debate, and I actually do think it’s a miss in terms of adoption… I could have just said, “Hey, maybe this is all just cloud-only to get started, and once we hit this particular scale, then we offer it in a different self-hosted market.” So that was amazing. I think I could have done A and B if I had actually partnered with a talent organization earlier. I brought – Jae is now my executive who runs my talent organization, and if I had her join earlier in the life of the company, I think we could have delivered both A and B sooner.

You know, we can always say woulda, coulda, shoulda, right? I mean, we look back, and I can appreciate you saying that’s a miss, and even being vulnerable and sharing that… The right answer sometimes when you have A or B is “Why not both?” But realistically – you can say that in retrospect, with rose-colored glasses, but realistically, in the moment, you probably needed focus to give yourself some credit. You probably needed to batten down the hatches and make sure stability was there, so that you can honor your promise etc. And I just wonder if that’s really true, if you could have done both. Could you really have done both? I mean, what would have enabled both, really?

Yeah, but I think you have to pick your battles, and I guess I struggle with that a little bit. In my head I was like “Well, why not both?”

But you seem like you hold yourself to a high standard, so that makes sense.

Yeah. You know, strategically, one of the best things that cloud has allowed us to do and explore is this idea of data sovereignty, which is this totally new idea for cloud products, that in my experience only started to become popular really in the last six months. And with cloud – okay, so part of the reason why this was late, just for a full context for the audience listening, is that we did launch a cloud, and we actually onboarded public companies in the cloud, so it actually did work, but it wasn’t the architecture that I had in mind. I was like “This is not the future of cloud.” And so we threw it away, and then we wrote a new cloud. [laughter] So maybe the context is that we did end up writing it, but we just wrote two clouds before we called it GA.

And so on the second version that we just launched, what it allows us to do is this idea of “Bring your own cloud.” Still fully managed. But the hard drives, the data that Redpanda writes - those hard drives belong in the users of VPC, but the control plane is in our VPC. Now, from a user, it’s still a fully-managed SaaS, just like Mongo Atlas, or something; you go in, you click-click, deploy a new cluster, in your region, in your cloud etc. That all happens exactly the same. But the user owns that data. And what’s interesting in that is that at least the computational model that we define is unified for both dedicated, and this idea of BYOC.

I think that privacy is easier to achieve the data sovereignty, where for some industries like insurance, or marketing, where if they add a new vendor, they have to send an email to like 10,000 customers. They’re “Hey, we’re adding a new vendor. These are the risks” etc. And so we de-risk all of that, because they get to run their code, Redpanda data plane, in their VPC, and we simply do all of the management. It’s all automated. And so that cloud was really what took us two years to build. It’s like, a new version that gives us the flexibility to either run the data in your cloud, or in our cloud. I think that’s probably why it took so long for us.

[44:01] Yeah. Does Redpanda live in GCP? How do you obtain, I guess, the lack of latency between clouds if the control plane is wherever you host it, and I’ve got my NVMe’s in a separate space. How does that play out?

Great question. So Redpanda, remember, is a single C++ binary. So that C++ binary lives inside your VPC. But in addition to Redpanda, we launched like a small proxy agent, and the proxy agent is the thing that gets commands from the control plane. So you can actually think of it as a command database. I call these control plane databases, but the idea is that on the UI - let’s say you say “Hey, deploy a cluster”, or “Expand the node.” So there’s a command that gets shipped to that agent, and then the agent, asynchronous, on its own time, will expand the cluster. You know, this happens in like a milliseconds timeframe, for context. But your data plane, where Redpanda lives - it’s local in that VPC. So when you connect your applications, BYOC gives you this super-low latency, because it is exactly as though you were managing it yourself, but you’re paying for a fully SaaS solution.

I want to highlight a couple of things though… I think this would have been much more difficult even five years ago. The reason this is possible today is Amazon APIs are great. Same thing with Google Cloud APIs. So there are new APIs that were invented. Kubernetes gives us a single execution plane, so that it looks the same to us on all clouds… And new technologies like WebPack Federation, which is effectively distributed hierarchical user experiences - so basically, JavaScript that gets delivered between multiple TCP endpoints is kind of the gist to describe it - those are kind of the pillar fundamental technologies, and I think five years ago, or even ten years ago, it would have been extremely difficult to do this BYOC model. And so anyways, we’re kind of the child of recent technology improvements.

Why is this sovereignty important? I mean, there’s an obvious answer there, but what’s the big picture of why sovereignty is important?

There’s a recent trend for special government, and things like that. An example is in Germany GCP talked about this in public. Amazon talks about this in public. Nation states are telling all of the hyper clouds that they want the services, but the clouds have to stay local, in that particular – like, government needs to understand where the physical building, with the physical address is located in that building. It is fundamentally important to a nation state security. The reason that’s important is the kill switch is just simply turn off a routing table, and now Google can’t access it, right? So let’s say there’s sort of nation state – now, I highlight that as an extreme use case, but it’s a very similar need, where people need to control their end user data. They need to control who has access to what, especially for sensitive data, any personally identifiable information, which at some point and ends up flowing through an event architecture, in any case… And so sovereignty is more difficult to achieve than privacy, because privacy is - they’re policies: mask, rule this, do this, obfuscate, delete this data… Sovereignty says, “You, Adam, you know the entire lifecycle of the harddrive. You understand when the hardware gets destroyed, and when data gets deleted.” That is really powerful for people that need to prove to their customers… Healthcare is a really good example. All sorts of contractors for sensitive information. Satellite is another really good example. And so there’s like a ton of industries; oil and gas… There are entire industries where sovereignty is basically paramount for them.

When it comes to, I guess, the hardware of the cloud that you built, did you build on top of GCP, or AWS? Or did you go bare metal? What’s some of the infrastructure choices you had to make to make this state-of-the-art future possible?

Yeah, so our control plane runs on both AWS and GCP. So if you launch a GCP thing, then it basically will work - like, the data plane will live in GCP, and then it’ll make a TCP connection back to the data plane. The things where we innovated a lot were in this idea of WebPack Federation, which is how do we ship multiple UIs, so that when you log into a dashboard, like a cloud dashboard, it doesn’t feel like this janky iFrame from the 1970s, or like the ‘90s, or whatever it was when I was first writing HTML. It feels like a unified, modern product; you shouldn’t see any difference. The technology is really meant to be kind of this magic behind the scenes. So that was one.

But on the data plane side, that’s where we invested a bunch of effort. An example is for Amazon we ship an ARM-optimized build for their particular processors. So you get 30% better bang for the buck. It’s typically related to latency and throughput, and more predictable latency. So ARM - so we did that, too. We built NVMe profiles. So we spent about $50,000 profiling the best VMs. And so when you deploy – and all of what it was, it was literally just benchmarking the NVMe’s that are associated, and we just like launched a ton of experiments across all categories. And we embed that database in the control plane, so when you say, “Hey, scale from one megabyte to 10 megabytes”, we’ll pick the instance for you, we’ll size it, we’ll tune the kernel, we’ll do a bunch of those things based on the empirical evidence, not what Amazon or Google lists on their website, because that’s never the truth… But actually, what did we – as measured on the hardware, what did we actually get? And then that knowledge is sort of shipped to the data plane. And so that’s more of the kind of innovations that we did on the data plane to continue to get better latency and performance.

It’s a lot of work. That’s a lot of investment to – I guess it’s necessary though, right? Whatever is necessary to get the job done, that’s what you’ve got to do.

Somebody has to own the complexity. It’s gonna be either us, or it’s going to be you. And I feel like we’re [unintelligible 00:53:17.05] so we should probably onboard that complexity.

Yeah, for sure. I guess where to next? What’s the state of cloud now? So is it available, is it GA? How long has it been out?

Yeah, so it was launched on November last year. It’s SOC2-compliant, it has VPC peering, managed connectors… The things that people would expect. For me in particular, where I’m trying to push the company direction is into this idea – so for context, we built Redpanda with this technology called shadow indexing, which in plain terms is like a bunch of suite of technologies that enables tiered storage to be seamless. And so one of the challenges with immutability - and this is what a streaming service does - is since you cannot mutate the data, you can only either append or truncate; those are the only two operations.

[54:09] And so streaming - you always are appending. It means you always are writing to the – it’s like an insert-only table, you can think of as streaming. And so at some point - well, you have to remove the tail of the log, or the head of the log, to make space for new data, because you’re going to run out of physical disk space. Instead of that, we upload that data to S3. And what that gives developers is super-cheap data storage, because you’re just leveraging the cost efficiencies of S3, which is a couple of cents per gigabyte. And so I say that as sort of the keystone where I actually think streaming is heading.

And so for us, it’s leaning into more open formats. What if the data that lands on S3 - what if all of your event data is somehow type-checked so that when you insert into this particular topic, the data format, it’s correct? Like, it’s either protobuf, or it’s avro, or it has the right columns, or whatever it is. And then we take that information and do a columnar projection on the tiered storage, so that you can connect to Snowflake, or you can potentially connect Databricks cloud, or you can use ClickHouse, or you could use any MPP style database. And so that gives people freedom… As opposed to the way it’s done today, where people kind of connect one stream, and then they send that one stream to Snowflake, and then the next stream to this other database, and the other stream – I was like “What if the data was just in an open format that all of these vendors are willing to support as the row and the line. That sort of changes the way how you think about streaming and architectures.

Are you pioneering this open format then?

So we’re going to leverage what all of the datalakes, data warehousing category of companies are using, which is going to be Apache Iceberg. So we’re going to leverage that, but as far as I’m concerned, we are the only company in the streaming world that the uniform format is going to be a columnar projection, because it fundamentally changes how streaming is thinking. Streaming at the byte level is like row-oriented, because you’re reading one record and writing one record. But for analytics, you care about column projection, so you can actually skip data, and do fast queries. And so as far as we know, we’re the only company that is working on that kind of open format initiative.

I mentioned earlier in the call about this tension between you and Kafka… And Kafka is obviously open source. I’m just curious why you chose the BSL license. How did this play a role in the economics and the business model of Redpanda? Considering your inspiration of, and probably - I’m sure you love open source, right? I mean, who doesn’t love open source, right? You have to, in this world; you have to embrace it. Because it has won.

It’s how we grow up, basically.

Right. How did you explore, I suppose, the commercial viability of Redpanda juxtaposed against the possibility of open source? Why BSL, why eventually open source?

So for those new in licensing, BSL turns into Apache 2 on a rolling four-year window. It means the code today, in four years from today, is going to be Apache 2. And so that’s one. The second clause that is interesting is that for us, frankly, we didn’t want the hybrid clouds and the most dominant cloud in the world to take our work, and then make money. We are a company, right? I have to pay my developers. It takes money to run a company. And you know, this is a totally personal decision. I don’t think there’s a right or wrong. I think the context matters. That context for us is that we wanted to build the company, we wanted to continue to innovate in the streaming space, and to me, it felt like the right balance. I was like “People use us without paying us”, and I’m talking about probably the largest company in the United States by market size. I’m talking about some of the largest other space exploration companies - some of them use us and they don’t pay us a penny… And I know, because some of my engineer friends work there. I know that we’re saving them like $10 million a year. I was like “Hey, you should probably pay us 100k.”

[58:05] Anyways, so there’s still a lot of sense with that, and that’s okay. What I want to do is I want to build a cloud business. And so the restriction is no one – or you can, but you would have to pay us license. No one can build a Redpanda cloud without paying Redpanda licenses. That’s fundamentally the restriction. Other than that, people can just use it as whatever it is.

And so that was, to me, for this project, for the context, for kind of like who I am as a person and the project I was trying to build, that seemed like the right balance. And so now the sort of following effect of that decision is that because we already have that protection, it allows us to make the best possible product. We don’t have to build peripheral systems that we need to monetize. We can simply build the best possible product that we know, as opposed to the way a lot of the open core companies are doing it, which is like there’s zero monetary value in the core, and then you have to build services around that, whether it’s a new database layer, or whether it’s like a schema registry layer, or something like that, that you can monetize so that you can actually thrive as a company.

So I actually empathize quite deeply with the struggle of companies that are based on that, in part because open source gives people the adoption. Like, it’s basically infinitely cheap; you have all of these developers, and you just call them whenever they’re in production and they need help. But the challenge with that is that then you have to build services around this that make the core value offering better. Because to embed that code in the core, then it would have zero monetary value.

And so anyways, with that, I just wanted to say that for me, the balance of expiring in four years to become Apache 2, with the clause that we are the only company allowed to build the Redpanda cloud, seemed like the right balance. And it’s worked out great. No real issues. In fact, I think people empathize with the way – and look, open source has evolved a lot since it was originally started, and there were the hybrid clouds trying to eat your lunch… Now it’s like a very different world today, and that’s where we are.

Yeah. Well, you don’t have to defend your decision, but I do love that you described decision. It’s important whenever you look at the success you’ve had, and the heart you’ve had to get to where you’re at, to just understand the decision. Because other founders, other future founders will listen to this show generally, and then maybe this episode in particular, and hear your story, and say, “Okay, that’s why Alex chose the BSL.” That’s why he rationalized that, and that balance you mentioned. And then future them is gonna weigh that against their scenario, their context, and obviously, the way that open source has changed in that day, and make their own decision. But it’s important to look at why you made the decision of BSL, and eventually open source, versus open source and then how that defense comes around with like ancillary products, as you mentioned, versus the singular best product, and defending against the clouds.

And it’s likely - you know, for those that are going to listen to this in the future, in like five years… Remember, we started this in 2019, so like four years, or five years. A decade later, it’s very likely that a different set of trade-offs may make sense for that particular time and context. And so it’s all extremely contextual. And I’ve actually spoken with all of the CEOs, in particular the people that have to make the decision on licensing, and how that affects the sellers, and the seller’s salary that affects their family. You have to design your people’s quota attainment, and that quota attainment defines the salary that they bring home. That’s when the impact of licensing has a big sort of mental toll. I was like “Man, I really can’t screw this up. This is like the one thing I need to take care of my people.”

And so I talked to a bunch of them, and at the time that we chose that license, that felt like the right balance. And in 10 years it may be totally different. So I think context – if there’s one thing people take away, context is kind, and we evaluated it at the time. The world may be a different place.

[01:02:02.28] If you had to make that decision today to go BSL, given today’s context, would you still make the same decision? Or would it be different?

If there wasn’t a Redpanda available, which I think there would be, I think that there’s just like – I know that we have changed the roadmap of other startups. If there wasn’t a Redpanda, probably… If there was a Redpanda, then I would only build a cloud service.

Cool. Okay, so let’s move on to goals for the future. What’s the horizon for you? Where are you trying to go? What’s the next big thing for you all?

I guess in terms of success for us, we hope to be at the driver’s seat to take this company public. And part of that is because we have a different view on how the world should look like. And I see Redpanda, the product, as sort of this keystone that is going to help the market transition from batch to real time. I think we have an opportunity in changing that, and sort of building the best developer experience for that. So as I look towards the future, that’s really sort of the next milestone.

In terms of technical developments, I mentioned things like leaning into open formats, leaning into things like WebAssembly… Future-looking, it’s going to look a lot more cloud than it has been in the past. We had to start somewhere, and we had to pay the bills, so we started with the self-hosted, and that’s kind of where we are today.

So yeah, I’m just excited to try and – look, the way financially I think about this, and actually also from a developer perspective, is we don’t have to make all the dollars in the world. If we manage to help developers to think differently, we will be successful. We just have to capture a small amount of the value that we create. And I think that’s how I really fundamentally think about the product, is if we manage to help people think differently about how to build applications, this company will be really successful. And part of it is to continue to work on our YouTubes, and our tutorials, and free tools that people could use.

An example is we’ve built the most successful Kafka-to-Prometheus metrics converter. It’s called KMinion. It’s Apache 2, by the way; it’s open source, it’s maintained by us. And what it does is it takes Kafka and it makes it into the Kafka metrics, like for Apache Kafka open source, and then it makes it into Prometheus queriable metrics, and it’s the most used thing in the world for that. So anyways, continue to make streaming easy. At some point, we’re going to capture some value of that, and… Anyways, so that’s just how I think about the future of the company and product.

What about the inception of this scholarship, Hack the Planet? What’s the backstory there? Is it for the future Alex’es, or the Alex’es that – I mean, this wasn’t in place whenever you were young. What would you have done if you had Hack the Planet scholarships available to you? I guess maybe you took advantage of some of them to some degree throughout your career, but what is this Hack the Planet scholarship?

[01:04:47.29] Thanks for asking. It is a totally non-scalable program. And by that, we help one person per year for four months. We give them money, and mentorship, and release all of the intellectual property and expect zero results. This is not an internship. It is really just designed to help people dream bigger. Some people look like me, and they don’t have opportunities to work on the things that I’ve had a chance to work on. And I didn’t have a Hack the Planet scholarship, but I think that people, especially from underrepresented backgrounds in tech, could really benefit from this.

To date, we’ve helped someone in the UK, who is a woman. We helped someone in the Middle East. We are now helping someone in South Africa. So it’s actually a global program. As long as we can wire the money… I’m not doing Moneygrams anymore. That was the first one, and I was like “Okay, this is too crazy.” Like, walking in and wiring money doesn’t seem right.

But anyways, as long as my finance can wire you money into a bank around the world, we will send you money, and we’ll help you. And what folks typically – like, the superficial value is we’ll help you understand, we’ll help you program, and help you think differently; we’ll give you a different mental model. It’s one hour a week with the most senior people who have built the largest system, and then they also get an hour with me per month, to help them think differently about goals. It’s like “Hey, why don’t you just try something harder?” And they get a little bit of money. It’s not a lot of money to – you know, it’s like 1,500 bucks. But here’s the gotcha - you could sit on your couch, and you could not do anything, and that will be on us. I expect zero from this. I don’t have any financial returns; there’s nothing in this for us. We just want to give, and try to make the world – it’s not a scalable thing, but if it’s just maybe influencing one person per year at a time, it’s something that I feel proud of, and I think it’ll change some people’s life at some point.

So one person… How do you – you must get just tons of applicants, or at least 50; let’s just say 100, maybe 200. I mean, how do you get down to the one? What’s the process involved?

It boils down to how we – this team today, we ask them like “Go on LinkedIn, look at the engineers. How can we help you the most?” And for the person where we have the largest leverage, that still fits the base criteria, that’s the person we’ll pick. So you’re picking an ambitious goal… Typically, it is very sweet, but a lot of these people, they’re like “I want to build the database in 3 months” I was like “Okay, one person is just new to programming.” But it’s great. That is the kind of like dream we want people to empower, just so that they uncover the truth of saving bytes to the file system.

Anyway, so one is they look at the LinkedIn and they say, “This engineer, this engineer, this engineer will really help me to – this is the impact that you’ll have.” And so we pick the best project out of that. We actually get thousands of applicants, so it’s very hard. Personally, it takes her like a month usually just to go through that.

Thousands, huh? Wow. I mean, one person out of 1000… I guess that is the needle in the haystack, so to speak… And what a burden to make the choice. And then I guess the lack of scale, which is by design…

You know, but if other companies start doing that… I know we’ve inspired other companies, like DoorDash. I haven’t followed up with them, but I know that they launched – they wanted to join a similar thing, and what I told them was like “This isn’t a scalable system. If you choose to do it, don’t scale it to hundreds. Scale it to two, or one.” Really, like, the most senior engineers - they aren’t that many. I could tell you that at Akamai it was like 10, 15 people at most. This is like a company with 10,000. So those are the people that will make a difference for this underrepresented background people.

Very cool.

Yup. So that’s how it goes.

I love it. Love it. This has been a fun conversation, Alex. I’ve enjoyed hearing your journey firsthand, and I’m happy to share it here on Founders Talk. Just stoked to see your ambition, your obsession, and out the other end, magical things have happened… So keep doing what you’re doing, for sure.

Thanks so much for having us. I really appreciate it.

It was awesome. Thanks, Alex.

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. 💚

Player art
  0:00 / 0:00