GPU dev environments that just work with Nader Khalil, co-founder and CEO of brev.dev (Practical AI #208)

All Episodes

Creating and sharing reproducible development environments for AI experiments and production systems is a huge pain. You have all sorts of weird dependencies, and then you have to deal with GPUs and NVIDIA drivers on top of all that! brev.dev is attempting to mitigate this pain and create delightful GPU dev environments. Now that sounds practical!

Changelog++ members support our work, get closer to the metal, and make the ads disappear. Join!

40 minutes
Recorded Jan 18, 2023
Published Jan 24, 2023
Download (38MB)
Transcript
🎧 21,829

Featuring

Nader Khalil – X
Chris Benson – Website, GitHub, LinkedIn, X
Daniel Whitenack – Website, GitHub, X

Sponsors

Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com

Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Notes & Links

📝 Edit Notes

DISCOUNT for our listeners 🔥:

Use coupon code “practical-ai-2023” for 5 hours of free GPU compute!
brev.dev doesn’t offer credits often, so the credit redemption button is hidden by default. Go to this link to expose the button.

Chapters

Chapter Number	Chapter Start Time	Chapter Title	Chapter Duration
1	00:00	Intro	00:35
2	00:45	Welcome to Practical AI	00:30
3	01:20	Nader Khalil	06:36
4	07:56	Dependency issues	02:26
5	10:22	Stepping away from the classical approach	03:32
6	13:55	Getting excited about Brev.dev	04:19
7	18:14	Multi-cloud considerations	06:43
8	24:57	Brev.dev Templates	06:18
9	31:16	Collaboration with Brev	01:44
10	32:59	Brev in private environments	02:31
11	35:30	Nader's future	03:25
12	39:05	Outro	00:45

Transcript

📝 Edit Transcript

Changelog

Play the audio to listen along while you enjoy the transcript. 🎧

Daniel Whitenack

Welcome to another episode of Practical AI. This is Daniel Whitenack. I’m a data scientist at SIL International, and I’m joined as always by my co-host, Chris Benson, who is a tech strategist at Lockheed Martin. How are you doing, Chris?

Doing good, having a good 2023, and this is gonna be the best year for artificial intelligence, ever.

Daniel Whitenack

Yeah. Well, I mean, it must be. Yeah, we finally did our ChatGPT episode, and that was really cool, because I don’t know if you saw, Chris, it was the first episode where we had I think like over 10,000 downloads in the first week… So thank you to our listeners; it’s awesome to see that. We’re glad that it was useful, and we’re gonna keep the good content rolling right along, because this week we’ve got something super-practical, which I think everyone deals with what we’ll talk about today… But we’re privileged today to have with us Nader Khalil, who’s the co-founder and CEO at brev.dev. Welcome.

Hey, thank you. Thanks for having me.

Daniel Whitenack

Yeah. So I alluded to a problem that we all face, which is environment management… Developing in this environment, I need to have these dependencies, or I use this environment, now I need a GPU… Or Chris is on my team, and he needs to replicate my environment - all of these sorts of things, whatever category you put those in… So how, I guess, in terms of - you’re digging into this problem now, but how did you get there? What started you along this path of really thinking deeply about dev environments?

Man, we’ve had quite a twist and turn of a journey to get here. Yeah, the ultimate goal is just monotonous machine problems getting in the way of creative development. It’s funny, I went to UC Santa Barbara, I studied electrical engineering and computer science… And when I moved to SF to work, I was actually building cloud dev environments at Workday. And I did that for two years, and in December 2018 – actually, just before that… I was getting a beer with a bar owner, and he was telling me how he had 1,000 clicks on his Google ads, but his bar was empty other than me. And he shows me his metrics on his Google ads, and he goes, “Make it make sense.” And I realized he had a really good point. Digital ads work really well for digital businesses, because if someone clicks on an ad that’s an Amazon ad, you’ve entered Amazon storefront, there’s nothing like that for the physical businesses like his. And so he’s just using a really bad medium.

So my co-founder and I - kind of like the same co-founder with brev, we pretty much realized there was like a way for us to backdoor the Uber app, and so we put tablets in Ubers and Lyfts, and we let local businesses advertise on them. And if you tapped our screen, we would reroute your Uber to that location.

Daniel Whitenack

Yeah, that’s legit. Yeah.

Yeah. You got out with friends for drinks, you see “Buy one, get one free margaritas”, you tap the screen and we take you there. You get a free drink, the bar owner knows his ad worked, the driver got a tip… Everyone won.

Daniel Whitenack

Perfect.

So that was really exciting. That’s what I quit my job to go do. We did that for like two years, completely bootstrapped. We like ran out of money, I poured my 401k into it, we got into YC for that, we got to like a quarter mil ARR, and essentially, Demo Day was March 2020, which was right when the Shelter in Place happened in SF… And so we got to see our fleet of four hundred cars go to seven overnight, right actually the week of Demo Day. So we didn’t raise a dime, obviously, but –

Oh, I feel bad for laughing, but I can’t help it…

Daniel Whitenack

Yeah…

Yeah… Have you seen that GIF on the internet, of the raccoon with like cotton candy, and it’s just “Where did it go?” [laughter] That was very much like 8th March 2020 for us. But it was funny, because with a physical business, you have a physical fleet, right? We have physical operations. You imagine like physical hurdles being the hardest part of that. And in January 2020, we were starting YC, we got to like 15k MRR, things were working, and we need to just 3x, 4x the fleet. And that was like really hard for us.

We found out from one of our drivers that Uber and Lyft have these parking lots half a mile from SFO airport, where drivers go wait for these really valuable airport rides. So I go to the parking lot, and Uber security kicks me out right away. They’re I’m not a driver. So I’m “Okay, well, I’m gonna [unintelligible 00:05:28.09].” So I went to a gas station and I bought cigarettes. I light one up and just walk back on the lot, because now I look like a driver taking a smoke break. And I got right past Uber security, I’m on the slot to like 4am, talking to every driver; we 4x-ed our fleet that night.

So there was never a physical hurdle that got in our way. But once we got those drivers live, everything else went to s**t. We had like our advertiser dashboards really slow. all these random problems. One of them was like the ads, when they flipped on our tablets would just disappear and flash white. And if that happened at night, it’s jarring, and so riders would turn off the screen and you’d lose revenue for the night.

[06:04] And so it was really funny having really weird physical problems, but like we can sneak past Uber security and solve those. But like no, when we have to like sit at our computers and fix something, it’s like our dev environment’s slowing us down. And so it was almost instantly when my co-founder – like, when essentially the pandemic killed that business, my co-founder and I looked at each other, and those 20 days of January where we were trying to deal with our dev environment issues, and we couldn’t replicate these issues locally, just - so many weird, bizarre issues. We’re just like shooting in the dark. That was the only time with that business I had like a pit feeling in my stomach, like we’d forgotten an assignment, or something… And so we used immediately “How do we solve our previous problems?” And so we spent like a year and a half in pivot land, with a good Northstar. We built a very heavy abstraction, I guess… It was kind of like what Replit is now. At the time, Replit didn’t have databases, so you couldn’t really build applications in it. So we had this – we essentially said, “Hey, if we force our DEV environment opinions on you, you can’t have problems we didn’t already know about, because we forced your decisions. And so you wouldn’t have problems, it’d be a really smooth experience, as long as you did everything that we supported.” And so you’d get cron jobs out of the box, and tooling was already hooked up, and the database was already there, but you had to use our version of Python for your APIs, things like that.

So it was an interesting experience in the broader – like, everything outside of a dev environment, when you want to run tests and you need more tooling, and those things aren’t supported. So it was a great way to like plunge into the space, but ultimately, we learned that a good abstraction is only good if it pairs well with the problem it’s solving. And if you’re good at solving problems, you’re gonna have new ones to solve, which means you’ll need new abstractions, or a flexible abstraction. And so that’s when we kind of pivoted away from that and built like the current version of brev.

Daniel Whitenack

And a lot of what you’ve described – I mean, I’ve never tried to sneak tablets on an Uber, or something like that, which sounds like a really fun thing to try to do, and I love that story. Probably a less fun thing for me in my life is like the general arena of the very kind of specialized and weird dependency issues, specifically related to like machine learning and AI sorts of environments, and the differences that people have between trying to prototype something locally, and then trying to scale it out in a reasonable way. Did that factor into your thinking when you were building this in terms of these data science people out here, this explosion of AI tooling, and all of that? Or was that something that came along the way as you were kind of going in this journey and thinking about “These abstractions we’re thinking about, what kind of problems do we solve?”

Yeah, so it’s definitely something that we learned along the way. We initially started by trying to solve our own problem. We at Brev exclusively use Brev for all of our own development. To your point, you’re not dealing with environment issues. We have a blog post about when we upgraded from Golang version 1.17 to 18. It caused a memory leak. But our co-founder fixed it, his environment, and so when I wanted to update my environment, I just reset, and I’m on the latest. And so being able to just move your environment that way - it makes everything a lot easier.

What we’ve learned is that some of our power users were AI developers, because AI dev environments are really complicated. And they specifically asked us to support GPUs. And when we started to support GPU instance types, it just kind of opened her eyes to how many raw DevOps problems there are within the MLOps space. GPUs are really expensive, a lot of times the GPU is sitting idle… If you need to do some sort of development, you might spin up a GPU just because there’s the off chance you do some GPU development right now, but a CPU would have sufficed… So the way Brev works is - the idea is you can move your dev environment between different instances. So if you’re not using the GPU, deallocate it, and just go to a really cheap pennies per hour CPU instance, and only when you need the GPU turn it on. We also have auto-stop. So I learned from Workday, they were burning a lot of money every month because developers forgot to shut these instances off… This also happens from individual developers… So if you don’t use your Brev instance, we’ll automatically power it down. You can start it again from the CLI and it’s back up and running.

So Brev is a CLI that makes it really easy to spin up these dev environments, and we connect your local tools to that remote instance. So the CLI wraps SSH, so all you have to do is run brev start and start coding, and not really have to worry about the actual environment issue.

[10:22] It sounds really cool. Let me ask you kind of a baseline question, that as I’m learning about how you’ve done this - but I’m starting kind of from where I’m coming from, and probably where more than a few of our listeners have… I’m used to using Docker, and getting in a container, and it has access to an NVIDIA GPU, kind of the way a lot of folks are doing it… Can you kind of tell us a little bit about what the difference is between that kind of classical approach that a lot of people use, and in what ways are you differentiating and stepping up from that into brev.dev?

Yeah. And can you explain to me maybe where are you running this container? Are you running this on your machine? It has the NVIDIA GPUs…?

You have to have a set of images that you have set up, there’s a bunch of configuration ahead of time, which I know I don’t have to do on yours… But essentially, I’m having to say, okay, I have a GPU available in some place on the network, or maybe in the cloud, and I’m going to do those configurations, and then maybe I’m on my laptop, maybe I’m on a server, but a lot of people are logging into containers to do the work, and then trying to move the container around, and be able to access those resources from different locations. I’m starting from that because it has some good things, but it also has some real pain in the butt aspects to it, in terms of having to make it all work… It sounds like what you’re describing up front is a really good user experience, and so I’m trying to get a sense of what the differences are in the two.

Yeah. So I think at a minimum, if you want to just run a Brev environment, with or without a container, whether or not you have that setup, the way we kind of handle this is with a simple Bash script. Knowing that every Brev environment is running the same version of Ubuntu - we have the specific version listed in our docs - running a Bash script… Bash is ubiquitous, it’s available, and you can run anything on it, or you can install anything with it rather. So you can start with just a Bash script; if you don’t want to run a container, if you just want to try something and have that run.

So we leverage this a lot for some of our templates… If you have a Bash script committed to your repo that has setup instructions, Brev can automatically run it when you spin up an instance. So you create a new environment, you give it the Git repo and the path to the script that you want it to run, and that script will get run immediately for you when the instance is created.

The user experience is creating the new environment, whether in the CLI or through the UI, setting the path to that setup script, or you can also just start with one of our templates. And then from your terminal, you run brev.open and it’ll open up VS Code, connect it to the remote instance, or a Brev shell if you – we support Vim, Emacs, JetBrains, whatever IDE it is that you want to use, or code editor.

And then if you do have a containerized workflow, anything that you were going to run in your terminal, if you’re going to run Docker compose commands, if you’re going to run Cog, if you’re using Replicate. Anything that it is you’re trying to run, you can just put in the Bash script and know that that’s going to reliably run for you or someone else that you’re sharing this with. But I think the big thing here is, there’s a lot of optimizations around the GPU spend, so the way that it’s being backed up for the volume - we’re doing like intelligent backups, I guess, where we can backup just the amount of volume that’s actually being used, so you’re not paying for unused volumes, even when your instance is off. There’s autostop, making sure that your instances aren’t costing you a lot when you’re not using them.

You can use Brev scale, which lets you deallocate the GPU, or get a more powerful instance if you need it. So flexible compute needs without having to re-setup or install anything. And there’s the obvious benefit of not running a container locally, if you’re on a Mac that kind of like casually eats up like 20 gigs of RAM.

Daniel Whitenack

[13:54] I actually – so I haven’t used it a lot, I have to be honest, but I did spin up a couple of environments in brev.dev leading up to this conversation, because I wanted to understand a little bit more about it, and it was really fun. Like Chris was saying, I think it’s true that the sort of dev and onboarding experience is really nice. I was using like the UI configuration, and the experience I had was that – and I don’t know, I’m kind of curious what you’ve heard from other users, I guess is my question… Because my experience was similar to like - okay, I created the dev environment with the UI; it’s a little bit different UI than I’m used to, but there’s like familiarity with certain of the things, right? I’m pointing it to a Git repo, I’m maybe defining, like you’re saying, like a startup script, or something like that. I’m naming it, okay, it’s creating this thing, I add a GPU, whatever… There’s similarities between that and like what I would create in an instance in the cloud. But then I have this dev environment, and I think the point where something switched in my brain was - I was local in my terminal, and I actually even forget the command now, but like Brev open, the environment name, that’s what it was… Brev open, and it just popped up VS Code, and then I realized - like, I had my VS code open, and I could open like a terminal in VS Code, but that was running in the environment that I created remotely. So that’s where things switched in my brain, like “Oh, I’m now using that environment that I set up, and I could share that environment with someone else, and then they could pop open their code editor and see this.”

I’m curious, other people that you’ve talked to, people that have started using it, where are those light bulbs going off for them, and what are the things that they’re really getting excited about, I guess?

I mean, I just did a slew of user interviews, and the first question I always ask is “What does Brev do?” It’s always really exciting to hear that from someone before I have an opportunity to like kind of accidentally influence that conversation… And the biggest things we hear is that Brev is the most delightful or easiest experience to run anything on a GPU in the cloud. So that’s been a lot of our focus. Dev environments are kind of the thing that gets in the way of what you’re trying to do, and so that’s been our focus from the beginning… But there’s a lot more complicated workflows, especially within AI, and just the dramatic cost.

We have one user whose Google Cloud bill was like about $280 with just running on their GPU instance. But with something like Brev scale, they brought it down to about 25 bucks. I think their exact thing was like 27 or something dollars. And so that’s like 10x reduced cost, just because that GPU was sitting idle while they were like actively coding and building things.

So I think our goal is just to have something that is a much more delightful and really simple experience, but also saving a lot of money. A lot of what we’re focused on right now is integrating with other clouds. We, to get this far, have just been built on AWS, but we’re partnering with like Lambda Labs right now to support their GPU instances, because they’re a third of the cost. We’re leaning deeper into actually a container strategy which will let us provide kind of start and stop across clouds, which I think will be really exciting. So this is something that we’re getting ready to release over the next two weeks, and I’m going to start kind of talking a bit more about…

Actually – I was gonna say, and you can go ahead if you want to dive a little bit into that right now… Because you’ve really piqued my interest with that. So if you don’t tell me now, I’m gonna pester you later.

[laughs] Yeah. Well, the way that we’re approaching it I think is we’re going to – and it’s a bit experimental still right now, but we’ll have something out within two weeks. Our team has pretty quick velocity. We’re a small, but potent and passionate team. We really want to be able to support start and stop across any – anywhere that there’s a GPU available for us in the cloud. And it might not be at a large data center, it might be at a small one, and that’s okay. If it’s a cheap GPU that’s in a region that’s not going to introduce a lot of latency for you, you should be able to leverage it while we have access to it, and if it’s robbed from us, if you stop your instance, you should be able to start it again, and it might not be on the same instance, in the same data center, but that’s okay. We’re really just trying to optimize on – a GPU itself, a GPU is a commodity; you just want the cheapest one, and you want to be able to run your code on it easily. So yeah, in like two weeks, I think we’ll have a pretty exciting launch on that.

[18:09] That sounds pretty cool. So there’s another aspect of that that’s got me thinking… With you looking at multi-cloud, and you kind of said it could be a small data center, it could be – I’m getting the impression there can be a lot of diversity, potentially, in what you’re targeting for getting your GPU. What are some of the kind of considerations somebody might have for if they’re using brev.dev, like how might they decide, and is there any strategy yet other than just kind of whimsical, on saying, “Hey, I want to go with this one, or that one?” Is it just a cost thing, or could there be other considerations that you guys have thought about in terms of being able to provide? You know, like going to a small data center here, at this company, rather than the big AWS one in Northern Virginia over here. Any thinking around that?

Yeah, so just to clarify on the kind of whimsical approach - are you talking for us as we have tried to find GPUs that we can –

No, for the user perspective. Am I correct in thinking they can kind of choose where to target on that? Or is it something you’re doing behind the scenes?

Our goal is to make it really easy, but expose as many options to a user as they want. So for example, we’ll default right now to a region that makes sense, but you can always open up the region and pick one that you would like. Again, right now we’re only working with AWS, but that’ll change really quickly, like in these next two weeks.

So we always want to make it an option for a user to see transparently where their instance is coming from. I don’t think there’s reason for us to hide that. However, we do have an option for you right now to connect your AWS account. And what I’ve noticed is only like two individual users (not teams) have used that. And I think what that means to me is the specific location of the GPU doesn’t really matter. It’s just like “Hey, I want to run this on an A-100. Go run this on an A-100.”

Gotcha.

Daniel Whitenack

So one of the things that is sort of a question running through my mind is - I thought it was really powerful, like when I open up the environment, I had the environment, I could run Stable Diffusion or whatever, because I had a GPU in the background, I had enough memory… All of those things, it’s really nice. And I could see how that would allow me to sort of understand the environment that I’m eventually building towards in terms of what I want to release in production, and I could share that environment with other people. What would be, from your perspective as both the founder and creator, but also a user of Brev, what is like the workflow that you’ve seen work in terms of going from that local dev, and sharing local dev environments with other team members, towards something you would run in the same type of environment in production? Like, okay, I’ve now used a Brev environment to figure out how to run this fast API code that serves my model, or something like that… And now I want to run the same type of environment, but I want to deploy that in my AWS, or something. How does that work, and how does Brev factor into that, I guess?

Yeah. So it’s really funny, right? Think about how many times you have to kind of do like the same redundant work, and all of this being like not the thing you’re trying to actually build. So you go and install everything so you can work on the dev environment; then you go and install everything so you can go run your tests, if you have a pipeline. Then you go and install everything so that you can deploy everything in production. And theoretically, we’ve already done this.

I’m good friends with the team at banana.sev, we love working together; I think our products are both very synergistic. And something that we’re working on is if somebody has a Brev environment, they should be able to click a button and it deploys on banana. It’s a serverless GPU for production.

A helpful way to look at this is there’s two types of compute. There’s interactive compute and non-interactive compute. If you’re deployed on production, that’s a non-interactive compute, right? Your API is up and running, you don’t need an active shell into it, and in fact, that might even be an anti-pattern. If you have interactive compute, you’re actively developing, you’re opening the terminal, you’re running things, and seeing it live, and making iterations to it. And so if you look at Brev and Banana, for example, as like interactive and non-interactive computes, that worked really well together, you can take your interactive dev environment on Brev, get things running, and once you’re done, press a button, move it to Banana so that it’s non-interactive, it’s not costing you as much, and it’s just on a serverless model.

[22:19] Then if you have a server error, if you get some sort of Sentry log on your Banana server, then you should be able to click a button and open it up in Brev, in interactive compute, so you can figure out what’s wrong, fix it, and send it back. And if you’re able to have that kind of workflow, you’re taking away a lot of this like DevOps overhead. Because at the end of the day, we’re just trying to build. I think that’s where I see the future kind of heading, is how smooth can we kind of move between the states that the user essentially wants.

Daniel Whitenack

Yeah, I think that’s a really insightful sort of direction, because I see this efficiency gain with Brev, and sharing environments for that interactive compute; that’s really important. But then if you can make that connection to the sort of production deploy - that’s huge, because now… Still, so much of the time there’s this friction that you talked about, where even if I’m developing against like a cloud instance, there’s some sort of like non-negligible labor costs of like me going through the headache of going and deploying something to production… And it’s still not the same, or there’s some issue, like you were talking about, when things go wrong, and there’s debugging.

So if you can replicate that environment both in an interactive and a non-interactive way, I personally think that’s really, really powerful and interesting. I think, actually - just a note, I think we’re scheduled to have an interview with Banana upcoming. So listeners, watch out for that one. I’m excited about that.

Really exciting product, and I think a really, really exciting space, of just like the ML/AIOps dev tooling coming out right now. Yeah, definitely really excited. And to kind of take what you said even a step further, you might be reading a research paper and you see a Google collab notebook that has a model, and you want to go take it, fine-tune it for whatever you want to do with it, and then go ahead and deploy it… I mean, Brev is kind of in the center of interactive compute, where if we have an important tool for collab notebooks, where you can kind of import it on Brev, change the compute that you want, get something more powerful, fine-tune it the way you’d like, maybe even use a template for API frameworks, so you get like Flask APIs set up, ready for you… You can kind of continue to modify from there and then hit the production button and go to Banana. That’s kind of like the dream workflow I see, where we’re behind the scenes, always finding the cheapest GPU for you to do that, you’re able to get as powerful of a compute needs as you need… It’s really simple to go from like collab to something scaffolded, with APIs that are ready for you to deploy to production… And again, we just get to focus on the fun part.

Daniel Whitenack

Alright, Nader, so I’m looking through the templates that you have at brev.dev, and just to give people a sense of like some of the things that you can kind of spin up an environment quickly, and do it right away… I see a couple different Stable Diffusion, Stable Diffusion, Stable Diffusion version two, Dream Booth, TensorFlow, Whisper, clip/image captioning… All sorts of different things. But then there’s environments that you have templated out for things like Go, and Rust, and other environments that people might be interested in.

You already alluded to the fact that you’re a quickly moving small team, and I’m wondering, out of all the sort of areas that you could focus on, it’s probably – one of the things I would guess is it’s maybe difficult to position this for a certain group of people that really need it, because it’s kind of a common need across all dev environments.

[26:00] It seems like you’ve kind of brought some focus to the area of GPUs, and data science, AI type of workflows specifically… Do you think that’s mostly been driven by this sort of GPU element and the complexity of those environments? Or how do you think about like where to head from here in terms of the verticals and the industries and the specific dev workflows that you’re thinking about and you’re focusing on? How is that working, and what are you hearing from users in that respect?

Yeah, so it’s kind of funny… Before we leaned into the AI/ML workflows pretty heavily - you’re right, dev environments… Who is your target audience? People who code, right? And that’s kind of a very naive answer, for a very early stage of the product.

I think what we learned is you really want to be able to solve someone’s problem as quickly and acutely as possible, and then get out of the way. And I think that’s been a big change in direction for us. Even if you look at like the way that the product onboards - you need to have a CLI so you can run Brev open. So we used to say, “Oh, well, when you make an account, we’ll tell you to install the CLI right there.” But the user doesn’t know yet why they want to install the CLI. They haven’t had an expressed desire to open their dev environment yet. So the way that we change it is it’s just focused on getting your environment created. When your environment is created, then you see an Open tab. When you click the Open tab, there it tells you “Install the CLI, because you haven’t yet.” So the user says “I want a thing”, and then we can kind of show, and not really impose.

And so when we were thinking about like broadly dev environments, when we initially started building brev, it felt like we – you know, someone says, “Hey, my local environment is not working.” And so we’d say, “Great, we can make one for you in the cloud.” But now we’re not just introducing Brev as a tool to solve their environment issues, we’re also introducing the cloud. It’s a separate thing. And so in terms of like acutely solving the problem, we’re not doing that; we’re introducing the element of the cloud, which they have not yet expressed a desire for.

So what’s great about the GPU use cases is we’re meeting people where they are, which is in the cloud, right? They’re saying “I am trying to access an A-100 that does not exist on my MacBook Pro, and I want to get this running”, right? So the cloud intention is coming from them, not us. We’re not kind of like sneakily trying to introduce something else so that we can get them to use brev. It’s just meeting the user where they are. And the issue with using a GPU in the cloud is that they’re really expensive, and they’re really painful to get set up, and then, of course, all the dev environment issues.
So that’s been a really great focus for us, and we’re leaning in as hard as possible to the MLOps tooling; the dev environment issues are much more severe here, it makes a lot more sense for – there’s a lot more room for us to delight users by making a much better experience.

And going back to that container strategy - if we can move between different clouds, we can also move between one local cloud, which is your actual computer. So I think the way that we kind of want to approach broader dev environments is you should be able to run something on your computer and then say, “I now have a need for the cloud. I want double the RAM. I want a GPU. I want something.” So you can start local, and then move it to a cloud. And that’s the way that I think we can ultimately broaden from ML dev environments. But this is a huge focus for us right now, and what I really want to do is, rather than think about so many of those other use cases, how do we get really tight integration with Banana? How do we get a really easy way to go from a collab notebook to something that you’re now fine-tuning on a much more powerful GPU? How do we find an interface with other clouds? That’s where we’re focused right now, and there’s a lot of work to do here.

Yeah. Clearly, you have such a focus on kind of accessibility, in terms of the user experience… And you have a bunch of different ways of connecting in… I use VS Code, so I went and looked at that. And you have the guides that address different common models that we would be interested in, that are really popular right now, like Stable Diffusion… And you talk about the different clouds… Could you pick one, whatever one you want, and just kind of walk us verbally through? I know it’s audio only, but if you can walk us verbally through kind of what the workflow looks, and what people might expect, just to give a sense – it looks really good, but in my head I’m trying to put it all together, from an end to end.. And I bet you’ve done this before, so I’m hoping you can kind of just give us a little narrative that’s easy to follow on that.

[30:06] Yeah. So let’s say you want to run DreamBooth and you want to make a bunch of cool photos of you and your friends. So you can go to our DreamBooth template, it says, “Click a link.” Any environment, you can actually make a URL to easily share it, so we’ve made one that’s a URL template for running DreamBooth. So you click the link from our blog post, or from the guide in our docs, and it will take you to the dev environment page with everything filled out. It has the GPU that you’ll need selected, it has the volume, the amount of hard drive that you need, the repos that you need, the setup scripts… You don’t have to worry about anything, just pretty much hit the Create button. When you do that, the environment - essentially what we’re doing behind the scenes is spinning up the GPU that you need. We are installing everything that’s needed, all the dependencies that are needed for that. When you’re done with that, with the Brev CLI, run Brev open and the name of your environment, and it’ll open up VS Code to that environment, and in the readme it’ll say “Upload 10 photos of yourself in this folder”, and we kind of show you how to have a train. And that’s it.

So the idea is, in like four minutes you have a GPU running everything, and all you have to do really is focus on the fine-tuning that you kind of want to focus on.

That sounds great.

Daniel Whitenack

Yeah. Could you share a little bit also about – because part of this, I think, is like I’m doing a specific thing in my environment, that I’ve created, which is special to me… But now, somehow, I need to share that with Chris. Right? How would that work out in this type of scenario?

Yeah. So there’s a few things that Brev does behind the scenes. There’s like things that we intend for you to share, but every environment that I have - I have my own Git aliases. When I type C, that’s a function for Git commit. S is git status… There’s a bunch of things that I just expect, that I have set up in my .zshrc. So you can set up your own developer preferences, and every time you create a dev environment, we take whatever it was shared in the template, and then we add all of your settings on top of it.

There’s also - HashiCorp Vault is hooked up by default into every instance, so you have like an encrypted secrets manager. So I have my AWS credentials encrypted, and it stays in my AWS account. And every time I create a dev environment, if my co-founder shares one with me, or someone on the team gives me their environment, I reliably know that my terminal settings are all gonna be loaded in, my AWS credentials will be loaded in… But also, there’s scopes to the encrypted secrets manager. So you can say, like, if someone shares this environment, make sure that these secrets are added into the environment. Like an environment-scoped setting.

So it’s up to you to decide what you want to be shared. We’re not going to share secrets that you don’t want, you’re not going to share your AWS credentials if you don’t want it. You’re never sharing the machine with somebody, we’re just setting up one for them, and setting it up kind of identically.

Daniel Whitenack

Yeah, yeah, which I guess gets to that sort of idea of templates, right? You’re creating a template which you intend another person to use, but maybe in a slightly different way than you used it, right?

Yeah, exactly.

So I’m gonna throw out kind of a random question… And it’s okay if you haven’t gone here; I just want to ask, have you ever thought about having one that is essentially – you know, we see these services that companies will run, and then they’ll end up deploying it kind of on a private server or something, so that they can go into a secure environment, that kind of thing, as a standalone, instead of being web accessible… Any thought toward doing something like that, where you could use it in a non-public environment?

Yeah, so that’s how larger teams will use Brev. So at a minimum, you can just deploy all of the instances – the instances themselves stay in your AWS account, but we can also deploy the entire control plane behind your VPC, so nothing’s really exposed out. But that’s kind of more on the enterprise route. Individual developers I don’t think have this.

Totally. And it’s the enterprise route that I was kind of asking about; you’ll have large organizations that have their own GPUs and stuff like that, but they’re still just GPUs, and so I was wondering whether – you know, moving into that, if the control plane can say, “Okay, I’m gonna look up what you have in your datacenter. Here’s your workforce”, and you kind of have your own environment. So that’s something you clearly have been thinking about doing.

[34:06] Yeah, and it’s something that we actively support, and we have teams that we’re talking with that are going this route. You get all the same benefits where you can still scale down your instances, scale up your instances… Obviously, you might not benefit from some of the cheaper GPUs in the other clouds, because they’re not behind the VPC, but if you’re on like AWS, or GCP, we can absolutely do that.

We know from an individual user’s perspective, if you’re going to pay an extra eight hours by accident - because we always forget to shut our instances off… If you’re a team of 180 engineers, that cost just is amplified. And I saw that at Workday when I worked there as well. So definitely, we’ve kind of had some of those learnings brought into the product as well. So yeah.

Daniel Whitenack

Yeah, that’s awesome. I’m just thinking, like, looking back at my own sort of progression, and like trying to run some of these things myself, and like just thinking back – I mean, the tooling has improved right, but the environments were

still difficult, right? So either I had the consumer GPU card that’s in my workstation here, or I’m trying to use one in the cloud, and the GitHub repos there, and the tooling… Like, I can understand what’s happening in the code, right? That has gotten much easier. I can deploy Stable Diffusion in a very small number of lines, right? But the environment is still quite difficult. So I think this is really exciting and encouraging. I’m wondering, what encourages you, and what are you thinking about kind of like looking towards the future? What excites you about this space?

Oh, man, what excites me about this space… I don’t even know how to say it… There’s just so much to focus on in every realm, right? Within interactive and non-interactive compute - like, I’ve talked to Eric at Banana about just how both of our teams are about the same size, and we’re both 100% focused in our space, and it just feels like there’s an infinite amount, just looking down. So looking up, there’s even more.

You guys mentioned your last episode was on ChatGPT, and I think AI is really exciting, not so much in that it’s gonna replace us all, but it kind of lets us be more creative directors of our own lives. If you think about any creative process as having like some generative aspect, and then some like malleable aspect… So if someone’s making clay, they like throw a bunch of clay down - that’s the generative. And then you kind of like form it nicely into the bowl or cup that you want, and that’s the kind of like morphing it in. So there’s always those two aspects. And when AI is able to help us just kind of really push on that generative side, and we’re still in control of the output, we’re still the ones that are kind of morphing the final product, I view it just as an extremely empowering thing. So it’s been really exciting , with all the development in the space… Really bought into the idea that you make things a little bit easier and you can just dramatically increase the affordance for things to happen. So as much as possible, how do we get rid of machine problems, and that people who want to build really exciting things and build the next new affordances, and the new models, essentially be able to do that with as little friction as possible. And that’s not just within their fine-tuning and resource constraints that they might have, but also in terms of like moving it and shipping it and delivering it.

So on one hand, the things that are being built is very exciting. But on the other, the energy in the space is huge, right? I think everyone has been so inspired by what’s been done recently with ChatGPT and the recent AI models that are out, that it’s just galvanizing a lot of people to build a lot of really cool things. Everyone I know, especially here in San Francisco, founder or not – it’s funny seeing founders who have nothing to do with AI thinking about AI side projects, right? That’s galvanizing people. And so everyone is really excited about building this stuff right now, and I just hope we don’t lose that energy, and just make things as frictionless as possible as we do that. Even I, I’m guilty. I have a little Saturday project I’m throwing together with some generative AI stuff. There’s a lot of really cool stuff that’s happening.

Daniel Whitenack

That’s great. Yeah. Well, thank you, Nader, and your team, for helping us reduce some of that friction and get people’s ideas out there. This is super-exciting. And I think - speaking of friction, I think one of the things that you mentioned prior to our conversation is that you’ll spin up a coupon code for our listeners, for some compute on Rev.dev, and getting some of that – removing even some of those barriers for our listeners as they’re getting started. So we’ll make sure and include that in our show notes, so please take a look at that. Get on brev.dev. I did it. It only takes a couple minutes. It’s awesome.

So yeah, thanks, Nader, for coming on the show and telling us about what you’re doing.

Absolutely. Thank you guys so much for having me. I really love the conversation. And by the way, Chris, you mentioned Lockheed Martin earlier… My mom was a nuclear engineer and worked at Lockheed Martin as well, so that’s awesome.

Oh, thanks for telling me that. I’m definitely in a good company, and… Awesome. Cool.

Yeah.

Daniel Whitenack

Alright. Thanks. See you guys!

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. 💚

View all episodes

Player art