Practical AI – Episode #249

The state of open source AI

with Casper da Costa-Luis, a contributor to the State of Open Source AI Book

All Episodes

The new open source AI book from PremAI starts with “As a data scientist/ML engineer/developer with a 9 to 5 job, it’s difficult to keep track of all the innovations.” We couldn’t agree more, and we are so happy that this week’s guest Casper (among other contributors) have created this resource for practitioners.

During the episode, we cover the key categories to think about as you try to navigate the open source AI ecosystem, and Casper gives his thoughts on fine-tuning, vector DBs & more.

Featuring

Sponsors

Changelog News – A podcast+newsletter combo that’s brief, entertaining & always on-point. Subscribe today.

FastlyOur bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com

Fly.ioThe home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Notes & Links

📝 Edit Notes

Chapters

1 00:00 Welcome to Practical AI
2 00:43 Advent of Gen AI
3 02:03 PremAI open source AI
4 05:33 How to categorize open source AI
5 08:44 Different types of open
6 12:13 Alighned vs unaligned models
7 13:47 The AI ecosystem
8 16:06 Value of fine-tuning
9 18:58 Large vision models
10 20:05 The major model families
11 22:51 Community around the book
12 25:00 Where to find the book
13 26:36 Sponsor: Changelog News
14 28:20 How to use the book
15 31:08 Desktop apps in open source AI
16 35:12 Reflecting on the book
17 36:21 Hot takes
18 37:25 Encouraging open source contribution
19 40:18 Closing thoughts
20 41:41 Outro

Transcript

📝 Edit Transcript

Changelog

Play the audio to listen along while you enjoy the transcript. 🎧

Welcome to another episode of Practical AI. This is Daniel Whitenack. I am the founder and CEO of Prediction Guard, and I’m joined as always by my co-host, Chris Benson. How are you doing, Chris?

Doing great. How’s it going today?

It is going awesome. I don’t know if you’ve heard, but it is the advent of Gen AI… That is, I’m participating in this advent of Gen AI hackathon with Intel, so people are getting hands-on with a bunch of open source models, and different hardware… So I’ve been in Slack all day, answering questions and seeing cool prompts, and seeing cool outputs, so it’s just been a ton of fun.

What’s the most interesting thing in terms of what you’ve seen so far? …I’m just curious, before we go on.

The first challenge - so we’re only a day in; the first challenge was to generate a series of images that go in a sequence, like a comic strip, that tell a narrative… But there were some really amazing ones; one’s a child growing up, and then him having a son, and the images were really compelling, and the narrative was really interesting… Yeah, just very, very creative output is something that I’ve noticed. And today is all about chat, so we’re gonna see some chatbots popping up in the hackathon, and really looking forward to that.

Sounds like fun.

Yeah. And the hackathon is all centered around these openly-accessible, or open source, or permissively-licensed generative AI models… I think it’s really fitting, because we have with us Casper, who is a longtime open source enthusiast, but also one of the contributors to the recently published State of Open Source AI book from Prem. So welcome, Casper. It’s great to have you with us.

Hello. Yeah, great to be here.

Yeah. Well, I mentioned you’re a longtime open source enthusiast… How did you get enthused about open source AI specifically? So what was your own journey into open source AI, maybe leading up to this book and what it’s become?

I mean, that’s a good question. I’ve been around for long enough that AI didn’t really exist as a thing back when I got into open source… And it was honestly just purely a hobby. I never even considered it as a career. This was - it must have been 15 years ago or something, and I… In fact, I felt ashamed and embarrassed every time I was working in open source, because it felt like I should have been spending that time working on an actual career. It felt like it was just a toy.

I had a very long commute to get between my home and workplace on a train, and I was just coding away on my phone… I actually installed Debian, side-loaded on my Android, and… Yeah, that got me hooked on open source purely as a hobby. And - I mean, if you contribute enough, and you’re happy making mistakes in public, eventually you build something that loads of people start using it, spirals out of control. Before you know it, it suddenly turns into a career.

So I probably entered into this whole space in an unconventional way. I didn’t intend to make things that would become famous, but they just wound up becoming famous… Which is quite pleasant. I mean, there’s pros and cons, because also things that become successful aren’t necessarily things that you expect to become successful. You can put a lot of effort into something and the world determines it’s not really of much value, and so they don’t use it. And something you barely put much effort into could explode. So that was my sort of background.

I’ve kind of an academic slant as well, so I did a lot of machine vision type things in university. Didn’t really want to shoehorn myself into any particular one area, though… And also, I didn’t want to do pure Academia. I’d much prefer industry, and having stakeholders, and actual products that you build at the end of the day. I mean, there’s pros and cons definitely to both, but… Yeah, so that’s obviously how I wound up like the rest of the industrial world, seemingly moving towards AI, because that’s a buzzword, and that’s what everyone wants you to work on, effectively.

So yeah, what started off as initially being machine vision, pre-machine learning, became machine learning type machine vision type stuff. And now of course LLMs are all the rage… So that’s why we thought of doing a bit of extra research and try and consolidate all of the noise out there, and in various different blog posts people effectively shouting into the ether, and we thought we might as well write a book… Release some of our research in the wild, get some feedback on that before we actually started building more things.

Yeah, that’s awesome. And you even allude to this in the sort of intro to the book, this sort of fast-paced nature of the field, and a lot of people feeling sort of FOMO… How do I even categorize all of the things that are happening in open source AI? So maybe one general question about the structure of this - Chris and I have worked through some of these categories in various episodes on the podcast, but sometimes it is hard to sort of think about how do you categorize all the things that are happening in open source AI… Because they do go beyond just models, but they include models, and a lot of things are sort of interconnected… So how did you kind of – was it organic in how the structure of this book came together? Or how did you come up with the major categories in your mind for what’s going on in open source AI?

[06:10] And that’s what I was really wondering as well. You literally said, Daniel, exactly what was in my head just now.

Yeah, we’re in tune.

Yeah, I mean, it is a big ask, because - I mean, my philosophy in general is that the Universe exists as a cohesive whole, and we split it up into different subjects like physics and chemistry and maths just as a way for humans to actually parse everything that exists in small little bite-sized chunks… But they’re not really independent subjects. And the same goes with AI. I mean, there’s so many different categories of AI. The nice thing about working in the open source space is that there’s lots of different people you can have conversations with, get some feedback… Everyone kind of chipped in their own ideas about how to, let’s say, break down a book into different chapters.

Ultimately, I think what made the most sense is that it doesn’t matter too much what those chapter titles are. It’s more about the content within them being, let’s say, not too repetitive, and actually distilling the ideas that people are talking about. And if you can do that really well, it maybe almost doesn’t matter quite how you subcategorize things. But I would say Filippo Pedrazzini is probably the one who came up with the actual final let’s say 10 chapters, but then past that, in terms of actually writing those chapters… Probably about a dozen people have actually worked on them… Which is, again, really nice that you can do this in the open source space. No single person is really the author of this book.

It seemed fairly obvious to me based on my own particular passion and research that licensing should definitely be a chapter, and that’s something that developers often neglect, because it’s just sort of outside their field of interest and expertise, and it’s just a bit of red tape that maybe they have to be aware of in the back of their mind… I basically wrote a chapter on licenses, which I think everyone else was happy about, but nobody else wanted to do it.

But sure, I mean, it was just effectively topics that we felt are big, major things that there’s a lot of confusion over; maybe we ourselves were confused about it as well… So like evaluation and datasets - what’s the best way to evaluate a model anyway? So that seemed like a big topic, let’s make that a chapter.

So it seemed fairly organic, coming up with these titles. And of course, as we were writing this - again, it was all fully open source, even the whole writing process - we thought maybe we should split up the chapters. So we split off models into two chapters, let’s say; one for specifically unaligned models, versus aligned models… So yeah, it was an iterative process.

Yeah. On that front, I definitely hear the passion coming through for that sort of licensing element of that, and I see that upfront in the book… I’m also very, very much – like, we’ve mentioned on the podcast multiple times that people need to be reviewing these things, especially as they see whatever 400,000 models on Hugging Face, and parse through these things… But could you give us maybe the pitch for engineering teams or tech teams that are considering open models, but might not be aware of the various flavors of openness that are occurring within “open source AI”? Could you just give us a little bit of the sense of maybe why people should care about that, and maybe just at a high level what are some of these major flavors that you see going on in terms of openness and access?

Right. I mean, I suppose first I should have a disclaimer, which is the quiet part that nobody usually says, which is almost a counter argument - it might not matter, because in practice nobody’s going to sue you if you do something illegal, unless you’re fairly big and famous. That’s just the harsh truth, and it’s very frustrating that laws and enforcement tend to be two separate things. And there is a precedent in law that you’re not meant to create a law unless you know definitely you can enforce it. So to a large extent, a lot of these licenses out there are [unintelligible 00:10:04.06] nimble in that regard.

[10:08] The other thing is a lot of these licenses are not actually, let’s say, tested in court. They’re not actually formally approved by any government or legal process… So it’s not necessarily illegal just to write something in a license. You should probably be aware of recent developments. In the EU, for example, they’ve proposed two new laws, the CRN and PLA - two new acts, I should say - that are effectively saying the no warranty clause in all of these open source licenses might be illegal if you are in any way benefiting, let’s say, monetarily, even if it’s indirectly… So you’re constantly releasing open source things purely for advertising purposes, but you’re not directly gaining any money from it, we’re still going to ignore the no warranty clause.

So yeah, there’s interesting stuff in that space… But I would say as a developer, the things that you should be aware of when it comes to model openness is that there’s a difference between weights, training data, and output. Those are the three main categories, really. So licenses usually make a distinction – it’s not licenses, this is more about the source. So are the model weights available? That’s often the only thing that developers care about in the first instance, because that means they can download things and just play with it. But if you actually care about explainability, or in any way alignment in order to figure out how you might be able to make a model aligned or unaligned, or whatever you want to do with it, you probably do need to know a bit about the training data… So is the training data at least described, if not available? And when I say described, as in more than just a couple of sentences saying how the data was obtained, but actual full references, and things… So a lot of models are not actually open when it comes to the training data.

And then, of course, the final thing is the licensing around the outputs of the model. Do you really own it? Are you allowed to use it for commercial purposes? And even if you are, it’s highly dependent on the training data itself, because if the training data is not permissively-licensed, then technically, you shouldn’t really have much permission to use the output either. So I think even developers are kind of confused about the ethics around the permissions… So certainly, legally, we’re super-confused as well.

I have two questions for you as follow-up. They’re unrelated, but I’m gonna go ahead and throw both of them out. Number one, the quick one, I think, is could you define what an aligned model versus an unaligned model is, just to compare those two for those who haven’t heard those phrases? And then I’ll go ahead, just as you finish that, and say what’s the reason that – I noticed that licenses is addressed at the very top of the book. Is that framing the way you would look at the rest of the book, or is that more just happenchance that it came there? I was just wondering how that fits into the larger story you’re telling.

Yeah, so for those who don’t know, unaligned models - it’s effectively if you train a model on a bunch of data, it is by default considered unaligned. But in the interest of safety, what most of the famous models that you’ve heard of do - like ChatGPT, for example - is add safeguards to ensure that the model doesn’t really output sensitive topics, issues, anything illegal… It’s still probably capable of outputting something quite bad, but there are safeguards. And the process of adding safeguards to a model is called aligning a model… As in aligning with good ethics; I suppose that’s the implicit…

Gotcha. Thank you very much. And then I was just wondering - like I said, the positioning of licensing at the front… Is that relevant or is that just happenchance?

We did sort of think of an order of chapters, let’s say… And licensing just seemed like a good introduction, let’s say, because it’s before you get into the meat and the details of actual implementations, and where you can download things, and where the research is going, let’s say.

[13:47] Well, Casper, you were just describing the framing of the book, and also some of these concerns around licensing… I’m wondering if we could take a little bit of a step back as well and think about what are some of the main components of the open source AI ecosystem. The book details all of these, but what are some of the big major components of the AI ecosystem, maybe beyond models? Because people obviously have maybe thought about or heard of generative AI models, or LLMs, or text-to-image models… But there’s a lot sort of around the periphery of those models that make AI applications work, or be able to run in a company, or in your application, or whatever you’re building… So could you describe maybe a few of these things that are either orbiting around the models, if you view it that way, or part of this ecosystem of open source AI?

Sure. I mean, there’s the huge issues I would say regarding, let’s say, performance per watt, effectively, electrical watt. There’s a lot of development in the hardware space, and we have new Mac M1 and M2, which might actually mean you can fairly easily do some fine-tuning, or at least inference on a humble laptop, without ever needing CUDA. It seems like there’s a lot of shifts and paradigm changes when it comes to the actual engineering implementations. WebGPU is a big upcoming thing, which - I mean, it has technically been going on for a decade or more, but it might actually have reached the point where possibly we can just write code once and it just works on all operating systems, on your phone; you can get an LLM just working, wherever.

But yes, I mean, there’s effectively a lot of MLOps style problems. It’s one thing to have a theory of how to actually create an LLM, but quite another thing to actually train a thing, fine-tune it, or deploy it in a real-world application. So there are a lot of competing, let’s say, software development toolkits, desktop applications… And I don’t think anyone’s really settled on one that’s conclusively better than anything else. And really, based on your individual use cases, you have to do an awful lot of market research just to find something that’s suited to your use case.

I asked this because we’ve had a number of discussions on the show about sort of training, fine-tuning, and then this sort of prompt or retrieval based methodologies. So from your perspective as someone that’s kind of taken survey of the open source AI ecosystem, and is operating within it, and building things, what is your vision for where things are headed in terms of more sort of fine-tunes getting easier, and fine-tunes being everywhere, or pre-trained models getting better and people just sort of implementing fancy prompting or retrieval-based methods on top of those? Do you have any opinion on that sort of development? I know it’s something that’s on people’s mind, because they’re maybe thinking about “Oh, this is harder to fine-tune… But is it worth it? Because I’m getting maybe not ideal results with my prompting.”

Yeah, no, it makes sense. I would say basically if you’re not doing some form of fine-tuning, you’re not producing anything of commercial value. Effectively, it’s very much like hiring an intelligent human being to work for you, without them having any particular expertise, and not even knowing what your company does. That’s what a pre-trained model is, effectively. So you do need to fine-tune these things, or add some amount of equivalent, anything else that’s equivalent to fine-tuning, let’s say.

In terms of things that actually predate LLMs, I think there’s a lot of stuff that is very useful, and even maybe far more explainable, that people seem to be discounting just because it’s easy to get some result out of an LLM just by prompting it. So people view it as good enough, and they start using it, even though it’s maybe not safe. So one thing I would really recommend people look at is embeddings. Just by doing a simple vector comparison in your embeddings, you can find related documents. You don’t really need an LLM to drive that, because an LLM is effectively – instead of explicitly making an embedding of your query, converting your query into a vector, and then comparing it to other vectors in your database that correspond to let’s say documents or paragraphs that you’re trying to search through, your LLM is automatically doing that entire process… And it might make mistakes while it does that. It’s going to paraphrase things, which it might get wrong, because it can’t even do simple, basic mathematics, and it doesn’t understand logic.

[18:27] So yeah, and whenever it comes to things like let’s say medical imaging, where there’s a lot of interest in how can we use AI to improve this, people tend to get frustrated with how slow the uptake of AI is. But there’s a reason for that, which is explainability is important, right?

So the way I see things going is - yes, far more fine-tuning, more retrieval-augmented generation type stuff, so RAG stuff, and then also probably push into explainability… I don’t really think there’s much explainability in LLMs right now, in general.

Everyone’s been so focused on LLMs with large vision models are one of the newer things on the rise… What is your take on large vision models in the future, and how they start integrating in? [unintelligible 00:19:08.26] talking about some of them now… I would love your take on it.

Sure. I mean, we didn’t quite get to covering this in the book. I mean, that’s how fast-paced things are… So multimodal things are super-interesting. To me, my feeling is that it’s effectively gluing together existing models into pipelines. And it hasn’t been historically something that I wasn’t that interested in, because that’s more an application and it’s not so much something you need to research per se. It’s very similar to how the OpenAI people were very surprised that ChatGPT exploded in popularity, even though technically the technology was quite old. You know, you lower the entry barrier a little bit, and then everyone actually starts using it because they can. To me, the multimodal type stuff is similar. It could result in really innovative, new companies popping up, and new solutions that are actually usable by the general public, but in terms of the underlying technology, it doesn’t seem that particularly novel to me.

As you looked at the landscape of models itself, and the licensing of those models, the support for those models, and underlying MLOps sort of infrastructure, the support for underlying model optimization toolkits and that sort of thing… Some people out there might hear all of these words like “Oh, there’s these LLaMA 2 models, and there’s now Mistral, and then there’s now [unintelligible 00:20:30.06] and all of these… As you were going through and researching the book, and also doing that as an open source community, can you orient people at all in terms of the major model family? So you already distinguish between sort of models and unaligned models… Is there any categories within the models that you looked at that you think it would be good for people to have in their mind in terms of “Hey, I have this application, or I have this idea for working on this. I’m listening to Casper, I want to maybe fine-tune a model. I’ve got some cool data that I can work with.” Where might be a sort of well-supported or reasonable place for people to start, in terms of open LLMs, or open text to image models, if you also want to mention those?

Sure. Yeah, I mean, because there’s just a new model basically being proposed every day – I mean, often it’s a small incremental improvement over a previous model. So in terms of actually trying to compare them from a theoretical level without looking at their results, there isn’t really much to talk about in terms of large model families. They might be an extra type of layer that has been added to a model in order to give it a new name, let’s say… Nothing particularly stands out there. I mean, we do have a chapter on models where we try and address some of the more popular models over time, the proprietary ones, and then the open source ones. But I would say nothing particularly stood out to me out there.

[22:02] I suppose the more interesting thing in terms of actually implementing something for your own particular use case is starting with a base model that has pretty good performance on presumably other people’s data that looks as close as possible to the data that you actually personally care about. So you don’t have to wait too long when then fine-tuning it on your own data. So for that, I think the most important thing is to take a look at the most up to date leaderboards. And there are quite a few different leaderboards out there. We do also have a chapter on that. And that was, interestingly, also a nightmare to keep up to date, because the leaderboards themselves also changing regularly. New leaderboards are being proposed for different things… Take a look at the leaderboard, pick the best model performing there, and then start doing some fine-tuning - that would be my MO.

This kind of gets to one of the natural questions that might come up with a book on this topic, which is things are evolving so quickly… And you mentioned the strategy with this book being to have the book be open source, have multiple contributors… And I’m assuming part of that is also with a goal for it to be updated over time, and be an active resource. How have you seen that start to work out in practice? And what is your hope for that sort of community around the book, or contributors around the book to look like going into the future?

Sure, yeah. I mean, for the evaluation dataset thing, we already have more than a dozen leaderboards. Just the names of the leaderboards, and links to them, and then what benchmarks they actually implicitly include… And yeah, we have comments at the bottom of each chapter, which are driven by GitHub, effectively, powered by [unintelligible 00:23:45.13] which is this integration tool helper… So you don’t need to maintain a separate comments platform, let’s say. And it also encourages people to open issues, open pull requests. If we’ve made any mistake, or something is out of date in the book, we definitely encourage people to fix things or complain about things… Which I suppose is also good from the perspective that nobody can sue you for writing something wrong, because in the first instance, what they really should do is just correct it, right? You can’t really open a court case. And for that reason, I think it’s also lowering the entry barrier for people to contribute in the first place. They don’t have to worry about what they write, and whether or not people will disagree… Because if they disagree, they can fix it. They can start a discussion. Nobody’s going to immediately file a lawsuit.

And yeah, so we’ve had quite a lot of interesting discussions already on the individual chapters. The other thing that we highlight is that as soon as you make a contribution to anything, your name is automatically displayed at the bottom of the individual chapter, as well as the list of contributors in the front.

So yeah, it’s a good way to get your name as a co-author, in a way, of a book. I mean, it’s a 21st century book as well, so it lives fully online. Everything that is committed to the repository is automatically built and published immediately.

And before we get too much further, some people in the audience might be wondering, like – I mentioned the name of the book, and of course, you can find it by googling it, I’m sure… But what is the best place to find the book? And then also, as a contributor, you mentioned the links at the bottom of the pages, but I’m assuming there’s a GitHub associated with the book. Do you just want to mention a couple ways for people to find it?

Sure. I mean, the easiest is probably to go to book.premai.io. Yeah, apologies that there’s an AI and an IO. It seems to be [unintelligible 00:25:31.23] But yeah, so book.premai.io. Or - I mean, you can also just probably google PremAI and you can find our GitHub, which is also github.com/premai-io. That’s a thing.

All the AIs and all the IOs.

Exactly. We have quite a few repositories that – I mean, some of them were just archived right now, because we’re constantly running different experiments, changing the entire architecture of the things that we’re building… So effectively, our strategy was to first do a lot of research. We didn’t mind publishing this for the general public to have a look at, so we released it in a book… And now we’re working on actually reading our own book, and maybe taking some of its advice and building things… And we have this very much fast-paced startup style, “Let’s build lots of different things, try lots of different experiments, it’s fine if we throw things away.”

So Casper, I want to actually do a quick follow-up of something you were just saying as we were going into the break… And that was - talking about now we’re gonna start going through the book ourselves, and taking the advice… And that brings up a business-oriented question I wanted to ask about it. So you go out today, you’ve listen to the podcast, downloaded the book, and there’s so much great information in all of these chapters. And the comparisons, and the different options that each chapter addresses are good, or bad, and things like that… If someone’s just getting going, or maybe they’re starting a new project and they’re using your book as a primary source to help them make their initial evaluations, how best to use that book? …because there’s a lot of material in here; all these different categories, they need to come up with their pipelines, and go back to the leaderboards, and select the models, the architectures they are interested in doing, and all that… If you were looking at this initially with a new set of eyes, but also having the insight, having been one of the authors and editors of this, how would you recommend to somebody that they best be productive as quickly as possible, and getting all their questions sorted? How would they go about that process?

Right. I mean, that’s not a really question I was thinking of addressing with writing a book… So I suppose what are you referring to is a case where someone has a particular problem that they want to solve?

An actual, let’s say, business model or target audience. So if there’s actually something that you’re trying to solve, the book hasn’t been really written from that perspective. It’s more for a student who wants to learn about out everything. Or a practitioner who just hasn’t kept up to date with the latest advancements in the last year. So the intention is that you can skim through the entire book, really. You’re not meant to necessarily know in advance which specific chapters it might have, or spur an innovation or an idea that you can actually implement to help you.

In terms of that, what probably might be more useful is looking through a couple of blog posts that actually take you from zero to “Here’s an example application that, for example, will download a YouTube video automatically, do some speech-to-text recognition type things, and then give you a prompt, and you can type in a question and it will answer it based on that video.” We do, in fact, have a few blogs giving you these kinds of examples, and I think that would probably be more useful if you’re actually trying to build a product, to find existing write-ups of people who have built similar things and just follow that as a tutorial. The book is more just to get an overview of what’s happened in the last year in terms of the recent cutting-edge state of the art.

Yeah, I think that’s a good call-out. And I think one of the ways I’m viewing this is I am having a lot of those conversations as a practitioner with our clients about “How are we going to solve this problem?” And something might come up, like - oh, now we’re talking about a vector database. How does that fit into the whole ecosystem of what we’re talking about here? And why did we start talking about this? I think that the way that you formatted things here and laid them out actually really helps put some of these things in context for people within the whole of what is open source AI, which is really helpful.

So I just mentioned vector databases, which we have talked about quite a bit on this show, and is something that of course is an important piece of a lot of workflows… But there’s one thing on the list of chapters here that maybe we haven’t talked about as much on the show, and that’s desktop apps. We’ve talked a lot about whether it be like that orchestration, or software development toolkit layer, like talking about Langchain, and LLaMA Index, and other things, or the models, or the MLOps, or the vector database… But I don’t think we’ve talked that much about sort of “desktop apps” associated with this ecosystem of open source AI. Could you give us a little bit of framing of that topic? Like what is meant by desktop app here, and maybe highlighting a couple of those things that people could have in their mind as part of the ecosystem.

[32:35] Sure. I mean, I should probably quickly say about vector databases - I don’t quite understand why there’s so much hype over it. To me, embeddings are actually the important thing. The database that you happen to store your embeddings in is almost like a minor implementation detail… Unless you’re really dealing with huge amounts of data, it shouldn’t really matter which database you pick.

Sure. Valid point.

I don’t know if you have a different opinion, though…

[laughs] No, I think it’s not necessarily a one or the other… But in my opinion, there’s use cases for both. But not everyone should assume that they fit in one of those use cases until they figure out what’s relevant for their own problem.

But yeah, in the desktop space, I think maybe – there aren’t that many developers who talk about it, because it’s almost frontend type applications, as opposed to getting stuck into the details of implementing fine-tuning. All that stuff tends to be more backend, let’s say, in inverted commas. So I think that might be one of the reasons why there aren’t that many desktop applications being produced, because you kind of need both, both frontend and backend… And that maybe naturally lends itself to more the sort of resources that only a closed source company might be willing to dedicate… So maybe that just might be why there’s not so much in the open source space. It just takes a lot of development effort.

But yeah, there are a few that we do mention in the book. There’s LM Studio, GPT4All, Kobold… All of them are still very new, because - I mean, the thing that they’re effectively giving you a user interface for itself is very new. So yeah, I mean, there’s some common design principles that are maybe being settled on. You do expect a prompt, if you’re dealing with language models, you do expect a certain amount of configuration for images if you’re dealing with images, like what’s the dimensions, and some basic pre-processing that has nothing to do with artificial intelligence, but you might still expect to see this sort of thing in one place, rather than having to switch between a separate image editor and your pipeline.

Things that I’m interested in is improving the usability or the end user pleasure, let’s say, of using these desktop apps far more. So can you sort of graphically connect these pipelines together, like some sort of a node editor, so you can drag and drop models around, and connect the inputs and outputs to each other so that you can have a nice visual representation of your entire pipeline? But yeah, I’m excited to see what happens in that space. To some extent, I think Prime itself is probably interested in developing a desktop app itself.

As you’ve gone through the process of putting the book together - and I think one of the things that in any project that folks do is when to go ahead and put it out there; there’s a point where you have to put a pin in it and say “That’s this one right now.” But our brains never stop working, obviously, on these problems. To that effect, you get the book out there… Is there anything - and you have conversations like this one that we’re having right now, where we’re talking about it and you’re like “Well, it wasn’t meant for that, but it was meant for this…” Is there anything in your head where you’re starting to think “Well, maybe that should have been a topic, or something we should have put in the book. Maybe next time…”? With this landscape evolving so fast, where has your post-publishing brain been at on this collection of topics?

[35:55] We definitely have yet another 10 more chapters planned… So there’s definitely going to be a second edition of this book. Or maybe I should say second volume; it’s not even a second edition. It’s not a corrections kind of thing. It’s ten whole new chapters. Yes, literally v2. That’s going to include a lot of interesting stuff about things that happened in the latter half of 2023, and hopefully will be developed in 24 as well.

Among the things that people are talking about – I mean, we already talked about vector databases a little bit, and maybe you don’t see the hype there… What are some things in the ecosystem that you’re really, really excited about? And then some things that maybe – is there anything else that you’re like “Ah, people are talking about this a lot, but I don’t really see it going anywhere”? Any hot takes?

I mean, I probably already covered some of these things… What I’m super-interested in is fine-tuning and lowering entry barriers further. Things that I’m not all that convinced by are pretending that AI is AGI. They’re not the same, I’m sorry, and I don’t see it… And I don’t trust these models to be more intelligent right now than, at best, a well-trained secretary. They’re considerably faster, so there are applications where being able to churn through a lot of text really quickly is actually a value, in which case - yes, great; apply one of these things. But apart from that, I don’t really buy the hype.

Yeah, that’s fair, I think. And as we get closer to an end here, I’m wondering, maybe there’s some in our listener base that don’t have the kind of history in open source that you do… And of course, there’s contributions to this book that would be relevant, but there’s also contributions within this whole ecosystem of OpenAI, whether it’s in toolkits, or it’s in the desktop apps, or it’s in the actual models or datasets or evaluation techniques themselves… For those out there that maybe are newer to open source, do you have any recommendations or suggestions in terms of more people getting involved in open source AI? Obviously, the book is a piece of that, because it’s open source, and people could contribute to that… But maybe more broadly, do you have any encouragement for people out there in terms of ways to get started in contributing to open source AI, rather than just consuming?

Sure. Yeah, I would say that basically every time you consume, you are 90% of the way there to contributing back as well. So you have probably cloned a repository somewhere in order to run some code, right? You’ve probably encountered some issues. And a lot of those issues probably aren’t genuine bugs, because these are fast-moving things; people just write some code without necessarily doing full, proper, robust testing. We don’t have time to do robust testing. A lot of the time they’re just throw-away experiment type things. So we’re in make and break mode. Yeah, so if you find an issue, rather than quietly fixing it yourself, feel free to open a pull request. And maybe you’re kind of new to this, and you’re scared of opening a pull request, you’re scared that it’s not perfect code that you’ve written as well… Well, I mean, bear in mind that the code you fixed was even less perfect, right? And I can say, as an open source maintainer, I’m always super-happy when people contribute anything, whether it’s an issue, a pull request…

And I think, generally, people are far more happy and helpful and kind than you might expect, I would say. When it comes to actually writing code, people aren’t necessarily the same trolls that you might find on Twitter, or social media in general. These are people who have a mindset that they’re thinking about what’s been written, and they care about the actual project, and they don’t care about fighting you on a political front, let’s say.

So if you are trying to be helpful, that counts a lot more than “Are you actually helpful in your own opinion or anyone else’s opinion?” And even if your pull request doesn’t get accepted or merged in, you will definitely have some useful feedback. It might help you in your own expertise, your own growth as a student or contributor… And I would say there are definitely times where you might rub somebody up the wrong way, and you’re not happy with an interaction… But it’s such a small percentage of the time that it’s definitely worth it.

Yeah. Well, I think that’s a really great encouragement to end this conversation with. And of course, Chris, and I as well would encourage you to get involved, and even if it’s something small initially, get plugged into a community, start interacting and contribute to the ecosystem… Because I would agree with you, Casper, it can be both useful for the project, but also very rewarding and beneficial for the contributors in terms of the community, and the things you learn, and the connections that you make, and all of that. So yes, I very much encourage people to get involved. I also encourage people to check out the Open Source AI book, which we’ll link in our show notes… So make sure you go down and click and take a look; it’s very easy to navigate to, and you’ll see all the categories that we’ve been talking about through the episodes. So dig in, and if you see things to add, definitely contribute them.

We appreciate you joining, Casper, and thanks for sharing the link. You just shared it with me. So book.premai.io/state-of-open source-AI. We’ll link it in the show notes as well, so people can click easily. But yeah, thank you so much for joining, Casper, and also thank you for your contributions to the book. We’re really thankful that you’ve done this.

Sure, yeah. Thanks for having me on.

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. 💚

Player art
  0:00 / 0:00