Practical AI – Episode #174
Quick, beautiful web UIs for ML apps
featuring Abubakar Abid, Gradio team lead at Hugging Face
Abubakar Abid joins Daniel and Chris for a tour of Gradio and tells them about the project joining Hugging Face. What’s Gradio? The fastest way to demo your machine learning model with a friendly web interface, allowing non-technical users to access, use, and give feedback on models.
Featuring
Sponsors
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com
Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this!
Notes & Links
Transcript
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to another episode of Practical AI. This is Daniel Whitenack, I am a data scientist with SIL International. I’m joined as always by my co-host, Chris Benson, who is a tech strategist at Lockheed Martin. How are you doing, Chris?
Doing very well, except I’m drowning in yellow pine pollen.
Yeah, it’s allergy season.
Yeah, it’s everywhere. Other than that, doing great.
Yeah, yeah. Allergy season - I’ve been rotating between sneezing and then debugging some NVIDIA issue today. So it’s like NVIDIA issue, sneeze, come back, see if it’s fixed, sneeze, blow my nose…
Is the NVIDIA issue contributing to the sneezing?
I don’t know, I don’t know.
Is there a correlation between the two?
Let’s say no–
Okay.
…on the record.
You’ve just inspired the next line of NVIDIA processors. They’re going to be like “The sneeze” or something, you know?
Yeah, yeah.
They need good marketing and you’re providing that for a free service.
Yeah, I guess so. Yeah. Speaking of practical things, I’m pretty excited about our guests today, because as you know, I’m always wanting to get into the practicalities of what we’re talking about, and this really fits into that. I’m sure a lot of our listeners have heard about Gradio, and today we have with us Abubakar Abid, who is the Gradio team lead at Hugging Face. Welcome!
Thank you so much, guys. It’s awesome to be here.
Yeah, yeah. As we get started, maybe just give us a little bit of a sense of your background and how you eventually found your way into thinking about Gradio-like things, or interfaces or apps for machine learning.
Yeah, absolutely. So I’ve been doing machine learning for about a decade now. I did my Ph.D. at Stanford, and I worked a lot on building machine learning models for medical imaging and medical videos, that kind of thing. And during the course of my Ph.D, I would often work with collaborators who are not machine learning scientists, not machine learning engineers. They were doctors, clinicians, biologists, that kind of thing.
[04:05] And one of the things that we realized was - well, first of all, not everyone knows how to use machine learning models directly, not everyone can code, software engineering and all of that good stuff… And we wanted to make it easy for other people to try out the model so that they could get feedback. Because you know, sometimes you tell someone, “Hey, I trained a model for you. It’s 95% accurate”, they’re not going to believe you. They’re not going to take your word necessarily, so they want to try it out themselves and test it.
And so we started off with a very simple library that was designed to make it easy to test computer vision models. That ultimately grew into Gradio, which is an open source Python library to build machine learning models very generally, to build GUIs for machine learning models very generally, to build demo web apps for machine learning models… And then I actually finished my Ph.D. at Stanford last year, and then soon after that actually, Gradio got acquired by Hugging Face. And so now we’re here at Hugging Face, which is a fantastic place to be, and really, we have this amazing mandate, which is “Hey, let’s make Gradio the default tool for machine learning practitioners to build demos”, to get their work out there, so that other people can access their work, so that machine learning researchers can make their work more reproducible, machine learning teams can easily collaborate and kind of see what machine learning models are actually doing. And we’re hoping to grow this a lot more in the next few years.
Just to pull it out, so for those in our audience who are not familiar with Gradio, could you talk specifically about what that is, since we’ve all been talking about it for a few minutes? I want to make sure that they have a clear image of what you’re talking about.
Yeah, absolutely. So Gradio is an open source Python library. So you can pip install it… And what it does is it takes a machine learning model that you’ve built - or it could also be an API, or any sort of function, but it takes something that you’ve built and wraps it with a GUI. So you have this web-based GUI… So imagine you have an image classification model - you can easily wrap a web-based GUI that allows you to drag and drop your own images, see the model’s predictions. But the cool thing is that you can build this GUI entirely from Python, right?
So in the past, if you wanted to build a web application around your model, you needed to know Flask, but then also maybe Docker to containerize it, and then maybe some Bash scripts, and then you needed to figure out web hosting solutions, and then you needed to know maybe a little bit of SQL to build a database to collect samples, and then maybe some front end web development, HTML, CSS, JavaScript to kind of build a little UI around it, all of that in one framework, in Python, which is already something that you know if you’re doing machine learning.
Yeah. So before the show, I was just running a few things in Gradio - and I have in the past, but I kind of revisited a few things that I was playing around with. Right now I’ve got an application running to do just like question-answering, which is actually something that we do at SIL, and work very closely with those sorts of models. Just to give people a sense - I created this application and I’m looking at it right now, it’s 10 lines of code; three of those are blank lines. And in those 10 lines of code, imported a question-answering model, and like you say, wrapped it in this function, this question-answer function. And then with that, when I do Python, thisapp.py, then it spins up a web server and I can go into my browser and interact with the model. So yeah, it’s pretty cool. I think it’s pretty interesting that– it seems like you can have superpowers, like you get superpowers to do this web stuff with having to actually know that stuff.
I’m curious, from your perspective now doing this for quite a while, seeing a lot of people use Gradio, in the context of industry and companies, where do you think the value add comes from this type of– I don’t know if you’d consider it like a prototype or an app or a tool… Where does the value add come within the workflow of a typical data scientist, or something like that?
[08:07] Yeah. I think this is a really important question. So the way that I’ve seen machine learning done, both in research kind of settings, but also in industry teams, is that there’s almost like two cycles of machine learning. There’s the prototyping/exploration/research cycle, and then there’s kind of the deployment cycle. And the kind of the workflows and the tools that are used in both of the cycles are very different.
So in the first one, in the research exploration, you’re mostly working in Jupyter Notebooks, doing a lot of trial and error, you’re building lots and lots of models, oftentimes, and you’re getting feedback - and this is really important - you’re getting feedback from stakeholders. Because it’s impractical and usually not the case that the machine learning developer knows everything about how the model is going to be used. It’s oftentimes end users, sometimes quality testers, sometimes customers, sometimes business teams who are going to be the ultimate consumers of these models. So there’s the producers of the models and there’s the consumers of the models. And traditionally, it’s very hard for those two types of folks to talk to each other. And so what Gradio does - it kind of creates the bridge between these two teams.
So let’s say you’re a machine learning researcher or a scientist, developer, whatever your role is - you can easily take a model that you’ve built and then expose it so that it can be consumed by a variety of different people. And we’ve seen that be sometimes quality testing teams. So one of the things you’ll notice in a Gradio demo by default, there’s this flag button as well, which helps you catch incorrect predictions and so on, and store them in a local database on your computer and stuff like that. So that’s very useful, and it’s really designed for quality-testing teams, or even end-users to try out your model and get back to you.
The whole idea behind Gradio is it’s so intuitive. Very few people know how to build machine learning models, kind of in the grand scheme of things. More people know how to maybe load a model if you’ve given a model and maybe run some code against it, but way more people know how to use a browser, right? Billions of people can literally drag and drop images into a browser and try it out. And what that does is it kind of lets you– it reminds you that your audience, when you’re building these machine learning models, is very broad, and it puts the control in their hands, so they can try it out and give you feedback.
Is it kind of conceptualizing it a little bit, if you’re kind of making an analogy to web development a little bit and you have these frameworks out there to make it easier, like React and Angular and stuff like that - is it a little bit like that, or is it more focused on just the model? When you talk about creating that user experience, so that you can share it with people, what does a typical experience look like?
Yeah, that’s a really good question. So even when we started Gradio– and recently, we’ve been having this conversation about how general-purpose do we want Gradio to be? Do we want it to be the way you create any sort of web application directly from Python? That would be really cool. And I think we’re working towards that goal, but we’ve intentionally started with these higher-level abstractions that are designed for machine learning models. Really, if you have a machine learning model and you want to wrap it with a demo with a GUI, you can do it in less than three lines of code, as Daniel was just talking about. That’s because we’ve created these high-level abstractions that make it super easy to plug in a model, plug in what kind of input you want, plug in what kind of output, and it creates that GUI for you.
So really, the developer experience right now, as is, is designed really for machine learning engineers and researchers. But - and this is one of the things we’re working on right now, is we’re exposing a low-level API that can allow you to actually build more complicated and potentially a much wider array of web applications, whether they be for other use cases in machine learning, like for example, labeling or annotation or a data set exploration, or it could be for like other things altogether.
[11:48] And partly because of great communities like Hugging Face and other communities, we are seeing this rapid proliferation of all sorts of kinds of models, and how they’re used to do all sorts of interesting things… So could you describe a little bit, with Gradio, maybe those main use cases where it’s super-easy - like, I did the question-answering thing - and then maybe other cases? How does the customization work? Maybe someone out there is thinking like, “Oh, well, I’m not quite doing the same type of object recognition, or tech summarization or something like that.” How does that work?
Yeah, absolutely. So Gradio started off actually designed for images, was kind of the image-related tasks. But at this point, it covers pretty much, I would say, 95% of machine learning use cases. And the way it does that– and this kind of goes back to your previous question, Chris, about who it’s designed for. The way it does that - it comes shipped with prebuilt components. So the idea is, “Hey, I’m working with videos. I want a video input”, and maybe my output is a heat map, it’s an image, or it’s a plot, or whatever it may be. So Gradio comes in shipped with all of these different components, making it super-easy for– basically, your use case could be anything from… Maybe I’m doing video activity detection, to anything more like traditional data science; maybe I’m working with data frames and other kinds of graphs and time series data. Gradio has components for all of that. And we’re always adding new things. For example, we just added real-time speech recognition, so you can kind of speak, and as you’re speaking, you can get a transcription rendered real time. We have a lot more things in the pipeline as well, for some things like 3D images and 3D models and objects. That’s going to be released pretty soon as well.
So I’m just looking at the documentation, and I was also playing around with a couple other things before this conversation, and it covers a lot of components, so like input components, including things like sliders and text box, and video and audio output components, data frames, files, labels, text boxes… I also work in dialogue systems with SIL, and there’s even like a chatbot output. I actually didn’t know that before I went into your docs again after I’d used it in a while, and I was like, “That’s cool.” So I clicked on that and just opened a notebook and had a chatbot interface to plug into my new model, which was really cool.
You can stitch all of these together, as well. One of the things that kind of astounds me is sometimes people will build super-complex – a model will take like eight different inputs, and they will write all of the Gradio code that’s needed to create this input that takes eight different things, and the GUI is so complicated. And I’m like, “Wow, they write all this code…” Which isn’t too much code, but still, you have to read the documentation, understand the parameters, and all of that. And people do all of that because the alternative is kind of terrible. The alternative is having to write all of this front end stuff and–
No one wants to do that.
No one wants to do that. So having this ability, this superpower to do it in Python, I think it’s quite nice.
So you already mentioned the acquisition by Hugging Face, and now you’re lead of the Gradio team at Hugging Face, which is super-exciting. Our listeners can’t see video, but I made sure and I wore my new Hugging Face hat today, for the interview. I don’t know if you see that. Obviously, I have a little bit of bias on my side for Hugging Face, as our listeners will know.
I’m jealous, because I don’t have a Hugging Face hat.
How did you get one? I don’t have one either.
[16:10] Well, I don’t know… I saw someone on Twitter post like, “Hey, beta of the Hugging Face store”, and I don’t know if it’s public yet. Maybe I’m not supposed to be sharing this, but I saw it on Twitter like, “Beta of the Hugging Face store” and I was like, “Oh, yeah!” So I clicked there, and then I ordered swag for my whole team. So I splurged a little bit. They won’t be surprised, I guess, if they listen to this podcast. I’ll try to get it to them before then, but yeah.
Anyway, with Hugging Face– so you’ve described Gradio. We’ve talked about Hugging Face on the show many times, but maybe you could just give us, from your perspective, what is Hugging Face and why does it make sense to have Gradio plus Hugging Face?
Yeah, absolutely. Hugging Face does a lot of things, so it’s hard to describe concisely. But basically, I think the overarching goal of Hugging Face is to make machine learning more accessible, right? So previously, if you wanted to use state of the art machine learning models, you have to read papers, you have to wrangle with a lot of, I think, malformed GitHub repos…
I don’t miss those days at all.
Exactly. It’s a lot of work. I remember that from my Ph.D. as well. It’s a lot of work. The way I see it is that Hugging Face offers various levels of access, right? So for example, there’s Hugging Face datasets, which is designed for machine learning practitioners, really, to training their own models.
Then there’s Hugging Face models, which is designed for machine learning practitioners, but also software engineers who don’t want to have to think about what’s under the hood; they just want to use a really good image classification or really good question-answering model and not really worry about any of the algorithmic implementation details. And so that opens it up to a lot more people.
And then with Gradio, or Spaces, which is what Hugging Face calls demos, that level of access opens up even more. So now pretty much anyone who can use a browser - like we said, billions of people who have a browser and are connected to the internet can now use state of the art machine learning models, and they can do interesting things with that. And I’ll just give maybe two quick examples. One is we had this demo that someone built with a Gradio called AnimeGAN. And we’ve seen many such examples, but AnimeGAN was this demo that you could put your own image, or put any image, and it would render it into this cartoonized, almost like – I think it’s called rotoscoped, or drawn version of that image. And someone built a demo, hosted with Gradio, hosted on Hugging Face Spaces, which is a place you can host your Gradio demos for free, and then released it on Twitter, where it went viral.
I remember it. Yeah.
Yeah. And we had thousands of people use the model, or I’d say maybe tens of thousands, and tried on their own profile pictures, pictures of their pets, everyday objects… And they were interacting with actually state-of-the-art machine learning, which is something that’s never been done before. And that is really cool, because now machine learning developers are thinking, “Hey, the audience who is using my model is a lot bigger. Let me make sure that my models are robust, they can handle diverse images, diverse inputs.” And that leads to a level of, I think, concern for the end user that wasn’t really there before. I think that’s really important, because then issues of bias and accessibility are addressed as well.
I want to give one other example as well, because I think this is interesting as well, because demos have a big purpose for education as well. And I remember one of the early days of Gradio, actually, we had built like this demo for an MNIST model, and I actually shared this… Anyway, this MNIST model, you can draw handwritten digits, and you can see what the prediction of the model is… And I shared it with my sister, my younger sister, who has like no background in machine learning whatsoever. She tried using the model and she drew a six, and it predicted six, and all that work. And then she just drew a little dot in the center, and I think it predicted like a seven. And she was like, “Why did it predict a seven? That makes no sense. I just drew a dot.” And then I told her, “Well, it kind of has to predict something, you know, and maybe the sevens were just the most thing in the dataset.” And she was like, “Hmm, that doesn’t seem right”, and I was like, “Well, what should it do?” and she said, “Well, it should just avoid making a prediction. It should kind of abstain for making a prediction.”
[20:21] And so she had stumbled upon this idea of abstention, which is now a really important topic, and lot of people think about that. But if you’ve never really interacted with a machine learning model in this way, you might not even realize the importance of it. And this is someone who has no background in machine learning. So I think demos can go a really far away in both accessibility, and then also education as well.
So I love that. I actually would like to dig in, having gone through those two examples, and dig in just a little bit. It really begs the question of what typical workflows look like, because you’ve kind of shown us two examples of going out there and doing that. And that second one, in particular - there was a big insight, because we’re all in this industry, and I think there’re some things we take for granted because we’ve been doing it for a while, but you had someone who wasn’t, someone important in your life who was not in the industry, make you realize something, and I think we all have moments like that. What are some of the typical ways that you and your team and other people that you work with are using Gradio on a day-to-day basis that has directly changed the way the workflow is? Where do you fit this in? If I’m adding Gradio into my machine learning, I’m already maybe using Hugging Face, but I now want to use Gradio as part of that - what’s changed and how do I think about my workflow now?
Yeah, absolutely. Let me give you one more example that might illustrate that, and then we can discuss that. So this was actually one of the early examples of Gradio and where I realized it could actually have a big impact, especially in interdisciplinary teams. I was building a machine learning model to classify ultrasound images of the heart, so echocardiograms, if you’re familiar with them. So we had built this model and it was getting really good accuracy, like 95% AUC, and all this good stuff, to predict various things about the heart, for example like does this patient have a pacemaker in the heart or not.
And so we had built this model and we shared it with a cardiologist, and the cardiologist was a little skeptical about how well the model was working. And so we built a Gradio demo around it and we let him play around with it. And one of the things you can do with a Gradio demo is you can interactively edit the input. So the cardiologist can upload his own ultrasound image, in this case, and they can edit it. And so they could, for example, remove the pacemaker from the image by kind of white-outing it. And so he did that. And so there was a pacemaker and an ultrasound, and the model was predicting this patient has a pacemaker, then the cardiologist white-outed it, removed it, and then the predictions changed in real-time. And when the cardiologist saw that and did this with a few different images, he was like, “Wow, this actually works.”
And even us, all of the machine learning people in the room, we all breathed a sigh of relief, because it’s one thing to see aggregate metrics, but it’s another thing to have someone adversarially test your model, and then still it’s kind of robust to that.
Basically, I think there’s two broad ways that Gradio can help, and one is, if your model is good, it can help build trust in the model, especially for important stakeholders, because they can test it. But if your model is bad - and this may be more important - it can help expose those issues of bias and other things that are really important as well.
I had a little follow up, and you’re starting to address it there at the end already, but I’m curious - and it’s a little bit of a side issue; it’s not a direct Gradio issue, but Gradio is obviously part of the solution to this, and that is that several times in our conversation, we’ve mentioned the idea of people being skeptical of model output, and like, “I don’t know”, and all that. And obviously, you’ve seen that and you’re addressing it, and you’ve produced a really good tool for letting people get that. But I’m curious, as someone who’s observed that repeatedly, what is it that’s causing the skepticism in trusting the model? And obviously, they get to a point where the tangibility of using Gradio allows them to get past that, but what do you think is causing the challenge the non-machine learning consumer of that model is facing upfront?
[24:17] Well, I think you only have to look at some of these very famous models, you look at something like GPT-3, for example. OpenAI released this model. It’s supposed to be able to understand language and answer your questions. But people try it and they discover all sorts of problems, right? So you can ask it questions and it starts making very nonsensical kind of responses, or even suggest very dangerous things. I remember there was a study that showed that if you ask it medical questions, it could suggest things that could lead to self-harm, and all these terrible things. And I myself was part of a study, and I did a study where we were looking at religious bias and found that GPT-3 has all these issues associating Muslims with violence, and all of these kind of problematic things. And that’s just one example.
So OpenAI did something interesting, which is that they did release it as a demo, so that people could try it out. But that doesn’t happen most of the time with research, right? So you see all of these really nice sounding numbers, and state-of-the-art this, and that, and nice models published in Nature that claim to have solved this problem… I remember, actually, there was one person I talked to when I was a Ph.D. student at Stanford, and she had made this really nice model to look at videos of people in the ICU, and from those videos, you could tell if a patient had a particular disease based on how they were moving around. And as she published in Nature, I was like, “Oh, this is really cool. We should try to deploy this in the clinic.” And then she looked at me, she was like, “Are you crazy? I would never trust this in the clinic.” Because there’s a big gap between I think what is publishable and sometimes what people can get away with, versus what is actually usable in the real world, for all of the reasons that we’ve talked about. Usually, models are trained and tested on these really nice, sanitized datasets. We don’t really expect it to test them on real-world settings that might be out of distribution.
And so I think one of the cool things about Gradio, and we’re seeing this more and more, is researchers, for example, at CVPR 2022, they published papers and they released a company in Gradio demos, and then their research community was testing them. And there were some models that did great, like people were testing them all sorts of difficult ways and they were doing great, and then other models people have found holes in relatively quickly. And so I think part of it is because the machine learning community is so accustomed to training and testing on these very fixed benchmarks, that really, really stresses the need for something like Gradio to open up that box and let other people try it with their own data.
Well, I definitely want to follow up on some things related to Hugging Face plus Gradio, but maybe from a different perspective. I’m wondering, as you were sort of running Gradio pre-Hugging Face, and you had the open source project, and in certain ways, you might not have known all the ways that people were using Gradio, but had some sense of how it was useful, and now you’ve got this Hugging Face scale of people using it, and an avenue for people to share things, and they’re sharing a lot of things… What sorts of maybe challenges have you faced as you’ve tried to integrate Gradio at Hugging Face scale, and tried to scale that up, make sure it runs well for people as they create their own spaces, with all sorts of different models, some of which might be really big, and some of which might be pretty easy to run? I just imagine that that’s kind of, well, a really hard thing to do, but it seems like you’ve done it very well. So yeah, any thoughts there?
[28:11] Yeah, that’s a good question. There’s a lot here. So one of the things that I’m amazed by is just the diversity in the types of models that people are building. This is shocking. I think sometimes you it be kind of an echo chamber, where you think “Okay, most people use machine learning for these types of use cases”, but then you see users that just completely blow your mind. For example people using GANs to generate Pokémon, or people using speech recognition in so many other languages, and you have to realize you have to support Unicode this, and that… It’s just a lot. It’s just a lot of different use cases that show up, and you have to address that.
One of the good things about Gradio - I talked earlier about how there’s like two different cycles of machine learning. There’s the exploratory/research stage, and then there’s actual production-level type stuff. At Gradio, we tend to focus more on the exploratory/research side. So even when you share a model, let’s stay on Spaces, or you share it temporarily - so I don’t know if you’ve seen this, but if you create a Gradio demo, you can pass in one extra parameter in the launch function, share equals two, and that creates this temporary public link that allows anyone to access your model, which is super, super-handy for prototyping.
And so what we’ve entirely focused on, we’ve said we’re not trying to optimize for production-level traffics or anything like that. We want to just focus on, “Hey, let a few people try out your model, get feedback. Let a few other people try out your model.”
And so because we’ve kind of laser-focused on that use case, even when a lot of other people are using Gradio demos, we have always stayed with that expectation. So for example, people asked us, “Oh, my space is up on Hugging Face. I want to use it as an API.” Well, we added support for that, but we kind of made it clear, “Hey, this is not meant to be a production-level API. You can use it for testing, and so on.” So I think we’ve mitigated a lot of like traffic type related issues just by focusing on this stage of the use case.
And then the other thing that we’ve tried to do is leverage Hugging Face’s and existing infrastructure as much as possible. So for example, Hugging Face already has something called the Inference API. So any sort of model that you can find on the Hugging Face hub, which at this point is I think more than 30,000 different models, it comes with its own inference API that you can just call. Gradio also supports using any of the models in the Hugging Face hub pretty much off the shelf. So in one line of code, you can build a demo for one of these models. And if you do it, it leverages Hugging Face’s existing inference API, rather than trying to create something ourselves.
So by doing that, I think we’ve mitigated a lot of those load issues, but what we definitely find is a lot of people using Gradio in these different use cases that we wouldn’t even imagine. We see a lot of issues being raised, and then we’re doing our best to kind of support that. But a lot of cool things – like, people ask for new types of components as well. And so we’re working on supporting that. A lot of people ask for like, “We’re using this Space to test our model. We’re seeing some kind of weird behavior. How do we retrain the model based on what kind of issues have been observed?” So we’re thinking about how to fit Gradio into this larger loop of training models, again, and making them better as well. So yeah, a lot of cool things there.
I’m synced with you pretty well, I think, because you’re already going where I was going to ask you actually… And that is, as you talk about these new components that you’re building, and you’ve talked about focusing more on the exploratory side of that, but you’ve already acknowledged that there’s the production deployment side… Is that where you want to go? It seems like it might be a natural workflow that if I’m already doing Gradio and getting my feedback with my demos and everything, that I might ultimately just want to deploy that in a variety of areas. And so as you build more components out, is that with that in mind, of eventually you are robust enough in how you’re making that model available, so that it’s Gradio all the way and forevermore?
[32:04] Yeah, it’s interesting… We’re thinking about that. And right now, we’re actually leaning against that a little bit. And the reason for that is because that space is, first of all, very crowded. There’s a lot of tools that are designed to help you deploy your model. And it’s kind of an interesting– the issue is it’s one of those things, I would say kind of like the Heroku type problem, which is that if people get big enough, they don’t want to use your solution to deploy, they want to do it themselves. If people are small, they want to– so there’s a lot of people who are just prototyping. Great. So we’ve got that use gets covered. When people are big, or kind of medium size, maybe they’ll use like an off the shelf product to deploy your model. When they’re very big, they’re going to write something themselves, or use something that’s very tightly coupled with one of the big cloud vendors.
And so we’re actually thinking that rather than focusing on the production use case, which is kind of crowded, what we should do instead is make it easier to build more kinds of web applications from Python itself. And so I think we’re going to be leaning more in that direction. You’ll see things like potentially maybe lightweight labeling solutions built out of Gradio, or maybe dataset exploration tools built out of Gradio… We’re trying to cover more of those use cases.
I love the focus there.
And as you look towards that, I also note that – I believe, if I’m not mistaken, the main core of Gradio is open source. I don’t know if there’s certain things that are maybe integration things with Hugging Face and other things that aren’t. But the main bit is, I’m wondering how that community has grown over time, and the code base and open source community, what you’re seeing in terms of activity and interests there.
Yeah, that’s been amazing, and we obviously owe a lot to Hugging Face for that. So for example, we have a Discord server. If we had just launched for Gradio a Discord server, I don’t know how many people would’ve joined. But we’re part of the Hugging Face Discord server, which helps a lot. And our community is honestly amazing. There’s some folks– and this astounds me. There’s some folks that use Gradio every day, and they’re raising new issues. Like, I don’t use Gradio every day. [laughter] So it’s been really nice, people catch issues like that. Any time we break anything, people let us know. But I think one of the very interesting developments that we’re seeing, and we’re trying to understand this piece better, is that people used to just have demos as standalone things, but now people are taking demos and integrating it onto their websites, as part of their portfolios, but even just as standalone websites.
So one of our users, for example, built this NFT search engine using– the backend of it is a Gradio app. He hosted it on Spaces, and just embedded that Space on the homepage of his website. And so his whole website - I mean, it has some surrounding information, but it is basically a Gradio demo, which is very interesting. And this is not the only one. We’ve seen this in a few different examples. And this kind of raises the question… So now it seems like a lot of people are building very data-centric or machine learning-centric applications. That’s the focus of it. If there’s enough use cases like this, maybe we want to focus on building this one-stop shop for how to build a complete data-centric application. So we’re thinking about that as well.
I work with a lot of students from Purdue University, I have a close collaboration there, and other universities as well, and they’re always asking me, “As I’m going from grad school into industry, what can I do to set myself apart?” I think for those of you listening out there that are listening to this - this is like a really cool idea that can set you apart. If you’re able to go past saying, “Hey, I ran this cool example in my Jupyter Notebook, and here’s the GitHub repo with the Jupyter Notebook. Okay, it renders and I can see some images”, it’s a whole other level when you can point someone to Spaces or to a Gradio app or embed that in a blog post or your website or whatever, and have someone actually interact with it. I think that that goes a long way. We get a lot of requests from people listening to the podcast and in our own lives about people getting into AI, and data science, and I think that’s just a general free tip out there. I think this episode has shown that.
[36:18] It really differentiates that person from all the competition. And I know that in my own organization we see that as well as people are coming in. If you walk in and you’re a Gradio expert, compared to people who are not able to show that, it’s a huge differentiator for someone. So yeah, it’s a great point you’re making there.
Yeah, yeah. That’s a lot of fun.
It is a lot of fun.
Yeah, just to be able to interact with your model; it’s just so much more real. As we’ve talked about, you start noticing these things that you otherwise would just not have even paid attention to. A lot of things are just buried under these nice aggregate metrics that we like to take a look at; but you just get way more insight when you can actually play with your model.
And as you look to the future of Gradio and Hugging Face and maybe other things that are happening with those two things, what are some of the things that are exciting to you about maybe the AI space more generally, and then maybe more specifically in the Hugging Face and Gradio world? What are you thinking about that you’re super-excited about the developments of, and looking forward to the future?
Yeah, absolutely. So I think one thing that really excites me is that we are moving away from what we talked about, this benchmarking or these static data sets, and that’s it. Now a lot more people are interested in out of distribution robustness, right? We’ve trained a great model on our dataset. What kind of guarantees can we give about how well it’s going to perform the real world?
Obviously, it’s a very difficult topic, but there’s a lot more interest both on the academic side, and then also on the industry side, with practical tooling. And so this is where, I think, we’re going to see companies, potentially including Hugging Face, also invest more resources in. Like, you have some models out there; how can you effectively flag issues that the model is having, so that other people who use it are aware of these limitations, can contribute to the robustness of the model? Because I do think at the end of the day it’s a problem that’s probably going to be– it’s very hard, I think, to formulate this problem in a clean way, such that we can tackle it from a theoretical point of view or an academic point of view. I think what we need are better tools to identify issues behind models, and to let people almost– just the same way GitHub, for example, has issues and PRs, what does the equivalent of that for machine learning look like, so that people are aware of the issues and can make things better? So that’s something I’m personally very excited by. I think Gradio plays a role with that, in helping discover these issues, but I think it’s a much bigger problem just alone that Gradio can solve.
Yeah. And you mentioned even– we didn’t go into it in detail, but you do have this flagging feature within the app… Could you maybe tie in how you see that fitting into what you’re talking about with this out-of-distribution input, and that sort of thing?
Yeah, absolutely. So this is one of the core fundamental things that we added to Gradio early on, and we see people using it very actively to this day… Which is that Gradio lets you try out your own data in potentially someone else’s model or your own model; you can drag and drop your own image, you can type your own text, you can edit it, you can play around with it. And let’s say you find something that the model isn’t working well on. Let’s say – I mean, just for an example, if you take a state-of-the-art image classifier and you put in a picture of a bride who’s wearing Western attire, like American, let’s say, typical– it predicts bride, or ceremony; pretty reasonable labels. But if you put in a picture of a bride from Pakistan, where I’m from, or India, it usually predicts things like costume, or performance, or something that’s wrong. Maybe it could be borderline offensive as well. You find issues like this all the time, with all sorts of machine learning models that we’ve talked about. They’re very fragile. And so what this flagging button lets you do is it saves that sample and it stores it to a local database in like a CSV file, basically.
And so the workflow, what this looks like in practice - and we see this often - is people will create a Gradio demo, and they’ll share it, because it’s so easy to share that demo. It’s just like a Google drive link; you share it with a bunch of people that try out your model, they identify issues and that helps you… And then they can just click the flag button. That’s all they have to do. And then you have this nice CSV of everything that’s been flagged. And you can say, “Ah, okay, maybe I should retrain my model on these samples, or maybe I’ll have some better understanding of what the failure points are.”
We have users who used to have spreadsheets where they would send these spreadsheets back and forth and read them, it’s kind of a replacement for that, and maybe a better way to do it.
Yeah. I think this is super important, and yeah, I’m just really excited about the future with these sorts of tools and what you’re doing. Keep up the good work. I really appreciate you taking time out of your busy Gradio/Hugging Face life to let us know about these things. It was a pleasure talking. I hope to talk again soon.
Yeah, it was great. Thank you so much for having me here, and all the great questions.
Right. Bye-bye.
Our transcripts are open source on GitHub. Improvements are welcome. 💚