Daniel and Chris explore three potentially confusing topics - generative adversarial networks (GANs), deep reinforcement learning (DRL), and transfer learning. Are these types of neural network architectures? Are they something different? How are they used? Well, If you have ever wondered how AI can be creative, wished you understood how robots get their smarts, or were impressed at how some AI practitioners conquer big challenges quickly, then this is your episode!
Discover.bot – A digital space for bot developers and enthusiasts of all skill levels to learn from one another, share stories, and move the bot conversation forward. Want to learn more about building bots? Get started with their Guide to Bot Building Frameworks.
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.
SUBSCRIBE ~> Brain Science — For the curious! We’re exploring the inner-workings of the human brain to understand behavior change, habit formation, mental health, and being human.
- RL (Reinforcement learning)
- GANs (Generative Adversarial Networks)
- Transfer learning
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to another Fully Connected episode, where Daniel and I will keep you fully connected with everything that’s happening in the AI community. We take some time to discuss the latest AI news, and we’ll dig into learning resources to help you level up your machine learning game.
My name is Chris Benson, I am the chief strategist for artificial intelligence, high-performance computing and AI ethics at Lockheed Martin, and with me is my co-host, Daniel Whitenack, who is a data scientist at SIL International. How’s it going today, Daniel?
It’s going pretty good. It’s been a good week. How about with you, Chris?
It’s been good, just the usual busy stuff. I am excited about today’s episode. We were talking about it prior to getting on air, and you had some great ideas. Do you wanna go ahead and talk about why we’re doing what we’re doing?
Yeah, sure. Some of the listeners might know that I do industry trainings in AI and other things for companies, and sometimes at conferences, and that sort of thing… And one of the frequent questions that comes up during those trainings, and just conversations about AI in general, are questions about the difference between an AI model - so you might think about having a convolutional neural net model, or an image detection model, something like that, a model - is that the same or different from things like reinforcement learning, GANs, transfer learning? Are GANs are reinforcement learning and transfer learning - are those types of models? Do you have a transfer learning model? Or are they different sorts of things than models, with specific architectures that might be associated with specific neural units, like recurrent neural networks or convolutional neural networks?
So my thought today is that maybe we could just go through a few of these methodologies or approaches that aren’t maybe models themselves, but are connected to the AI world in some way. I’m suggesting that maybe we talk through what reinforcement learning is, what GANs are, and what transfer learning is. How does that sound?
[04:02] That’s a great idea as far as I’m concerned, and it’s the kind of thing that we hear in feedback a lot, because as people come into the podcast - some people are already experts in the field, a lot of people coming in are brand new and they’re trying to understand what the field is about, and there’s so much to learn these days, and it’s evolving so rapidly, that I thought this was a great idea, to just kind of go through each of these and just define what each one is, how it works, and allow people to get up to speed on those a little bit faster.
Yeah, and as we go through these, we can maybe just give a little bit of a sense of what they are, but also give some places that they’ve shown up in the news recently - in the AI news or news in general - and then some learning resources for each one if you wanna get started in reinforcement learning or one of these other things. Some links that we’ll for sure put in our show notes, so that you can follow up on those things and start learning them practically. Remember, we’re all about practicality here at this podcast, so I wanna make sure and include those links, as well.
Which one of these do you wanna get started with, Chris?
Do you wanna dive into reinforcement learning up front?
Sure. Again, we’re thinking about approaches or methodologies that involve AI models, but might be slightly different than a single, one end-to-end model. With reinforcement learning - first of all, where have you seen reinforcement learning being applied, Chris?
I think the thing that made reinforcement learning big was its application to simulation and to robotics. Reinforcement learning has been a core technique for doing simulation robotics for a long time, and then in recent years deep reinforcement learning - which we’ll dive into in a few minutes here - has really come in and revolutionized that process itself. But in real life, at a previous employer, we were working on a large multi-skilled team, with different people specializing, and we had reinforcement learning specialists on the team, that were focused on doing robotics. It’s definitely real-world stuff, it’s not just academic, and it works, and that’s why they’re doing that.
Cool, yeah. You mentioned that there’s this thing, reinforcement learning, and then there’s deep reinforcement learning; that’s where part of this AI or neural network stuff gets plugged in. But in reinforcement learning one of the main pieces of reinforcement learning is this thing called an agent, and this agent takes action. In your case, Chris, with the robots, do you remember what this agent or these actions were, that the agent was taking?
Sure. So you’ll have different software components within the robot, and they may be integrated with different types of models… And they each have a particular job. For simplicity’s sake, let’s just say it’s about moving the robot around its environment. Initially, you have to have an algorithm that the agent is going to use to make decisions based on what’s happening to it in the environment that it’s operating in. The way you do that is every time the agent actually takes an action, that changes the relationship it has within the environment, which is called state. That could have been a good action, that’s going toward what you’re trying to train it for, or maybe not such a good action. The way that is determined by the person that’s training the model is by offering a reward for the appropriate actions being taken.
[08:04] You could kind of think of it as - since we’re always talking about pets, and stuff like that - treating the dog for doing the right thing, with positive reinforcement training, for those of us who have pets… Same kind of ideas - you wanna let the agent know “Hey, that was good. You get bonus points for doing the right thing, or we’re gonna pull back something if you don’t.” That’s the basic idea.
You go through that iteration many times, to try to get your robot or your simulation - it could be a video game, it could be whatever - to start behaving in the way that it has been most rewarded along the way.
Alright, cool. I’m trying to parse through in my mind some of the things that you said, which was really good… So there’s this first thing that’s called an agent, and that agent can take action. Let’s say in a very simple scenario, with a robot, maybe the robot can only do two things - it can move left or it can move right. That agent has to determine if based on some external factors (the environment) and its current state, or maybe where it’s at, or which way it’s facing, if it’s to move left or if it’s to move right. So it takes in some inputs from that environment and it’s supposed to determine if it moves right or left.
Now, I think what this – so the agent employs what’s called a policy to determine that next action. Let’s say that the robot is in this place, with these coordinates, and maybe there’s other external factors, or something… So it’s got a current state, it’s somewhere in its environment, and that policy is to determine that next action based on the current state, whether it’s maybe move left, move right, do this, do that…
So far I’m trying to count up the things that reinforcement learning involves in my mind, and we’ve got the agent, we’ve got the policy, we’ve got the state, and then we’ve got the environment. Now, you mentioned the reward. The reward also is kind of how the model gets feedback - is that right?
Yeah, it’s the feedback loop, and the purpose of the reward is to shape the policy. Your policy is being evolved, so that at the end of your training the policy hopefully always does the right thing that you’re training it towards. You’re essentially giving it little bumps with the reward to get it there. You’re trying to shape the policy, which is the strategy that the robot is using to move around. There are many different ways of doing that. There are lots of different algorithms that have been used over the years, and one of those (which we’ll talk about) has moved into what’s called deep reinforcement learning.
Yeah, so in my mind, if I’m thinking about this, I kind of see this loop where the agent takes actions, and then at some point in the feedback loop the environment infuses a reward or feedback into the agent.
Now, one of the things that people ask me when they’re trying to figure out this reinforcement learning thing, they kind of get the idea of - okay, you can give a dog treats, and help train it; so this idea of training makes sense. But they have a hard time picturing where the neural network fits in this scenario, or where the model fits in this scenario. One example of that might be if our robot has a camera, and it’s looking at its environment, or maybe it’s looking at a simulation, one thing it could do is image recognition, and then based on that image recognition it could determine whether to move left or move right, or something like that.
[12:02] So instead of the image coming in and then the output just being “This object is in this image or not”, then in this scenario the image would come into the model and the output of the model would be the action - left or right, or something like that. So there’s still this neural network model there, but it’s tied into this feedback where the output is the action. Am I representing that correctly, Chris?
I think that’s a very good explanation. A lot of times what the reinforcement learning is acting on may be a camera; the camera image is coming in - and as a little side note, the type of neural network that is most often used for that is called a convolutional neural network (CNN), and we’ve had several episodes where we’ve talked about that, including one that was a deep dive on the technology, early on.
Typically, when we’ve talked about those, we’ll talk about the convolutional neural network basically classifying what it sees. Essentially, putting a label on it, with a percentage of – you know, “I’m looking at something. Is that a horse, is it a cow, is it a dog?” and there’s some level of percentage of confidence that is being assigned to those traditionally with CNNs. The difference when you put it in with this particular approach, with reinforcement learning, is you’re talking about influencing the policy. So what you really need is the output of that convolutional neural network. It is “What action should I take for my next action?” and that way it feeds into how the reinforcement learning algorithm is trying to do that reward, to change the policy over time on how the model is acting on the environment.
That’s a great point. You mentioned a convolutional neural network, but people could see that this reinforcement learning algorithm or approach is really just that - it’s an algorithm or approach where within that approach you could apply a convolutional neural network in your agent to learn a certain policy to take in images and output actions. But people use reinforcement learning for all sorts of other things, and that approach is independent of the specific kind of model that comes in.
You could perfectly well use other architectures of neural networks - recurrent, and other things - within your agent, but this reinforcement learning loop or approach would still be there. That would still be kind of an RL approach to maybe a different sort of problem.
Yeah, and you raise a really great point there, and we’ve kind of alluded to it several times - and just to kind of back out of the specific reinforcement learning conversation… We’re talking about different approaches that have different algorithms or architectures, and when you set aside all these buzzwords, each one is trying to solve a particular class of problem… Whether we’re talking about CNNs looking at images and trying to solve that, and reinforcement learning being able to have an agent take actions that are rewarded to get to the right policy to act in your environment…
We’ve talked about several others, and the point is you can use a lot of these together; so to avoid confusion, if you’re working on a particular problem, and you might be in reinforcement learning and say “Well, it’s images that I need to act on in this case”, you would stick a CNN there. And that is just one possibility of how you would combine different types of architectures or algorithms in deep learning to get where you wanna go. It’s not always the case that one architecture, one algorithm gets you where you wanna go. A little bit like LEGOs - you may plug some of these together. I just wanted to clarify that, in case there was any confusion.
[16:10] Yeah, I appreciate that. Before we move on to the next thing, let’s maybe just think about - okay, where is reinforcement learning showing up in AI news, and what are some learning resources that people can look into if they’re trying to learn reinforcement learning, if this has piqued their interest.
One of the things that I’ve seen in the news recently - and people are probably seeing related things on Twitter, or wherever it is - Deep Mind released a reinforcement learning approach that resulted in human-level performance in the video game Quake 3. This was pretty cool, where a lot of these reinforcement learning techniques have been applied to fun things like video games, and things like that.
If you’re more interested in reinforcement learning, we’ve actually had a couple episodes… Episode #14 and episode #40 of Practical AI, that talk about certain applications of reinforcement learning, with a little bit more explanation. And as well, one of the learning resources that I’ve found that looked really good on this front - there’s actually an official PyTorch tutorial on reinforcement learning. We’ll make sure and link that in our show notes if you want to go ahead and dig a little bit deeper into reinforcement learning and actually try some things out on your own.
Daniel, now that we’ve covered reinforcement learning, what do you say we dive into generative adversarial networks (GANs)?
Yeah, that sounds good. This sounds kind of scary, “adversarial” things, Chris…
Are we gonna talk about terminators now? Are we about to all die?
Yeah, I don’t know… How adversarial are these networks, Chris?
Well, I’d say that they’re adversarial with each other, which is the whole point, which is why they’re calling it that. It is a really, really interesting innovation that came about in 2014, where one of the very famous figures in this space, whose name is Ian Goodfellow, put out a research paper with several of his colleagues about generative adversarial networks. What that is is basically you have two different types of neural network architectures designed to work together, or more specifically against each other, to try to get where you wanna go. It’s a way of often creating outputs that are creative from this type of technology. Pretty interesting stuff. I’ve seen it used for lots of different use cases, too.
[20:07] Yeah, you mentioned the creative element of this… One of the places I think this has received a lot of attention for good and bad, in some ways, is in generating images. There’s examples of creative uses, like generating specific artwork, or generating things in the style of certain other things. There’s also examples of generating pictures of fake people, and all of these things… So this all involves this kind of generative element of generative adversarial networks.
You mentioned there’s two elements of this methodology, Chris… There’s obviously some sort of generative element of this, which is what people call the generator of this approach… What’s the other thing that’s involved here?
So you have the generator, which is one part of this combined architecture, and then the other side, the other algorithm is the discriminator. Essentially, the generative architecture/model in this case that’s being trained is creating outputs that are inputs for the discriminator. The discriminator side is essentially trying to classify which ones are real and which ones are fake, and it has those mixed in with the ground truth dataset, so that if you’re trying to create images - and this might be something that’s completely new - the discriminator has access to a dataset that has a bunch of real images that are the ground truth that you’re training against. It is that baseline dataset. And the generator is also looking at those, but it’s creating images that are meant to look like whatever it is that the dataset is representing.
I’m just making this up - it might be cats, since we like to talk about cats on the internet. So you might have a bunch of images in the actual dataset of cats, and then the generator is trying to create new images of cats and slide that in with the ground truth datasets, and it’s up to the discriminator to determine which ones are real and which ones are not, and put a percentage on that.
So there is this feedback loop between the two to where the discriminator is making its choices and giving that feedback to the generator, and in turn, the generator is learning from what the discriminator is able to do right or wrong, and produce more and better images. So it’s a neat thing - the adversarial side is that these two models are literally trying to beat each other. One analogy could be a policeman against a counterfeiter, with the generator being the counterfeiter and the discriminator being the policeman, and they’re each trying to do their thing, and get better and better at it, and by doing that they both get better.
Yeah, I’ve also heard the analogy being the generator is the artist, and the discriminator is the art critic, trying to examine the output of the generator. In some ways similar to reinforcement learning there’s this overall scaffolding in which in this case two models are interacting. So there’s more going on here than just one end-to-end model; there’s a couple things happening here, and there’s this loop between the generator and the discriminator. Now, each of these pieces - the generator itself and the discriminator - could be a single neural network. The generator might be a neural network that takes in, for example, some random inputs and generates an image on the output, like an art image, or something like that. So its input might be some kind of random input like that, and the output might be what you’re trying to generate.
[24:22] The discriminator, on the other hand - it’s taking in a whole bunch of images and it’s kind of like a classifier, so it may just be another type of neural network that is trained to be a classifier, to classify as human-generated or computer-generated, or good art of bad art, or something like that. It’s a classifier that classifies that set of images. So you’ve kind of got two “models” here, and that’s where the neural networks are fitting in here.
Of course, there’s specific types of generative models that work particularly well in this framework. For the image case, DCGAN is fairly popular. There’s a OpenAI article that we’ll for sure link in our show notes that kind of describes some of the generative models that are used in GANs. But maybe as we look a little bit more at GANs, maybe we can talk about where they’ve been showing up in the news. Where have you been seeing GANs show up recently in AI news, or news in general, Chris?
One of the things that we have talked about on a couple of previous episodes was that there was a portrait that was created by a GAN that Christie’s Auctions house sold at auction for $432,000. And nobody, including the people selling it, were expecting that. That was for a unique and original piece of artwork that a GAN created… And that suddenly really shook that industry, because it was one of those instances that nobody saw coming. But we’re also seeing it in all sorts of other places - creating original music (we’ve talked about that in the past), I know that Ian Goodfellow uses it in the security industry, which is completely different.
So there are so many different use cases where you want some sense of originality or creativity to play into it, and using GANs to actually generate this stuff, regardless of what the medium is, is becoming a better and better option for doing that.
Yeah, I know one thing that I’ve seen… Even people like my brother-in-law, who isn’t involved in the AI industry at all – I mean, he’s kind of interested in tech things, but not really a programmer or anything like that… He even showed me this one website - people are probably familiar with this - ThisPersonDoesNotExist.com. Have you seen this, Chris?
I have, and it’s gotten better and better over time.
Yeah, it has gotten better over time… And this website, if you’re not familiar with it - you can go there and basically all it shows you is a picture of a person. It looks exactly like a real person; it takes you off-guard when you realize that this person does not exist. In other words, this picture of this person, which looks real in every way, is a picture of someone that is completely generated; everything about that picture is generated using this type of methodology.
[27:55] Of course, that’s really interesting, and kind of amusing in certain ways, but also it’s kind of concerning in other ways. Of course, everyone is concerned with fake news, and fake content on the internet now, so there’s definitely a concern with these around if what you’re looking at is actually real or not. I remember talking on one episode - I forget which one - about there’s actually people out there that will create a fake persona, a fake picture for you for Instagram to be your company’s influencer on the internet. So there’s a question here of like how real are the things that we’re interacting with.
It’s interesting, one of the responsibilities I’ve taken on at Lockheed Martin is contributing to developing AI ethics, and figure out not just about what we do, but about how we react to what’s happening in the world, and there are obviously bad actors out there. So one of the things – GANs are so powerful, and as a quote, Facebook’s AI research director is very well known in the industry, Yan LeCun, and he referred to GANs famously as “The most interesting idea in the last ten years in machine learning.” Obviously, Ian Goodfellow and his partners that were working on this are among the brightest minds in the field… So there’s so much potential for the use of GANs, both wonderful, interesting, and some bad use cases as well.
The advent of GANs has changed the conversation in terms of AI safety, and AI ethics, and how these technologies can and should be used.
Yeah. If people are interested in diving a little bit more into GANs, there’s definitely some good resources out there. For reinforcement learning we mentioned there’s a PyTorch tutorials - and there’s a bunch of other tutorials out there for that, but there is a really great TensorFlow tutorial for GANs. We’ll make sure and link that in the show notes. Actually, if you go to that tutorial, they have some nice pictures as well, talking about the generator and the discriminator, and cat images, and all of those good things… But then they walk you through all of the code with Keras and TensorFlow to actually create this GAN, and they have a link to pop that up in a Google Colab notebook, so that you can go ahead and get started with GANs.
Okay, lastly – so we talked about reinforcement learning, we’ve talked about GANs… Let’s go ahead and jump into this last thing that I hear people asking about, which is transfer learning. We’ve certainly touched on this in previous episodes, but we haven’t kind of put it in context like we’re putting in context these other things.
Transfer learning is another one of these methodologies or approaches that’s used in AI, by AI practitioners, to do a bunch of different things. But transfer learning isn’t kind of a model in and of itself; it’s another one of these approaches… And I would say in comparison to GANs and reinforcement learning, it’s actually one that I leverage pretty heavily in my own work. Reinforcement learning and GANs haven’t touched my life as much as transfer learning has, and I think transfer learning is something that pretty much all AI practitioners should be familiar with and utilize heavily. What do you think, Chris?
I would say that pretty much all AI practitioners have utilized it, whether they realized it or not.
Yeah, that’s probably true.
If not before, certainly when they were learning how to do this, and they were initially going through and creating their first models; they were almost certainly using transfer learning, even if they didn’t realize it. It’s kind of the secret weapon of getting yourself going, and it’s probably almost always used in certain types of use cases, such as computer vision. As we get into defining what it is, it will become apparent why.
Yeah, and it’s definitely impacted the natural language processing (NLP) community very heavily, and there’s been a lot of efforts in that direction recently. I know on one of our very first episodes we had the guys from Machine Box on…
Yeah, that was episode two.
Yeah, the first one with guests.
It was the first one with guests, that’s correct.
Yeah, so Machine Box has this really great service that you can spin up, that will do facial recognition. And really, all you have to give it is one or two images of a person’s face, and it automatically updates the model and does really great facial recognition. Of course, it’s not just utilizing one or two images and training a whole neural network on two images. That just wouldn’t work. So there’s something else being leveraged under the hood, and as Chris mentioned, in that computer vision context or NLP, a lot of times that thing is transfer learning.
At a high-level, how do you think about transfer learning, Chris?
The way I think about it is when you’re creating a model, you don’t just go and do it and it’s done. It is an iterative process. Going back to the basics of what deep learning is - what a deep neural network is is you have a series of layers, and each of those layers is responsible for generalizing something, understanding something, and they tend to build on themselves.
In the context - to make it real - of computer vision, you may have a deep neural network, and the early layers are there to recognize just simple things like lines, or corners, and things like that, and you tend to build those features up to where now it recognizes (after it combines some of those together) what lips look like, or what an eye looks like. And then you go up a little bit and it starts to recognize how you put those different features together and make it a face, and then a full head… So each one builds upon the other.
[37:59] The really cool thing about this is… Let’s say that you need to go recognize something, and maybe some of those baseline features like recognizing (at the lowest level) lines, and curves and such - obviously, in every image you’re gonna do that. So if you have a model that’s really good at doing that already, and if it’s getting close to human recognition, or maybe animals, or certain common objects, you can move higher up the stack. And then at whatever point the purpose of that pre-existing model might diverse from yours, you can take those layers that were consistent with what you’re trying to achieve and build upon those. And since they were built with a general dataset that is different from the data that you’re about to train it on, your new model is more likely to generalize better as well, since you have a more diverse dataset by definition, since you have pulled in a partially-trained model from somebody else’s dataset.
So it’s kind of like we’re all standing on the shoulders of giants - you build upon what other people have built; you can take that pre-existing model that might work really well up to a point, and then you take your specific data, with your specific dataset, a set of images about something that you care about, and then fill that out. You end up being able to get a very robust model that does something very useful with much less training, and it’s a lot less brittle since it has a broader dataset to base it on.
It’s kind of like most programmers don’t write every single line of code from scratch when they’re creating a new application. There’s a lot of copy and paste that goes on, because they’ve done this before, they’ve done that before, they’ve created this service and it just needs to be slightly different this time around… So they never start from scratch; they kind of copy a bunch of things over. It’s very similar here, in the sense that yes, you can in many cases take a model that has not been trained on any data yet, and train it to do a certain task.
Let’s say that we want to translate text from English to Hindi. What we do is we get a parallel corpus, so we’ve got a bunch of examples of English phrases, and then a bunch of examples of the corresponding Hindi translation of that. And we train a model on English to Hindi, so that when we put in an English phrase, what we get out is the corresponding Hindi translation. That’s kind of like training from scratch. That would be, in my analogy, creating every line of code from scratch; we’re initializing all of the weights and the biases of our model, all the parameters of our model from scratch, from some random seed, or all starting at zero, or whatever that initialization is… But we’re using that English to Hindi corpus to train all of those parameters of the model from scratch…
Whereas now let’s say after we’ve done that, later on in our work we don’t want a model that is trained to translate English to Hindi, but we want English to Urdu, which is a related language. This means that we could do one of two things - we could either get another huge corpus of English to Urdu data and train from scratch again, or we could leverage the knowledge that we already created in that English to Hindi model. So we could take that model and all of the weights and parameters that we trained for English to Hindi, and then we could just slightly modify it or fine-tune it by re-training those on the new dataset, maybe a smaller amount of English to Urdu language.
[41:50] This has been widely used in NLP because in a lot of cases maybe you want to take a pre-trained model that’s very general, so it’s applied to maybe do translation for all domains, and you want to really fine-tune that for a specific domain of text, or of some content… So what you’ll do is you’ll fine-tune or slightly modify that on this new dataset.
So there’s kind of this initial pre-trained model, and then there’s the fine-tuning of that pre-trained model on a new dataset. So it could be on a new dataset, or you might fine-tune it by adding additional layers to it, as well.
To bring this back full-circle on that, if any of our listeners have taken classes from maybe NVIDIA’s Deep Learning Institute, or maybe Coursera, on specific things like NLP or computer vision etc, chances are in that class one of the things you did when you started creating the models for your class was they would have you go in and select an architecture to base that on. That itself is transfer learning. You’re gonna find libraries of these models that are pre-trained, that you can build upon, in all the common frameworks out there. TensorFlow has them, PyTorch has them… It is truly the most common way, certainly to get started or to build upon.
In my own experience, I have more often than not seen people use transfer learning in their work than start from scratch and try to build things completely from the ground up. You would have to do that if there was not the right type of model that you can build upon, but this is normal stuff. This is what we do, and I thought your analogy, Daniel, in terms of using libraries if you’re a programmer, you’re truly using lots and lots of code that other people have built. Maybe a lot of that is open source, maybe some of it is proprietary, but you’re still using those APIs to build whatever thing you’re building, whatever application you’re building… That’s a fantastic analogy you gave, on matching it up to transfer learning in ML.
And another thing is in a lot of cases you may just not have access to the data that you need. For example, you may not have access to the huge number of face images that someone else has trained a facial recognition model on… So they might have 200 GB of images that they’ve trained their model on, and you only have a handful. But that doesn’t mean that you’re totally out of luck, because a lot of people have released these pre-trained models for facial recognition and other things.
Like we were talking earlier with Machine Box, you might just be able to utilize that pre-trained model and update it with a couple new images, or a handful of new images. That kind of removes the burden on you to gather all of these large sets of data, maintain them, update them over time, run really long jobs to train these models using GPUs, which is really expensive… So it can also be kind of an operational and cost-saving strategy as well.
I’ve seen transfer learning in the news recently in a few different places. One of the places, as I was searching around in preparation for this episode - I saw a recent article from Forbes about Google’s AutoML. It mentions actually transfer learning in that article which I thought was reasonably technical for Forbes, but… Yeah, they talk about how Google’s AutoML services are using transfer learning, leveraging transfer learning to allow people to create these customized models, maybe for translation for their specific domain, like for law, or for medicine, or for customer service, or something like that. So it’s definitely being utilized in a lot of production services.
Where else have you seen transfer learning come into recent news or recent releases of things, Chris?
[46:16] Well, we’ve had several episodes that made reference… Some of the algorithms that we’ve talked about were BERT in episode #22, we talked about GPT-2 in episode #32, and those are models that you can build upon as well. And I think it’s really important to note that this is kind of the standard way you start thinking about a problem - you go out and look and see if there is something out there that makes sense to build upon. It’s almost the root end to machine learning today.
A lot of these great institutions are in fact building things that all of us can then take it thereafter in the tool, whichever one we want to use, and apply that.
Yeah, I think the BERT and GPT-2 and other large-scale language models are good examples. For example, as we talked about BERT or GPT-2 in other episodes, you can basically take that pre-trained model, and in a lot of cases how you would “fine-tune it” is by adding a layer that would do named entity recognition, or adding a layer that would do text classification, or something… And keeping all of that knowledge from the BERT or GPT-2 embeddings on the front-end of your model. So you’re only adding or changing it a little bit, but you’re leveraging all of that knowledge that Google or OpenAI has already built into it for you.
A couple things that I’ve seen even over the past couple of days - if you’re looking to get hands-on with transfer learning, there’ve been a couple resources that have been published that I think are really great. One of those is a blog post from the Hugging Face team; if you remember, on episode #35 we had Clément from Hugging Face on. He had some really interesting and fun stuff to talk about… But their team has released this tutorial on how to build a state of the art conversational AI with transfer learning, and I think that builds on some of these large-scale language models.
Then even today there was an NAACL workshop on transfer learning. That’s the computational linguistics conference that’s happening (I think) right now up in Minnesota… And there was a workshop there and they released all of the code and Colab notebooks and information (slides) from that tutorial. We’ll make sure and link that in the show notes as well, if you wanna get hands-on with transfer learning.
But yeah, talking through these things with you, Chris, has definitely helped categorize some of these major components of AI methodologies in my mind. I hope it has for you, as well.
It definitely has. I hope that we’ll get feedback from our listeners. I know when we were talking about doing this before recording this episode, we were hoping that there might be some of the confusion out there that we could alleviate, and we would love to hear back from people through Changelog.com/community, or on our LinkedIn group, which we invite people to join as well. You can search for “Practical AI podcast” on LinkedIn to do that.
We’d love your feedback, to know if these were helpful, and are there other specific questions we left unanswered, and are there other topics that you would like us to cover in future shows.
Awesome. Thanks for talking through these things with me, Chris, and I look forward to hearing from our listeners out there of how they’re using these techniques. If we messed anything up or misspoke, or if there’s additional great resources that you know about on this front, please reach out. We will talk with you again soon.
Our transcripts are open source on GitHub. Improvements are welcome. 💚