Practical AI – Episode #106
Learning about (Deep) Learning
with NVIDIA's Will Ramey
In anticipation of the upcoming NVIDIA GPU Technology Conference (GTC), Will Ramey joins Daniel and Chris to talk about education for artificial intelligence practitioners, and specifically the role that the NVIDIA Deep Learning Institute plays in the industry. Will’s insights from long experience are shaping how we all stay on top of AI, so don’t miss this ‘must learn’ episode.
DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog.
Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this!
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.
Rollbar – We move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at rollbar.com/changelog.
Notes & Links
- NVIDIA GPU Technology Conference, October 5-9, 2020
- 20% off for our listeners! Use code CMINFDW20 by 9/25 for additional 20% off
- NVIDIA Deep Learning Institute
- NVIDIA to Acquire Arm for $40 Billion
- NVIDIA Ampere Architecture
- GeForce RTX 30 Series Graphics Cards
- Practical AI Episode #15: Artificial intelligence at NVIDIA with Chief Scientist Bill Dally
- Practical AI Episode #36: Growing up to become a world-class AI expert
with Anima Anandkumar of NVIDIA and CalTech
- Practical AI Episode #90: Fully-Connected with Chris and Daniel - Exploring NVIDIA’s Ampere & the A100 GPU
Click here to listen along while you enjoy the transcript. 🎧
Welcome to another episode of the Practical AI podcast. We are the show that tries to make AI practical, productive and accessible to everyone. My name is Chris Benson, and I’m a principal emerging tech strategist at Lockheed Martin, and with me as always is Daniel Whitenack, who’s a data scientist with SIL International. How’s it going today, Daniel?
It’s going really well, Chris. It’s been a busy week so far, but fairly productive. How about yourself?
It’s been a fun week for me. I’ve just hit my 50th birthday, and…
Oh, happy birthday!
Thank you! It was a big one. I was pretty happy with it. I don’t get into the whole crying over your birthday thing, so I’m having fun…
I’m trying to extend it for the entire month, in every possible way, so… Good times.
Right. Yeah, that’s good. Do you have a party or any fun over the weekend?
We are scheduling all sorts of fun stuff. We’re just trying to do some good family stuff, stay safe with Covid, the way it is, and… Good times.
Yeah, awesome. It’s been a pretty eventful week as well on the AI front. I’m sure you saw, as I did, a pretty large acquisition that’s looking like it’s gonna happen between NVIDIA and Arm, so that’s pretty cool stuff.
Yeah, if it gets through, a 40-billion-dollar acquisition, which is obviously big news… But I mean, really, anything NVIDIA does is big news, as we’ve proven time and time again. We’re always talking about NVIDIA… And speaking of NVIDIA, we have a guest that I know both of us have really wanted on the show for a while now. With us today is Will Ramey, who is the global head of developer programs at NVIDIA. Welcome to the show, Will!
[04:07] Thank you for having me.
I’m wondering - I know you and I have met previously at GTC, and had conversations, and you have an interesting background; I know you run the Deep Learning Institute… I was wondering if before we dive into all the cool stuff, if you could just tell us how you got to where you are now at NVIDIA, in the particular position… You guys do some amazing work, so I’m really looking forward to hearing this.
Yeah, let’s see if I can summarize the short version of it. I got an undergraduate degree in computer science and worked as a software engineer in Silicon Valley for many years, starting with Silicon Graphics and then a number of startups, including a game studio. During that time, I transitioned from working as a software engineer to a product manager, and a producer, and a general manager in the games industry… And then some friends of mine invited me down to join this scrappy little company called NVIDIA, that was just really ferociously competing in the video game industry back then. I accepted their offer, and… That was 17 years ago.
A little bit of change happening during that time at NVIDIA, I would imagine…
It’s been a wild ride, both for me personally and for the company coming up on two decades here… When I joined the company, we were viewed as a chip company; we designed chips, that’s kind of our business. And I had the opportunity to work as the product manager for some of our developer tools, and then do some program management of large software projects we were working on with Microsoft and others, and work a little bit in our embedded business… And slowly but surely, the company started to climb up the value chain, from chips to boards and embedded systems, kind of system-level products… And then in 2009 we introduced this new technology called CUDA. CUDA, as many people know and love it today, is a parallel computing platform that gives developers access to the parallel computing power, the accelerated parallelism of NVIDIA’s GPU processors. And that was exciting.
Scientists, researchers were starting to figure out new and innovative ways to take advantage of this computing capability… And I moved into a role as the product manager for CUDA. So introducing this new technology, talking with people who were trying to make sense of it and apply it to some of their most challenging problems… And that gave us the opportunity to go and really start working with people who we’d never worked with before. Working with Fortran programmers, working on climate and weather simulations, working with people in the energy industry, trying to figure out what does the subsurface of the Earth look like. People who were worknig at the very smallest scales of physics, trying to figure out molecular dynamics and how things work at the quantum levels. Really big, sophisticated simulations.
Yeah, in the beginning days of CUDA there it sounds like the focus was a lot on scientific computing, materials modeling, more on the scientific computing side. Was AI even in the mix at that point?
Not at all. In fact, one of the most common questions we got was “Wait, why should I use gaming technology to do real science?” And it seems kind of like a strange question these days, where some of the most important scientific challenges of our time are being addressed using not only this gaming technology, but what has grown into a more general-purpose parallel computing platform.
[08:02] The application of this parallel computing technology to AI happened several years later, and it really came, again, from the research sector, where there were a small number of researchers who had somehow managed to maintain funding, and were exploring this area of deep neural networks, and what we have come to call deep learning. And they were building what at the time seemed to be really large, complicated neural networks that had very simple amounts of processing in each node. And this was a very different approach from what had been done previously, in other machine learning techniques… And using really large amounts of data to train and fine-tune these neural network models to perform various tasks, initially in computer vision; things like image classification, object detection, segmentation, and things like that.
What they discovered was somewhat surprising, in that these GPU parallel processors were almost ideally suited for accelerating the work of training these neural network models to perform a wide variety of tasks, and that was really the genesis of the deep learning AI revolution that we’re all witnessing today.
I’m kind of curious, because it sounds like this realization came from the research community initially, when they started thinking more about accelerating these AI workflows on the GPUs or the graphics cards… So at what point did you begin to see a shift into real-world industry applications, where industry people started really expressing this desire for GPU-accelerated machines, and that sort of thing, out of research and into the industry world?
Yeah, that’s a good question. It happened in a couple of phases, and in some ways we’re all very fortunate that those initial researchers decided to publish not only their results in research papers, but they published the software that they used to train the deep neural network models, the things that we call deep learning frameworks today. One of the early ones - there was Caffe, and Torch, now PyTorch, TensorFlow, MXNet… There are some called PaddlePaddle… And these frameworks being available in the open source community, but really just being available at all, allowed all of us to start to experiment with them.
And what we observed happening was that very early on in the process the cloud service providers who had adopted GPUs into their offerings very quickly recognized that this deep learning technology was going to be not only valuable to their own businesses, but a valuable new type of workload that allowed them to offer more compute services to their customers and grow their cloud computing business.
So you saw new types of instances, new types of servers being offered by the cloud service providers, and prepackaged virtual machine images, basically the operating system, the software that you run on these servers and the cloud service provider platforms, prepackaged and ready to go. All you had to do was click a few buttons and you had a server with GPUs and all the software you needed ready to start running and training your models on your data. So that lit a fire, really, especially in the startup community, where up until then the cost of buying your own data center and getting everything set up was pretty prohibitive to small startups. Now they could just very quickly rent all of the compute capabilities they needed, and most of the software was already set up for them to start exploring all the innovative ideas that startups come up with every time a new technology is released.
[12:26] And we also in that timeframe saw a number of (let’s say) forward-thinking enterprise organizations and government agencies begin to take advantage of these capabilities, where they saw there was a good business opportunity, either to improve their own internal business operations and practices, or to build new and enhanced products that allowed them to provide better experiences, better services to their customers.
Now what we’re seeing is really beyond – they sort of crossed the chasm, if you will, to the point where the leading/early adopters have already proven out the technology and everyone is just racing to figure out how to apply it practically to improving their products, their services, their business operations and just other aspects of their challenging problems that this technology can address.
So before we dive fully into the AI powerhouse that NVIDIA is today, I’m kind of curious, referring back to those early days when you started, and you know, we really thought of NVIDIA as that graphics company, graphics cards, and in gaming, and anytime we were thinking along those lines that was the name… And you made this remarkable transition as an organization into being one of the dominant AI companies, and on the hardware side THE dominant AI company. That’s a huge cultural shift, and I’m just curious, as someone who lived through that very successful shift that you’ve made, how did the company did that, when so many other organizations that do something similar tend to fall down on their face?
Well, I have to give our leadership a lot of credit here for having a very strong vision and a conviction that we were headed in a direction to be more valuable to the world… But at the same time, taking measured steps - small, measured steps each year, each quarter, that allowed us to experiment and to learn quickly, and to test what worked and what the various markets, what our customers were ready for. And through that process of continuous experimentation and innovation, and proving out our hypotheses as we went, we were able to find our way into providing not just the technology, but the business models, the training, the support and resources that developers, researchers, data scientists, this emerging class of DevOps engineers who are proving to be so crucial in connecting the work that happens in the engineering and development teams with the IT deployment and operational teams… Just learning as we go, and really investing in this ecosystem of developers and people who can use the technologies that we’re creating to solve their problems is what my experience was kind of along the way.
And maybe to set the stage for where we’re at now - so you mentioned a lot of different stakeholders there and involved parties, from developers, to DevOps and all of that… And of course, people might think of NVIDIA as like the actual GPU cards and servers and these other things, but could you just kind of give us a very – I know it’s hard to not leave anything out, but just kind of a brief sketch of the different types of things that NVIDIA is offering the AI community?
[16:15] So there’s the hardware side of things, but then there’s other things as well… One of the things I use fairly frequently is the NGC containers that are really to use/optimize… So maybe you could just give us a sketch of some of these other things that people might not be as aware of, that NVIDIA is offering the community.
Sure, sure. It’s a really long list, but maybe I can just pick a couple of the highlights.
Yeah, just highlight a few. There’s a bunch, but…
So in addition to the GPU hardware itself, and now the networking hardware, with the acquisition of Mellanox and Cumulus, and bringing them into the NVIDIA family… And then the systems - the HGX system designs that we’ve designed and worked with lots of our OEM partners to build and put into the market… And of course, the DGX product line - the workstations and the servers and the pods; the kind of data center scale computing solutions that we offer.
There’s a really, really rich collection of software, some of which we contribute to as open source projects. So whether it’s PyTorch, or TensorFlow, or any of the other leading deep learning frameworks - we have teams of engineers who are working on contributing to those projects to help them take best advantage of GPU acceleration directly… And then also take advantage of things like our TensorRT. This is the deep learning model compiler and runtime environment, so that once your deep learning model is trained, in the same way that we as humans – we put a lot of energy into training ourselves and learning a new thing, but once we’ve learned it, we kind of have like the mental shortcut.
Let’s say you were learning to juggle; it takes a lot more effort to learn how to juggle than it does to actually juggle after you’ve learned how to do it. And it’s the same thing for playing tennis, or playing piano, or anything like that. Same thing for deep learning. Once you’ve trained the neural network model, it turns out that it can be optimized and run at a much faster performance in an inference deployment environment, or at a lower energy profile, depending on what your needs are… And this TensorRT runtime is able to help with that.
But there are also higher-level tools and resources like NGC. The NGC catalog provides both containers with all kinds of accelerated computing, and deep learning, and data science software environments that are pretested, preconfigured, ready to go, can be deployed in your own servers, on your embedded Jetson-based platforms, or on cloud service provider platforms…
And it also includes things like model recipes. So maybe you don’t want the entire environment, you just want some deep learning models that are known to be good, and a recipe for how to use that, to train it with data, and things like that. So NGC provides a number of those things as well.
And then there are even higher-level services on top of that, that work with our EGX solutions, that are kind of combined cloud or on-premises, with internet of things. Say you’re a city and you wanna build a smart city type environment to be able to keep track of how many of which types of vehicles are running on your roads, so you can do predictive maintenance before potholes show up, or figure out where the safe and less safe traffic patterns are, so you can put in better crosswalks, or change the timing of your stoplights, and things like that. You might have hundreds or thousands of cameras, and you need to be able to do analytics on those video feeds that are coming into the system. This is the kind of thing that these EGX scalable systems are able to handle.
So given that we are just about to head into GTC, the GPU Technology Conference, I know there’s all sorts of announcements and stuff, and you guys are gonna release new stuff coming up… Obviously, the thing on everyone’s mind is Arm. I was wondering if you could tell us a little bit about what GTC is, for those who may not have had the opportunity to attend already… And also, I imagine it’s a little bit early, but if there is anything that you can address with the Arm acquisition, we’d love to hear it.
Sure. Let’s take those one at a time, and we can start with GTC. The GPU Technology Conference (GTC) is something that’s very near and dear to my heart. We organized the first one a little over ten years ago, and it has grown and evolved with our ecosystem of developers and researchers and data scientists and everyone else who’s learning and exploring how to apply the new technology. This year, with the Covid-19 virus, we transformed GTC into an online experience, and over 60,000 people joined us earlier this year for the very first one. In just a couple of weeks we’re gonna be doing what we call GTC Fall, the next installment of GTC, and it’ll have all of the usual elements that people have become familiar with in the GTC conference. We’ll have a keynote by Jensen Huang, our president and co-founder. He’ll be kicking things off on Monday morning, October 5th… But we’ll also have just hundreds and hundreds - I think it’s something like over 600 live and pre-recorded sessions available this year, talking about all of the amazing work that is happening out in the ecosystem.
Some of the talks are by NVIDIA engineers and experts and product managers to help people learn how to use these technologies, but a lot of them are by people who are using it to solve their own problems, and then sharing back with the community the types of solutions that they’re building in graphics, in ray tracing, in artificial intelligence with deep learning… They’ll be talking about hybrid cloud computing, they’ll be talking about the work that they’re doing in healthcare, in public sector and government applications… Just a really, really amazingly broad range of topics.
One of the most valuable things about GTC is it’s one of the very few places where there’s an opportunity for people from different disciplines, people working on completely different application domains to kind of cross-pollinate their best ideas, and inspire innovation in a large number of fields… Unlike most other events, where people who are working on the same kind of thing get together and there can be a little bit of an echo chamber sometimes.
[24:19] So some of the things that we’re doing to help facilitate that kind of innovative cross-pollination, even in a virtual environment, is hosting networking events for the attendees. We have this thing that we call “Dinner with strangers.” There’s basically a topic, that could be an industry-related topic, or a technology-related topic, and people who are interested in that technology and wanna talk about it with people maybe they haven’t met before, are invited to get together in small groups. We’ll have someone from NVIDIA, with some knowledge on the topic, host the gathering… But it’s not a presentation, it’s just a way for people with common interests to get together, maybe bring some dinner to their desk, and eat and have some social interaction, talking with other people who have common interests.
Yeah, this is so great… And before we go too much further, I just wanna mention as well that this is coming up, it’s October 5th through the 9th. As you can already tell, we haven’t even gotten into everything, but it’s so much interesting stuff here… And NVIDIA has provided us with a 20% off discount for our listeners for GTC this year. And additionally, there’s an early bird price that lasts until September 25th. I know I’m registering… The code we’re gonna list in our show notes; the code is “CMINFDW20”. It might be hard to remember; I’ll definitely put it in our show notes… But this is a great deal; it’s not very expensive, and lots of great material, sounds like, to come…
I’m curious, as you sort of tried this out earlier in the year, and now it’s this fall edition, what were some of the maybe surprising elements that you noticed from having it completely online for the first time? Of course, it sounds like there’s huge attendance for one, but what were some of the maybe surprising things that came out of the fact that you were able to have it online that maybe were unexpected to you?
Well, I mentioned earlier how the evolution of NVIDIA over time has been supported by lots of small experiments… And one of the experiments that we did at GTC in the spring of this year was to offer our hands-on Deep Learning Institute training in a virtual environment. We had never done that before, and we had certainly never done it at the scale of trying to train thousands of people at the same time as part of a GTC, in these virtual classrooms.
I’m just incredibly proud of the work that our Deep Learning Institute team did to pull that together, to work out all the kinks, to cross-train all of the instructors and teaching assistants, so that thousands of people who attended those trainings in March could have a great hands-on learning experience… And I’m really excited that this upcoming GTC is a place where we’re gonna offer 16 of these workshops, timed so that several of them are available in the North-American timezone, several are available in the European timezone, and several are gonna be available in the Asian timezones. We’ll be offering our brand new fundamentals of deep learning course for everybody who just kind of wants to get started and understand what the basics are, get their hands dirty, training their first neural networks… We have a completely all-new, updated natural language processing course that’s based on the state of the art tensors-based approaches to natural language processing, and we also have a brand new recommender systems course that will help people learn how to build recommendation systems and integrate them into their applications.
[28:08] And as always, the other courses, including the multi-GPU course have been updated, there’s a new course on CUDA… The Deep Learning Institute program has helped us to train over 250,000 people worldwide in the last several years, and the demand for access to this kind of hands-on learning by doing style of training is just continuing to increase, and we’re really excited about being able to offer that in an even larger way at GTC this October.
Those are great classes, having taken several of them myself… I love taking them, and I definitely recommend them. I wanted to just follow up for a second - I imagine that your CEO, Jensen Huang, is probably going to address the acquisition of Arm… Is there anything you can tell us, or do you really have to wait until after he’s talked at the keynote?
That news is hot off the presses just maybe a day and a half ago; we’re still counting it in hours… And Jensen is really gonna be our spokesperson on that. But what I can say is we’re all just really excited about it. There’s so many opportunities for ways that we can serve the ecosystem of developers and data scientists and researchers that the companies and the organizations that are – they’re trying to solve really, really hard problems, and we’re just delighted that we have the opportunity to help them with the platforms and the technologies and the training that they need to make progress on these things.
I totally get it. I totally understand that. I knew it was so new, but I felt like I had to ask.
It’s exciting, and you know…
Good timing for this episode, for sure… Tune in here in a couple of weeks to find out more, for sure. It’s gonna be exciting. Will, you mentioned one thing as you were talking about GTC, and Chris mentioned that he’d taken some of these classes, and you talked a little bit about the Deep Learning Institute, some of its activities at GTC… I was wondering if you could just give us – kind of step back a bit and tell us a bit about the origins of the Deep Learning Institute and its current state and what it is now.
Oh, sure. Yeah, actually that’s a pretty fun story. I think I mentioned at the beginning that for a period of time I was the product manager for CUDA… And when we were first starting to bridge from working in the Academic research segment into commercial enterprises, there were a lot of customers who said “Yeah, that looks promising, and we love the demos, but can you come show us how to do it? We have smart people here, they can learn, but we need someone to come show us how…” So I dusted off my frequent flier mileage card and started traveling the world with literally a pallet of laptops. We’ve got some gamer laptops that had GPUs in them, and a colleague and I started just flying all over the world, shipping pallets of laptops.
And we had this challenge that, you know, stuff breaks. It gets stuck in customs. You have to reimage the darn things after every training to make sure they’re gonna work for the next people… And it was a big hassle. And after doing that for a couple of years, Amazon AWS announced that they were gonna have thousands of GPUs available in their cloud, and we immediately said “That is what we need.”
So we updated our training materials, and figured out how to get all of that hosted in the cloud, and now all we had to do was show up… And all of the people who wanted to learn how to - at the time - develop applications using CUDA could just connect to these cloud servers, preconfigured with all the software they would need, from their own laptops. In fact, we had a few people just do it from their iPads… And it was fantastic.
[32:12] And then we realized “You know what - we could probably develop some self-paced content, so that anybody in the world could just log in and learn how, without having to wait for me to show up on their doorstep.” So that started taking off too, and after about a year, maybe two years of that, a few of the people that we had been working with internally at NVIDIA, our subject matter experts who were constantly going out and engaging with customers and helping them solve their challenging problems - they came to us and said “Hey, there’s this thing called deep learning. It’s kind of interesting… And we’ve built these little Jupyter Notebooks that kind of show people how to do it. Could we just maybe host that on your training platform and see if people wanna learn how to do this thing?”
Oh, how far we’ve come… Right? [laughter]
Yeah… Within a very short amount of time the number of people who signed up for an account, or logged in and experienced this deep learning training completely eclipsed anything else we had done before… And we realized that we had a tiger by the tail.
So we’d been experimenting, we’d been trying things… Something worked, and that’s when we decided to double down. We gave this initiative a name, we decided to call it the Deep Learning Institute, we hired a team and began building out a rich catalog of both self-paced and instructor-led content… We started an instructor certification program, so now there are over 600 instructors all over the world who are certified to deliver our training to their audiences.
Many of them are well-qualified professors and Academic researchers working in higher-ed. Many of them are independent training service providers who have now been able to make this part of their business practices… Some of them are even companies who have decided that they need so many of their own employees to be trained that they’re getting their internal employee instructors certified to deliver this training, and scaling it up within their organizations as part of the workforce transformation projects to become AI companies. We’re happy to work with all of them.
That’s kind of how we’ve come to where we are today, and as I mentioned earlier, the latest new thing in the evolution of the Deep Learning Institute is this introduction of our online virtual classroom format, that allows us to deliver training anywhere in the world, and even to aggregate demand across many different customers or many different sites within the same customer… Because it doesn’t really make sense to send an instructor to train two or three people; that’s a lot of effort and expense for everyone. But if you can get 15, 20, 25 people together from multiple different sites, whether they’re from the same company or from several different companies or organizations, then you can put together a really good learning experience for them to quickly learn how to use these new technologies through hands-on exercises and assessments, and end up with a certification at the end of the day, proving that they have actually developed competence in applying these technologies to solve worthwhile problems.
Will, I’ve got a question about – you know, you guys have all these great classes, and they’re very well-run and they’re very well-integrated, but we’re in this really fast-moving world of deep learning and AI in general… How do you choose what should be part of your curriculum? What classes? There are so many topics out there, and they’re evolving so fast… That has to be a bit of a challenge, to figure out what to provide for the people that you’re serving out there.
[36:21] It really is, but it’s one of those problems that it’s a good problem to have. It’s like having an embarrassment of riches. There’s so many opportunities to develop training curriculum around these new and emerging and quickly-evolving technologies that we do have to be a little bit careful which ones we choose to invest in, and get a small army of instructors trained and certified to deliver.
And so what we’ve tried to do is keep up with what is the state of the art, and keep tabs on all the new research that’s coming out, and as that research evolves and new practices begin to start getting adopted, that’s really the sweet spot for developing training. Training is really a way to teach people how to use and apply things that have already been proven.
So if you were to ask me at this time last year, I would have said “You know, there’s a lot of work in recommender systems”, but the state of the art is still evolving so rapidly that if we were to go take a couple of months to distill that down into best practices and build a training course around it, by the time we actually got that out into the world, the state of the art would have moved on so quickly that people would be maybe a little frustrated or disappointed in the training… So at this point, the best thing we should be doing is connecting them with the research papers and the open source projects and things like that, so that people who are more comfortable living on the bleeding edge can adopt that technology while it is still in its infancy.
[38:06] But this year, things have evolved to the point where many of the recommender system development and design approaches that those best practices are established. In fact, they’re really well-established. So we built a training course around that, and we’re starting to offer it to everyone who needs to build these kinds of recommender systems. And it can kind of happen in waves. So this happened a few years ago for natural language processing, and we built a course around it that was really incredibly popular. It was one of our more difficult courses, but it was really popular. And then last year, or over the last year, a completely new approach to natural language processing has emerged that is based on transformers and models like BERT and GPT and so forth. So it was very clearly time for us to go back and completely update and replace that older course with something that is teaching people how to use the latest state of the art techniques now that they’ve been boiled down to practice.
On the [unintelligible 00:39:10.27] it’s kind of like painting the Golden Gate Bridge. By the time you get to the end – you paint it from South to North; by the time you get to the North side, it’s time to go back and start painting it all over again.
Yeah. So you mentioned the interactions with open source and the research community… Of course, it was really interesting when you were talking about the story of the Deep Learning Institute, and how Jupyter entered the mix, and these different projects… From your perspective, it seemed – like, part of the reason why Chris and I started this podcast was of that very much focus on the practical side of AI, when it seemed like to us that there were a lot of… You know, it was kind of like the Wild West, there were a lot of things out there that just weren’t extremely practical… But of course, that has definitely changed and evolved over time.
What is, maybe in the Deep Learning Institute, or maybe in your work more broadly, in NVIDIA, how do you think about engaging with different open source projects and contributing to those? I know NVIDIA is involved with a lot of those, but you mentioned, for example NLP and transformers… Of course, we’ve had Hugging Face on the show, and I’ve seen pictures on their Twitter of them getting boxes of Titan RTX GPUs in the mail for training their models… So how do you keep your pulse on not only the new techniques, but how the tooling and the open source side of things is involving? Is there a lot of freedom within NVIDIA to contribute to open source, or an encouragement around that? And how do you choose where to put your effort, and all of that sort of thing on the open source side?
Wow, that’s a really good question. If I could even distill it down to a simple list or a recipe, there’s a lot of people within the company who would really appreciate that. I think the best way to really explain it is that we look at each opportunity and each situation based on its own merits, and we recognize that there are so very many different types of open source projects, some of which - many of which, hundreds of which, in fact - NVIDIA engineers have created and continue to maintain out in the world, and invite others to participate in.
Others, which we actually have adopted and integrated into some of our products, and continue to support and make contributions and so forth to - there’s isn’t really a single one-size-fits-all recipe. I think it really depends on “What do our customers need? What do our developers need?” and “What’s the best ways that we can find to support them?”, and then at the same time, what are the best implementations, what are the best projects out there that we might be able to use and to contribute to, to just be positive contributing members to the open source ecosystem? I’m not sure if that’s a great, succinct answer to your question…
[42:20] It’s a big question, yeah… It’d be hard to distill that down, but I think it’s – yeah, definitely good thoughts on that subject.
So I guess as we’re starting to get toward the end, I’m curious, being in the center of AI education and having accomplished all that you’ve done on this, where do you see the future of AI education going with the Deep Learning Institute, and even in a broader sense, at large, within the larger AI community?
Obviously, we’ve had changes lately where we’ve moved from a lot of in-person training to more and more remote training because of Covid and its impact on the world… But what do you see over the horizons that maybe the rest of us haven’t thought about? Where you think things might go.
Wow, that’s a great question. Off the top of my head, I think there’s really kind of three really important areas. One is in the training of AI practitioners; people who are going to be building and evaluating AI-based functionality to be integrated into these applications, and helping them to understand just the critical importance of analyzing the data that is used to train the neural networks, and testing the outputs of those neural networks to ensure that they’re really functioning as expected in all of the different scenarios where we need them to work.
These are things like just making sure you have enough examples of all the different types of data that you need in order to have your neural network be effective in a real-world environment. For example, if you train a neural network to be able to distinguish between all the different types of flowers, or dogs, or cats in the world… And then later you show it a picture of a raccoon or a giraffe, it’s not gonna know what to do. So you need to train it to be able to handle all the variations, the variety of the real world that it’s actually gonna be exposed to. So that’s one thing.
The second thing is that for people who are not AI practitioners, or who are let’s say not yet AI practitioners; let’s say they’re earlier in their career or schooling - so much of our world and interaction with it are going to be influenced by these powerful AI tools that I think it’s really important that even school age kids have a basic understanding of what AI is and how it works. When I go to a website and someone is recommending that I buy this pair of sneakers, or this shirt, or whatever product, that I have a sense for “Well, what might that be based on? Why are they recommending it to me? Is it because other people like me have also purchased this product and enjoyed it? And if so, how do they know who else is like me?” Just being really thoughtful about how does that world work, and taking a peek under the covers I think is gonna be a really important thing for really everyone to be comfortable with the capabilities and the limitations of this powerful technology.
That’s a great point. I was reminded of – I bought my nieces and nephews… There’s this series of books called – I think it’s like “Calculus and other things for babies/kids” or something like that… I was like “Oh, we need some AI for babies books” that you can put in along with the other children’s books… Anyway… My mind was going there while you were talking.
What was your third point there?
[45:54] Oh, the third area - and these are just off the top of my head here - is the impact that AI is going to have on education itself. As we transition to more online learning, and you have things like Khan Academy and other services that have really packaged up some very high-quality lectures and learning materials on all of the core subjects, the ability of computers using AI and other data science techniques to observe how students are learning and to understand what is the learning style… Because we all have different types and different extents of different flavors of learning styles; some people like to learn more visually, some people learn more by listening… I happen to be someone who learns - maybe you can tell - by talking, and kind of repeating back what I have understood in order to get confirmation that I’ve understood it correctly… These different learning styles are things that can be observed. And then the students can be offered different forms or different formats of the material that helps them learn more effectively.
So that can take the shape – think of it as almost like a choose-your-own-adventure type of storybook, if you remember those, where you’re offered different kinds of learning experiences to learn the same materials, based on how you have learned best in the past. And there’s always gonna be some experimentation and we’re gonna have to make some smart decisions around - you know, maybe some students are better auditory learners; they learn by listening, but it’s really important that they develop their visual, analytical and comprehensive skills as well… So we’ll need to balance that out in the types and formats of learning materials that are offered to them. This is an area that our Deep Learning Institute team is really interested in. They wanna get all meta and apply deep learning to deep learning education.
Yeah, that’s so cool. I just get the sense from how you describe that and in your passion around it - you seem fairly optimistic in terms of AI’s benefit to certain areas, like in terms of AI for good in education and other areas. It’s something definitely Chris and I are very passionate about. As you look to the future, is that true? Or are you sort of generally optimistic about AI’s ability to impact our world for the better in different areas, like education or health, and that sort of thing? Maybe what’s one other area that you’re particularly – as we kind of close out, what’s another area where you’re particularly excited about the potential impacts for AI over the next couple of years?
[48:54] You know, in general - yes, I’m an optimist. I’m optimistic that we as humans will learn how to use AI technologies… Because it’s just a tool. It doesn’t actually think for itself. It’s still a computer that does what we design it to do. Now, we may be designing it at a little further level of abstraction than we have with previous technologies, but it doesn’t have its own agency to decide what to do. So I am optimistic that we as humans will continue to learn and apply AI in really positive ways, and also learn how to mitigate some of the potentially less positive ways in which this tool, this technology could be applied.
One of the areas that I’m most excited about really is in healthcare. If you look at the work that is being done even just around Covid-19 research, and the speed at which the computational simulations are allowing scientists and researchers to very rapidly explore the problem space to test out their hypotheses in effectively zero-risk scenarios before they go start testing these drug molecules and treatments in a lab environment. It’s just really incredible, and I think that what we learned through this incredibly focused period of time on Covid-19 is going to turbo-charge the entire healthcare drug discovery area as a result. So I have a lot of hope, a lot of optimism around the effective application of this deep learning/AI approach in healthcare and a number of other fields.
Fantastic. Very inspiring. As we finish up, I actually wanna note to listeners that we have had several episodes devoted entirely to NVIDIA, and obviously NVIDIA comes up in many episodes as part of our casual conversation. Episode 15 was “Artificial intelligence at NVIDIA with Bill Dally”, NVIDIA’s chief scientist. Episode 36 we had Anima Anandkumar, who is the director of ML Research at NVIDIA… And then more recently, on episode 90, Daniel and I were just alone, but we talked all about exploring NVIDIA’s Ampere architecture and the A100 GPU. I just wanted to let listeners know that. If you’re interested in this conversation and haven’t heard one of those, you probably wanna go back and listen to get a lot more about NVIDIA.
I wanted to remind listeners that we have a code if you go to our show notes, which you can get to on whatever device you happen to be listening to. There is a discount code for the GPU Technology Conference… And with that, Will, thank you so much for coming on the show; this was a great conversation, and we really enjoyed it.
Well, thank you both for having me. This has been a lot of fun, and I look forward to seeing you again in person someday, or at least virtually like this, sometimes soon.
Absolutely, will do. Thank you.
Our transcripts are open source on GitHub. Improvements are welcome. 💚