Weights & Biases is coming up with some awesome developer tools for AI practitioners! In this episode, Lukas Biewald describes how these tools were a direct result of pain points that he uncovered while working as an AI intern at OpenAI. He also shares his vision for the future of machine learning tooling and where he would like to see people level up tool-wise.
Hamish from Sajari blows our mind with a great discussion about AI in search. In particular, he talks about Sajari’s quest for performant AI implementations and extensive use of Reinforcement Learning (RL). We’ve been wanting to make this one happen for a while, and it was well worth the wait.
Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.
Suju Rajan from LinkedIn joined us to talk about how they are operationalizing state-of-the-art AI at LinkedIn. She sheds light on how AI can and is being used in recruiting, and she weaves in some great explanations of how graph-structured data, personalization, and representation learning can be applied to LinkedIn’s candidate search problem. Suju is passionate about helping people deal with machine learning technical debt, and that gives this episode a good dose of practicality.
We’re partnering with the upcoming R Conference, because the R Conference is well… amazing! Tons of great AI content, and they were nice enough to connect us to Daniel Chen for this episode. He discusses data science in Computational Biology and his perspective on data science project organization.
In anticipation of the upcoming NVIDIA GPU Technology Conference (GTC), Will Ramey joins Daniel and Chris to talk about education for artificial intelligence practitioners, and specifically the role that the NVIDIA Deep Learning Institute plays in the industry. Will’s insights from long experience are shaping how we all stay on top of AI, so don’t miss this ‘must learn’ episode.
So, you trained a great AI model and deployed it in your app? It’s smooth sailing from there right? Well, not in most people’s experience. Sometimes things goes wrong, and you need to know how to respond to a real life AI incident. In this episode, Andrew and Patrick from BNH.ai join us to discuss an AI incident response plan along with some general discussion of debugging models, discrimination, privacy, and security.
Many people are excited about creating usable speech technology. However, most of the audio data used by large companies isn’t available to the majority of people, and that data is often biased in terms of language, accent, and gender. Jenny, Josh, and Remy from Mozilla join us to discuss how Mozilla is building an open-source voice database that anyone can use to make innovative apps for devices and the web (Common Voice). They also discuss efforts through Mozilla fellowship program to develop speech tech for African languages and understand bias in data sets.
Waymo’s mission is to make it safe and easy for people and things to get where they’re going.
After describing the state of the industry, Drago Anguelov - Principal Scientist and Head of Research at Waymo - takes us on a deep dive into the world of AI-powered autonomous driving. Starting with Waymo’s approach to autonomous driving, Drago then delights Daniel and Chris with a tour of the algorithmic tools in the autonomy toolbox.
Hilary Mason is building a new way for kids and families to create stories with AI. It’s called Hidden Door, and in her first interview since founding it, Hilary reveals to Chris and Daniel what the experience will be like for kids. It’s the first Practical AI episode in which some of the questions came from Chris’s 8yo daughter Athena.
Hilary also shares her insights into various topics, like how to build data science communities during the COVID-19 Pandemic, reasons why data science goes wrong, and how to build great data-based products. Don’t miss this episode packed with hard-won wisdom!
Everyone working in data science and AI knows about Anaconda and has probably “conda” installed something. But how did Anaconda get started and what are they working on now? Peter Wang, CEO of Anaconda and creator of PyData and popular packages like Bokeh and DataShader, joins us to discuss that and much more. Peter gives some great insights on the Python AI ecosystem and very practical advice for scaling up your data science operation.
We made it to 100 episodes of Practical AI! It has been a privilege to have had so many great guests and discussions about everything from AGI to GPUs to AI for good. In this episode, we circle back to the beginning when Jerod and Adam from The Changelog helped us kick off the podcast. We discuss how our perspectives have changed over time, what it has been like to host an AI podcast, and what the future of AI might look like. (GIVEAWAY!)
Come hang with the bad boys of natural language processing (NLP)! Jack Morris joins Daniel and Chris to talk about TextAttack, a Python framework for adversarial attacks, data augmentation, and model training in NLP. TextAttack will improve your understanding of your NLP models, so come prepared to rumble with your own adversarial attacks!
Sash Rush, of Cornell Tech and Hugging Face, catches us up on all the things happening with Hugging Face and transformers. Last time we had Clem from Hugging Face on the show (episode 35), their transformers library wasn’t even a thing yet. Oh how things have changed! This time Sasha tells us all about Hugging Face’s open source NLP work, gives us an intro to the key components of transformers, and shares his perspective on the future of AI research conferences.
DevOps for deep learning is well… different. You need to track both data and code, and you need to run multiple different versions of your code for long periods of time on accelerated hardware. Allegro AI is helping data scientists manage these workflows with their open source MLOps solution called Trains. Nir Bar-Lev, Allegro’s CEO, joins us to discuss their approach to MLOps and how to make deep learning development more robust.
The multidisciplinary field of AI Ethics is brand new, and is currently being pioneered by a relatively small number of leading AI organizations and academic institutions around the world. AI Ethics focuses on ensuring that unexpected outcomes from AI technology implementations occur as rarely as possible. Daniel and Chris discuss strategies for how to arrive at AI ethical principles suitable for your own organization, and what is involved in implementing those strategies in the real world. Tune in for a practical AI primer on AI Ethics!
Daniel and Chris get you Fully-Connected with open source software for artificial intelligence.
In addition to defining what open source is, they discuss where to find open source tools and data, and how you can contribute back to the open source AI community.
A lot of effort is put into the training of AI models, but, for those of us that actually want to run AI models in production, performance and scaling quickly become blockers. Nikita from MemSQL joins us to talk about how people are integrating ML/AI inference at scale into existing SQL-based workflows. He also touches on how model features and raw files can be managed and integrated with distributed databases.
This full connected has it all: news, updates on AI/ML tooling, discussions about AI workflow, and learning resources. Chris and Daniel breakdown the various roles to be played in AI development including scoping out a solution, finding AI value, experimentation, and more technical engineering tasks. They also point out some good resources for exploring bias in your data/model and monitoring for fairness.
Daniel and Chris go beyond the current state of the art in deep learning to explore the next evolutions in artificial intelligence. From Yoshua Bengio’s NeurIPS keynote, which urges us forward towards System 2 deep learning, to DARPA’s vision of a 3rd Wave of AI, Chris and Daniel investigate the incremental steps between today’s AI and possible future manifestations of artificial general intelligence (AGI).
The CEO of Darwin AI, Sheldon Fernandez, joins Daniel to discuss generative synthesis and its connection to explainability. You might have heard of AutoML and meta-learning. Well, generative synthesis tackles similar problems from a different angle and results in compact, explainable networks. This episode is fascinating and very timely.
On the heels of NVIDIA’s latest announcements, Daniel and Chris explore how the new NVIDIA Ampere architecture evolves the high-performance computing (HPC) landscape for artificial intelligence. After investigating the new specifications of the NVIDIA A100 Tensor Core GPU, Chris and Daniel turn their attention to the data center with the NVIDIA DGX A100, and then finish their journey at “the edge” with the NVIDIA EGX A100 and the NVIDIA Jetson Xavier NX.
Chandler McCann tells Daniel and Chris about how DataRobot engaged in a project to develop sustainable water solutions with the Global Water Challenge (GWC). They analyzed over 500,000 data points to predict future water point breaks. This enabled African governments to make data-driven decisions related to budgeting, preventative maintenance, and policy in order to promote and protect people’s access to safe water for drinking and washing. From this effort sprang DataRobot’s larger AI for Good initiative.
Daniel and Chris get you Fully-Connected with AI questions from listeners and online forums:
- What do you think is the next big thing?
- What are CNNs?
- How does one start developing an AI-enabled business solution?
- What tools do you use every day?
- What will AI replace?
- And more…
Daniel and Chris have a fascinating discussion with Anna Goldie and Azalia Mirhoseini from Google Brain about the use of reinforcement learning for chip floor planning - or placement - in which many new designs are generated, and then evaluated, to find an optimal component layout. Anna and Azalia also describe the use of graph convolutional neural networks in their approach.
In the midst of the COVID-19 pandemic, Daniel and Chris have a timely conversation with Lucy Lu Wang of the Allen Institute for Artificial Intelligence about COVID-19 Open Research Dataset (CORD-19). She relates how CORD-19 was created and organized, and how researchers around the world are currently using the data to answer important COVID-19 questions that will help the world through this ongoing crisis.
AI legend Stuart Russell, the Berkeley professor who leads the Center for Human-Compatible AI, joins Chris to share his insights into the future of artificial intelligence. Stuart is the author of Human Compatible, and the upcoming 4th edition of his perennial classic Artificial Intelligence: A Modern Approach, which is widely regarded as the standard text on AI. After exposing the shortcomings inherent in deep learning, Stuart goes on to propose a new practitioner approach to creating AI that avoids harmful unintended consequences, and offers a path forward towards a future in which humans can safely rely of provably beneficial AI.
So many AI developers are coming up with creative, useful COVID-19 applications during this time of crisis. Among those are Timo from Deepset-AI and Tony from Intel. They are working on a question answering system for pandemic-related questions called COVID-QA. In this episode, they describe the system, related annotation of the CORD-19 data set, and ways that you can contribute!
Daniel Wilson and Rob Fletcher of ESRI hang with Chris and Daniel to chat about how AI powered modern geographic information systems (GIS) and location intelligence. They illuminate the various models used for GIS, spatial analysis, remote sensing, real-time visualization, and 3D analytics. You don’t want to miss the part about their work for the DoD’s Joint AI Center in humanitarian assistance / disaster relief.
Catherine Breslin of Cobalt joins Daniel and Chris to do a deep dive on speech recognition. She also discusses how the technology is integrated into virtual assistants (like Alexa) and is used in other non-assistant contexts (like transcription and captioning). Along the way, she teaches us how to assemble a lexicon, acoustic model, and language model to bring speech recognition to life.