9 out of 10 AI projects don’t end up creating value in production. Why? At least partly because these projects utilize unstable models and drifting data. In this episode, Roey from BeyondMinds gives us some insights on how to filter garbage input, detect risky output, and generally develop more robust AI systems.
How did we get from symbolic AI to deep learning models that help you write code (i.e., GitHub and OpenAI’s new Copilot)? That’s what Chris and Daniel discuss in this episode about the history and future of deep learning (with some help from an article recently published in ACM and written by the luminaries of deep learning).
Pinecone is the first vector database for machine learning. Edo Liberty explains to Chris how vector similarity search works, and its advantages over traditional database approaches for machine learning. It enables one to search through billions of vector embeddings for similar matches, in milliseconds, and Pinecone is a managed service that puts this capability at the fingertips of machine learning practitioners.
William Falcon wants AI practitioners to spend more time on model development, and less time on engineering. PyTorch Lightning is a lightweight PyTorch wrapper for high-performance AI research that lets you train on multiple-GPUs, TPUs, CPUs and even in 16-bit precision without changing your code! In this episode, we dig deep into Lightning, how it works, and what it is enabling. William also discusses the Grid AI platform (built on top of PyTorch Lightning). This platform lets you seamlessly train 100s of Machine Learning models on the cloud from your laptop.
Chris and Daniel sit down to chat about some exciting new AI developments including wav2vec-u (an unsupervised speech recognition model) and meta-learning (a new book about “How To Learn Deep Learning And Thrive In The Digital World”). Along the way they discuss engineering skills for AI developers and strategies for launching AI initiatives in established companies.
Tuhin Srivastava tells Daniel and Chris why BaseTen is the application development toolkit for data scientists. BaseTen’s goal is to make it simple to serve machine learning models, write custom business logic around them, and expose those through API endpoints without configuring any infrastructure.
Today we’re sharing a special crossover episode from The Changelog podcast here on Practical AI. Recently, Daniel Whitenack joined Jerod Santo to talk with José Valim, Elixir creator, about Numerical Elixir. This is José’s newest project that’s bringing Elixir into the world of machine learning. They discuss why José chose this as his next direction, the team’s layered approach, influences and collaborators on this effort, and their awesome collaborative notebook that’s built on Phoenix LiveView.
90% of AI / ML applications never make it to market, because fine tuning models for maximum performance across disparate ML software solutions and hardware backends requires a ton of manual labor and is cost-prohibitive. Luis Ceze and his team created Apache TVM at the University of Washington, then left founded OctoML to bring the project to market.
This API supports multiple deep learning frameworks (TensorFlow, PyTorch, etc), supports multiple hardware accelerators (CPU, GPU, egdeTPU), and is based on open source models. You can think of it a bit like the Google’s Cloud Vision API, only open source and self-hosted.
To say that Jeff Adams is a trailblazer when it comes to speech technology is an understatement. Along with many other notable accomplishments, his team at Amazon developed the Echo, Dash, and Fire TV changing our perception of how we could interact with devices in our home. Jeff now leads Cobalt Speech and Language, and he was kind enough to join us for a discussion about human computer interaction, multimodal AI tasks, the history of language modeling, and AI for social good.
This week Elixir creator José Valim joins Jerod and Practical AI’s Daniel Whitenack to discuss Numerical Elixir, his new project that’s bringing Elixir into the world of machine learning. We discuss why José chose this as his next direction, the team’s layered approach, influences and collaborators on this effort, and their awesome collaborative notebook project that’s built on Phoenix LiveView.
Smart home data is complicated. There are all kinds of devices, and they are in many different combinations, geographies, configurations, etc. This complicated data situation is further exacerbated during a pandemic when time series data seems to be filled with anomalies. Evan Welbourne joins us to discuss how Amazon is synthesizing this disparate data into functionality for the next generation of smart homes. He discusses the challenges of working with smart home technology, and he describes how they developed their latest feature called “hunches.”
This article starts with a concise description of the relationship and differences of these 3 commonly used industry terms. Then it digs into the history.
Deep learning is a subset of machine learning, which in turn is a subset of artificial intelligence, but the origins of these names arose from an interesting history. In addition, there are fascinating technical characteristics that can differentiate deep learning from other types of machine learning…essential working knowledge for anyone with ML, DL, or AI in their skillset.
Ro Gupta from CARMERA teaches Daniel and Chris all about road intelligence. CARMERA maintains the maps that move the world, from HD maps for automated driving to consumer maps for human navigation.
Nhung Ho joins Daniel and Chris to discuss how data science creates insights into financial operations and economic conditions. They delve into topics ranging from predictive forecasting to aid small businesses, to learning about the economic fallout from the COVID-19 Pandemic.
Dave Lacey takes Daniel and Chris on a journey that connects the user interfaces that we already know - TensorFlow and PyTorch - with the layers that connect to the underlying hardware. Along the way, we learn about Poplar Graph Framework Software. If you are the type of practitioner who values ‘under the hood’ knowledge, then this is the episode for you.
Nikola Mrkšić, CEO & Co-Founder of PolyAI, takes Daniel and Chris on a deep dive into conversational AI, describing the underlying technologies, and teaching them about the next generation of voice assistants that will be capable of handling true human-level conversations. It’s an episode you’ll be talking about for a long time!
Chris has the privilege of talking with Stanford Professor Margot Gerritsen, who co-leads the Women in Data Science (WiDS) Worldwide Initiative. This is a conversation that everyone should listen to. Professor Gerritsen’s profound insights into how we can all help the women in our lives succeed - in data science and in life - is a ‘must listen’ episode for everyone, regardless of gender.
David Sweet, author of “Tuning Up: From A/B testing to Bayesian optimization”, introduces Dan and Chris to system tuning, and takes them from A/B testing to response surface methodology, contextual bandit, and finally bayesian optimization. Along the way, we get fascinating insights into recommender systems and high-frequency trading!
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.
The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.
Currently in beta.
Our Slack community wanted to hear about AI-driven drug discovery, and we listened. Abraham Heifets from Atomwise joins us for a fascinating deep dive into the intersection of deep learning models and molecule binding. He describes how these methods work and how they are beginning to help create drugs for “undruggable” diseases!
Empirical analysis from Roy Schwartz (Hebrew University of Jerusalem) and Jesse Dodge (AI2) suggests the AI research community has paid relatively little attention to computational efficiency. A focus on accuracy rather than efficiency increases the carbon footprint of AI research and increases research inequality. In this episode, Jesse and Roy advocate for increased research activity in Green AI (AI research that is more environmentally friendly and inclusive). They highlight success stories and help us understand the practicalities of making our workflows more efficient.
The README has a bunch of examples of things you might search for and the results you’d get back. (“The Transamerica Pyramid”, anyone?)
The author also has another related project where you can search Unsplash in like manner.