It’s one thing to gather some labels for your data. It’s another thing to integrate data labeling into your workflows and infrastructure in a scalable, secure, and useful way. Mark from Xelex joins us to talk through some of what he has learned after helping companies scale their data annotation efforts. We get into workflow management, labeling instructions, team dynamics, and quality assessment. This is a super practical episode!
WeightWatcher, created by Charles Martin, is an open source diagnostic tool for analyzing Neural Networks without training or even test data! Charles joins us in this episode to discuss the tool and how it fills certain gaps in current model evaluation workflows. Along the way, we discuss statistical methods from physics and a variety of practical ways to modify your training runs.
The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2).
(Image from stability.ai)
AI is increasingly being applied in creative and artistic ways, especially with recent tools integrating models like Stable Diffusion. This is making some artists mad. How should we be thinking about these trends more generally, and how can we as practitioners release and license models anticipating human impacts? We explore this along with other topics (like AI models detecting swimming pools 😊) in this fully connected episode.
In this Fully-Connected episode, Daniel and Chris discuss concerns of privacy in the face of ever-improving AI / ML technologies. Evaluating AI’s impact on privacy from various angles, they note that ethical AI practitioners and data scientists have an enormous burden, given that much of the general population may not understand the implications of the data privacy decisions of everyday life.
This intentionally thought-provoking conversation advocates consideration and action from each listener when it comes to evaluating how their own activities either protect or violate the privacy of those whom they impact.
Differentiating between what is real versus what is fake on the internet can be challenging. Historically, AI deepfakes have only added to the confusion and chaos, but when labeled and intended for good, deepfakes can be extremely helpful. But with all of the misinformation surrounding deepfakes, it can be hard to see the benefits they bring. Lior Hakim, CTO at Hour One, joins Chris and Daniel to shed some light on the practical uses of deepfakes. He addresses the AI technology behind deepfakes, how to make positive use of deep fakes such as breaking down communications barriers, and shares how Hour One specializes in the development of virtual humans for use in professional video communications.
Daniel and Chris cover the AI news of the day in this wide-ranging discussion. They start with Truss from Baseten while addressing how to categorize AI infrastructure and tools. Then they move on to transformers (again!), and somehow arrive at an AI pilot model from CMU that can navigate crowded airspace (much to Chris’s delight).
AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.
Every year Mozilla releases an Internet Health Report that combines research and stories exploring what it means for the internet to be healthy. This year’s report is focused on AI. In this episode, Solana and Bridget from Mozilla join us to discuss the power dynamics of AI and the current state of AI worldwide. They highlight concerning trends in the application of this transformational technology along with positive signs of change.
In this Fully-Connected episode, Chris and Daniel explore the geopolitics, economics, and power-brokering of artificial intelligence. What does control of AI mean for nations, corporations, and universities? What does control or access to AI mean for conflict and autonomy? The world is changing rapidly, and the rate of change is accelerating. Daniel and Chris look behind the curtain in the halls of power.
In this Fully-Connected episode, Daniel and Chris explore DALL-E 2, the amazing new model from Open AI that generates incredibly detailed novel images from text captions for a wide range of concepts expressible in natural language. Along the way, they acknowledge that some folks in the larger AI community are suggesting that sophisticated models may be approaching sentience, but together they pour cold water on that notion. But they can’t seem to get away from DALL-E’s images of raccoons in space, and of course, who would want to?
Coqui is a speech technology startup that making huge waves in terms of their contributions to open source speech technology, open access models and data, and compelling voice cloning functionality. Josh Meyer from Coqui joins us in this episode to discuss cloning voices that have emotion, fostering open source, and how creators are using AI tech.
Drausin Wulsin, Director of ML at Immunai, joins Daniel & Chris to talk about the role of AI in immunotherapy, and why it is proving to be the foremost approach in fighting cancer, autoimmune disease, and infectious diseases.
The large amount of high dimensional biological data that is available today, combined with advanced machine learning techniques, creates unique opportunities to push the boundaries of what is possible in biology.
To that end, Immunai has built the largest immune database called AMICA that contains tens of millions of cells. The company uses cutting-edge transfer learning techniques to transfer knowledge across different cell types, studies, and even species.
While scaling up machine learning at Instacart, Montana Low and Lev Kokotov discovered just how much you can do with the Postgres database. They are building on that work with PostgresML, an extension to the database that lets you train and deploy models to make online predictions using only SQL. This is super practical discussion that you don’t want to miss!
Could we create a digital human that processes data in a variety of modalities and detects emotions? Well, that’s exactly what NTT DATA Services is trying to do, and, in this episode, Theresa Kushner joins us to talk about their motivations, use cases, current systems, progress, and related ethical issues.
In this “fully connected” episode of the podcast, we catch up on some recent developments in the AI world, including a new model from DeepMind called Gato. This generalist model can play video games, caption images, respond to chat messages, control robot arms, and much more. We also discuss the use of AI in the entertainment industry (e.g., in new Top Gun movie).
Hugging Face is increasingly becomes the “hub” of AI innovation. In this episode, Merve Noyan joins us to dive into this hub in more detail. We discuss automation around model cards, reproducibility, and the new community features. If you are wanting to engage with the wider AI community, this is the show for you!
Don’t all AI methods need a bunch of data to work? How could AI help document and revitalize endangered languages with “human-in-the-loop” or “active learning” methods? Sarah Moeller from the University of Florida joins us to discuss those and other related questions. She also shares many of her personal experiences working with languages in low resource settings.
AI is discovering new drugs. Sound like science fiction? Not at Absci! Sean and Joshua join us to discuss their AI-driven pipeline for drug discovery. We discuss the tech along with how it might change how we think about healthcare at the most fundamental level.
We all hear a lot about MLOps these days, but where does MLOps end and DevOps begin? Our friend Luis from OctoML joins us in this episode to discuss treating AI/ML models as regular software components (once they are trained and ready for deployment). We get into topics including optimization on various kinds of hardware and deployment of models at the edge.