Natural Language Processing Icon

Natural Language Processing

Natural language processing (NLP) is the study of how computers and humans interact.
44 Stories
All Topics

Practical AI Practical AI #205

NLP research by & for local communities

While at EMNLP 2022, Daniel got a chance to sit down with an amazing group of researchers creating NLP technology that actually works for their local language communities. Just Zwennicker (Universiteit van Amsterdam) discusses his work on a machine translation system for Sranan Tongo, a creole language that is spoken in Suriname. Andiswa Bukula (SADiLaR), Rooweither Mabuya (SADiLaR), and Bonaventure Dossou (Lanfrica, Mila) discuss their work with Masakhane to strengthen and spur NLP research in African languages, for Africans, by Africans.

The group emphasized the need for more linguistically diverse NLP systems that work in scenarios of data scarcity, non-Latin scripts, rich morphology, etc. You don’t want to miss this one!

Practical AI Practical AI #196

What's up, DocQuery?

Chris sits down with Ankur Goyal to talk about DocQuery, Impira’s new open source ML model. DocQuery lets you ask questions about semi-structured data (like invoices) and unstructured documents (like contracts) using Large Language Models (LLMs). Ankur illustrates many of the ways DocQuery can help people tame documents, and references Chris’s real life tasks as a non-profit director to demonstrate that DocQuery is indeed practical AI.

AI (Artificial Intelligence) github.com

Kern AI's refinery is a data-centric IDE for NLP

Like the data-centric sibling of your favorite programming environment. It provides an easy-to-use interface for weak supervision as well as extensive data management, neural search and monitoring to ensure that the quality of your training data is as good as possible.

This won’t rid you of the need to manually label, but it’ll save you time in the process!

Kern AI's refinery is a data-centric IDE for NLP

Practical AI Practical AI #185

DALL-E is one giant leap for raccoons! 🔭

In this Fully-Connected episode, Daniel and Chris explore DALL-E 2, the amazing new model from Open AI that generates incredibly detailed novel images from text captions for a wide range of concepts expressible in natural language. Along the way, they acknowledge that some folks in the larger AI community are suggesting that sophisticated models may be approaching sentience, but together they pour cold water on that notion. But they can’t seem to get away from DALL-E’s images of raccoons in space, and of course, who would want to?

Practical AI Practical AI #178

Active learning & endangered languages

Don’t all AI methods need a bunch of data to work? How could AI help document and revitalize endangered languages with “human-in-the-loop” or “active learning” methods? Sarah Moeller from the University of Florida joins us to discuss those and other related questions. She also shares many of her personal experiences working with languages in low resource settings.

Practical AI Practical AI #158

Zero-shot multitask learning

In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.

Python github.com

An open source, online reverse dictionary

This is the first time I’ve heard of a reverse dictionary, but now that I have… so cool!

Opposite to a regular (forward) dictionary that provides definitions for query words, a reverse dictionary returns words semantically matching the query descriptions.

Ever had a word on the tip of your tongue and you Just. Can’t. Think of it?! Reverse dictionary!

An open source, online reverse dictionary

Practical AI Practical AI #146

Exploring a new AI lexicon

We’re back with another Fully Connected episode – Daniel and Chris dive into a series of articles called ‘A New AI Lexicon’ that collectively explore alternate narratives, positionalities, and understandings to the better known and widely circulated ways of talking about AI. The fun begins early as they discuss and debate ‘An Electric Brain’ with strong opinions, and consider viewpoints that aren’t always popular.

Practical AI Practical AI #145

NLP to help pregnant mothers in Kenya

In Kenya, 33% of maternal deaths are caused by delays in seeking care, and 55% of maternal deaths are caused by delays in action or inadequate care by providers. Jacaranda Health is employing NLP and dialogue system techniques to help mothers experience childbirth safely and with respect and to help newborns get a safe start in life. Jay and Sathy from Jacaranda join us in this episode to discuss how they are using AI to prioritize incoming SMS messages from mothers and help them get the care they need.

Mozilla Icon Mozilla

Mozilla Common Voice adds 16 new languages and 4,600 new hours of speech

That’s a big addition. Here’s what Hillary Juma (Common Voice’s community mgr) had to say about it:

Internet access is increasingly mediated through speech: Voice assistants and smart speakers give us directions, search for information, connect us to friends, used in assistive technology and much more. Yet this technology doesn’t work for millions of people. For example, neither Amazon’s Alexa, Apple’s Siri, nor Google Home support a single native African language.

By giving individuals the ability to share their speech, we can help ensure all communities have access to voice technology and the opportunity it unlocks.

What a great initiative! (I first heard about Common Voice on Practical AI.)

Practical AI Practical AI #133

25 years of speech technology innovation

To say that Jeff Adams is a trailblazer when it comes to speech technology is an understatement. Along with many other notable accomplishments, his team at Amazon developed the Echo, Dash, and Fire TV changing our perception of how we could interact with devices in our home. Jeff now leads Cobalt Speech and Language, and he was kind enough to join us for a discussion about human computer interaction, multimodal AI tasks, the history of language modeling, and AI for social good.

Practical AI Practical AI #129

Going full bore with Graphcore!

Dave Lacey takes Daniel and Chris on a journey that connects the user interfaces that we already know - TensorFlow and PyTorch - with the layers that connect to the underlying hardware. Along the way, we learn about Poplar Graph Framework Software. If you are the type of practitioner who values ‘under the hood’ knowledge, then this is the episode for you.

Tooling github.com

Search inside YouTube videos using natural language

Use OpenAI’s CLIP neural network to search inside YouTube videos. You can try it by running the notebook on Google Colab.

The README has a bunch of examples of things you might search for and the results you’d get back. (“The Transamerica Pyramid”, anyone?)

The author also has another related project where you can search Unsplash in like manner.

AI (Artificial Intelligence) github.com

Introducing spaCy 3.0

You may recall spaCy from this episode of Practical AI with its creators. If not, now’s a great time to introduce yourself to the project. 3.0 looks like a fantastic new release of the wildly popular NLP library. The list of new and improved things is too long for me to reproduce here, so go check it out for yourself.

There’s also three YouTube videos accompanying the release. That’s evidence of just how much effort and polish went in to this.

Practical AI Practical AI #115

From research to product at Azure AI

Bharat Sandhu, Director of Azure AI and Mixed Reality at Microsoft, joins Chris and Daniel to talk about how Microsoft is making AI accessible and productive for users, and how AI solutions can address real world challenges that customers face. He also shares Microsoft’s research-to-product process, along with the advances they have made in computer vision, image captioning, and how researchers were able to make AI that can describe images as well as people do.

  0:00 / 0:00