Natural Language Processing Icon

Natural Language Processing

Natural language processing (NLP) is the study of how computers and humans interact.
10 Stories
All Topics

TechCrunch Icon TechCrunch

Hugging Face raises $15 million to build their open source NLP library 🤗

Congrats to Clément and the Hugging Face team on this milestone!

The company first built a mobile app that let you chat with an artificial BFF, a sort of chatbot for bored teenagers. More recently, the startup released an open-source library for natural language processing applications. And that library has been massively successful.

The library mentioned is called Transformers, which is dubbed as ‘state-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.’

If any of these things ring a bell to you, it may be because Practical AI co-host Daniel Whitenack has been a huge supporter of Hugging Face for a long time and mentions them often on the show. We even had Clément on the show back in March of this year.

Google github.com

Using Google's speech recognition to beat Google's ReCaptcha

A little ingenuity paired with changes to ReCaptcha’s audio challenge allowed this hacker to create a Python ‘robot’ that defeats the ‘not a robot’ test with 90% accuracy. The approach is brilliant:

  1. Navigate to Google’s ReCaptcha Demo site
  2. Navigate to audio challenge for ReCaptcha
  3. Download audio challenge
  4. Submit audio challenge to Speech To Text
  5. Parse response and type answer
  6. Press submit and check if successful

The code is small enough to grok in 5-10 minutes. Love it!

Using Google's speech recognition to beat Google's ReCaptcha

TensorFlow cvcompiler.com

An NLP tool for improving dev resumes

CV Compiler is an online resume analysis tool designed exclusively for software engineers.

The review technology scans for keywords from the world of programming and how they are used in the resume, relative to the best practices in the industry.

CV Compiler was built using Python with libraries NLTK and spaCy for tokenization, lemmatization, and POS-tagging.

The internal analysis engine for large datasets (resumes, job descriptions) was built upon a Seq2Seq model in TensorFlow.

0:00 / 0:00