A formalization and continuation of this old Quora question about the most important research papers which all NLP students “should definitely read”.
They’ve split the dataset up into two bundles:
- Lite, which you can download w/ a click, but is limited to 25K image
- Full, which you have to request access to and is limited to non-commercial use
This is interesting for a couple of reasons. First, it’s a great resource for anyone training models for image classification, etc. Second, it’s a nice business model for Unsplash as a startup.
I predict that, unlike its two predecessors (PTB and OpenAI GPT-2), OpenAI GPT-3 will eventually be widely used to pretend the author of a text is a person of interest, with unpredictable and amusing effects on various communities.
If you’re going to read this post, make sure you stick around until the end.
For years now I’ve been asking AI/ML experts when these powerful-yet-complicated tools will become available to average developers like you and me. It’s happening! Just look at how high-level this text generation code sample is:
import openai prompt = """snipped for brevity's sake""" response = openai.Completion.create(model="davinci", prompt=prompt, stop="\n", temperature=0.9, max_tokens=100)
They’re oftening all kinds of language tasks: semantic search, summarization, sentiment analysis, content generation, translation, and more. The API is still in beta and there’s a waitlist, but this is exciting news, nonetheless.
Neuropod is a library that provides a uniform interface to run deep learning models from multiple frameworks in C++ and Python. Neuropod makes it easy for researchers to build models in a framework of their choosing while also simplifying productionization of these models.
This looks nice because you can make your inference code framework agnostic and easily switch between frameworks if necessary. Currently supports TensorFlow, PyTorch, TorchScript, and Keras.
Acme is a library of reinforcement learning (RL) agents and agent building blocks. Acme strives to expose simple, efficient, and readable agents, that serve both as reference implementations of popular algorithms and as strong baselines, while still providing enough flexibility to do novel research. The design of Acme also attempts to provide multiple points of entry to the RL problem at differing levels of complexity.
Learn how a CNN model transforms different images into class predictions with all of the intermediate steps along the way. It’s interactive, so you can select individual neurons and inspect the details.
A fun little project that uses a neural network to map your facial movements onto an avatar of your choice. You have to watch the demo to get the full effect.
If you say… “Hey, computer, play me some music” and then it starts playing you some music, there’s a number of things that have to have happened for that to come true.
Meta-Blocks is a modular toolbox for research, experimentation, and reproducible benchmarking of learning-to-learn algorithms. The toolbox provides flexible APIs for working with MetaDatasets, TaskDistributions, and MetaLearners (see the figure below). The APIs make it easy to implement a variety of meta-learning algorithms, run them on well-established and emerging benchmarks, and add your own meta-learning problems to the suite and benchmark algorithms on them.
This repo is still under “heavy construction” (a.k.a. unstable) so downloader beware, but it’s worth a star/bookmark for later use.
Here is my python source code for training an agent to play Tetris. It could be seen as a very basic example of Reinforcement Learning’s application.
Demo on YouTube.
Have you ever posted an image on the public internet and thought, “What if someone used this for something?” Thomas Smith did and what he discovered about Clearview AI is disturbing…
Someone really has been monitoring nearly everything you post to the public internet. And they genuinely are doing “something” with it.
The someone is Clearview AI. And the something is this: building a detailed profile about you from the photos you post online, making it searchable using only your face, and then selling it to government agencies and police departments who use it to help track you, identify your face in a crowd, and investigate you — even if you’ve been accused of no crime.
I realize that this sounds like a bunch of conspiracy theory baloney. But it’s not. Clearview AI’s tech is very real, and it’s already in use.
How do I know? Because Clearview has a profile on me. And today I got my hands on it.
I used multilingual unsupervised methods (MUSE) to train cross-lingual word embeddings for over 500 languages. I then used these embeddings to extract components of the phrase “wash your hands” from existing target language documents. This resulted in translations of “wash your hands” in 510 languages not currently supported in any public translation platform.
What exactly is ‘music source separation’?
If you have ever stumbled across those online videos of Freddie Mercury singing what sounds like an a cappella rendition of “Another One Bites the Dust” or a version of Alanis Morissette’s “You Oughta Know” featuring only Flea’s distinctive slapped bass, then you’re already familiar with the concept of music source separation.
Facebook’s research team has figured out a way to do that “with an uncanny level of accuracy”. The technique is called “Demucs” (a portmanteau from “deep extractor for music sources”) and it’s out-performing other methods (spectogram analysis being the primary) by quite a bit. Code here.
I love projects like these that push the boundary of what we consider art.
PyTorch3d is designed to integrate smoothly with deep learning methods for predicting and manipulating 3D data. For this reason, all operators in PyTorch3d:
- Are implemented using PyTorch tensors
- Can handle minibatches of hetereogenous data
- Can be differentiated
- Can utilize GPUs for acceleration
Get started with tutorials on deforming a sphere mesh into a dolphin, rendering textured meshes, camera position optimization, and more.
Here’s a new acronym for you: Generative Teaching Networks (GTN)
GTNs are deep neural networks that generate data and/or training environments on which a learner (e.g., a freshly initialized neural network) trains before being tested on a target task (e.g., recognizing objects in images). One advantage of this approach is that GTNs can produce synthetic data that enables other neural networks to learn faster than when training on real data. That allowed us to search for new neural network architectures nine times faster than when using real data.
Fake data, real results? Sounds pretty slick.
Show us humans a picture of someone in uniform on a mound of dirt throwing a ball and we will quickly tell you we’re looking at baseball. But how do you make a computer come to the same conclusion?
In this post, we’ll explore basic methods for performing VQA and build our own simple implementation in Python
Congrats to Clément and the Hugging Face team on this milestone!
The company first built a mobile app that let you chat with an artificial BFF, a sort of chatbot for bored teenagers. More recently, the startup released an open-source library for natural language processing applications. And that library has been massively successful.
The library mentioned is called Transformers, which is dubbed as ‘state-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.’
If any of these things ring a bell to you, it may be because Practical AI co-host Daniel Whitenack has been a huge supporter of Hugging Face for a long time and mentions them often on the show. We even had Clément on the show back in March of this year.
Style-based GAN architecture produces impressive image generation results, but it’s not without its limitations. NVidia’s research team has been hard at work fixing some of the problems with StyleGAN (artifacts).
In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably detect if an image is generated by a particular network.
I love everything about this: the creativity, the engineering, the relentless desire to be as lazy as humanly possible. Chris automated 100% of this process, from content creation to social interactions to the sales pitch. A must-read.
Imagine an infinitely generated world that you could explore endlessly, continually finding entirely new content and adventures. What if you could also choose any action you can think of instead of being limited by the imagination of the developers who created the game?
WIRED’s business unit interviewed Jerome Pesenti, VP of artificial intelligence at Facebook. The major takeaway:
[he] is encouraged by progress in artificial intelligence, but sees the limits of the current approach to deep learning.
Could this be the beginning of the end for this particular AI hype cycle?
This booklet covers four main steps of designing a machine learning system:
- Project setup
- Data pipeline
- Modeling: selecting, training, and debugging
- Serving: testing, deploying, and maintaining
It comes with links to practical resources that explain each aspect in more details. It also suggests case studies written by machine learning engineers at major tech companies who have deployed machine learning systems to solve real-world problems.