Machine Learning Icon

Machine Learning

Machine Learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
283 Stories
All Topics

Apple github.com

Transformer architecture optimized for Apple Silicon

Use ane_transformers as a reference PyTorch implementation if you are considering deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations.

We were just discussing Apple’s next AI move on yesterday’s JS Party live (ships to the feed next Friday). They’ve been the quietest tech giant since the GenAI movement kicked in to high gear. My guess: they’ll have a LOT to say at this June’s WWDC…

AI (Artificial Intelligence) github.com

Dreamfusion! Text-to-3D model powered by Stable Diffusion

This working implementation of text-to-3D (powered by Stable Diffusion) didn’t take six months, like Simon predicted it would. Although I will concede that it’s not a 3D environment that can then go into in a game engine, but I’m sure that’s just a few more weeks away this point.

From the readme:

This project is a work-in-progress, and contains lots of differences from the paper. Also, many features are still not implemented now. The current generation quality cannot match the results from the original paper, and many prompts still fail badly!

OpenAI Icon OpenAI

OpenAI introduces Whisper (open source speech recognition)

They’re really putting the Open in OpenAI with this one…

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing.

We might need to give this a spin on our transcripts. Who knows, maybe our next big innovation could be The Changelog in German, French, Spanish, etc!

Terminal github.com

Hacking GitHub Copilot in to the terminal

So you got tired of AI just suggesting code edits, and now you want it to help you run code, too. Silly human, you have come to the right place. This will take five steps.

This gets an A+ for creativity. Fire up your shell, then launch Neovim. Then shell out with :VimShell to get back to where you started, but with Copilot suggestions.

My guess is the ergonomics of this are… bad. But a cool hack, regardless!

AI (Artificial Intelligence) matthewbilyeu.com

Responding to recruiter emails with GPT-3

Like many software engineers, Matt Bilyeu receives multiple emails from recruiters weekly. And, because he’s polite (and for other reasons) he tries to respond (politely) to all of them. But…

It would be ideal if I could automate sending these responses. Assuming I get four such emails per week and that it takes two minutes to read and respond to each one, automating this would save me about seven hours of administrative work per year.

Enter the GPT-3 API and some code that gets run by a future cron job (now that he’s tested this on a handful of emails) and Matt auto-responds to al the emails, continues to be polite, while also saving (his) time. It’s AI Matt responding the way real Matt would.

AI (Artificial Intelligence) simonwillison.net

Stable Diffusion is a really big deal

Simon Willison explains what it is:

Stable Diffusion is a new “text-to-image diffusion model” that was released to the public by Stability.ai six days ago, on August 22nd.

It’s similar to models like Open AI’s DALL-E, but with one crucial difference: they released the whole thing.

And why it’s a really big deal:

In just a few days, there has been an explosion of innovation around it. The things people are building are absolutely astonishing.

He then details some of the innovation and it is staggering, to say the least. Open FTW!

Chip Huyen huyenchip.com

Introduction to streaming for data scientists

Chip Huyen:

As machine learning moves towards real-time, streaming technology is becoming increasingly important for data scientists. Like many people coming from a machine learning background, I used to dread streaming. In our recent survey, almost half of the data scientists we asked said they would like to move from batch prediction to online prediction but can’t because streaming is hard, both technically and operationally…

Over the last year, working with a co-founder who’s super deep into streaming, I’ve learned that streaming can be quite intuitive. This post is an attempt to rephrase what I’ve learned.

Machine Learning wasp-lang.dev

ML code generation vs coding by hand

Matija Sosic (Co-founder & CEO at Wasp) shares what he thinks programming is going to look like in the near future.

When thinking about how ML code generation affects the overall development process, there is one thing to consider that often doesn’t immediately spring to mind when looking at the impressive Copilot examples. The question is - what happens with the code once it is generated? Who is responsible for it and who will maintain and refactor it in the future?

Although ML code generation helps with getting the initial code written, it cannot do much beyond that - if that code is to be maintained and changed in the future … the developer still needs to fully own and understand it.

Generated code accepted blindly is creating tech debt!

In other words, it means Copilot and similar solutions do not reduce the code complexity nor the amount of knowledge required to build features, they just help write the initial code faster, and bring the knowledge/examples closer to the code (which is really helpful). If a developer accepts the generated code blindly, they are just creating tech debt and pushing it forward.

Machine Learning github.com

A collection of resources to learn about MLOps

While still in its infancy, MLOps has attracted machine learning engineers and software engineers in general. With every new paradigm comes new challenges and opportunities to learn. In this primer, we highlight a few available resources to upskill and inform yourself on the latest in the world of MLOps.

Good resources, regardless of whether you think MLOps is its own thing or should be rolled into DevOps.

AI (Artificial Intelligence) github.com

A human-in-the-loop workflow for creating HD images from text

DALL-E can generate some amazing results, but we’re still in a phase of AI’s progress where having humans involved in the process is just better. Here’s how the authors of this workflow explain it:

Generative art is a creative process. While recent advances of DALL·E unleash people’s creativity, having a single-prompt-single-output UX/UI locks the imagination to a single possibility, which is bad no matter how fine this single result is. DALL·E Flow is an alternative to the one-liner, by formalizing the generative art as an iterative procedure.

A human-in-the-loop workflow for creating HD images from text

Security github.com

The Deepfake Offensive Toolkit

dot (aka Deepfake Offensive Toolkit) makes real-time, controllable deepfakes ready for virtual cameras injection. dot is created for performing penetration testing against e.g. identity verification and video conferencing systems, for the use by security analysts, Red Team members, and biometrics researchers.

What’s crazy is dot deepfakes don’t require any additional training. 🤯

The Deepfake Offensive Toolkit

Python github.com

Imagen (Google's text-to-image neural net) implemented in Pytorch

Last week I logged the very impressive Imagen project, which smarter people than me have said is the SOTA for text-to-image synthesis. Now a WIP implementation is just a pip install imagen-pytorch away.

Architecturally, it is actually much simpler than DALL-E2. It consists of a cascading DDPM conditioned on text embeddings from a large pretrained T5 model (attention network). It also contains dynamic clipping for improved classifier free guidance, noise level conditioning, and a memory efficient unet design.

Google Icon Google

A text-to-image diffusion model with an unprecedented degree of photorealism

Google researchers are giving DALL-E a run for its money:

Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and image-text alignment much more than increasing the size of the image diffusion model.

A text-to-image diffusion model with an unprecedented degree of photorealism

Clément Delangue huggingface.co

Hugging Face raised $100 million for open/collaborative machine learning

Big news from our friends at Hugging Face:

Hugging Face is now the fastest growing community & most used platform for machine learning! With 100,000 pre-trained models & 10,000 datasets hosted on the platform for NLP, computer vision, speech, time-series, biology, reinforcement learning, chemistry and more, the Hugging Face Hub has become the Home of Machine Learning to create, collaborate, and deploy state-of-the-art models.

What will they spend the money on? Good stuff:

Thanks to the new funding, we’ll be doubling down on research, open-source, products and responsible democratization of AI.

Career evjang.com

The machine learning job market in 2022

Eric Jang was recently on the job market (finally landing at [Halodi Robotics])(https://halodi.com/) and in this post he shares his process and view of the job market today. He also has some insights on where it’s headed. In brief:

In the future, every successful tech company will use their data moats to build some variant of an Artificial General Intelligence.

Player art
  0:00 / 0:00