AI (Artificial Intelligence) Icon

AI (Artificial Intelligence)

Machines simulating human characteristics and intelligence.
330 Stories
All Topics


ImaginAIry imagines & edits images from text inputs

This is a Pythonic wrapper around stable diffusion with image editing by InstructPix2Pix. The four images featured below (top) are generated by the following command:

imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"

Then they are edited (bottom) with the following commands:

>> aimg edit scenic_landscape.jpg "make it winter" --prompt-strength 20
>> aimg edit dog.jpg "make the dog red" --prompt-strength 5
>> aimg edit bowl_of_fruit.jpg "replace the fruit with strawberries"
>> aimg edit freckled_woman.jpg "make her a cyborg" --prompt-strength 13
ImaginAIry imagines & edits images from text inputs


A library for building apps with LLMs through composability

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

This library is aimed at assisting in the development of those types of applications.

LangChain is designed to help with prompts, chains (sequences of calls), data augmented generation, agents, memory & evaluation tasks.

AI (Artificial Intelligence)

Microsoft wants to acquire a 49% stake in ChatGPT

This escalated quickly. I don’t know about you, but I’m a daily user of ChatGPT. Just yesterday, I asked “What options does Linux offer for fast RAID 0 software RAID?” and I had an entire conversation that settled on Btrfs as a good option and I learned how to create and configure the array, mount it, and most importantly scrub it for errors. I’ll still use ZFS, of course. But, I’ve never had that experience using Google (nor can you).

…according to a report by Semafor, Microsoft Corp is discussing the possibility of acquiring OpenAI, the parent company of ChatGPT. The tech-industry giant is ready to pay upwards of $10 billion for the acquisition.

Clearly, Microsoft sees the bigger picture here for Bing, Microsoft 365, GitHub Copilot, and more. This also speaks to the conversation we had with Swyx about AI’s future being tied to capitalism and eventually being controlled by the FAANGs.

Ars Technica Icon Ars Technica

Stability AI plans to let artists opt out of Stable Diffusion 3 image training

On Wednesday, Stability AI announced it would allow artists to remove their work from the training dataset for an upcoming Stable Diffusion 3.0 release. The move comes as an artist advocacy group called Spawning tweeted that Stability AI would honor opt-out requests collected on its Have I Been Trained website. The details of how the plan will be implemented remain incomplete and unclear, however.

This seems like a step in the right direction, but it appears that artists will have to proactively register and manually flag matched images in the database. Ain’t nobody got time for that!


Historical analogies for large language models

How will large language models (LLMs) change the world?

No one knows. With such uncertainty, a good exercise is to look for historical analogies—to think about other technologies and ask what would happen if LLMs played out the same way.

I like to keep things concrete, so I’ll discuss the impact of LLMs on writing. But most of this would also apply to the impact of LLMs on other fields, as well as other AI technologies like AI art/music/video/code.

What follows are 13 examples of technological innovations that changed the world and description of how they affected they way people work. Here’s an example analogy of Feet and Segways:

First, there was walking. Then the Segway came to CHANGE THE NATURE OF HUMAN TRANSPORT. Twenty years later, there is still walking, plus occasionally low-key alternatives like electric scooters.

In this analogy, LLMs work fine but just aren’t worth the trouble in most cases and society doesn’t evolve to integrate them. Domain-specific LLMs are used for some applications, but we start to associate “general” LLMs with tourists and mall cops. George W. Bush falls off an LLM on vacation and everyone loses their minds.

AI (Artificial Intelligence)

OpenAI's Whisper model ported to C/C++

OpenAI recently released a model for automatic speech recognition called Whisper. I decided to reimplement the inference of the model from scratch using C/C++. To achieve this I implemented a minimalistic tensor library in C and ported the high-level architecture of the model in C++. The entire code is less than 8000 lines of code and is contained in just 2 source files without any third-party dependencies.

State of the art voice recognition without any PyTorch baggage and it’s optimized to run on Apple Silicon!


Learning Rust with ChatGPT, Copilot and Advent of Code

Simon Willison is using this year’s Advent of Code as an opportunity to learn Rust.

He’s using Copilot to help him with syntax/snippets via comment-driven prompting. He’s using ChatGPT as a study partner by asking it questions about how to do things in Rust. Is it working?

So far I think this is working really well.

I feel like I’m beginning to get a good mental model of how Rust works, and a lot of the basic syntax is beginning to embed itself into my muscle memory.

The real test is going to be if I can first make it to day 25 (with no prior Advent of Code experience I don’t know how much the increasing difficulty level will interfere with my learning) and then if I can actually write a useful Rust program after that without any assistance from these AI models.

And honestly, the other big benefit here is that this is simply a lot of fun. I’m finding interacting with AIs in this way—as an actual exercise, not just to try them out—is deeply satisfying and intellectually stimulating.

This might be an early glimpse into the future of AI-assisted learning…


GitHub Copilot isn't worth the risk

Elaine Atwell says all CTOs urgently need to answer the question: should I allow Copilot at my company?

If you haven’t already figured it out from the title, Elaine’s answer to that question is No. But that might not be the right answer for everyone. In this article, she goes over the case for and against Copilot, and how you can detect whether it’s already in use at your organization.

Vladimir Prelovac

The age of PageRank is over

Google search quality has been deteriorating for awhile. In this manifesto (of sorts), Kagi CEO Vladimir Prelovac describes what he thinks needs to replace it:

In the future, instead of everyone sharing the same search engine, you’ll have your completely individual, personalized Mike or Julia or Jarvis - the AI. Instead of being scared to share information with it, you will volunteer your data, knowing its incentives align with yours. The more you tell your assistant, the better it can help you, so when you ask it to recommend a good restaurant nearby, it’ll provide options based on what you like to eat and how far you want to drive. Ask it for a good coffee maker, and it’ll recommend choices within your budget from your favorite brands with only your best interests in mind. The search will be personal and contextual and excitingly so!

Matthew Butt­erick

We've filed a lawsuit challenging GitHub Copilot

A couple weeks back, Adam logged some news that linked to Well, There’s a new website now:

Matthew Butterick:

By train­ing their AI sys­tems on pub­lic GitHub repos­i­to­ries (though based on their pub­lic state­ments, pos­si­bly much more) we con­tend that the defen­dants have vio­lated the legal rights of a vast num­ber of cre­ators who posted code or other work under cer­tain open-source licenses on GitHub. Which licenses? A set of 11 pop­u­lar open-source licenses that all require attri­bu­tion of the author’s name and copy­right, includ­ing the MIT license, the GPL, and the Apache license.

Matthew Butt­erick

GitHub Copilot Investigation

Is GitHub Copilot an AI parasite trained in the realms of fair use on pub­lic code any­where on the inter­net? Or, is it a much needed automation layer to all the reasons we open source in the first place?

When I first wrote about Copi­lot, I said “I’m not wor­ried about its effects on open source.” In the short term, I’m still not wor­ried. But as I reflected on my own jour­ney through open source—nearly 25 years—I real­ized that I was miss­ing the big­ger pic­ture. After all, open source isn’t a fixed group of peo­ple. It’s an ever-grow­ing, ever-chang­ing col­lec­tive intel­li­gence, con­tin­u­ally being renewed by fresh minds. We set new stan­dards and chal­lenges for each other, and thereby raise our expec­ta­tions for what we can accom­plish.

Amidst this grand alchemy, Copi­lot inter­lopes. Its goal is to arro­gate the energy of open-source to itself. We needn’t delve into Microsoft’s very check­ered his­tory with open source to see Copi­lot for what it is: a par­a­site.

The legal­ity of Copi­lot must be tested before the dam­age to open source becomes irrepara­ble. That’s why I’m suit­ing up.

What are your thoughts on this investigation and “poten­tial law­suit” against GitHub Copi­lot?

AI (Artificial Intelligence)

Dreamfusion! Text-to-3D model powered by Stable Diffusion

This working implementation of text-to-3D (powered by Stable Diffusion) didn’t take six months, like Simon predicted it would. Although I will concede that it’s not a 3D environment that can then go into in a game engine, but I’m sure that’s just a few more weeks away this point.

From the readme:

This project is a work-in-progress, and contains lots of differences from the paper. Also, many features are still not implemented now. The current generation quality cannot match the results from the original paper, and many prompts still fail badly!

OpenAI Icon OpenAI

OpenAI introduces Whisper (open source speech recognition)

They’re really putting the Open in OpenAI with this one…

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing.

We might need to give this a spin on our transcripts. Who knows, maybe our next big innovation could be The Changelog in German, French, Spanish, etc!

AI (Artificial Intelligence)

Responding to recruiter emails with GPT-3

Like many software engineers, Matt Bilyeu receives multiple emails from recruiters weekly. And, because he’s polite (and for other reasons) he tries to respond (politely) to all of them. But…

It would be ideal if I could automate sending these responses. Assuming I get four such emails per week and that it takes two minutes to read and respond to each one, automating this would save me about seven hours of administrative work per year.

Enter the GPT-3 API and some code that gets run by a future cron job (now that he’s tested this on a handful of emails) and Matt auto-responds to al the emails, continues to be polite, while also saving (his) time. It’s AI Matt responding the way real Matt would.

AI (Artificial Intelligence)

Stable Diffusion is a really big deal

Simon Willison explains what it is:

Stable Diffusion is a new “text-to-image diffusion model” that was released to the public by six days ago, on August 22nd.

It’s similar to models like Open AI’s DALL-E, but with one crucial difference: they released the whole thing.

And why it’s a really big deal:

In just a few days, there has been an explosion of innovation around it. The things people are building are absolutely astonishing.

He then details some of the innovation and it is staggering, to say the least. Open FTW!

AI (Artificial Intelligence)

The AI art apocalypse

Alexander Wales:

This image was created by an AI, MidJourney. All I had to do was type in a prompt (“wildfire”) and aspect ratio. This AI is pretty good, but nowhere near the state of the art, and AI like it are, over the next few years, going to make art like this available within seconds at a cost of pennies. This applies not just to “art” like the above, which is going to accompany my prose and worldbuilding projects, but to almost every area of life where you see pictures of any kind. I think it’s hard to understate how big of a deal this will end up being, and this blog post is largely my attempt to collate a lot of the arguments under one roof, in part because some of the arguments aren’t actually arguments at all.

Microsoft News Icon Microsoft News

Microsoft's new AI for Beginners course

A 12-week, 24-course curriculum covering:

  • Different approaches to Artificial Intelligence, including the “good old” symbolic approach with Knowledge Representation and reasoning (GOFAI).
  • Neural Networks and Deep Learning, which are at the core of modern AI. We will illustrate the concepts behind these important topics using code in two of the most popular frameworks - TensorFlow and PyTorch.
  • Neural Architectures for working with images and text. We will cover recent models but may lack a little bit on the state-of-the-art.
  • Less popular AI approaches, such as Genetic Algorithms and Multi-Agent Systems.
Microsoft's new AI for Beginners course

AI (Artificial Intelligence)

Kern AI's refinery is a data-centric IDE for NLP

Like the data-centric sibling of your favorite programming environment. It provides an easy-to-use interface for weak supervision as well as extensive data management, neural search and monitoring to ensure that the quality of your training data is as good as possible.

This won’t rid you of the need to manually label, but it’ll save you time in the process!

Kern AI's refinery is a data-centric IDE for NLP

AI (Artificial Intelligence)

A human-in-the-loop workflow for creating HD images from text

DALL-E can generate some amazing results, but we’re still in a phase of AI’s progress where having humans involved in the process is just better. Here’s how the authors of this workflow explain it:

Generative art is a creative process. While recent advances of DALL·E unleash people’s creativity, having a single-prompt-single-output UX/UI locks the imagination to a single possibility, which is bad no matter how fine this single result is. DALL·E Flow is an alternative to the one-liner, by formalizing the generative art as an iterative procedure.

A human-in-the-loop workflow for creating HD images from text


The Deepfake Offensive Toolkit

dot (aka Deepfake Offensive Toolkit) makes real-time, controllable deepfakes ready for virtual cameras injection. dot is created for performing penetration testing against e.g. identity verification and video conferencing systems, for the use by security analysts, Red Team members, and biometrics researchers.

What’s crazy is dot deepfakes don’t require any additional training. 🤯

The Deepfake Offensive Toolkit
  0:00 / 0:00