AI (Artificial Intelligence) Icon

AI (Artificial Intelligence)

Machines simulating human characteristics and intelligence.
330 Stories
All Topics

Matthew Butt­erick githubcopilotlitigation.com

We've filed a lawsuit challenging GitHub Copilot

A couple weeks back, Adam logged some news that linked to githubcopilotinvestigation.com. Well, There’s a new website now: githubcopilotlitigation.com

Matthew Butterick:

By train­ing their AI sys­tems on pub­lic GitHub repos­i­to­ries (though based on their pub­lic state­ments, pos­si­bly much more) we con­tend that the defen­dants have vio­lated the legal rights of a vast num­ber of cre­ators who posted code or other work under cer­tain open-source licenses on GitHub. Which licenses? A set of 11 pop­u­lar open-source licenses that all require attri­bu­tion of the author’s name and copy­right, includ­ing the MIT license, the GPL, and the Apache license.

Matthew Butt­erick githubcopilotinvestigation.com

GitHub Copilot Investigation

Is GitHub Copilot an AI parasite trained in the realms of fair use on pub­lic code any­where on the inter­net? Or, is it a much needed automation layer to all the reasons we open source in the first place?

When I first wrote about Copi­lot, I said “I’m not wor­ried about its effects on open source.” In the short term, I’m still not wor­ried. But as I reflected on my own jour­ney through open source—nearly 25 years—I real­ized that I was miss­ing the big­ger pic­ture. After all, open source isn’t a fixed group of peo­ple. It’s an ever-grow­ing, ever-chang­ing col­lec­tive intel­li­gence, con­tin­u­ally being renewed by fresh minds. We set new stan­dards and chal­lenges for each other, and thereby raise our expec­ta­tions for what we can accom­plish.

Amidst this grand alchemy, Copi­lot inter­lopes. Its goal is to arro­gate the energy of open-source to itself. We needn’t delve into Microsoft’s very check­ered his­tory with open source to see Copi­lot for what it is: a par­a­site.

The legal­ity of Copi­lot must be tested before the dam­age to open source becomes irrepara­ble. That’s why I’m suit­ing up.

What are your thoughts on this investigation and “poten­tial law­suit” against GitHub Copi­lot?

Practical AI Practical AI #196

What's up, DocQuery?

Chris sits down with Ankur Goyal to talk about DocQuery, Impira’s new open source ML model. DocQuery lets you ask questions about semi-structured data (like invoices) and unstructured documents (like contracts) using Large Language Models (LLMs). Ankur illustrates many of the ways DocQuery can help people tame documents, and references Chris’s real life tasks as a non-profit director to demonstrate that DocQuery is indeed practical AI.

AI (Artificial Intelligence) github.com

Dreamfusion! Text-to-3D model powered by Stable Diffusion

This working implementation of text-to-3D (powered by Stable Diffusion) didn’t take six months, like Simon predicted it would. Although I will concede that it’s not a 3D environment that can then go into in a game engine, but I’m sure that’s just a few more weeks away this point.

From the readme:

This project is a work-in-progress, and contains lots of differences from the paper. Also, many features are still not implemented now. The current generation quality cannot match the results from the original paper, and many prompts still fail badly!

Practical AI Practical AI #195

Production data labeling workflows

It’s one thing to gather some labels for your data. It’s another thing to integrate data labeling into your workflows and infrastructure in a scalable, secure, and useful way. Mark from Xelex joins us to talk through some of what he has learned after helping companies scale their data annotation efforts. We get into workflow management, labeling instructions, team dynamics, and quality assessment. This is a super practical episode!

OpenAI Icon OpenAI

OpenAI introduces Whisper (open source speech recognition)

They’re really putting the Open in OpenAI with this one…

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing.

We might need to give this a spin on our transcripts. Who knows, maybe our next big innovation could be The Changelog in German, French, Spanish, etc!

Practical AI Practical AI #194

Evaluating models without test data

WeightWatcher, created by Charles Martin, is an open source diagnostic tool for analyzing Neural Networks without training or even test data! Charles joins us in this episode to discuss the tool and how it fills certain gaps in current model evaluation workflows. Along the way, we discuss statistical methods from physics and a variety of practical ways to modify your training runs.

The Changelog The Changelog #506

Stable Diffusion breaks the internet

This week on The Changelog we’re talking about Stable Diffusion, DALL-E, and the impact of AI generated art. We invited our good friend Simon Willison on the show today because he wrote a very thorough blog post titled, “Stable Diffusion is a really big deal.”

You may know Simon from his extensive contributions to open source software. Simon is a co-creator of the Django Web framework (which we don’t talk about at all on this show), he’s the creator of Datasette, a multi-tool for exploring and publishing data (which we do talk about on this show)…most of all Simon is a very insightful thinker, which he puts on display here on this episode. We talk from all the angles of this topic, the technical, the innovation, the future and possibilities, the ethical and the moral – we get into it all. The question is, will this era be known as the initial push back to the machine?

Practical AI Practical AI #193

Stable Diffusion

The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2).

alt text
(Image from stability.ai)

Practical AI Practical AI #192

Licensing & automating creativity

AI is increasingly being applied in creative and artistic ways, especially with recent tools integrating models like Stable Diffusion. This is making some artists mad. How should we be thinking about these trends more generally, and how can we as practitioners release and license models anticipating human impacts? We explore this along with other topics (like AI models detecting swimming pools 😊) in this fully connected episode.

AI (Artificial Intelligence) matthewbilyeu.com

Responding to recruiter emails with GPT-3

Like many software engineers, Matt Bilyeu receives multiple emails from recruiters weekly. And, because he’s polite (and for other reasons) he tries to respond (politely) to all of them. But…

It would be ideal if I could automate sending these responses. Assuming I get four such emails per week and that it takes two minutes to read and respond to each one, automating this would save me about seven hours of administrative work per year.

Enter the GPT-3 API and some code that gets run by a future cron job (now that he’s tested this on a handful of emails) and Matt auto-responds to al the emails, continues to be polite, while also saving (his) time. It’s AI Matt responding the way real Matt would.

AI (Artificial Intelligence) simonwillison.net

Stable Diffusion is a really big deal

Simon Willison explains what it is:

Stable Diffusion is a new “text-to-image diffusion model” that was released to the public by Stability.ai six days ago, on August 22nd.

It’s similar to models like Open AI’s DALL-E, but with one crucial difference: they released the whole thing.

And why it’s a really big deal:

In just a few days, there has been an explosion of innovation around it. The things people are building are absolutely astonishing.

He then details some of the innovation and it is staggering, to say the least. Open FTW!

Practical AI Practical AI #191

Privacy in the age of AI

In this Fully-Connected episode, Daniel and Chris discuss concerns of privacy in the face of ever-improving AI / ML technologies. Evaluating AI’s impact on privacy from various angles, they note that ethical AI practitioners and data scientists have an enormous burden, given that much of the general population may not understand the implications of the data privacy decisions of everyday life.

This intentionally thought-provoking conversation advocates consideration and action from each listener when it comes to evaluating how their own activities either protect or violate the privacy of those whom they impact.

Practical AI Practical AI #190

Practical, positive uses for deep fakes

Differentiating between what is real versus what is fake on the internet can be challenging. Historically, AI deepfakes have only added to the confusion and chaos, but when labeled and intended for good, deepfakes can be extremely helpful. But with all of the misinformation surrounding deepfakes, it can be hard to see the benefits they bring. Lior Hakim, CTO at Hour One, joins Chris and Daniel to shed some light on the practical uses of deepfakes. He addresses the AI technology behind deepfakes, how to make positive use of deep fakes such as breaking down communications barriers, and shares how Hour One specializes in the development of virtual humans for use in professional video communications.

AI (Artificial Intelligence) alexanderwales.com

The AI art apocalypse

Alexander Wales:

This image was created by an AI, MidJourney. All I had to do was type in a prompt (“wildfire”) and aspect ratio. This AI is pretty good, but nowhere near the state of the art, and AI like it are, over the next few years, going to make art like this available within seconds at a cost of pennies. This applies not just to “art” like the above, which is going to accompany my prose and worldbuilding projects, but to almost every area of life where you see pictures of any kind. I think it’s hard to understate how big of a deal this will end up being, and this blog post is largely my attempt to collate a lot of the arguments under one roof, in part because some of the arguments aren’t actually arguments at all.

Microsoft News Icon Microsoft News

Microsoft's new AI for Beginners course

A 12-week, 24-course curriculum covering:

  • Different approaches to Artificial Intelligence, including the “good old” symbolic approach with Knowledge Representation and reasoning (GOFAI).
  • Neural Networks and Deep Learning, which are at the core of modern AI. We will illustrate the concepts behind these important topics using code in two of the most popular frameworks - TensorFlow and PyTorch.
  • Neural Architectures for working with images and text. We will cover recent models but may lack a little bit on the state-of-the-art.
  • Less popular AI approaches, such as Genetic Algorithms and Multi-Agent Systems.
Microsoft's new AI for Beginners course

Practical AI Practical AI #188

AlphaFold is revolutionizing biology

AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

Practical AI Practical AI #187

AI IRL & Mozilla's Internet Health Report

Every year Mozilla releases an Internet Health Report that combines research and stories exploring what it means for the internet to be healthy. This year’s report is focused on AI. In this episode, Solana and Bridget from Mozilla join us to discuss the power dynamics of AI and the current state of AI worldwide. They highlight concerning trends in the application of this transformational technology along with positive signs of change.

Practical AI Practical AI #186

The geopolitics of artificial intelligence

In this Fully-Connected episode, Chris and Daniel explore the geopolitics, economics, and power-brokering of artificial intelligence. What does control of AI mean for nations, corporations, and universities? What does control or access to AI mean for conflict and autonomy? The world is changing rapidly, and the rate of change is accelerating. Daniel and Chris look behind the curtain in the halls of power.

AI (Artificial Intelligence) github.com

Kern AI's refinery is a data-centric IDE for NLP

Like the data-centric sibling of your favorite programming environment. It provides an easy-to-use interface for weak supervision as well as extensive data management, neural search and monitoring to ensure that the quality of your training data is as good as possible.

This won’t rid you of the need to manually label, but it’ll save you time in the process!

Kern AI's refinery is a data-centric IDE for NLP

Practical AI Practical AI #185

DALL-E is one giant leap for raccoons! 🔭

In this Fully-Connected episode, Daniel and Chris explore DALL-E 2, the amazing new model from Open AI that generates incredibly detailed novel images from text captions for a wide range of concepts expressible in natural language. Along the way, they acknowledge that some folks in the larger AI community are suggesting that sophisticated models may be approaching sentience, but together they pour cold water on that notion. But they can’t seem to get away from DALL-E’s images of raccoons in space, and of course, who would want to?

  0:00 / 0:00