Luis Villa: So if you’re trying to reimplement some competitor’s API, I probably wouldn’t use Copilot, because then the output is gonna look like you took the heart of this other person’s thing. It’s probably gonna start auto-suggesting code that looks a lot like their implementation if it’s an open source implementation. So if there’s like a GPL implementation of something and you wanna write an MIT implementation of it, I suspect Copilot – I haven’t seen anybody try this yet, but I suspect Copilot is gonna start doing things that look a lot like the original implementation… And then you’re gonna have a problem. But because one of the tests for fair use is “How much of it did you take?”, if you end up with like a 5-line fragment out of somebody’s GPL code that’s like 100,000 lines – or, I mean, what’s the Linux Kernel these days? 6-7 million lines of code? A court is just gonna laugh that out of court. If somebody comes after you on a GPL claim for you as a user of Copilot - again, that’s different from GitHub, because GitHub did presumably copy the entirety of the Linux kernel.
So we’ve got the “What was the nature of the taking? How much did you take?” Another thing that courts are gonna look at is the commercial impact of the copying… Again, GitHub, since they’re copying the whole thing and they’re a big corporate competitor, possibly some big…
For you as a user of Copilot - that company was not trying to sell you five lines of code… They weren’t trying to license five lines of code to you, and you weren’t looking to buy five lines of code. You were just gonna write it yourself anyway.
[43:58] So a court, again, is gonna look somewhat skeptically at – and this is something we know from the Google Book Search case, that a court is gonna say “Well, you all weren’t selling snippet search of your book.” In fact, if anything, this is a key difference from Google Book Search to Copilot… The court found in that case that actually this is gonna help you sell more books, because people are gonna find books, there’s a limitation on how much gets shown, and there’s a Buy button right there… There’s no equivalent to that in – maybe GitHub will do something like “By the way, it looks like you copied this from the Linux Kernel. Click here to do GitHub Sponsors.” That might be a little tacky, but you could see that as a thing they could do in the future, perhaps.
So that’s sort of the basic analysis of how much got taken, was it really important stuff that got taken, what was the commercial impact, is it something new and bold and different, that wasn’t gonna happen anyway? And I think looking at all those, I find a really hard time seeing that a court is gonna say that this was not a fair use… Because it’s so different, the impact is so small… The emotional impact is real, and I don’t wanna downplay that. As authors – but again, the whole point of fair use is sometimes authors are pissed, and we ignore that as a policy matter.
By the way, I should say again, this is all in the U.S. E.U. has different sets of rules about this, and I really think one of the interesting things that is under-discussed that I would love to see more of is commentary from the European Union lawyers, Japanese lawyers… Because I don’t think we have as good a sense yet of what that would look like in other places, other legal regimes.