Tech lawyer Luis Villa returns to Go Time to school us once again on the intellectual property concerns of software creators in this crazy day we live in. This time around, weāre focusing on the implications of Large Language Models, code generation, and crazy stuff like that.
Luis Villa: Yeah, maybe it Vuitton. But thatās a trademark. That is used to identify your product to the public. And so exactly that, Kris - I think it was 386s, or 486s, where a court didnāt say that it was impossible to trademark a number, but like thereās a higher ā trademarks, the more creative they are, the less skeptical courts are, is the short and simple version of it. And a number - obviously not very creative. Itās like literally a part number. Whereas Pentium - much more creative, and in that sort of weird sense of itās one word⦠This is why you get all these weird startups that donāt call themselves a name that has anything to do with the thing theyāre doing. Right? Partially because itās more memorable, but partially because the less it sounds like the thing, the easier it is to trademark.
But that is a separate body of law. Trademark is the one body of intellectual property law thatās a very pro the humans at the end of the process. Because the whole thing is that trademarks are supposed to not confuse you. So in fact, itās companies that bear a lot of the burden there.
[22:02] Sort of the other way around with copyright, right? Where itās supposed to prohibit you from ripping the company off. But yeah, there are layers here that are ā and again, mostly I havenāt touched on patents. I assume that all these AI companies are patenting things out the wazoo. But patents apply to processes, usually, not to particular things.
So you could patent ā Iām simplifying a little bit here; thereās some exceptions, but⦠You can patent how you do training. So if you created a new way to train a model more efficiently, more effectively, then you can patent that. And if you figured out a new way to do outputs from a model, the process of creating those outputs you could patent. But actually patenting the model itself, with some sort of edge cases - probably not patentable⦠Which gets to one of the recurring themes that weāre going to have here, and this gets back to, Kris, what you were saying about numbers⦠A little unclear that the model actually is protectable by anything in modern intellectual property law. Like, thereās actually sort of an open question as to whether or not that thing is something that copyright law can fit into one of its boxes. Because thatās the thing - intellectual property law generally has boxes of things. Patents are four processes that we invented. Thatās a little bit of a simplification⦠But for our purposes, thatāll do. Copyright is creative works that you created. Trademark is brands that youāre using to sell a thing. If it doesnāt fit into one of those boxes - and this has actually been a problem with databases.
So like databases - I mentioned phonebooks. I think maybe I mentioned this in the last episode, too⦠Under US copyright law you canāt copyright a phonebook, because the Supreme Court said, āThereās no creativity there.ā Right? The only creativity was you wanted to find the phone number of every person in town. Now, if you said āThe 100 most popular debutantes in townā, which is actually was a thing in New York in the late 1800ās⦠That list, because it involves creativity and judgment, that list you could get a copyright on.
Now, if somebody else has a list of like āMy 100ā, and itās 95 of them are the same, it can be hard to protect. But at least in theory you could protect that thing, right? And so databases ā the European Union has a whole separate set of laws just for databases, that they call The Database Right. And so in theory, databases are ā in practice it turns out to have been not all that useful, but they invented ā 20 years ago, 25 years ago, they were āYou know what - we need to encourage more databases, so weāre going to create a Database Right.ā So there it is, an EU law; it turns out to be mostly unused, though weāll see if with machine learning maybe some people will say that the models are databases. I think thatās gonna be a little bit of a hard trick to pull off, but Iād be shocked if somebody doesnāt at least try to protect models under European database rights⦠Which, by the way, quick ā I think I touched on this last time around, but copyright law, global platform. And this is one of these where the analogies to programming actually works really well. Essentially, every country on Earth has signed whatās called the Berne Convention, which makes - the basic concepts of copyright are more or less the same globally. A lot of implementation details, as with anytime youāre creating an instance, youāre implementing an API, the implementation details matter⦠But at the high level, copyright is the same globally.
The US has no equivalent of the EU database law. The EU is regulating a lot right now on privacy, which bears on training, right? Like, what if you have private information in the model? The US Federal law says nothing about that. Copyright law says nothing about that. The European Union has very strong opinions on what happens if you accidentally encode private information, especially like medical information, in a model.
[25:59] So thereās a whole other field of law ā like, I think one of the cool things about machine learning law, but the very frustrating thing for programmers asking us about opinions⦠Because programmers just wanna know āIs this stuff legal? Can I use it?ā And the answer is, āIt dependsā, because, like copyright, patent, trademark, privacy law, database rights⦠All these things are like ā you know, step one is, I donāt know, are you in the EU? Or are you in Japan? Or are you in the US? Because the answer might be different in all those places.
So yeah, we talked a little bit about these sort of big buckets of things. One thing that I put in the show notes, and that we just talked a little bit about - like, what analogies do we use? So maybe it would be helpful - I donāt know where yāall wanna go - to talk about some of the analogies the courts have used for this kind of stuff in the past.