Practical AI – Episode #299
Sidekick is an AI Shopify expert
with Mike Tamir & Matt Colyer from Shopify
Today, Chris explores Shopify Magic and other AI offerings with Mike Tamir, Distinguished ML Engineer and Head of Machine Learning, and Matt Colyer, Director of Product Management for Sidekick. They talk about how Shopify uses generative AI and LLMs to enhance their products, and they take a deeper dive into Sidekick, a first-of-its-kind, AI-enabled commerce assistant that understands a merchant’s business (products, orders, customers) and has been trained to know all about Shopify.
Featuring
Sponsors
Fly.io – The home of Changelog.com — Deploy your apps close to your users — global Anycast load-balancing, zero-configuration private networking, hardware isolation, and instant WireGuard VPN connections. Push-button deployments that scale to thousands of instances. Check out the speedrun to get started in minutes.
Timescale – Purpose-built performance for AI Build RAG, search, and AI agents on the cloud and with PostgreSQL and purpose-built extensions for AI: pgvector, pgvectorscale, and pgai.
Eight Sleep – Up to $600 off Pod 4 Ultra — Go to eightsleep.com/changelog and use the code CHANGELOG
. You can try it for free for 30 days - but we’re confident you will not want to return it (we love ours). Once you experience AI-optimized sleep, you’ll wonder how you ever slept without it. Currently shipping to: United States, Canada, United Kingdom, Europe, and Australia.
Notes & Links
Chapters
Chapter Number | Chapter Start Time | Chapter Title | Chapter Duration |
1 | 00:00 | Welcome to Practical AI | 00:34 |
2 | 00:35 | Sponsor: Fly | 02:29 |
3 | 03:16 | What is Shopify? | 01:54 |
4 | 05:10 | What sets Shopify apart? | 00:59 |
5 | 06:09 | Wide array of customers | 01:52 |
6 | 08:01 | The Ai turning point | 01:11 |
7 | 09:12 | Shopify's mission | 01:22 |
8 | 10:33 | Approaching different models | 01:10 |
9 | 11:43 | Strengths and weaknesses of different perspectives | 03:08 |
10 | 14:51 | Growing pains | 02:19 |
11 | 17:21 | Sponsor: Timescale | 02:21 |
12 | 19:58 | Find your way out of the forest | 03:31 |
13 | 23:29 | Other products | 03:48 |
14 | 27:17 | Plugging in the ecosystem | 03:14 |
15 | 30:45 | Sponsor: Eight Sleep | 02:34 |
16 | 33:29 | What is Magic? | 00:41 |
17 | 34:10 | Adding more AI capabilities? | 06:00 |
18 | 40:10 | Finding utility in other tools | 04:07 |
19 | 44:17 | Looking to the future | 06:07 |
20 | 50:23 | Thanks for joining us! | 00:27 |
21 | 50:50 | Outro | 00:46 |
Transcript
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to another edition of the Practical AI Podcast. This is Chris Benson. I am going solo today, Daniel’s not able to join me… But we have two guests today, and I would like to introduce you to Mike Tamir, who is Distinguished Machine Learning Engineer and Head of Machine Learning at Shopify, as well as his colleague, Mike Colyer, who is the Director of Product Management for Sidekick. Gentlemen, welcome to the show.
Thanks, Chris.
Yeah, thanks for having us.
Glad to have you on board. I know you guys are doing a lot of cool stuff in the AI space, and so thank you both for joining to cover the different aspects of it. For those who may be joining, who probably have heard of Shopify, but may not be users, or may not be intimately familiar, can you guys talk a little bit – before we dive into all the AI goodness, can you guys talk a little bit about what Shopify is as a company, and kind of how you see the space that you’re in? What need are you fulfilling? Just some of the general understanding of your business before we get into the AI stuff.
Yeah, I can jump in and then maybe Mike can add in, sprinkle in some bits.
Sure.
Yeah, Shopify is one of those incredible companies that you might not be aware of, but you’ve probably used it, even if you aren’t specifically aware of it… So our mission is to create the retail operating system behind your favorite brands So you can think about it - there’s the chain restaurants you go to, all around your hometown… But then there’s your favorite coffee shop, that’s run by a local operator… And there’s just something better about that coffee, . And so Shopify’s goal on the internet is to enable those kinds of operators to have a successful business out there. And so we power many brands online. Some of the ones that are out there that are more famous are like Drake, Patel, Gymshark, Heinz, just to name a few… So some of the world’s biggest brands are on there, but also some of those entrepreneurs that are in your local area are, too.
Can you describe what sets Shopify apart from other processing? I mean, it has a very distinctive brand I know, but can you share a little bit about what makes it distinct from other things like credit card processors, and cart processors? I know you guys kind of have your own distinctive way of doing things.
Yeah, I mean, it’s a full soup to nuts kind of solution. So we do e-commerce, so we can help you build your site, we can help you create merchandising around the products that you offer… We also offer payment solutions, so you don’t have to set up a separate credit card processor, but if you have one, you can bring it, too. I think that’s actually one of the most powerful things about the Shopify platform, is that it’s got a very extensive developer ecosystem, and so many of our merchants install apps from our partners to do specific things. So if you’ve got a specific shipping provider, you can use their app for it… If you’ve got a specific email provider, you can use your app for it… For many of these categories we also have solutions, but the ecosystem is definitely the richest part.
It sounds like you also do lots of different market segments, from some of the large brands that you just talked about, down to – when I’ve come across Shopify in the past, it’s been in the context of kind of smaller business, and e-commerce, and midsize business, and things like that. So it sounds like you guys hit quite an array of different customer segments with technology solutions.
Yeah. I mean, maybe, Mike, do you want to chime in? I feel like you work on more of them than I do these days.
If you think about all the things that Matt described - establishing a website, payment processing… That’s the infrastructure, something we can provide for a large established brand all the way down to a smaller brand. More and more, especially in recent years, what Shopify has been focusing on is not just providing you with the website, but also providing you with kind of the tools for growth. And this is where the AI focus that we’ve had, on moving to machine learning and AI, has really made itself apparent.
For instance, we have our shop app, where a new merchant who has no track record of sales or history can join that app. And when we search, if we understand the backend machine learning, and understanding how to do the retrieval and the ranking well, we can reveal a new fresh merchant to a customer right away.
Very cool.
To piggyback on what Mike was saying, I think it’s like a great example of how we worked in the AI component of it…
But one of the core values of Shopify is that we keep merchants on the cutting edge. So we view our mission to understand technology at a really deep level, and always be out there, scouring the best, and then figure out how we can apply it to our merchants businesses. So like Mike was talking about “How can we help our merchants grow?” And so how can we apply these machine learning models or these AI techniques and really help these people grow their business.
[00:08:00.18] With you guys having that technology infrastructure that you’ve been supporting all these businesses with and everything, at what point did you start thinking about the fact that there were these AI technologies that have been on the rise in recent years? What was the turning point for the company where you started seriously looking at AI as a supporting factor in the business model? What made you say “I see an opportunity to go help our customer get done the things that we’re trying to do, that we’ve been doing for years”? What was that turning point? How did that come about, and how did you start thinking about AI in the business?
I mean, knowing the timeline, I think that the turning point was – we might be the chicken and not the egg, so to speak, in that Shopify maybe historically was not as invested in machine learning and AI… But by 2022 it had become interested in that, and has certainly massively redirected our forces, and the work that Matt and I joined and have focused on over the last several years.
And can you talk a little bit about that vision as you talk about you guys kind of coming into the company at that point, and carrying that forward? How do you think about the mission, if you will, that you’re doing? How do you contextualize it in terms of how you want to carry it forward, and how you’re going to serve your customers with that effort?
I’ll give you my product manager-y answer to this…
That sounds fine.
…because I’ve just been hearing Mike’s science answer to this. I think it’s about finding out what’s out there in the world… So to be quite honest, in that time period, if you all recall - it feels like, I don’t know, a million years ago, but ChatGPT didn’t exist like three years ago, right?
That’s right.
And we forget that that was a world, but it was. And so I think we got enamored, the folks that work with Shopify, with that technology, just as much as everybody else did. I definitely remember there was a peak ChatGPT moment where my mom was telling me about how she got it to write a poem… And so I think the whole world was just captivated by the fact that we had computers that could write us stories… And so I think that’s kind of the culture of Shopify. We’re all tinkerers, and we like to build, and we’re all there getting messages from our moms about how they can write stories that they couldn’t write before… And we’re like “Hm… Maybe there’s a way to apply this to commerce.”
I’m curious, as you talk about that – I like the mom story, because that literally holds true with me… I have a mom, she’s long since retired, but she was a technologist. And so yeah, we have those moments where she’s like “Well, I’m going to go try that thing out and do that.”
I’m kind of curious, as these new technologies were coming about and you guys were coming into the company and kind of carrying it forward, there were a lot of choices that you guys had to make, in terms of like – obviously, we talked about ChatGPT, there’s open source… There’s a whole bunch of different approaches to how you’re going to support your customers with different technologies, and different ways of addressing… What was your thinking, both on the technical and on the product side, in terms of how you might do that?
I would describe our approach as unembarrassed with how pragmatic we are on these issues, whether it’s open source, or one of the commodity providers for LLMs. We tend to gravitate to whatever works, and we do keep multiple threads of experiments with all of these different options, technological options, open for solving every problem. I think that that pans out with what we currently have in production, and active, is a nice array of not just all of the foundation models that you might think, the commodity options that are out there, but also being pretty aggressive with how we use the open source versions that are out there.
[00:11:44.05] As we’ve talked about that, and you bring up open source there, and we’ve talked about productized offerings such as ChatGPT, through API, and stuff… Going into this, and before you got to the point now where there are a number of things that I know we’re about to talk about, from a strategic standpoint, how did you parse that? How did you say “We have this challenge in that we’re a big, successful technology company, we’re moving into the brave new world of AI”, and you had to kind of like figure out what do you want to do with foundation models, are you hosting your own, are you going to go to APIs… How do you think about that? And not just from a technical standpoint, but from the perspective of serving your customers, and stuff? How do you see the strengths and the weaknesses of different perspectives there? If you could share a little bit about your insight as you had to analyze on behalf of your own business.
Echoing what Mike said, I think it’s early innings in this industry. So at the beginning of an industry, there’s a lot of change, and it happens very, very quickly. So I think it’s hard to pick any one solution. I think what people thought at the beginning of this a new wave of technology is – maybe fine tuning was the answer, right? And context windows will be small forever, and it will be expensive. And I think it’s blown everybody’s expectations out of the water at how quickly things have gotten less expensive. I don’t think anybody could have predicted that. And so I think that’s just such a world of abundance, both in like the operational costs, but also - the other thing I would say that’s unpredicted is that the number of solutions out there is just unparalleled. I don’t know, without the leaking of the LLaMA weights - I forget, back in like March of… I don’t know, was it ‘23, or something? I don’t know that we’d have the open source community that we have today, but since that Cambrian explosion of open source models, it’s been crazy, some of the innovation that’s out there.
So I would like to think that we had a master plan, where like “Oh yes, we saw exactly, we were going to use these commodity models, and then move to the open source models.” But the reality of it is always messier than the historical written version, I think. So I think the short answer to all of that is that we try a bunch of stuff, and I think the people that win in this space are the people that try most things the fastest… And so I’d say that’s our overall strategy, is we try a lot of different things, and then like Mike said, we have a rigorous way of determining which is the best… But that frequently changes. One approach that worked three months ago - -I can’t even count how many times on Sidekick we’ve replaced the core underpinnings of it, because we found a new approach that’s better.
I’m not at all surprised by that, and I appreciate that answer, because I think that’s a challenge that everyone out there, a lot of folks listening to this, are facing as well, as it is moving so fast… And what was expensive yesterday is plunging, and costs as new things are on the rise, and such… And so I like your notion of experiment very fast, fail fast, I guess, in the process on your experiments, so that you know what’s working for you, at least today, until the next thing hits you tomorrow. Have there been any growing pains that come to mind, that you can share, where you plant your palm on your forehead and say “I wish I could have seen that coming”?
So this is something that I think the entire community has been working out in real time… Like Matt said, you can try different things and see which strategy works the best, or which tactic works best, or what combination of those tactics works best, and change all the time. Because we do know that new models are going to keep being released, large commodity ones, smaller open source ones, large open source ones… And we’re learning all the time in the research what sorts of tactics, whether it’s fine-tuning, or using long prompts, or combinations of domain adaptation… That’s all going to be in flux, and we should believe that that’s going to be in flux for a very long time.
And so if you have a mandate of “Hey, we’re open to anything, and we’ll use what works”, we have to have a definition of what working looks like. And that means - and Matt’s going to laugh, because this is something that we’ve worked on quite a bit… You have to have your eval system dialed in. And that’s the sort of work that we’re moving into different formats for how you might eval, especially with unstructured text generation for figuring out “This was a good answer”, “This was a bad answer”, and being able to measure it in various ways. And there’s all sorts of creative solutions, depending on the context. But making sure that we have a way of measuring that, versus saying “Two anecdotes is enough for me to think that all the swans are white”, that’s an important part of the process if we want to be pragmatic about our solutioning.
[00:16:29.07] Yeah, we have I guess like a story on the team. We talk about it being the dark forest… And I think we’re all grasping, we wish we had our GPS-enabled phones pinpointing us on the map of how you get out of the dark forest… But I think what we’ve had to settle for with eval is like a compass, if you will, like the metaphor here… It’s that everybody wishes we knew absolutely like “This is how good the system is”, but what we’ve kind of settled for instead, that’s more practical. It’s like “Well, we know A is better than B.” And so if you just make enough decisions in aggregate, where A is always better than B, you eventually find yourself out of forest, right? But we wish we knew how long it would take. The question is always like “Well, when are you going to find your way out?” It’s like “Well, we don’t know.” We’re just going to keep making the best next decision we can.
Break: [00:17:11.03]
So guys, I love the notion of eval and finding your way back out of the forest… With that in mind, you have a lot of different products that you work with… And as you’re bringing these new technologies in, and you’re doing these evals, and you’re trying to find your way out of the black forest through that, and managing across multiple areas there, how does that look for you? How do you unify different products so that you can effectively serve your customers with these technologies? And how do you make all those Legos come together in a usable way?
There are certain genres of problems, and it seems like one or more strategies for eval will be appropriate for each genre. So let me give an example… With search, and evaluating search quality, there’s been some good research that shows that LLMs tend to be better at rank-ordering, or labeling relevant/not relevant of a product to a query, at a certain resolution. In fact, there’s research that shows that the LLMs are better than a human at doing this. And you might ask yourself, “Why is that?” You’re a human, Chris, you searched for a white flower dress, and you’re going to click on one of those. One of those is going to be the right answer for that query for you. And then I might search for a white flower dress, and you search for one, because you wanted a dress with a white flower. And I didn’t want a dress with a white flower. I wanted a white dress with colorful flowers. And both of these are completely legitimate answers to that query. And what we’re seeing here is it’s actually just a sampling problem. If you think about it, there’s a distribution of appropriate products matching any query.
And so every time we ask a human, we’re getting a sample from that distribution. But if that distribution is very flat, then we’re going to sample across a wide variety of different answers. And so what we’ve done with LLMs in this is – you can think about this as an analogy… “In the morning, I like…” fill in the blank. And there’s a lot of good answers to that.
I like to exercise. I like coffee. I like breakfast. I like orange juice. Whatever it is. There’s a lot of ways of completing that sentence, and they’re all legitimate. It’s just that it’s a very flat distribution. And with language, what we do is we just overwhelm with sample after sample after sample after sample, so that we can fill in that whole distribution. In the query product genre, it takes too much time and too much cost to fill in that distribution in mass that way, until you get into implicit feedback. So you have to find another solution. And this is one of the reasons why, when you’re using an LLM and replacing a typical [unintelligible 00:22:51.29] from filling in those answers, you get better results.
Now, something important and something that Matt and I have seen in other contexts - you can’t do this ungrounded. You can’t just have the robots grade the robots, and then hope for the best. You have to have different expert supervision to ground those answers, whether it’s in a search context, a personalization context, in more of a chat context, like the Sidekick product… You have to have that grounding. And once you inject that kind of course correction, then you kind of get the best of both worlds.
Could you talk a little bit about the different products? You talked about query, you talked about personalization… But are there any others there that you say are kind of very prominent in your world, that you’re thinking about applying LLMs or other AI technologies to?
[00:23:46.07] So we’ve got several products that are AI-enabled or magic-enabled at Shopify. So Sidekick is kind of the main one, which we’ll talk more about… But to give you a general idea, it’s a tool that helps merchants find their way around Shopify, but also answer questions about their business. So you can think of it as like the co-founder they wish they had, that’s available 24/7 and isn’t judgmental. So that’s kind of like the Sidekick idea. And then we’ve got a variety of other ones as well. So we have a lot of imagery… So it turns out shopping - people want to see what the thing is before they purchase it; not terribly surprising. And one of the things that merchants often want to be able to do before they’ve scaled up to a whole team that has a studio, and a photographer, and the rest of it, is they want to enhance the pictures that they do have. So at the scale that they’re at - this is where technology, again, bringing back the best from the frontier, and making it accessible to all of our merchants is exciting.
So there is technology that’s out there now, that you can essentially describe what you want to do to an image… You’re like “Hey, my background’s a little bit messy. Can you replace it with a studio background instead?” Because we all know that the nice, white studio background that looks like the object’s floating in space. You can do that in real life, it’s just really hard and expensive, and you have to know what you’re doing. And there are very few people who know how to do that well. And so it turns out – we’ve created models that can do it fairly well as well. And so bringing that technology back - that’s one of the products we do offer integrated into Shopify today, is background generation.
So merchants can import an image that they already have, replace the background with something that’s more on brand… Like, say they want to set their coffee to the background of a jungle, right? They can place it on a table in front of a jungle, if that’s what they would prefer. Or they could do it into the void of white space. So lots of exciting opportunities with that.
Another area that we’ve been investing in is that we have a product called Inbox, which allows our merchants to talk with the buyers that they have on their site, and if a buyer has a question, like “Hey, what’s your return policy?” or “Where’s my order at?”, they can interact with the merchants through Inbox. And so one of the things that we’re offering today is that we look at all the merchants’ policies, and all the other things that they’ve given to us, and then can help formulate answers for those common questions. It’s like “Well, what’s your return policy?” It’s like “Well, we’re pretty sure that this is the answer.” And then we can suggest that to a merchant, who then says “Yup, that’s right.” Or if that’s not right, they can adjust it to be correct, and then send it. Merchants love that, because it saves them time for answering a lot of those repetitive questions. And then those ones that are a little bit harder, they can write them themselves just as they would before.
And then the last one that I’m thinking about is – or again, going back to that product and merchandising kind of task, is that oftentimes merchants are uploading a lot of these at the same time. So they don’t always capture all the metadata in the first go… And so this is, again, where we’ve created models that actually can help with that.
So if you upload an image of a white flower dress, we have a model that can actually understand what that picture is, and suggest that “Hey, maybe this is a white flower dress”, and this should be categorized under dresses, and under Cotton. And maybe it can suggest “Oh, it turns out –” If you upload multiple different colors, it’s like “Well, maybe you want to create product variants.” So that’s some of the other technology that we’re kind of working on today, is using the data that we have from merchants, enabling them to more expressively describe their products through our sites.
With Inbox, is Inbox part of Sidekick, or is it adjacent or parallel to Sidekick, in the way that you see it?
No, it’s a completely separate offering.
Okay.
Merchants have to choose to install – again, going back to kind of the platform thing we talked about earlier… They choose to install the Inbox app, which Shopify builds… And then within the Inbox app, you can choose to use this behavior or not.
You mentioned also the ecosystem a little while ago, and I’m curious how – as you have created these new AI-enabled products, how has the ecosystem been plugging into that? Is that something that’s possible? What kind of interactions do they have together?
So today we have a very extensive API through GraphQL, that we expose. The data we just talked about, the categorization. So whatever the merchant decides - we make a recommendation, they say, “Yes, that’s correct. This dress is actually a dress.” Once they save that change, that information is then available through that product description API.
I see.
[00:27:51.09] Matt covered a nice breadth of generative text, generative images, product understanding, all of which are kind of adjacent to… Or image generation is sort of a different algorithm under the hood. There’s also a direction where we can think about something that I’ve – it’s been a gift that keeps on giving for all of my career… The sorts of machine learning techniques that work with text often also work with commerce. And so this goes back to old-fashioned matrix factorization for recommender engines was also useful for understanding text. RNNs, useful for looking at sequences or sentences, before transformers took over. Good for text also.
Boy, you’re taking me way back. We did many whole shows on RNNs, and that seems like the stone age now.
Yeah, the stone age of the 20-teens, right?
That’s right.
That also – a lot of the techniques, you could always just peek at what you’re doing in language, and come up with a cool idea for e-commerce, and vice versa. And so this has not stopped. I mean, transformers have kind of taken over everything, but there are - not quite transformer architectures, but heavy attention method, transformer-like architectures, that can look at sequences of behaviors of merchants, of buyers, the people that are shopping with our merchants. Those are sequences too, and can be processed in an analogous way in order to understand what is the next step on the journey for a merchant, and how can we help them get to that journey, and what are the likely ways we can simulate that. And that’s been one of our frontier cutting edge areas that we’ve been applying ML.
Actually, Mike’s answer reminded me… I want to add one more product that always slips my mind here. And it also blends with the ecosystem thing that we talked about. So as I’m sure you’ve talked with other folks on the show about, one of the exciting parts about LLMs is the ability to write code. And we talked about the GraphQL API. So one of the other exciting applications that we’ve done is for our developer ecosystem, is enhancing our developer docs in the way that like we now have an integrated tool that assists developers in writing code. You describe like “Hey, I’m looking to find the product category. Can you write me the query to do it?” And it will literally write you the GraphQL query. You can copy and paste that, put it right in your application.
So I think we’re still in the early days of figuring out how – like, it’s such a dramatic shift for engineering to like figure out how we apply these LLMs… And it’s exciting to see these new applications to existing documentation sites, and just unlocking the power and making it more straightforward to develop apps.
Break: [00:30:33.20]
Going back to something that you had mentioned earlier in the conversation… You had talked about magic, and magic-enabled, and stuff. Could you tell me a little bit about that? I may have misunderstood. Is that a product, or a supporting technology that you guys are using?
I mean, the way we refer to things - we should have done a better job explaining it. So all the things we just talked about, we consider part of the Magic brand at Shopify. So it’s like the product taxonomy stuff, the text generation, the Sidekick, the image generation in the background. So those are all Magic features that Shopify offers.
Gotcha. So it’s kind of the AI-enabling brand that’s around all these things.
Yup.
So we’ve kind of talked a little bit about Sidekick and we’ve gone through… There was another thing I wanted to ask you about, and that was how your current array of AI-enabled capabilities that we’ve been talking about - how are you thinking about that going forward? Where are you looking at? How are you – are you going to add any more in there, that are announceable yet, or maybe at least alludable to? How are you thinking about kind of where you’re at today, versus kind of some of the things you might be doing in the fairly near future, and stuff? And we’ll get into the farther future a little bit later.
I can’t answer today, unfortunately, other announcements that are coming… But I think we can talk about generalities, of like what’s interesting…
Fair enough.
…and some of the stuff that Mike talked about, of applying old techniques in new ways around commerce-specific things. I think predictions and customization around that are interesting. I’d say me personally, I think the thing that I get excited about is the other modalities that are out there. So going back to that reference from earlier, ChatGPT was cool. I feel like I had a second bout of ChatGPT when the voice mode came out. I don’t know if you’ve played with it, Chris, but it’s absolutely incredible.
During the typical day, I have an ongoing con– this is really terrible that I would say this, but I probably talk to ChatGPT more than I talk to my wife. [laughter]
We’ll keep that between us.
Thank goodness she doesn’t want to hear me any more than she already does, so she won’t hear this on the show. But yes, I have an ongoing conversation about a plethora of topics… Which beckons back to the fact that this is moving so fast… And as you guys are having to kind of match your customer needs with products that support, with the technologies that are driving that forward, what are some of the things that you’re thinking about now, for maybe as you go forward into the future? And more specifically, how are you thinking about handling the risks associated with changing technology right now?
[00:36:10.25] We’ve talked a little bit about constant experimentation and everything, but there’s also a point where you kind of have to make investments in different directions, and trade-offs, and stuff like that. Other than the experimentation of that to support this increasing line of capabilities that you guys are offering, how are you thinking about that risk, directions and stuff? Do you think that commercial product offerings, for instance - one topic that comes up all the time on the show… Do you think open source is going to overcome that and kind of take over? …since things are slowing a little bit on the frontier models in terms of the gains they’re making, and open source seems to be catching up faster. How are you guys thinking about problems like that as you’re dealing with these business issues in your company?
I think it will likely be hybrid. So I know we talked to kind of like our strategy is like everything, all the time… So I can’t imagine a world where the commercial offerings completely take over, and I don’t know if I could imagine a world where open source entirely takes over either. And I think that’s probably a good thing for the world. I think that’s what drives the innovation. It’s like a competition between the two. And there are some things that one is good at, and the other is not. So I can’t imagine a world where we aren’t using both at Shopify.
Can you talk a little bit about kind of how you see the strengths and weaknesses, recognizing it may change tomorrow, given how fast things are moving? But when you look at what we might go with a commercial offering like ChatGPT or one of the other several biggest competitors, versus the open source, and probably the foundation infrastructure that you guys will have… How do you guys know where to go? How do you know to go to ChatGPT as an API, versus using a foundation model you’re storing in your infrastructure?
Well, I think it goes back to Mike’s favorite point from earlier of evals. We’ve got to have our compass, because without the compass, we’re lost. So that’s how we answer which one. But I think the other part of like – you’re asking like which one is good at what at this point. And so I think in general, the way to frame it is like open source, the power is in the control. It’s like, you’re guaranteed to run this exact model, with this exact set of training data, with this exact outcome. So it’s very predictable, you have way more control over the training process, and like the post-training process… You just get a lot more knobs. But also, with great power comes great responsibility. There’s a cost to operating all those knobs and knowing what the correct values are from that. So for problems that you have pretty fully defined, and you know exactly how you want to do it, it’s like, open source is great.
The commercial models - fewer knobs, but the thing that’s great about that is the defaults out of the box usually work pretty well. So I think if you’re looking at things that are early on in prototyping, the commercial models work great. They can get you from zero to one real quick. And then when you get to that one, you realize “Oh, well, I want to get to 2.0.” And sometimes it necessitates a need to that shift to an open source model to get that extra control out of it.
There’s kind of this question of “What size of a problem are you trying to solve?” If you’re trying to solve a central, do everything, “the co-founder that you wish you had” model, we’re going to need to pull out all the guns, and really put everything we can into leveraging as much power as we can. And the question is just “What is the most powerful for this task, at this time?”
There’s another side of things, where maybe we’re trying to solve problems that aren’t supersized problems, but they’re more manageable problems. Or maybe we need to do it at scale. And of course, the commercial models are getting faster and faster, and cheaper and cheaper… But when you need to do something at scale, it might be worthwhile to distill a model from some patterns, and then run it at scale. We have billions of products, if you look over our entire history, doing that at scale for the product that Matt described, where we understand all the different attributes, and the taxonomy, and we normalize the description of those products… That’s a true engineering feat that we need to work in, and that may not be a great idea to send that to GPT-01, right?
[00:40:08.13] Absolutely. One of the questions - I’m guessing, Mike, this is coming to you - the last two, three years we’ve been so focused on LLMs, and generative AI capabilities… And I know in general the industry is starting to also kind of pull back and look at some of the other things, things we used to talk about, other technologies in the AI space we talked about a lot… It seems that some industries, things like reinforcement learning, and CNNs, and things like that - depending on the industry, I think in my own experience, as I’ve talked to different people, some find utility in these other architectures, with other purposes, and some don’t… How are you guys? Are you really primarily focused on LLMs and generative? Or do you have use cases where some of the other technologies that we haven’t talked about as much lately, but are still very much out there in industry - are they coming into play for you guys?
Yeah, so this might be where we peel the technology onion one more layer… At a base level, any neural net is a universal approximator, right? And so if we have enough data, there is a big enough neural net that will solve – just an NLP; a fully connected neural network.
Sure.
And so the way I tend to think about it, whether it’s a CNN, or a heavy attention model, or an RNN, whatever it is - all that’s really in the business of doing is making it so that even though there is a number of neurons, it might be way too many neurons., and we might need way too much data in order to do that. And all of these are really just tactics for reducing the amount of data that we need, in order to approximate the patterns that we want.
Right now, for sure, heavy attention models, whether it’s traditional transformers, or evolutions, like you might see in –of the original 2017, [unintelligible 00:41:53.27] transformer architecture, to what you see in LLaMA, these are kind of like tweaks, and they’re still very multi-head attention-focused.
There are other techniques, like the one I described for e-commerce, that are making substantial changes, like removing the softmax out of multi-head attention, which is sort of like having the sigmoid as our activation function 10 years ago. It was just a mistake, and a sociological mistake at that. So seeing small changes like that, that maybe move us out of transformer architectures, I think that we’re definitely in an era where that makes sense.
There’s also kind of combinations of things. So you mentioned reinforcement learning, there’s GNN architectures… And these are actually compatible with vision transformers for planning and reinforcement learning. Using transformers for your aggregation functions for a graph neural network. And so it’s not an either/or, it’s that now we have another tool for either, in the former case, modeling the world, so that we can do a good job at our cue learning and in our policy learning, or do a good job in capturing the right information when we have a graph structure of how we’ve organized the different kinds of nodes, or different kinds of, in our case, merchants and products and buyers.
That being said, there’s another way of taking your question, which is “When are transformers going to be done, and are they gone already?”
That was on my mind as well, actually… Because everyone’s talking about “Okay, what does a post-transformer world look like?” So, yes.
I think I can say this, and I say this to my students pretty religiously… I’m going to make this claim with full understanding that you should never make a prediction that will be falsified in your lifetime. I’m pretty sure transformers is not the last architecture out there. It seemed for a decade that CNN was almost synonymous with vision since 2012, right? And now it’s not. And if you would have asked me in the late 20-teens if I thought it was, I’d probably say the same thing. It’s a heavy bet. CNNs seemed to be the top of the hill, and they seemed to do such a good job with image classification. It’s hard to imagine what will replace it, but probably it’s not the last chapter of the story. And I think that you could probably say the same thing for transformers.
[00:44:17.17] So a little bit ironically - and you’ve sort of kind of covered this territory a little bit with that last answer, Mike… From each of you – we usually finish the show really wanting to get perspectives from our guests on kind of what the future looks like. And with each of you addressing kind of different areas, you probably have somewhat different answers, based on your focus and stuff… And Mike, recognizing that you’ve already kind of touched a little bit on the future, but I’m actually – despite your comment about not making predictions that might prove falsified in your lifetime, I’m going to ask you both to kind of do that. If you’re looking out - and I’ll let you kind of decide on what time frame works for you… But maybe beyond the short term, waxing poetic a little possibly, and trying to say, you know, what do you think you’re going to see, what do you want to see, and how might your various jobs and how your company serves customers - how do you see this fast-moving, there are twists and turns all along the way that catch us all by surprise…
How do you see that playing out from each of you? Matt, if you could lead off, and then Mike, I’ll come back to you for that.
I think what’s most exciting about this – I mean, when I grew up, I remember when we first got the internet. And it was like the first ISP out there, and it was like a dial up modem, and there was like a BBS… It was just – that was like the first wave of technology to me, and that’s how I got into this field. And then I feel like the mobile revolution caught me by surprise. I think at the moment – I knew when the first iPhone came out, I was like “I need to have one of those.” But what I didn’t expect was how much the world would change after that. And it feels like this time around – I wasn’t a big believer in Web3. I was like “What is this Web3 business?” But I feel like this is, again, that same kind of shift… So I’m just going to ignore Web 3; I think this is the real Web 3, it’s AI.
And so how does this play out, right? I think what’s different this time is that for the last, I don’t know, 70 years that we’ve had computers, we as humans have had to conform to how computers work. At first we wrote literally bits. Then we wrote assembly code. Then we’re like “Well, maybe we should have languages.” And it’s like “Okay, so we’re slowly crawling there.” And then the next revolution is like “Oh, we should have point and click”, and so [unintelligible 00:46:30.01]
So now we have a world where everybody spends eight hours a day clicking on little colored boxes, and then typing into other colored boxes, characters on a keyboard. And I think what’s fascinating to me is that we’ve become – we’ve shaped who we are to conform to how computers work today… But I think this point in time – I’m sorry, I’m giving you 10 years out from now.
No, it’s fine.
[00:46:54.04] All that’s going to change. I think that whole – all these browsers that we click buttons on, to like set settings, all that is going to go away. I think it’s going to be that we interact with an agent, or some amorphous entity, and instead of listing all the steps of like “First search for this, then click on this link, then do this thing”, it’ll be like “I would like to buy a box of toothpaste”, and the agent’s like “Great. Do you want to buy one that ships tomorrow, or do you want one that’s cheaper, but ships next week?” And you’re like “The cheaper one.” And then it’s like “Done.” You didn’t fill out a credit card form, you didn’t click through 10 sites, you didn’t do any of that.
So I’m going to put my bet on that… I just think the web will change again. I think we’ve gotten so used to SaaS, and all these models… I don’t know how long it’s going to take, that’s why I’m like 10 years out. I don’t know. It might be three years out, but it might also be 20. But I don’t think we’re going to be typing in boxes in 20 years from now.
Good answer. Mike, back to you.
Yeah, I think I’m going to flank this from both directions. So I spent a little bit of time in the self-driving space years ago, and this was during the maximum hype period for self-driving, where everyone I talked to said “Oh, well, people won’t even need to drive in five years.” And this was more than five years ago. And on the one hand, just across the Bay, Waymo is giving rides to people. Not at scale yet, but it is. For sure, a lot of people said my son would never need to get a license, and he’s about to get his driver’s permit.
So just to draw an analogy, I think it’s reasonable to expect what Matt is expecting, of having kind of like a self-driving assistant that can do that. I have all the respect for Matt in the world, because he was careful about how [unintelligible 00:48:37.12] not five years, or whatever. There’s probably going to be a little bit of difficulty smoothing down the edges for that. And luckily, a crash with a self-driving assistant is far less dangerous than a self-driving car.
Indeed.
So it could be that we get imperfect models there.
I’ll take an extra five boxes of toothpaste. That seems like [unintelligible 00:48:59.28]
So that’s one side of flanking it, is there will be a self-driving moment for these assistants. And how long out is – I’m completely on the same page with Matt. We’re going to hit some bumps before we actually get there.
One thing that I feel very confident of is that we are going to change the way we organize and access and utilize information. This is going to be a forcing function that we really haven’t seen since early search days for the internet, which is also a way of just completely transforming how we organize and access information.
And a lot of people you will talk to will already say that they go to their favorite LLM first, before they go to a search experience. And there are a whole host of product and interface and questions like that about what’s going to be the best way of doing this… But it is also – once again, piggybacking on something that Matt said, it is incredibly significant that we are now speaking in the same language, quite literally, when we want to access and refine the information we’re looking for. And that’s something that’s really never happened before.
Well said. Well, gentlemen, thank you very much for coming on the show. It was really interesting. I learned a lot. And thanks for sharing your perspectives going forward. I hope you guys will come back as things evolve, and you have more things that you want to share with the audience. Thanks for coming on.
I’d be happy to. Thanks, Chris.
Thanks, Chris.
Our transcripts are open source on GitHub. Improvements are welcome. 💚