Gemini vs OpenAI (Practical AI #256)

All Episodes

Google has been releasing a ton of new GenAI functionality under the name “Gemini”, and they’ve officially rebranded Bard as Gemini. We take some time to talk through Gemini compared with offerings from OpenAI, Anthropic, Cohere, etc.

We also discuss the recent FCC decision to ban the use of AI voices in robocalls and what the decision might mean for government involvement in AI in 2024.

Changelog++ members save 2 minutes on this episode because they made the ads disappear. Join!

44 minutes
Recorded Feb 12, 2024
Published Feb 14, 2024
Download (42MB)
Transcript
🎧 31,270

Featuring

Chris Benson – Website, GitHub, LinkedIn, X
Daniel Whitenack – Website, GitHub, X

Sponsors

Neo4j – Is your code getting dragged down by JOINs and long query times? The problem might be your database…Try simplifying the complex with graphs. Stop asking relational databases to do more than they were made for. Graphs work well for use cases with lots of data connections like supply chain, fraud detection, real-time analytics, and genAI. With Neo4j, you can code in your favorite programming language and against any driver. Plus, it’s easy to integrate into your tech stack. Visit Neo4j.com/developer to get started.

Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Notes & Links

📝 Edit Notes

Chapters

Chapter Number	Chapter Start Time	Chapter Title	Chapter Duration
1	00:06	Welcome to Practical AI	00:36
2	00:43	Get fully-connected	00:38
3	01:20	FCC AI robocall ruling	06:52
4	08:12	Regulations to come	01:58
5	10:10	Google's Gemini tiers	06:07
6	16:17	Lack of competitors	03:36
7	19:53	New Multi-modal models & Apple's image editing model	01:53
8	21:46	Growing data analytics usecases	03:29
9	25:29	Sponsor: Neo4j	00:58
10	26:38	Under the hood of these models	05:37
11	32:15	Neuro-symbolic methods & hybrid methods	02:47
12	35:02	Innovating smaller models	01:49
13	36:51	Copilot x Windows	01:13
14	38:04	AI tools in schooling	02:58
15	41:02	Prompt engineering guide	01:21
16	42:24	Goodbye!	00:18

Transcript

📝 Edit Transcript

Changelog

Play the audio to listen along while you enjoy the transcript. 🎧

Daniel Whitenack

Welcome to another episode of Practical AI. This episode is a Fully Connected episode, where Chris and I keep you fully connected with everything that’s happening in the AI world, all the recent updates, and also share some learning resources to help you level up your AI and machine learning game. I’m Daniel Whitenack, I’m founder and CEO at Prediction Guard, and I’m joined as always by my co-host, Chris Benson, who’s a tech strategist at Lockheed Martin. How are you doing, Chris?

Doing pretty good, Daniel. Lots happened this past week.

Daniel Whitenack

A lot has happened. It seems like – I don’t know if it felt like this to you, but there was sort of a little bit of a lull around the holidays, maybe…

Too much eggnog.

Daniel Whitenack

Yeah, too much eggnog. But we’re fully back into the AI news and interesting things happening. One of the ones that I had seen this week, Chris, was decision – well, I don’t know how all the government stuff works, but the FCC, which regulates communication and other things in the United States, had a ruling about AI voices in robo calls. So if people don’t know, robo calls are automated phone calls… Typically, when I worked back in the telecom industry, we’d call it sort of dialer traffic. You spin up a bunch of phone numbers, you can call a bunch of people… This is how you get phone calls from numbers that seem maybe local to where you’re at, but they’re really just automated calls… And then you pick up and realize it’s spam, or someone trying to sell you something, or something happening.

Anyway, there was an interesting one where there was an AI voice clone of President Biden, and I think they were robo-calling a bunch of people and trying to sort of change views about President Biden via this recording. Well, it wasn’t a recording, it was a voice clone of him saying certain things, which hopefully would sway people’s political affiliations or sentiments leading into election season. Anyway, this was one of the things that was in the news, and maybe prompted some of these decisions, or at least highlighted some of these decisions by the FCC, to ban or find people that were using AI voices in these robo calls. So yeah, what do you think, Chris?

First of all, I think whoever was doing that has a serious ethical issues to contend with.

Daniel Whitenack

Yeah. Well, I’m not sure that a lot of dialers are primarily motivated by their ethical concerns…

Yeah, I mean, I think that we’ve been seeing this coming for such a long time, and we’ve talked about it on the show… With all the generative capability and the ability to commit fraud, and the ability to represent misrepresent yourself in ways like this. So I’m glad the FCC got on top of it after something like that happened, and I think, unfortunately, I suspect we’ll see quite a bit more of such things. As you pointed out, not everybody follows the law as well as maybe they should. I keep waiting for them just to ban robo calls altogether, and it would just take the whole issue away from us. We’d have AI-generated voices in other contexts, of course, but…

Daniel Whitenack

Yeah. One interesting thing - I actually forget if this was a conversation we had on this podcast, or elsewhere; maybe someone could remember. I don’t always remember all the things we’ve talked about on this podcast, but I saw it either in a news article, where we were discussing someone on the other end of the spectrum who was using clone voices or synthesized voices to actually spam-bait the spammers. They had like a script set up where they would get a robo call, or a spam call, and actually, they have this conversational AI that would try to keep the spammer on the line as long as possible.

I think we did talk about that. I remember that, yes.

Daniel Whitenack

So I don’t know if that’s illegal. I’ve found that one also kind of fun, because you see these people on YouTube, that sort of spam-bait the spammers, and try to keep them on the line… Because if they’re talking to an AI voice, then they’re not scamming my grandma, or something like that.

That’s true.

Daniel Whitenack

So yeah, that was I think the goal in that, but I don’t know, maybe all of this is gets in a little bit of a murky zone.

It does. But I would say the FCC, the Federal Communications Commission got it right on this one. Score one for the government.

Daniel Whitenack

Yeah. What I don’t know — I think this would still allow… Because obviously, when you call in to change your hotel reservation, or you call your airline or something, there’s synthesized voices, and there have been for many years; not necessarily synthesized out of a neural network, but synthesized voices. So I’m assuming that that – I haven’t read the ruling in detail; I think the main thing that they’re targeting is these robo calls, and so I don’t think that covers these assistants… But I don’t know, that’s a good question.

[06:13] I would assume it goes to intent, and the representation of the voice. And if it is clearly as in the case of the FCC ruling, it’s mimicking a person for the purpose of misrepresenting how they’re seen or whatever, or what their positions are and such, then… I think that’s a very reasonable thing. All of the types of circumstances we find ourselves in where people are trying to commit fraud, or misrepresenting themselves in some way, probably need to be addressed in this way. But obviously for every one of those there’s probably 1000 legitimate use cases as well. So I agree.

Daniel Whitenack

Yeah, there is probably a weird middle zone, because even - if you remember when… I think it was originally Google did their demos at one of their Google IO conferences, one of the things that was shown on stage is clicking and calling your pizza place and ordering a pizza with an AI voice, right? Or “Make me a reservation at 5pm at this restaurant”, but there’s no form on the website, so there was an automated way to make a call with an AI voice to make the reservation for…

Which seems completely legit to me, because you’re representing everything appropriately. You’re not pretending, you’re not getting around, that kind of thing. You have a tool, and frankly, I could use a few of those in my life, and just take care of all the things. But I’m probably not going to call anyone and have an AI model pretend to be Joe Biden or anybody else. So…

Daniel Whitenack

Yeah, I think it definitely - like you’re saying, it gets extremely concerning when there’s a representation that this is this person, and they’re trying to sway your mind in one way or another, and it’s not that person.

Yeah. Pure ethical problem, right there. I mean, that’s…

Daniel Whitenack

Yeah. Well, I don’t know, do you think that this represents some of what we’ll see this year, in terms of a trend of government regulation of generated content?

I would not be surprised, especially when the we talked last year about the executive order here in the US that came out… And I think that was indicative of further actions to come. I mean, they essentially laid out a strategic plan on how they’re going to address AI concerns. And FCC was one of the agencies I believe that was explicitly listed in the order, if I recall. And so I’m not surprised to see them weighing in on this at this point. It’ll be interesting to see how it mixes across national boundaries, and see how various countries are addressing it, and what that means for – so much of this is transnational in terms of technology usage, and even organization spanning, and so it will be a curious mess for all the lawyers to figure out going forward.

Daniel Whitenack

Yeah, when the dialer is using Twilio or Telnyx or something to spin up numbers, but they’re doing it from an international account, which is probably not even in the country where they’re operating, and there’s all of these layers… It gets into some crazy stuff. I know that’s always something that stands out to me. I always listen to the Dark Net Diaries podcast. It’s one of my favorites, so shout-out to them for the great content that they produce. But yeah, that’s always a piece of it, is putting enough of these layers in between to where - yeah, sure, there’s regulations, but…

We just need a blanket rule, a global blanket rule that’s just “Do the right thing. Everybody, everybody out there, just do the right thing.” But we may not have things to talk about on the podcast then.

Daniel Whitenack

[10:04] Yeah. Well, the messiness of the real world will continue, but… Yeah, speaking of Google, I’ve mentioned the Google demos and the stuff they’ve done over the year with synthesized voices and all that, and of course, recently they’ve been promoting Gemini, which is this latest wave of AI models from Google, which are multimodal kind of first models…

Yeah. There’s a whole bunch of kind of related activity in there, and they took their existing chatbot Bard and they rebranded it into Gemini… And there are several – there’s Gemini Pro… Very confusingly, there is the paid service now of Gemini Advanced, which is using the model called Gemini Ultra. So I know initially there was some confusion about Advanced versus Ultra. Well, Advanced appears to be the service, Ultra is the underlying model.

Daniel Whitenack

So Pro represents a model size, or Ultra, or it represents a subscription tier?

Both, in different ways. So Pro is the free tier. There’s nothing less than Pro. We only start –

Daniel Whitenack

Oh, obviously. Yeah, we’ve talked about this with Apple products before. There’s no low-quality anything, right?

Exactly. That’s what I was about to say. There’s no such thing as low quality. You start with Pro. And that’s the free version, it’s the smaller model. We can all go – just as you could go to bard.google.com, you now go to gemini.google.com, and it’s there and available.

Daniel Whitenack

So Bard is no more.

So Bard is no more. Gemini Pro is roughly the equivalent of GPT 3.5, the free version on the Open AI side, and now Google Advanced, which has the Google Ultra model, is competing against ChatGPT, which is hosting the GPT 4 model at the high end.

Daniel Whitenack

Gotcha.

And there have been a billion reviews of how the two go against each other head to head.

Daniel Whitenack

Have you tried the various ones, or tried Gemini?

I’ve not tried Ultra yet, because I haven’t decided to pay for it, because they’re asking for 20 bucks a month… So I haven’t been able to compare it directly. I’ve watched a whole bunch of YouTube videos, more than I should have, where it showed people doing side by side. And I think it’s a really good model, but it generally has been met with some disappointment, in that people are expecting the newest thing is always going to be the greatest thing possible… And I think we saw something with GPT-4 where when Open AI released it, and it had its initial fanfare, and then they’ve built a lot of infrastructure and services around it. The various plugins… They’ve also fixed a lot of the problems behind the scene, while maintaining the actual underlying model. Whereas Google has not done that. They put the model out, and it’s comparable in many ways, but it feels very, very rough around the edges, and it doesn’t always give you the best output. So most of the direct head to head comparisons, most of the various tests I’ve seen have had GPT 4 win out on a head to head thing. So my expectation on that would be that Google will start working around the issues that it has and cleaning it up, and probably within a few months it’ll probably catch up a little bit closer in that way.

Daniel Whitenack

So our company, and actually the last few I’ve been a part of, have been big Google users in terms of G Suite and Google Workspace and email and Docs and all that stuff. So I’m kind of embedded in that ecosystem, and I’m thankfully not having to deal with teams or something like that, as I know many are…

I am at work. It’s terrible. Gosh.

Daniel Whitenack

[13:59] I feel for you, and I guess I do experience that pain on a second order way, because I have to take a lot of teams calls. But anyway, outside of that, which is probably enough said… Then – so I’m always trying the Google stuff that comes out, and I have tried Bard, and I think also before that just the general interface to… I don’t know if it was always branded as Bard, or… I remember PaLM, but I think PaLM was embedded in Bard. I don’t remember always what the branding was.

But yeah, now there’s Gemini… I would say my impression was similar, Chris, in that I just took literally one of their – you know how you log into any of these systems, like ChatGPT or Gemini, and I literally just tried one of their example prompts… Like “Try this.” I think it was like “Print out how to do something in Linux”, or something like that. I think list processes, or something. I just clicked the button like the example prompt, and it wasn’t able to respond to the example prompt. These are rough edges; I’m sure the model does a lot of things really well, and that was just like a fluke in many ways… But I think it does represent a lot of those rough edges that they’re dealing with. And my impression - I’ve said this a few times on the podcast… When you’re a developer working directly with one of these models, it’s kind of like taking your drone that’s flying all great, and you’re controlling it, and then you take it out of autopilot mode, and there’s all of these things to consider that you really just didn’t think about, because they’re taken care of by great products, like Cohere, or Anthropic, or Open AI, or whatever. So I definitely feel for the developers, because there’s a lot of things and a lot of behavior to take care of. But yeah, that was not the best way to win me over, I think.

They might have done better to hold back just a little bit longer, and do a little bit more… They talked about that they had roughly 100 private beta testers… And that seems to me a very small sampling of beta testers to be working on it. You mentioned another name just now, which I wanted to throw out, that is very absent from this conversation out there. That is Anthropic. I don’t see a lot of comparing it to Claude and stuff like that, or Claude 2 at this point…

Daniel Whitenack

Or maybe, yeah, Anthropic and Cohere… Maybe some other ones…

Yeah, absolutely. Right now it’s been a two-horse race between these two, which made me a little bit sad. I wish there had been more – a little bit more expansive. And also, again, some of the open source models that are out there. Because one of the topics that you and I are often talking about is with the proliferation of many models, some of which are private, some of which are open… It increases the challenges for the rest of us in the world to know what to use, and when, and when to switch, and things like that. Something that I know you know quite a lot about.

Daniel Whitenack

Yeah, it’s been intriguing to see all of these. And I would say all of them are on some type of cycle. So we’re talking about maybe GPT 4 is in the lead, and here comes Gemini, and then… We’re mostly talking here about the closed proprietary models, that sort of ecosystem. But then I’m guessing Claude had a big release at some point, and they’re probably in their cycle where - I have no inside knowledge of this, but it’s just my own perception that Anthropic and Cohere, they’re in a different release cycle, obviously, than Open AI and Google. So we’ll see something from them in the coming months, I’m sure, in terms of upgrades, or multi-modality, or extra functionality like assistance, or tying in more things, like RAG, and that sort of thing, as we’ve seen with Open AIs assistance, and file upload, and that sort of stuff.

[18:05] You know, if we’re fair about it, when you think back to when GPT 4 came out, it didn’t have all the things that – the ecosystem has grown substantially since its release. And it had some of the same challenges of that. And I think this might be – with Gemini coming, I think everyone kind of took that for granted. They were a little bit less splashy than a big, giant new model coming out… And I think this is one of those moments where you kind of go “Wow, there’s more to this than just the model itself.” Big new model, I got that, but there’s so much to the ecosystem around a model, and the various plugins, capabilities, extensions, whatever you want to call them; Google calls them extensions at this point… But I think it really goes along the lines of something we’ve been saying for a long time, in that the software and the hardware - it’s all one big system. It’s not just about the model. And I suspect Google is very well positioned to make the improvements in the coming weeks. So it may be interesting to revisit some of these tests after a short while.

Daniel Whitenack

Yeah. And there are other players that are kind of playing on this boundary between open and closed, either on that sort of open and restricted line… So releasing things that are open, and not commercially licensed; or open source, but with some other usage restrictions, and that sort of thing. There’s cool stuff happening in all sorts of areas. One of the ones that we’ve been looking at is a model from Unbabel, which is a translation service provider… They have this Tower family of models, which does all sorts of translation and grammar-related tasks… But there’s also a lot of multi-modality stuff coming out…

So I noticed – you know, we talked about text to speech at the beginning of this episode, and I’m just looking at the most trending model right now on Hugging Face is the Meta Voice model, which is a 1-billion parameter model that is text to speech… But if I’m just looking through kind of other things that are trending, we’ve got text to speech, image to image, image to video, semantic similarity - which are, of course, kind of embedding-related models; text to image, automatic speech recognition, or transcription… So there’s really a lot of multi-modality stuff going on as well, and people releasing that. I know one that you highlighted was some stuff coming out of - I believe it was Apple, right?

Yes.

Daniel Whitenack

Related to image modif– or how was it phrased? Image modification, or something like that? Image editing?

Image editing. MGIE is the acronym, which I’m guessing there – I haven’t heard them say this, but I’m guessing they’re calling it Maggie, or something like that… And it is where you’ll give a source image, and they have a demo that’s on Hugging Face, and you essentially kind of talk your way in through the editing process, and gradually improve it, and everything. So I think they had the bad luck of announcing this and releasing it at the same time that Google did Gemini to go head to head on GPT 4. So I think it largely got lost in the news cycle. But it looks like it might be a very interesting thing, and I think – they’re competing against like Adobe, doing image generation… And all of these companies have some level of image editing model capabilities. So it will be interesting to see how Apple’s plays out and how they apply it to their products.

Daniel Whitenack

[21:46] What I think is a differentiating or interesting element of this, which is maybe not text to image or text to text sort of completion, but the common types of things that people are wanting to do, which are somewhat model-independent, but are more workflow-related. So things like RAG pipelines, where you upload files and interact with them… GPT models, or the Open AI ChatGPT interface, where certainly you can upload files and chat with them or analyze them… Anthropic actually was an early one, where because of their high context length window models had the ability to upload files and chat with those files. I don’t think - at least I couldn’t tell there’s something similar in Gemini, other than uploading an image and chatting or reasoning over that image, which is sort of like the vision piece of it… But more than multi modality, there’s these increasing workflows that people are developing. One of those that I think is really interesting is the data analytics use case that are coming out. So you have – actually, I’ve seen a trend and a lot of these companies popping up that are something to the effect of “New enterprise analytics driven by natural text queries.” So I’m thinking of like Defog, I think it is…

Yes.

Daniel Whitenack

These companies which are a chat interface, where you type in a question… Maybe your SQL database is connected, and you get a data analytics answer, or a chart out. And this is something that I believe, if I’m, again, understanding - I don’t know all the internals of ChatGPT… But it’s interesting that there’s different takes on this approach. And I think there’s a lot of misunderstanding about how this actually happens under the hood… So I don’t know, have you done much where you’ve like uploaded a CSV, or you’ve done that sort of thing in ChatGPT and asked it to analyze it or something like that?

Ironically, that’s literally something I’m playing with right now. I know you didn’t know that before asking the question. But I saw a similar post about kind of analytics being used for this, and so I’m experimenting with it… But I’m still very early.

Daniel Whitenack

How are your results initially?

They’re not as good as I want, but I think that’s mainly my problem. I keep running into little bumps, where I’m trying to get the CSV usable very well. So I have a database that I dumped some data out off, and was trying to do that… But I literally just did this today. Today was day one, and then stopped, and came in for us to have this conversation. So let me let you know in another week or so how that fanned out. But it caught my eye because I saw a conversation online about this, and some of the personalities that I’ve always associated with being super-technically bright, analytics folks were kind of saying, “We’re just hitting that moment where this kind of just AI-driven conversational analytics is now going to be available to everyone”, and I was that’s what I want “That’s what I want. That’s what I need.” So I’m actually trying to do something for work right now on those ones.

Daniel Whitenack

Well, Chris, I was asking these questions about this data analysis stuff because this is – I’ve done a few customer visits recently where we’ve been talking about this functionality, and I’ve noticed as I’ve gone around and talked to different people there’s some general misunderstanding about how you can analyze data with a generative AI model. One, because there’s something people think is going on that isn’t actually going on, and two, because generally, if you ask language modules – just the chat model, without uploading data, math type of questions, usually it’s really terrible at that. Even like adding things together, or doing like basic aggregation is something that these models are known to fail on pretty poorly. And so the question is “Well, how am I getting anything relevant out of these systems to begin with?” And again, I don’t know all the internals of ChatGPT, but this is my own understanding… There’s some difference if you look at maybe an example like Defog, or ChatGPT, or Vana AI… These are some examples of this that’s going on. ChatGPT takes the approach, in my understanding, where in their assistance functionality, so when you type – you upload maybe a CSV, and you ask a question, and you wait for seemingly forever, while the little thing spins, and it says it’s analyzing, I think, is what it says, something like that…

Yup.

Daniel Whitenack

My understanding of what’s happening is more of what they used to call a code interpreter. It’s actually generating some Python code, that then it executes under the hood to analyze your data that you uploaded, and then somehow passes along the results of that code execution to you in the chat interface. So this is a very astute observation by whoever had this, that yeah, these models really stink at doing math, but what doesn’t stink at doing math is code, right? So these models are pretty good at generating code, so why don’t we just sidestep the whole math thing and generate the code, and then execute it and crunch your data and we’re good to go?

I think the thing that often - what I’ve seen people struggling with like the assistance API in ChatGPT is, again, they have to support all sorts of random general use cases. Because people could upload a CSV of all sorts of different types or other file types, and so there’s a lot to support… And it’s kind of generally slow and hard to massage into working.

What I’ve seen more in the enterprise use cases that we’ve been participating in is less of focus on code generation to do the data analysis and more of a focus on SQL generation to do analytics queries. So this is more of the approach of the SQL coder family of models, Defog, Vana AI… We’re doing very similar things in the cases where we’re implementing this, similar to the Vana AI case, where you connect up - let’s that you have a transactional database, like your sales or something like that, or customer information or product information, and you want to ask an analytics query. Well, SQL is really good at doing aggregations, and groupings, and joins…

[30:16] Also, large language models, especially code generation models or code assistant models are really good at generating SQL, because - like, how much SQL has been generated over time? It’s a very well-known language to generate. So you kind of sidestep the code execution piece in that case, where you’re not generating Python code, but you’re generating from a natural language query a SQL query, to run against the database that’s connected. And you just run that SQL query in normal good old regular programming code to give you your answer, and then you send it back to the user in the chat interface.

So I thought that would be worth highlighting in this episode, because there does seem to be a lot of confusion of what’s actually going on under the hood, like “How can one of these models analyze my data?” Well, the answer is it kind of isn’t. It’s just generating either code or generating sequel that is analyzing your data.

It still gets you there, though. It’s in a sense - since you’re not directly having the model do it, it’s sort of a workaround in a manner of speaking… But I think if you look at something like the ecosystem built around ChatGPT, there’s a lot of tooling around it. And I think that’s – I think this year we’re gonna see more and more of that, whether it be the SQL use case that you’re talking about, or continued with Open AI… I think Google will do that well, I think Anthropic will get on that… And you’ll see these kinds of tools for doing exactly that kind of thing, where you may not have a model that does a particular task super-well, but it can produce an intermediate that can do something very, very well. I think that’s a level of – we keep talking about the maturity of the field, and I think part of that is recognizing, maybe there’s a better way to do it than just having a bigger, better, latest model. So yeah, I think that’s a great way of approaching it.

Daniel Whitenack

Not to sell-fulfill my own prophecy from our predictions from last year… I think in our 2024 predictions episode one of my predictions was that we would see a lot more combination of, I think what is generally being called neurosymbolic methods, but maybe more generally just like hybrid methods between what we’ve been doing in data science forever, and a kind of frontend that is a natural language interface driven by a generative AI model.

So in this case, what we have is good old-fashioned data analytics, just like the way we’ve always done it by running SQL queries. It’s just we gained flexibility in doing those data analytics by generating the SQL query out of a natural language prompt, using a large language model. And I think we’ll see other things like this, like tools, and Langchain is a great example of this, where you generate good old-fashioned structured input to an API, and that API is called and it gives you a result… But this could be applied in all sorts of ways, right? So let’s say time-series forecasting. I don’t think right now language models - and I’ve actually even tried some of this with fraud detection forecasting and other things with large language models, and they’re not very good at doing these tasks. But they can generate the input to what you would need in the kind of traditional data science task. So if you say - again, imagining bringing in this SQL query stuff, if you have a user, and you want to enable that user to do forecasts on their own data, well you could have them like fill out a form in like a web app, and click a button and do a bunch of work, or you could just have them say “Hey, I want to forecast my sales of this product for the next six months”, or something.

[34:19] From that request, a large language model will be very good at extracting the parameters that are needed, and possibly generating a SQL query to pull the right data that’s needed to be input to a forecast. But that forecast is going to be best to – best that you just use like Meta’s profit framework or something, that’s just a traditional Arima statistical forecasting methodology… And you just like forecast it out with that input, and then you get the result. So this is a very – it’s the merging of what we’ve been doing in data science forever, with this very flexible frontend interface. I think we’ll see a lot more

of that.

I completely agree with you. And not only that, but I think there’ll be a lot more room for LLMs that are not the gigantic ones. We’ve talked a bit, and we’ve had guests on the show recently talking about the fact that there’s room, not only for the largest, latest, greatest giant model, but there’s an enormous middle ground there, where you can have smaller ones, and combine those with tools… So it’s pretty cool seeing people innovate in this way, and start to recognize that not everything has to come out of the largest possible model you have available to you, and add that in. So I’m really looking forward to seeing what people do this year along, in their various industries, and how that spawns new thoughts.

Daniel Whitenack

Yeah, and especially with a lot of things being able to be run locally… I’ve seen a lot of people using local LLMs as an interface, using frameworks like Ollama and others, which is really cool, to be able to use LLMs on your laptop to automate things, or do these types of queries, or experiment locally… So yeah, I think that even adds another element into the mix.

And for edge computing, for truly edge computing, where it’s not practical to have a cloud backing. Or the networking between where that model would be in the cloud, and where you’re trying to do it… There’s a huge amount of opportunity to use them in that area. So yeah, I’m hoping that we see a lot of innovation. Last year was kind of - and even the year before, it was kind of the race to the biggest model. I’m kind of hoping now we see what other branches of innovation people can come up with to take advantage of some of that, and also recognize that the midsized ones have so much utility to them that’s untapped.

Daniel Whitenack

Yeah. And maybe, before we leave the sort of news and everything that’s going on in this kind of Copilot/Assistant analysis space, I did see - actually, my wife needed help connecting to printers. Printers are not a problem that is solved by AI yet, I guess, and will continue to be a problem forever in tech… But I was noticing in recent updates to Windows there’s the little Copilot logo there, even embedded within Windows… And I don’t know, whoever watched the Super Bowl in the US - the Super-Bowl, as we record this, was the day before we’re recording this… But there was a Copilot commercial during the Super Bowl. And that’s another interesting thing, because this is now - it’s running on people’s laptops everywhere… And of course, that’s connected to the Open AI ecosystem, in my understanding, through Microsoft… But yeah, this kind of AI everywhere, and also the sort of AI PC stuff that Intel’s been promoting and running locally is going to be an interesting piece of it.

[38:01] Totally agree. As we wind up, I want to briefly switch topics here… I’ve received some feedback a few episodes ago from a teacher who was listening… And I was so happy to have one, and maybe many teachers out there, listening to us and considering this… And as we often do - people may not realize, but Daniel and I, we have a topic, but we are largely unscripted. So we are kind of shooting from the hip in terms of what we’re saying. It’s a very genuine and real conversation. We’re not looking at a whole bunch of notes and pre-planned script. And I made a comment about my daughter in school, and the fact that I really think schools should take advantage of models, and as part of the learning process, as part of the teaching to integrate it in… Whereas often school systems right now are saying “You’re not allowed to use GPT, for instance, in your homework.” And that - I said, “Ah, that’s stupid, that teachers will not do that.” And this teacher reached out and said “Well, first of all, we really want to…” And I’m paraphrasing here. And she said second of all, a lot of times it’s not in their power anyways. The school system policy and stuff. And so I just wanted to apologize to anyone, especially the teachers out there, that might have been offended. I’m much more cognizant now of what I’m saying on that. It was kind of a shooting from the hip, but it was insensitive. And I’ve found that what that teacher pointed out was dead on; it was right on. And I just want to thank the teachers out there, especially those who are trying to take advantage of these amazing new technologies, and talk their systems into bringing them into the classroom, and not make it just the bad thing not to use for homework. So thank you to the teachers for doing that. I just wanted to call that out. It’s been a really important thing from my standpoint to say, so thank you.

Daniel Whitenack

I think it represents the complexity that people are dealing with.

It does.

Daniel Whitenack

You know, teachers want their students to thrive. I think generally we should assume that most teachers are really actually motivated and engaged, both in culture and technology, and the ecosystem, wanting their students to thrive… But sometimes, like you say, they have their own limitations in terms of what is the system that they’re working in, and privacy concerns, and other things… So yeah, that’s a good call-out, Chris. I’m glad you took time to mention it.

I want to say one last thing… To teachers out there who are trying to get these things into the classroom, so that your students have the best available tools to do things… If you ever need someone to back you up, reach out to us. We have all our social media outlets, you can find me on LinkedIn, and I will be happy to give a whole bunch of reasons to your school systems on why they might want to use the tools. I’ll be happy to work with you on that. And I thank you for fighting that fight on behalf of the students that you’re serving.

Daniel Whitenack

Yeah. And speaking of learning, something that we can all learn and be better at is all the different ways of prompting these models for multimodal tasks, and prompting and data analysis… And I just wanted to highlight here at the end a learning resource for people. A while back I had mentioned a lecture and a series of slides that was very helpful for me from DAIR.AI. Now I think that they have converted that series of slides in that prompt engineering course, I think is what they call it, into a prompt engineering guide. So if you go to promptingguide.ai, they have a really nice website that walks you through all sorts of things, and also covers various models in terms of the ChatGPT, Code LLaMA, Gemini, Gemini advanced… We talked about those on this show. And it talks about actually prompting these different models.

So I’d encourage you, if you’re experimenting with these different models and not immediately getting the results that you’re wanting, that may be a good resource to help you understand different strategies of prompting these models to get things done as you need to get them done.

It’s a great resource. I’m looking through it as you’re talking about it, and it’s the best I’ve seen so far.

Daniel Whitenack

Well, Chris, this was fun. I’m glad we got a chance to cover all the fun things going on. And we’ve complied with the FCC using our actual voices still. We’ll see how long that lasts, but… It was fun to talk through things, Chris. We’ll see you soon.

Talk to you later.

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. 💚

View all episodes

Player art