Practical AI – Episode #281
Gaudi processors & Intel's AI portfolio
with Benjamin Consolvo & Greg Serochi from Intel
There is an increasing desire for and effort towards GPU alternatives for AI workloads and an ability to run GenAI models on CPUs. Ben and Greg from Intel join us in this episode to help us understand Intel’s strategy as it related to AI along with related projects, hardware, and developer communities. We dig into Intel’s Gaudi processors, open source collaborations with Hugging Face, and AI on CPU/Xeon processors.
Featuring
Sponsors
Intel Innovation 2024 – Early bird registration is now open for Intel Innovation 2024 in San Jose, CA! Learn more OR register
Motific – Accelerate your GenAI adoption journey. Rapidly deliver trustworthy GenAI assistants. Learn more at motific.ai
Notes & Links
- Intel’s AI & Machine Learning Ecosystem Developer Resources
- Intel® Tiber™ Developer Cloud
- Intel Gaudi AI Processors
- Optimum Habana
- Overview of AI tools from Intel
- OpenVINO
- Case Study: Prediction Guard De-Risks LLM Applications at Scale
- How Prediction Guard Delivers Trustworthy AI on Intel® Gaudi® 2 AI Accelerators
Chapters
Chapter Number | Chapter Start Time | Chapter Title | Chapter Duration |
1 | 00:00 | Welcome to Practical AI | 00:44 |
2 | 00:44 | Sponsor: Intel | 01:58 |
3 | 02:47 | Welcome Intel's Benjamin Consolvo & Greg Serochi | 01:25 |
4 | 04:11 | Intel at a high level | 03:18 |
5 | 07:30 | Competition | 01:30 |
6 | 08:59 | Transitioning from NVIDIA | 03:02 |
7 | 12:02 | Intel & open source | 04:51 |
8 | 16:52 | What is Gaudi®? | 03:44 |
9 | 20:37 | AI dedicated processors vs GPUs | 02:36 |
10 | 23:13 | Scale & possibilities | 04:31 |
11 | 27:56 | Sponsor: Motific | 01:53 |
12 | 29:56 | Getting hands on | 01:51 |
13 | 31:47 | Tiber developer cloud | 02:35 |
14 | 34:22 | Gaudi® at the edge | 01:22 |
15 | 35:44 | The future with Intel | 04:07 |
16 | 39:52 | Chip diversity | 01:20 |
17 | 41:12 | Where is this headed? | 04:00 |
18 | 45:11 | Go try it! | 00:30 |
19 | 45:41 | Outro | 00:46 |
Transcript
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to another episode of Practical AI. My name is Daniel Whitenack. I am founder and CEO at Prediction Guard, and I’m joined as always by my co-host, Chris Benson, who is a principal AI research engineer at Lockheed Martin. How are you doing, Chris?
Doing great today, Daniel. How’s it going?
It’s going well. I am really happy to bring a little bit of content to the show today, that is from a little bit of the world that I’ve been operating in, in some of the groups that I’ve been collaborating with at Intel, which I think is a really kind of cool set of stuff that maybe people are – maybe they’re less aware of it, maybe they’re aware of it, but really happy to have with us today Benjamin Consolvo, who is an AI engineering manager at Intel, and then also Greg Serochi, who is a developer ecosystem manager at Intel Gaudi. Welcome.
Yeah, thank you.
Thanks for having us.
Yeah, well, like I say, I think people are, of course, aware of Intel, and that Intel is sort of everywhere in one degree or another, in cloud providers, and PCs, and all of that… But I think some people might not be aware of the strategic moves that Intel is making in the AI space, and what they’re focusing on… Ben, I’m wondering if you could give us a little bit of a high-level sense of what Intel is doing as related to AI and how that’s kind of featuring in their strategy right now.
Yeah, no, I think a lot of people don’t know Intel in terms of our strategy with AI, and so I’m happy to talk through that a little bit. So both in terms of hardware and software, we’re really aimed at AI in terms of the hardware front, which people are perhaps more familiar with. We have our Xeon product line, which is a data center CPU that we use for inference in a lot of cases for AI workloads, and then more recently, we’ve announced the AI PC, and kind of started that whole category, which includes - you know, the machine has a CPU, a GPU, and then what’s called an NPU, a neural processing unit… So that’s exciting. That’s kind of to optimize workloads on your local machine.
And then back to the data center side, we have the Gaudi product line, which is a really good performance substitute for a lot of the modern GPUs that are out there. So that’s really exciting as well, really powerful data center hardware that we have.
So that covers some of our hardware. I guess the other big one that I want to emphasize as well is we have Falcon Shores coming out in the future, which is an all-purpose GPU, a data center GPU as well. So kind of leading into that is Gaudi, and we’ll get more into that in the episode. But on the hardware front, we have multiple products for AI. So that’s really exciting.
[00:06:05.10] And then on the software front - Intel, again, spans kind of the whole gamut of software for AI, for enabling workloads in AI. But rather than kind of going through the whole software stack that we have, I’ll just talk about a couple of things that I’m excited about. So the PyTorch 2.4 release includes support for the Intel GPU. So right now is the Mac series GPU, and will be Falcon Shores. So that’s really exciting, that the upstreamed mainstream version of PyTorch has now support for that. And then coming soon, PyTorch 2.5 will have support for the Arc GPU, which forms a part of the discrete GPU product line that we have, and that’s also included in the AI PC. So those are a couple of exciting things with PyTorch that are happening.
And then we also have the OPEA ecosystem. That’s the Open Platform for Enterprise AI. And it’s kind of this open framework that we have that multiple people can contribute to for gen AI workloads, such as chat Q&A, a code Copilot, and different other gen AI examples. So yeah, those are a few of the things that I’m excited about on the software front.
I’m wondering if you could talk a little bit about how Intel is uniquely positioned in the AI landscape. You have both some very large competitors out there, there’s a bunch of smaller ones in the space… How do you guys see yourselves and how do you approach that point of differentiation?
Yeah, so I think where we can really compete well in this space is cost and performance and availability. So when you’re running out of availability, or don’t have the ability to rent that GPU you need from the cloud, from NVIDIA, and you need something cheaper that also runs AI workloads really well, I can say with confidence there are some good options coming out of Intel that I’ve used personally. And I come from a background in AI engineering where I was only using NVIDIA GPUs, and I have gotten to work at Intel and to try out our Gaudi products for training AI, deep learning AI models… And then I’ve also gotten to run inference on our Xeon products, and even run training as well on our Xeon data center products quite a bit to do more fine-tuning.
So yeah, I would position Intel in terms of those things, just being able to have good options for out-of-the-box, performant hardware that people might not be aware of.
Could you take just a moment - just a quick follow-up to what you were saying… You brought up the big competitor being NVIDIA out there, and a lot of folks out there, just as you were, had been on their hardware, using their GPUs and such… You talked about that transition. Could you talk a little bit about what is it like coming over to Intel when historically maybe you were on NVIDIA as a platform, and you guys are now coming as a real powerhouse. What does that migration and transition look like and feel like, if you decide to go for that?
Yeah, I think I can relate in terms of my background in deep learning as I was first working with TensorFlow, and then with PyTorch mainly… Now, since I’ve been at Intel, there’s been a lot on Hugging Face and Transformers, those libraries as well. What I’d say in general is that the transition is not difficult. A lot of the same tools that I’m used to using for development on the NVIDIA platforms, I can use those same tools with some slight code modifications for the Intel products.
[00:10:10.20] So in terms of a software leap, it hasn’t been too difficult. Intel actually historically has had – we have both the upstream support into those frameworks, where we have our own developers and the community developing for Intel in the mainstream frameworks, but we’ve also in the past had Intel extensions where there are gaps. For example, Intel has had an Intel extension for PyTorch, where there’s not yet support in the mainstream framework; you can install this extra package to get all the support that you need, again, with just a couple lines of code change. But we’re constantly aiming to get our changes into the mainstream framework, so that it’s just easy for developers to use. But yeah, in terms of software development I haven’t had huge obstacles for transitioning over to different hardware.
Yeah, and you mentioned some of those important open ecosystem projects, whether that be things that Intel is maybe driving more directly, like the OPEA stuff, or it’s kind of more community things… Greg, I know being kind of developer ecosystem/manager, for part, I know you’re focused more on the Gaudi side, but we were talking before the show even about our team is utilizing a lot of these great packages, that actually aren’t even in an Intel repository on GitHub. They’re maybe in a Hugging Face repository; Optimum has been a big one that we’ve used, but there’s other frameworks like TGI I know that I know are important for what Intel is trying to do. We’ll, of course, get into more details later, but just at that kind of open source level, for those out there that are not only wanting to utilize these great packages, but also contribute to them, how has Intel engaged in that open source community?
Right. The key thing here is wanting to maintain as much of that connection with the open source. And this really started with Gaudi when the project was introduced four years ago, where we started with TensorFlow and PyTorch, and now the industry has moved to PyTorch and so have we. And as Ben said, our goal here is to have full PyTorch support in native PyTorch, so we’re working towards that. The same thing with DeepSpeed, and with Megatron DeepSpeed for large language models. We engaged with the full ecosystem to support those things, so we can talk specifically about our support for PyTorch.
So a customer can run their PyTorch models and migrate them directly onto Gaudi. For instance, we have a tool that takes a model that maybe was running on a GPU architecture, and in real time migrate some of those things and move some of that code that was for GPUs, and change them over to things that Gaudi can understand.
But the key thing is if you’re running on PyTorch, you can bring your PyTorch models over to Gaudi. If you’ve been using Hugging Face, we’ve partnered very closely with Hugging Face, and have a dedicated library called Optimum Habana, or Optimum for Intel Gaudi. And that is a dedicated set of fully performant and fully documented examples of LLaMA 2, LLaMA 3, OPT, Mistral… All the important models that people are using today for both fine-tuning and inference in Hugging Face. So if you’re taking advantage of using Hugging Face, then it’s really easy to bring those models over.
And then we look at – again, for training, we have our partnership with… We’re using DeepSpeed, specifically using Megatron DeepSpeed, which is really optimized for doing those large-scale, large language model training where we’re taking advantage of the tensor parallelism, pipeline parallelism, and data parallelism that Megatron provides, to be able to really get customers to scale and start using our product very quickly and very easily.
[00:14:17.24] I know one of the things that was really cool for me - and I know a lot of people have been working on this and contributing a lot, but I just love the… Because the reality for us when we were building Prediction Guard is we had a bunch of transformer-based code running, and one of the things I liked about the examples when I was trying out this stuff was you sort of had the example of loading a model in with transformers, and then kind of the after, and it was just sort of like – like, you’d see a git diff in a repo… It’s like “Hey, change this line to this line”, and then you’re basically pretty good to go.
I remember I was actually on a plane to India during like a hackathon back last June, and had got access to some of these Gaudi processors in Intel Developer Cloud, and I remember doing that and going through those examples, and by the time I had landed in Bangalore, I had the models up and running on the Gaudi processors for what we needed to have, which was pretty cool. So yeah, great work to you all and the whole team in terms of providing some of that sort of functionality, and tying in very closely with that ecosystem.
Yeah, it’s really important that – you know, because Hugging Face is so huge and so pervasive, we wanted to make sure it was really easy for people to migrate over, and just even take advantage of the models that we have already optimized. So a lot of the work we do there is really managing – at the lowest level managing some of the static shapes, and managing the bucketing, and making sure that we have the most optimized models.
And as you said, Daniel, they’re fully documented, so it’s really easy for people to go into the repository on GitHub and you will see examples of running something as simple as doing text generation with GPT on one card, or going and running a full LLaMA 3 70-billion parameter model on eight cards. Or if you have access to more nodes, up to 16 or 32 cards. And everything is fully documented.
So like you say, you get off the plane and you’re already running. And it’s a great starting point for people to begin their development; either to take their existing model and fine-tuning it with Gaudi and taking advantage of that performance, or being able to run inference and applying that to their applications, just like you’ve done with Prediction Guard.
So we’ve kind of dived into talking a bit about Gaudi, but I’d like to pull us back – and for those of us that are out there listening and are not familiar with it, could you possibly kind of give us a what is Gaudi and kind of introduce the whole platform in the broad, and talk about kind of where it came from, how it came about, what Gaudi is, versus maybe some of the other things that Intel has that are not specifically Gaudi, and just kind of give us a context setting about what Gaudi looks like?
Yeah, Chris, that’s a great question. So let’s talk a little bit about what Ben has mentioned a moment ago about the overall Intel product roadmap. And you look at sort of three key areas. You think about the PC, and we have the AI PC, that Benjamin was talking about. Then we have the edge, where – the edge has huge latency requirements and performance requirements. And there’s great solutions there with Xeon, to handle those low-latency on-premise requirements.
[00:17:56.27] And then the final is really that large language model training and inference in the data center, where we’re looking at fine-tuning and pre-training models, as well as running inference on large batch loads, or batch sizes, or large batches, or dealing with users running an application where we have multiple, multiple, multiple users trying to take advantage. And the reason why we have Gaudi, and the reason why Gaudi exists really is to give the ecosystem a choice.
One thing we’ve been hearing from customers over and over is they want an alternative to the standard mainstream GPU solutions, because of cost and because of availability. So Gaudi really is that low-cost alternative to the standard NVIDIA GPU solutions that are in the market today.
A little bit of a history - the company Habana was an independent company and Intel really saw the value in the product they were building, the performance. So Intel made an initial investment in 2016, 2017, and then fully purchased the company in 2019. So it was a small company, and they really needed to scale. So Intel invested in the company and brought them inside of Intel, and brought many Intel employees - I was one of them - into the structure within the Gaudi team. So we started to build the product and started to ship. And so the first real milestone there was launching the first generation of Gaudi on AWS. So today that’s available in the DL1 instance on AWS, it’s still available today. And the next step was in Gaudi 2. And Gaudi 2 launched a year and a half ago.
And one of the real key milestones with Gaudi 2 was the submission for MLPerf on training and inference. And you look at – for those in the audience that may not know, MLPerf is a benchmark that the larger ecosystem uses today to measure these really standardized benchmarks. And so the way that you run those MLPerf benchmarks makes it easy to have a direct comparison from product to product. And one of the really key benchmarks was the large language model training benchmark for MLPerf. And Gaudi was one of the only products other than NVIDIA that submitted an actual benchmark for MLPerf, showcasing the performance.
So why Gaudi? What is Gaudi? Is Gaudi a GPU? No, Gaudi is not a GPU. It is a dedicated AI processor that is used to manage and train and run inference on today’s largest and most complex workloads for both training and inference.
Could you differentiate a little bit for me as we go here, when you say a dedicated AI processor versus a GPU - could you kind of distinguish between those?
Well, in some cases, a GPU still has the ability to do some other things. It’s got additional programmability to do other types of workloads. Whereas Gaudi as a dedicated AI accelerator is built specifically for AI training and inference. So we don’t have those additional – it’s specifically built for AI. So it’s a product where if you’re wanting to go run these workloads in the cloud, on the edge, or in an on-premise solution, in your on-premise data center, Gaudi really is that low-cost solution.
As an analogy with a potential competitor that people may know, just to transition, Google has their TPUs, which sound sort of similar to that, where it doesn’t have all the extra stuff that a GPU has. And I get Gaudi is its own thing, but just for people to make that connection - is it a little bit similar to that?
[00:21:38.18] Yeah, somewhat similar. And one more thing that really is a great differentiator in Gaudi is as part of the hardware architecture, we provide two things. One is 96 gigabytes of on-board, on-card HBM memory. And for those in the audience that really know about training and inference, having that local HBM memory is critical for storing your weights or your parameters, the things that actually are stored when you’re actually running the inference or running the training. So having that large HBM memory allows you to do one of two things. You can either run a larger model on a single card, or you’re more efficient as you scale out to more cards.
And the other thing that really is a key differentiator for Gaudi is the on-board networking. So on die, Gaudi offers 24, 100 gigabit Ethernet ports. So instead of relying on the latency of having to use a third-party network controller to scale to multiple nodes in a rack, for example, these dedicated Ethernet ports allow, in some cases, a direct all-to-all connection for eight Gaudis in a node or a single server, and then the additional Ethernet ports then can go to the other nodes in a rack and then scale out to a switch to go to multiple nodes, and then a full pod. So what you end up with is a virtual all-to-all connection, which significantly improves the scalability when looking at large workloads.
Ben, you were talking a little bit about your prior work in deep learning, in training models, and doing inference, and how you’ve transitioned a bit, and also done some projects with training on Gaudi and various other hardware that we’ve already talked about… Could you speak a little bit more to that practical side? Just to give people a sense of the kinds of projects that are possible on this hardware, just so they can form in their mind kind of both the scale and possibilities with what can be done.
Yeah, and my use cases might not be the full expanse of what’s possible on our hardware, but I can certainly speak to the things I’ve been able to work on and had some fun with over the last few years. So one of them is working with OpenAI’s Whisper model, which is a translation and transcription model that works very, very well out of the box. I speak both English and French, and so I tested the abilities of this model, to transcribe my own voice English to English, and then also its ability to transcribe from English to French, and then going the other way as well. And yeah, first of all, the model does a really great job. I was really impressed with the fidelity of the model that’s been trained, that OpenAI has released in the open source. So that’s great.
And then what I was able to do was to run this model on our Xeon product line, to run inference on Xeon and run that really, really quickly, and kind of put together a notebook and a workshop around that. So that’s been something that’s been fun for me to work on, on this generative AI model that is used for translation and transcription. Yeah, that’s one.
Another one that has been exciting for me to work on is with my – I have a background in imaging and computer vision, kind of prior to working at Intel, where I was applying a number of different techniques like pixel segmentation and image classification, object detection to different problems in computer vision, especially around geophysical imaging, where we’re imaging the subsurface. I kind of describe it like an ultrasound for the earth. So if you know what an ultrasound is, looking at imaging, the subsurface of the earth to extract and find certain minerals, and that kind of thing. So where I’ve applied some of the techniques there is to help get some of those images more quickly, and both in fine-tuning models and also in inference. And so I’ve been able to use some of our Xeon, again, product line to fine-tune some of those models, as well as run inference. So that’s been exciting.
[00:26:12.10] I’ll add this too, just to add onto that… One area that’s really getting a lot of focus now is the use of RAG, retrieval-augmented generation. We’ve invested a lot of effort there, and you’ll see that in the OPEA project, where we have a lot of RAG-based examples. But not only for just general RAG usage, which is important, but also transitioning into multimedia. So we’re seeing a huge request for – and we see this in the market, for both video, audio and text all coming together. So whether it’s a prompt of text to get a picture, or a prompt for text to get now a video, or to use a RAG type of usage to be able to parse through not only text, but parsing also through video, and then get a response. So those are all types of things that we have available for use on Gaudi, in the products.
Yeah. And actually, that’s something I’ll pick up on too, because - yeah, I’ve been working on some multimodal models on the AIPC actually, running some multimodal, smaller, 2-billion parameter multimodal models, and actually successfully ran one on the NPU, the neural processing unit, to run inference and to extract essentially some information about an image into text. So using that multimodal capability.
Break: [00:27:09.09]
So Ben and Greg, we’ve talked about a lot of interesting things, both on the kind of software and migration side, but also on the hardware and what that hardware enables… People might though be out there and be wondering “This is cool, I’d love to experiment with this stuff. How can I get hands-on with some of this hardware that I’m hearing about?” What are some of those ways that people can – they might have an Intel processor in their PC, or likely many do, but they don’t have a Gaudi or a Xeon sitting around at the moment… So if they were to want to kind of explore this ecosystem and get hands-on, try some things, how might they do that?
Yeah, I can start, and then Greg, you can fill in anything, Greg, that I’m missing. But I think the best way is to get onto the Intel Tiber Developer Cloud, which is kind of our Intel cloud where developers can come and try out both our hardware and our software that’s set up right there for them. We’ll offer kind of our latest, whatever we have on that platform. So we have our Xeon product, we have our Gaudi product, we even are going to be having a dev kit kind of for the AI PC, a simulated environment; even though it’s not a local machine, it’s still a cloud. So that’s, I think, the best way to get started, is the Intel Tiber Developer Cloud. And Greg, did you have anything to add there?
That is the best way. We’re going to have pretty soon some free access to Gaudi. So right now you need a developer credit. We’re working to get some nodes available for free on the developer cloud, so people will have the ability to try out our tutorials and examples and code examples, and be able to see and experience the Gaudi usage, and see how easy it is to run.
Yeah, that’s awesome. And I mentioned earlier some of the experimentation on my end even, on the plane, but a lot of that was enabled by this developer cloud. And I think there are – you all can correct me if I’m wrong, but people can sign up on the site, get access to… And there’s also some like training resources. People can spin up notebooks, try a variety of things… Maybe if they’re not as familiar, they’re learning, get access to various things.
But there’s also a kind of transition to within that environment utilize these powerful products in a production sense, or in an enterprise sense, rather than just a kind of developer experimentation sense… Which is definitely the transition that Prediction Guard has taken. So we’ve been able to operate very price performant, at scale, with our LLM and Gen AI platform, on top of Gaudi, and running that in Intel Tiber Cloud.
So I don’t know if there’s anything you’d want to highlight, whether that be kind of success stories, or just commenting on that kind of transition to production, that there kind of are people running this stuff in production, not just in a kind of developer experimentation sense. Anything you’d want to highlight there?
Yeah, Daniel, that’s a great point. The Tiber Developer Cloud is really meant to do two things. One is just to give people the access to our products, so they have the ability to experience and run them and test them. But Daniel, to your point, it is also a place for people running a business to be able to have really easy access to our products as well.
And specifically with Gaudi, we’re enabling customers now with very large scale up to be able to do full production workloads and use the Tiber Developer Cloud as a baseline for their business. So we invite those listening to be able to reach out, through your sales contacts in Intel, and really be able to talk about how we can help those people really using this for business to be able to scale as a real product.
And also, as we look to Gaudi 3, which is our new product, that we’ve announced at previous events, and we’re going to make a very large announcement at Intel Innovation in September - that’s also going to be a place where we’ll see Gaudi 3 also begin to scale. So it’s definitely a place where it makes it easy to partner directly with Intel, or with partners, in the future.
[00:34:22.20] Yeah. I’m curious whether it is existing, or maybe a roadmap item - could you talk a little bit about Gaudi at the edge? When you’re not in the cloud, and you’re out maybe wanting to use Gaudi in devices that are out there, or platforms that are out moving about… What’s the roadmap look like on that?
Right. So today it is a OCP-compliant part. So it’s on a mezzanine card, so it’s meant for data centers. So today, the Gaudi platform is in a 6U or 8U rack mount server. That’s the form factor it has today. And that OCP form factor spec is 8 Gaudis on a single baseboard. And if you may have noticed at Intel Vision a few months ago, we announced full packages that you can buy. So you can buy a Gaudi 2 baseboard, or a Gaudi 3 baseboard, that has the full baseboard, that’s OCP-compliant, so you can drop that into a chassis from Supermicro, or WeWin, or other products. But in the future, we’re also going to have a standalone PCIe card that will be available. And that’s going to be exactly for those type of more on-prem, sensitive solutions - Chris, like you said, at the edge - where people can take advantage of a single Gaudi 3 and its capability on the edge. So you’ll see that PCIe card coming soon.
Yeah. And maybe that’s a good transition to talk a little bit – we’ve talked a lot about the things that are kind of the now, of what’s available, and kind of tooling or hardware-wise with Intel… But both of you have alluded to the future, to one degree or another… So maybe, Ben, I’ll start with you - I know you mentioned certain things, whether it be that Falcon Shores, or some cool things that are happening with AI PCs and that sort of thing… But what kind of strategy-wise and kind of positioning-wise is Intel really thinking about and investing in moving kind of into the next phase of what AI is becoming, and where Intel thinks the market is going, I guess?
Yeah, thanks. Probably the best place to start on this is and what I’m excited about is Falcon Shores, like you pointed out. So Falcon Shores will be kind of the culmination of combining the Gaudi product line with the GPU, with our current Max Series GPU. And it will be a GPU, a graphics processing unit, so it will be a full GPU capable of not only the AI workloads, but also graphics and other applications that people want to use GPUs for.
So I think that’s probably the most exciting thing that we’re kind of aiming at, that we know the market needs. And then yeah, iterations on, as you mentioned, the AI PC. In the future we’ll also have a new, more powerful AI PC with the Lunar Lake chip coming out, and again, it will include the CPU, GPU, NPU, but it’ll be a much more powerful, more memory form factor, where developers will get even more out of their local machines. So yeah, those are a couple of things.
And then the other thing that I’m excited about just as an AI software developer myself is the integrations with – like I pointed out at the beginning, the integrations with PyTorch. I think it’s huge that we’re aiming at getting all of our optimizations and everything we can into this framework, that is by far one of the most popular deep learning frameworks, and one that I use on a regular basis.
[00:38:14.29] So that’s really exciting to me as well, that just natively I’ll be able to work with PyTorch and say “Hey, I want to use the XPU”, which is for the Intel GPU. And same with the AI PC, just have that direct integration. So yeah, those are a couple of things that are exciting for me coming out soon.
Yeah, and I’ll add to that, to say - you know, the key thing here is forward compatibility. So people that are using our products today - so I’ll specifically speak to Gaudi. If you’re running workloads on Gaudi 2, you’ll be able to run those workloads directly on Gaudi 3, and that same architecture will move forward into Falcon Shores. So people that make their technological investments now, those will remain viable and relevant far into the future.
Yeah. And I guess one other piece of this, which - Greg, you mentioned kind of in passing. One of the things that people are really interested in with Gaudi, of course, is the fact that there’s some diversity in the market, and there’s another choice for hardware out there… I think one of the other interesting things that I don’t know if – I know you two are kind of only in pieces of Intel, and focused on certain things, but I’ve found it really interesting how Intel is very much investing in chip production kind of diversity as well, in various geographies around the world, as a key part of their business. And I don’t know if you have any comment on that, or thoughts on how that influences the market as a whole, and availability, or supply chain sorts of robustness that that could build in… But I know we’ve seen some interesting things over the years, both in terms of availability and supply chain issues with hardware.
Yeah, and you can talk about – let’s talk about this at a macro level and maybe at an AI level. At a macro level, you look at what our CEO, Pat Gelsinger, has talked about when we promoted the CHIPS Act, for example, that we need - Daniel, to your point, we need to be able to build that infrastructure worldwide. So you can see from an Intel perspective, our investments in our fab in Ohio, and our fabs now in Germany, as well as just our fabs worldwide; we have that worldwide capability to support significant growth and expansion as the world continues to need more and more silicon.
From an AI perspective, the growth and the need for AI compute is insatiable, and it will continue to be that way for the foreseeable future. So again, this goes back to why we’ve invested in Gaudi, and brought that as a product as part of Intel, and as part of just growing the broader AI portfolio. We really want to be able to give the ecosystem an alternative to getting access to AI compute as they need it today.
So as we start closing up, I would really love to hear kind of, from each of you - and you guys can decide who wants to go first, but… Where do you see it going? Where Gaudi is going, and where the overall ecosystem and these technologies are going. This is a moment where you can kind of take a little bit of poetic license and speculate a bit. I would love to see what you think will unfold and happen in the times ahead, and how each of you may see it a little bit differently as individuals.
Sure. Yeah, I can start. So I think - yeah, just looking at kind of AI broadly, and what’s happening, it seems like things progress with these incremental changes. And sometimes there’s a leap, but there’s often just these incremental changes, and one of the questions I often get from friends - maybe you guys do too, as you’re working in AI - is AI going to take over, and are we going to have kind of robots controlling everything we do? That seems to come up a lot as I say that I work in AI. Sometimes I regret saying I work in AI, and I should just say software engineering… [laughs] But no, it’s always an interesting conversation that I have with different friends.
[00:42:35.11] My perspective is that it’s like the internet, the coming of the internet. AI has come along and has changed the way we work, and changed the way we operate. But just like the internet, we have people behind building these things. And as these technologies evolve, we will have safeguards, and we will have things in place to help regulate the different technologies of AI that come out. And lately, with the AI agents - I think it’s one of the most recent things, where you have the agent kind of do more things for you than maybe previously, where you had to ask it to do more things. So that’s been a really interesting part of AI that’s kind of come out, and that I think is going to see a lot more adoption in the future… To answer your question about the future - I think just getting the AI tools to build more things, and to be able to kind of do more complex tasks in sequence is something that’s evolving and happening. Yeah, Greg, did you have some more to add?
I’m so excited about the personalization of AI. I see cases where now things we didn’t have when we went to college… Now you can have your phone or your AI PC open in your college classroom, and the AI will summarize and create notes for a lecture, or quiz you on a lecture, and create all that content for you automatically. I love that.
I want to see AI do better with my email, and be able to organize my email better, do searches better, make my life better. Those are things I’m really excited to see from a general perspective.
From an Intel perspective, I think you’re going to continue to see us lean in on giving customers what they need, which is being able to have more and more compute for fine-tuning, and for inference, either on-prem or on the edge supporting the world’s largest models. And as we see innovation happening on a monthly basis - it used to take a year, and now we’re on a monthly basis. We will keep up with that innovation. Just as an example, Meta just launched their LLaMA 3 400-billion parameter model in the market last week… So we’ve already been running that model, and we support that model, and we wrote a blog on it a couple of days ago… So we’re going to continue to support the most bleeding edge, latest and greatest technology that’s coming out, again, on a monthly basis.
Thank you both for taking time out of a lot of things going on in a fast-moving ecosystem to come and chat with us.
And I would definitely recommend to all the listeners to check out some of the show notes and the links, and go try some things hands-on, and have some fun, and start building. Thank you both, Greg and Ben. I appreciate you taking time.
Yeah, thank you. Thank you, Daniel and Chris.
Thank you.
Our transcripts are open source on GitHub. Improvements are welcome. 💚