Dinis Cruz drops by to chat about cybersecurity for generative AI and large language models. In addition to discussing The Cyber Boardroom, Dinis also delves into cybersecurity efforts at OWASP and that organization’s Top 10 for LLMs and Generative AI Apps.
Featuring
Sponsors
Speakeasy – Production-ready, enterprise-resilient, best-in-class SDKs crafted in minutes. Speakeasy takes care of the entire SDK workflow to save you significant time, delivering SDKs to your customers in minutes with just a few clicks! Create your first SDK for free!
Fly.io – The home of Changelog.com — Deploy your apps close to your users — global Anycast load-balancing, zero-configuration private networking, hardware isolation, and instant WireGuard VPN connections. Push-button deployments that scale to thousands of instances. Check out the speedrun to get started in minutes.
Shopify – Sign up for a $1/month trial period at shopify.com/practicalai
Notes & Links
Chapters
Chapter Number | Chapter Start Time | Chapter Title | Chapter Duration |
1 | 00:00 | Intro | 00:34 |
2 | 00:35 | Sponsor: Speakeasy | 00:53 |
3 | 01:36 | Future AI in medical uses | 01:51 |
4 | 03:26 | Welcome Dinis Cruz 👀 | 00:34 |
5 | 04:01 | What is OWASP | 03:09 |
6 | 07:10 | It's nice to make the world safer | 04:43 |
7 | 11:53 | Blurring line between building and using | 04:43 |
8 | 16:49 | Sponsor: Fly | 03:06 |
9 | 20:13 | Security in and from AI | 06:55 |
10 | 27:08 | Most orgs shouldn't build models | 02:01 |
11 | 29:20 | Sponsor: Shopify | 01:32 |
12 | 31:08 | What makes good or bad? | 07:29 |
13 | 38:37 | Approaching different workflows | 03:36 |
14 | 42:12 | There be dragons | 03:54 |
15 | 46:06 | Speculating on AI | 03:56 |
16 | 50:02 | Wrapping up | 00:23 |
17 | 50:29 | Outro | 01:05 |
Transcript
Play the audio to listen along while you enjoy the transcript. 🎧
Welcome to another episode of Practical AI. This is Daniel Whitenack. I am the founder and CEO at Prediction Guard, and I am joined, as always, by my co-host, Chris Benson, who is a principal AI research engineer at Lockheed Martin. It’s good to record one together, Chris. There’ve been a couple where we’ve been apart…
That’s right.
Yeah, it’s good to have the band back together, I guess.
Absolutely. I took a brief respite. I had some lower back surgery, and since I’m always thinking about technology and AI, I was trying to imagine what the surgery might be like today versus if I had done this – if we were in a slight time warp and gone to the future.
Did they have a robot arm?
Exactly. If they did… Reinforcement learning, combined with kind of a chat interface, and the whole thing. So it was kind of a novel mind experiment there that I had, about what would it be if I was – you know, this was a little bit different time. But yup, all good for here.
Yeah, well, I would hope that in all of our futures, as we have medical procedures done increasingly by AI or AI-assisted, and other things like that, that those things are very secure, as they happen; that would be very important, I think.
Yes, absolutely.
And on that front, I’m really excited, because Chris, on this show actually a couple of times I referenced the OWASP top 10, Gen AI top 10 sort of risk white paper. It breaks down security and privacy-related risks, into sort of different categories, and helps people think about it. This is a collaborative thing, with multiple organizations involved… But today we’ve got with us Dinis Cruz, who is the founder at the Cyber Boardroom, but also has been involved in OWASP and in various capacities over the years, and is aware of all this that’s going on, and contributing to it. So we’re just super-excited to have you with us, Dinis. This is one I’ve been really looking forward to.
Thanks for inviting me. This is a topic that I’m very interested in at the moment, because I think there’s a lot of potential, there’s a lot of dangers too, but I think it’s very exciting times for us as an industry, in all sorts of levels.
Yeah. Well, like I mentioned, you’ve been involved in OWASP in different capacities… Give us a little bit of background, and first off what OWASP is, and what that means for those that aren’t familiar, but also, as you kind of started seeing this gen AI stuff come about, how did it strike you from that standpoint of being involved in OWASP over the years?
Okay, so OWASP is the Open Web Application Security Project. And you kind of can say that OWASP is the people around the world that kind of cared about application security, that found that sort of intersection of the world moving to apps, versus networks, and then the security elements of it. And I think it’s definitely one of those organizations that has grown from nobody really cared about it, to some more people caring about it, and now it’s being referenced left, front, and center, and has really changed the world in very, very nice ways. And it’s always attracted, I think, the people that want to do better. I think it’s a lovely community, it’s very open, by its nature, but it’s also – I think, it’s been a great hotbed of innovation, in terms of the first OWASP Top 10, the testing guide, a lot of amazing tools come out of it, OWASP Zap, Dependency Checker… There’s bazillions of tools. I developed a couple, too. And it’s a great, great community.
And I think Gen AI - it’s an interesting evolution on our technology, even from a security point of view, because I think for the first time we have a technology that kind of understands intent. And I think that’s quite different. Now, I want to say out of the bat that I think what Gen AI is doing, ChatGPT and everything, is actually making most of what we always talked about in OWASP even more important. Because if you think about it, Gen AI is an API. It’s fundamentally an API that you send some data in, you get some outputs out, and it is [unintelligible 00:05:56.24] to connect internal systems. And in a way, it’s your most vulnerable and your most dangerous API at the same time.
[06:05] I would say that the thing that is quite unique with Gen AI is the fact that you’re now sending English, or Portuguese, or French, or whatever language, or Klingon, whatever language you want to talk to, that is now code. And that’s quite different.
For the ones that have been doing security for a while, we always talked about securing code and data. If you can separate code and data - we know what code is, we know what data is - you actually can secure things quite well. Now, you’re in a world where data is code. So that introduces a huge amount of interesting challenges. And I think it’s one of those technologies that in one end it’s going to create a lot of problems, but on the other end I think it has the potential to solve some of the biggest challenges of scale that we always had in security and in engineering, and in development, with that technology, too. So I think it’s going to add up to both sides of the universe there.
It’s interesting, I actually want to go back to something you said right as you were starting off a second ago, and build on it a little bit… That like you kind of alluded to, for a long time, over the years, security was always kind of the redheaded stepchild of application developers. There was a lot of folks that just didn’t care too much… And obviously, we’ve progressed along, and we’re talking about Gen AI and things like that, what we’re doing now. But going back, I’m just curious, as we’re talking about security in general in AppDev, even before we get to AI being introduced into it, what has motivated people to get into the security side, in your view, as you’ve watched this? Because I too have seen the security concern go from being kind of a backseat thing to being very front and center, and very important in a lot of organizations. And that’s been an evolution. And as you’ve been in the middle of this ecosystem, how have you seen that happen, and why? Where’s that progression come to?
Well, so I would say, from personal experience, and I think from most of my peers, there’s a thing of it’s nice to make the world safer. There’s definitely a [unintelligible 00:08:05.29] it’s nice to do something that the outcome is to make people safe. And I’ve always liked to stay on the defensive – you know, yes, a bit offensive, but making things/system secure. So I think intellectually it’s really cool, I think from a technology point it’s really cool… There is a thing of hacking and breaking into stuff, which is also quite interesting. There’s a little bit of a James Bond, hacking into a big system, and getting there, and all that jazz… Which I think is part of the evolution.
So I think it’s a really cool industry, it’s very open, it’s very welcoming… It’s also very pragmatic. In the beginning, it’s like “Can you exploit it or not? Can you do this? Can you do that?” And I have to say, we need to also be pragmatic to say for a long time, it didn’t really matter. In a way, there was a period where you knew who’s being attacked, because they were the companies who had the best security teams. Literally, there was a one-to-one correlation. That team does security very well? Cool. They’ve been attacked. They’re on the evolution. And I’ve now been a CSO, I’ve worked at a lot of companies, and there’s definitely – I tend to break it into three different worlds. There is the nation state, the high-level espionage, there is the ideology attacks, and there’s the commercial attacks. The reality is that this one is a completely different level. If you’re dealing with nation state, you know about it. You have hell of a teams. And we can do it, but it’s a very different game. The ideology - it depends on the industry you’re in. Some industries are much more toxic. They tend to attract that. Most of it has to do with the business model of the attackers.
So what has happened in the last 10, 15 years is that the more we move our society into digital, the more the business model of the attackers evolve. So in the past, you can get away with a lot of things. You can get away with crazy insecure applications out there, and the probability of you being attacked was quite low. Now you kind of can’t. So I think there’s been that evolution of, in a way, the stakes are much higher.
[10:03] Now, even the recent CrowdStrike problem - which I still think is a security problem, right? We can talk about it if you want. But that just shows how one problem in technology is going to bring down large chunks of society.
In the old days it was the Nimda, the ISS worms that brought down a chunk of technology. But at the time, the impact was limited. Now, it really makes a difference. Now a cybersecurity problem in an organization of a medium size can break the company, can cause really severe disruption. Even a ransomware incident in one of our suppliers can actually cause a lot of problem in your world.
So the application security and OWASP and everything, we have, in a way, matured with the kind of evolution of the market. Although I have to say that I still feel that most companies - it’s still a marketing exercise. It’s a controversial view, but I think we still don’t have a good way to measure the cybersecurity preparedness of companies. So there’s still a lot of companies that kind of roll the dice and go “Well, I hope it goes okay.” Although much less than before, but I still feel there’s a lot of maturity that we need in our own industry to do that.
And this is the final point… The thing I also learned a lot was that cybersecurity is not in isolation. Cybersecurity is a side effect of engineering and business practices. So there’s a moment where if you want to fix cybersecurity problems, if you want to fix application security problems, you have to fix engineering problems. And that leads to that. If you do really good engineering, if you do really good development, really good practices, half of what we talk about in application security is not needed, because you just do it. It’s just good practices. The problem is a lot of companies, a lot of teams don’t do that. So the side effect is the security problems.
I’m wondering, when you’re talking about this sort of business practices, the engineering, where these security vulnerabilities pop up, and that sort of thing - part of what comes to my mind on the Gen AI side is the fact that it brings the domain experts a lot closer to the actual technology, in the sense that often there are tools or there are interfaces that people are using to create sort of chains of reasoning just with prompts and other things, just with natural language. So they’re not – to your point, I think now this sort of natural language is kind of like the code of programming these models. How do you think that that shifts the dynamic between the engineering side and the business side? Because from my perspective, a lot of these domain experts and business people are coming much closer to the actual kind of core functionality of an application.
And I think that’s the insane opportunity. So first of all, I don’t view that the real power of ChatGPT is that it can create things. The real power of ChatGPT is that it can understand context, it can understand data, it can create mappings and relationships. And what is in practice – the way I visualize what you described is that in the past everything you wanted to do had to be coded. It had to be solidified in bits of code, which is also where vulnerabilities were created, but also where, in a way, we were locking down the business logic into code, microservices, structs etc. And then we got bogged down by the complexity of that. And then we couldn’t understand the side effects, and then things ground to a halt… And then again, security vulnerabilities appear in those gaps. It’s those gaps, it’s those misunderstanding of how APIs work, or an API that was created over here, in a secure state, that is now plugged to the internet… You know, maybe not the best move…
But I think we’re now entering a phase where you can have a layer where the business logic is now described in prompts. And what happens when you describe the prompt, you start to describe the intent of what you want to see happening. And instead of that intent being locked in code, which is really hard to change, it has a huge amount of things, side effects, you can have that in prompts.
[14:08] Now, we need to be better understanding the prompts. I’m doing a lot of work on provenance, deterministic AI, to make sure that we actually have AI outputs that are deterministic, but it’s very powerful to be able to start to describe that. I’ll give an example. If you go to any organization – when we do threat modeling… So one of the practices in security is threat modeling. Threat modeling is just social engineer the other side to give us an architecture diagram that is up to date. Because once you have that, you find vulnerabilities, because you start asking questions, “What about this? What about that? What about that?” In a way, the hidden elephant is most organizations have no idea how the app works. They have no idea how are actually things connected, the guys who developed it have gone… Now imagine a world where we start to use Gen AI to explain how it works. Imagine a code commit that goes to a Gen AI layer that I can ask the question “Did this change the attack surface? Did the attack surface of my organization change because of this?” And they say, “Oh, if it did, then we need to do a review. If it didn’t - cool. Go all the way to production.”
Now, these questions in the past were impossible to ask. Or you had ridiculous static analysis that just about didn’t work. Now we can start to codify a lot of those business logic questions, a lot of those intents of what I want to do in basically simple language, instead of code. Because imagine, if you say “I want to do a change that allow anybody from the internet to access every record from this company.” You’ll probably go “Ooh, hold on. I’m not sure that’s a good idea.” Or “I want this to allow an attacker to run and control JavaScript on my clients.” Ooh, maybe not. I want my secrets to be put on the source code. Maybe not. But at the moment, all those things get locked into code. And it’s hard to understand the side effects, because nobody arrives one day and says, “I’m going to create a security vulnerability.” Unless they’re malicious. But apart from those. A lot of times it’s just genuine mistakes.
So I think what GenAI gives us is the ability to start thinking a lot more like three-dimensional… In our questions, but also in our applications, where we start to describe the intent of what we want to do. And that can be analyzed. In fact, that can be analyzed by other Gen AIs, so we can start to get a much better sense of actually what is happening.
Break: [16:37]
So Dinis, you said something that was pretty intriguing to me, which I think is maybe a kind of distinction that is maybe interesting to draw out and get your thoughts on, which is this idea that when you think of kind of the intersection of cybersecurity and AI, you could come at it from two perspectives. So you could come at it from “How can we use AI to help us in our cybersecurity tasks, or to create new tools for cybersecurity”, and then the other side would be “Well, how do we operate AI systems in a secure way?”
So there’s probably some interaction between these two things, but could you give us a sense, from your perspective as an expert in this field, and also seeing a lot of things so far, how do you see the kind of maturity of these two sides of that coin? Anything you’d want to highlight on either side of that in terms of how both things are progressing, at least at the state of where we are now?
So you’re saying that the difference between using AI to sort of build systems and do things, and then using one of those outputs is the cybersecurity analysis, right? …of what you have.
Yeah, I could imagine there’s ways I could use AI to fight cyber crime, for example, or to prevent malware, or like you just said, to help explain applications. So that’s using AI to help you create more secure systems. Whereas there also could be just your AI system is insecure in and of itself, and how you’ve deployed it and run it.
Yeah. And just on the second one, I think if you’re not careful, most AI deployments are ridiculously insecure. In fact, we have to take into account that we still don’t have a good understanding for how the models work. So the reality is there’s nobody today that can tell us that these models don’t have ridiculous backdoors in there. Even non-intentional. Even maybe just the way it works.
When we started this, people thought that a string copy was okay. People thought that a little catch between a memory copy in the OS was okay. And then we realized that you can drive buffer overflows, ridiculous exploits through it. So I think we’re in a nation state at the moment now, in the early days of understanding everything you can do with a model.
So my kind of view in this is that models that you want to use on that “How to use models secure” should be read-only, should not learn; you don’t want them to almost bring any content. You want to give them the content, you run them in complete isolation, and you assume that whatever you put on it is already exposed, and you verify the hell out of what comes out of it. And I think there’s a lot of companies who are rushing into pushing models. The problem is that they’re not taking into account that the models themselves are ridiculously powerful. And this is where you want to – imagine that somebody can put a payload that is then executed by a model, and that model sits now in the middle of your organization, in your cloud, in your environment, who probably has access to APIs or other assets. That is ridiculously dangerous. And that’s what we’re doing.
So I think in one end, I think we need to be very careful in putting models in line, in how we actually validate the inputs and the outputs, which is kind of why I view them in multi-tier sort of flows. And on the other hand, when we use them in a safe way, they’re ridiculously powerful, because going to your first form of how to use them for cybersecurity, what I really like is that I always felt that the model for cybersecurity is a model based on the attacker making a mistake. It’s not about you protecting everything, it’s about you almost – you want the attacker to make a mistake, i.e. make a call that was not supposed to happen, make a download, make a connection, access the application in ways that no user will access it, call web services that’s completely out of sequence…
[24:18] In the past, again, it was impossible to model this. We tried squeezing technologies, even people, and we created ridiculous installations [unintelligible 00:24:24.20] They really struggled at that. But I think we now have a good chance of doing that. So that means that we can now create much more, I would say, hostile environments for attackers, because we force them to follow the paths of the users… Which, by the way, they don’t know what those paths are, unless they’re already in your system. So I think we have a chance of using that, but what we need is we need models that are really, really reliable. So OS has an amazing top 10 for applications, it has a really good top 10 for Gen AI models, LLMs. What I think about is most of that is trying to deal with the fact that the models can learn, and the models don’t have deterministic outputs.
And I like the idea of actually turning the tables around, and say “Hey, I don’t want my model to learn. I want the data that my model has access to be completely determined by the session and the state that request comes in”, which is normal app sec. And then ideally, I don’t even want the model to have knowledge. I want to give the model the knowledge that it’s going to use so I control hallucinations.
Chris, you talked about your operation. You don’t want the Gen AI doing the operation on your back to suddenly go off piste, right?
Definitely not.
…and start doing an operation on your leg. So in this theory, you want, for example, the Gen AI model that is facilitating your operation to only know about back stuff. Or maybe to know general things about the body, understand that, but the domain knowledge that it has should be laser-sharp-focused to the situation that you’re in. And then that’s how you control hallucinations.
So I think the fundamental problem that we actually went backwards in security is that we now don’t have a separation between code and data. And that’s – I don’t think we speak enough about this, because for me, that’s a massive problem. It’s a massive problem because we really need to be able to distinguish what is code, what is data, what’s an input, what’s a command.
So I’m doing a lot of stuff where I go from JSON to JSON, and the latest model do this better, where I almost want an API coming in, I give that API - which I can form nicely with data validation and stuff - to the model, and then the model output itself is an API that is completely strongly typed, so I understand the output, if that makes sense.
It does, it does. On the side, I’ll just say, I might actually need that operation on my leg too, but I would prefer it was a separate operation, that we also planned out. Just to note it.
It’s funny, as you were taking us through that, I have a whole bunch of different pages up here relating to things we’re talking about, including on the OWASP site, that top 10 for LLMs and generative AI apps. And ironically, you were going through that and I was like “Wow, they already have this amazing list”, which kind of addresses these things you were talking about. And then you referenced it explicitly.
I was wondering, could you kind of take us through how was that generated? What’s the thinking? Because it looks – based on everything you were saying, it looks like almost a roadmap of the things that one needs to be thinking about when going through the process. Could you take us through that a little bit?
Well, I think you just nailed it. I feel that what you have there is – the team who did it… And I wasn’t very involved in it. I was a little bit on the outskirts of that project, because I thought they were doing amazing work…. It’s that they had a huge consultation period. They talked to a lot of people, they basically listed a lot of the stuff that goes wrong. And what I think is interesting about that is I think that whole list has a bias for the teams that are kind of deploying their own solutions. And it kind of covers a lot of those things.
[28:14] I kind of feel that a lot of that needs to be addressed by the people that provide the models. And I think more and more – you know, it gets to the point where you don’t want to build your own cryptography, you want to use cryptography models that are very robust. It gets to a point where somebody’s [unintelligible 00:28:26.24] you shouldn’t be building your own model. It’s like, look, unless you have a hell of a team and you really know what you’re doing on that area, most organizations I don’t think should be building models. Because the big paradigm shift for me was when the prompts is where the action is. And even if you look at things like Claude, and recently now people are sharing the prompts for those things, you see how much the prompt is actually impacting this stuff.
So going back to that top 10, I think it’s a great roadmap for people who are deploying their models to go “Do I have to care about this? Does this apply to me? How do I answer this in an effective way?” Because I think that’s very important.
Break: [29:11]
Well, Dennis, I’m really fascinated by this concept that you brought up about separating the model and the data. You phrased that in various ways… It sounds like you’ve been thinking about this concept a lot. I’m wondering if you could bring that to a practical level, maybe for those out there that are kind of wondering - maybe in both cases. So I’m using a closed model provider, like OpenAI or Anthropic or something like that… There’s that scenario. There’s also people that are hosting their own model, or even running it locally on their laptop, with a local model server. From your perspective, what are the interactions between kind of model and data, or as you put it, knowledge and model, that are relevant in those scenarios to create either goodness or badness in each of those scenarios?
So I think the first very important thing that is very relevant today, that wasn’t, I would say, six months ago, is that we need to move from this idea that you have one model. What you have now is you have an ecosystem where you have multiple models. And they will go from probably some of the most commercial, if that fits your model, that you want to use, to the open source one, but also from the most powerful to the least powerful. Because what you want is this mode where you start with “I want to do X.” And with X, you want to start figuring out what is the best model, and sometimes what is the best combination of models that will give me that output. Because in a weird way, the best deterministic way to do something is code, or to have the least amount of moving parts in there. Because also remember there’s a cost issue here.
So the more you use the models, you want a situation where you’re firing these model analyses all the time. Now, if every one of those is hitting an Open AI endpoint, that will get very expensive, very fast. But it’s not just that. Sometimes you don’t want that whole package. You don’t need all of that. If you just want a summary, or you want a validation, you want this, there’s now a lot more models, and there’s models who are specializing in specific things, who have a certain bias that you want to have those bias, in terms of that.
So I think it’s important to start thinking not just of one model, but a sequence of models, but also what is the best model that you have. And the open source models, the reason why they’re a game-changer is because suddenly you can now run models, let’s say with Ollama, on a desktop CPU, with decent speed. And decent speed might not be the like the real time we now get in ChatGPT and Claude 3 etc, but maybe even how ChatGPT was a year ago, or two years ago. But what it means, it means that if that’s on your pipeline, you now have a pipeline that is run off CPU; you don’t even need GPUs now. You can, of course, if you have them, and you can afford them, and it fits your model… But it’s CPU level that can run a model that is completely isolated from the internet. And I think it’s very important. I think it’s very important that you have a design that has those workflows, because you start to introduce them as part of your workflows.
In a way, the key answer to your question is people need to pick up a use case. It doesn’t need to be ridiculously complex, but pick a use case, and then try to do that with a Gen AI workflow. And that workflow where in the past you had to code, now is a workflow that has multiple LLMs, it has multiple sequence… I have things where sometimes you create same question to three models, then you use a fourth one to analyze it, and on that pipeline.
And what I’m trying to do with the Cyber Boardroom is fundamentally try to address in cybersecurity how to communicate, how to translate cybersecurity knowledge to board members or executives, but also how to get those executives to ask good questions, and to translate what they care about.
A really cool use of technology is to think about translation. For example, in the past as a CSO I produced reports and briefings for a lot of people. But I didn’t customize them, because it didn’t scale. Now I have the ability to create a customized version for Daniel. And take into account your culture, your language, your context, your level of interest. Do you care? Don’t you care? What’s your focus? And I have another version for Chris.
So maybe, Daniel, you’re a lot more focused on the financial element. Chris might be more focused on the strategic element of it. So I now have the ability to translate a bit of knowledge into very specific domains of one. Because I can feed the knowledge, I can feed what the background information, I can feed the audience, and then I can say ChatGPT, or Claude, or LLaMA, or Gemini, “Translate this.” And they are really good at that, and they don’t tend to hallucinate at that level, because you create the parameters. So that’s a good example of a use case, which is very laser sharp, but adds a lot of value, because - imagine… This is not just execs. Imagine you have the project manager and you have a program manager, and you have the lead developer, and you have the QA, and you have the marketing person, and you have the executive.
[36:17] You can now create briefs for every single one of them, that puts into context why they care, why this is important, why this cybersecurity stuff means that when the marketing team does a campaign, the website doesn’t block your users, because you just have 50% more traffic. This is a real story, by the way… We were attacked by our marketing department that ran a prime time TV ad. Great. But if we knew about it, we could have planned.
So there’s all these lack of communication that I think it’s really interesting to do in organizations. And in the past, this didn’t scale. So if you now take into account that you now have multiple models with different level of capabilities, you almost want to think “What is the most cost-effective? What is the most deterministic way for me to chain this?”, where you maybe use some cheaper models to do some stuff, and then maybe use the more expensive model to actually do some kind of uber analysis and make sure that it still all makes sense. And of course, this should all be calibrated by the humans who start to calibrate the inputs and the outputs that go into the system.
But that’s why I feel that it’s very, very exciting now that we’re having this super-competitive race between the different models, because we now have a huge amount of models to choose from, and you can now start to pick which is better for each capability. And that’s why I want to see models that have no content. I want to see models that just have understanding and logic. It’s almost like we need to find a way to strip away some of the content, so that I can say “This is my policy. This is what I care about. This is my world.” Because remember now, we now have big context windows. So you can now feed quite a lot of data in a prompt that goes into the model.
It’s really fascinating, with what you’re doing at the Cyber Boardroom, in terms of kind of optimizing, using the right models in the right way… And then I actually want to reach back a little bit to something you said a few minutes ago, where you were saying, from a security concern, you wouldn’t want most organizations to, for instance, build their own models from the ground up; use these foundational models for which you have strong security basis in, that you can trust, and then optimize in the way that you were just discussing. I think you’ve really hit on something, because I think a lot of organizations are really struggling with how to approach the different workflows to maximize their productivity, and that output, while staying secure and not exposing themselves.
So you seem to have a really good grasp of this workflow that maximizes the productivity and the efficiency for the different audiences, while not getting them into trouble by doing something they’re probably not well suited to do. Is that fair? Would you say that that’s a fair way of–
Yeah, it’s a nice compliment, thank you… But that’s what I’m trying to do. I kind of call it deterministic gen AI, and people go “Well, but it’s not supposed to be deterministic.” I’m going, “Yeah, but that’s a problem.” [unintelligible 00:39:18.27] there’s a cool side of it for creativity. Great. We already have that. What I think is interesting is to leverage that ability to understand context, and to write in English, or write in Portuguese, or write whatever language you want, and to do that translation layer, but to be very deterministic, which also means that we need to be much better at provenance, which is basically that path of “This creates that, creates this, creates this.” But also, it’s a way to scale. Because if you start to have provenance on good sources of information, then you don’t have to do this all the time. You kind of – you can build your knowledge base. You can build your workflow base, and you can build your confidence. And then it’s about creating these microservices that do one thing really well.
[40:04] Think about it - you don’t want a microservice that changes behavior next week when a new model comes along. Dude, it’s like, we want deterministic stuff, because we build things on top of that. So we need to start getting to these building blocks again components, that do one thing, they do it really well, they are super-reliable… Yes, there might be a model in the middle, but that means that the thing has this size, versus that size in terms of the capabilities. And more importantly - and I think that you mentioned this before - is that this allows us to go to the business owners and let them be in the driving seat. Because think about it, in the past we had Gherkin, if you guys know what that is, when you write stuff in quasi language… “When I do this, then I do that”, or “given this, given that…” But that was always a hack, because it was like fucking hardcore to the backend. Now we can actually have the business describe the intent. They can describe the workflows. They can describe the experience they want for the user, for the data. Even data transformations, we can start to describe what I want to get from here to there. And then it’s about how can you lock it down in the most reliable piece of code and system that can do that… Which is why having models that run offline are very important, because that allows you to lock in, version-control that thing, and then know that it will still go in two months, two years, five years, it will still do the same thing. And I think that’s very important.
So from a security point of view, what I’ve learned was every time you have a system that behaves like a black box, you have vulnerabilities. Literally – so we’re creating an uber-black box for this stuff. So what can go wrong? So in the past, I knew that the less the team tested, the less the team had architectural diagrams, the least they understood how a part of the application worked, the more vulnerabilities I was going to discover. Because they couldn’t test it. It’s almost like, how can the developer understand the side effects of what they’re doing if they don’t see it? So I think we have to be very careful by creating these black boxes, that we don’t understand the behavior and how it works.
Something that has maybe brought into a little bit more clarity for me as you were just describing what you just described with the black boxes - we had a few episodes ago, Chris, you remember we had the episode, I chatted with Donato when I was in London, at WithSecure. And one of the things that he brought up - I’m curious of your perspective on it, Dinis - is like these very, very large models, especially the closed ones, have a huge attack surface. The vulnerabilities - they’re so general. There’s so much knowledge kind of embedded that it would be sort of impossible to think that you could kind of fully explore the space of behavior and prevent things like jailbreaks, or things like prompt injections, that sort of thing. Whereas sort of this smaller model you bring the data to is either going to perform really good if you bring the right data to the table, because it’s not embedded in the model, or it’s going to be complete trash output… Which maybe is better, because you’re operating in a regime where the data distribution isn’t what you expect. I’m curious if that tracks.
And you can ask, you can verify. Yeah, I think that’s spot on. Look, it tells you something that we still – it’s almost like… Somebody had a cool analogy - it feels like we’re back in the navigation; the Portuguese have a great history of discovering the world. So at school we learn how the Portuguese, and the Spanish and everybody else were like “What’s out there? Let’s send a boat.” And then “Hey, look, we discovered a country” or “We discovered something”. It feels a lot of these models are like that, which in a weird way is ridiculously scary. Like, imagine like–
Yeah. There be dragons.
[43:51] Yeah. Imagine you run an app, you do a web service, you ship it and go “Hey, do you know that thing knows chemistry?” It’s like “Well… Well…” Do you know that thing speaks 20 languages? “Well…” Do you know what I mean? It’s the fact that we still talk about emergent properties, and the fact that people still talk about what the models do by probing it, from the outside, as a ridiculous black box - that shows you how immature we still are in that level.
Like, yeah, software is complex, but we can actually understand kind of what it does. Given the time and money, we could actually reverse-engineer even the most complex piece of software, and going “Yes, this thing is not going to know chemistry. This thing is not going to do beyond this. It might do some bugs, but it has a limited operation space.”
I think the models - in a way, it’s a good thing, in some aspects, but they have all these properties because they build these really big three-dimensional or multiple-dimensional views of it… But that’s not what you want from mission-critical systems. And that’s why I was talking about exploits. My prediction is there’s going to be a number of exploits, backdoors, and [unintelligible 00:44:57.27] instructions. My crazy ASCII characters, crazy [unintelligible 00:45:02.21] characters, whatever, math numbers, whatever, that will trigger the models to go into a place that they do crazy stuff. But we don’t know that, because we don’t understand the models.
People hack a model almost in English. Oh, pretend that you’re now writing in ASCII art, and you do this. If you think about it from an exploit point of view, that is very basic. There’s going to be way more ridiculous, complex, but interesting - well, I guess, from a security point of view - type of exploits that people will have once you start to understand how it works inside. Which is why I think we also need to start measuring almost like what is the behavior of models. Like, how do they arrive at those conclusions, and then even have models that can only do those bits? Does that make sense?
Yes.
So again, I want deterministic models. I want models that I can start to vouch for how they work, and how they do, even if maybe sometimes they’re a little bit less efficient, but you earn the explainability.
So I want you to actually – as we’re starting to wind up here, I would like to ask you to even extend that a little bit in terms of kind of where you think that’s going to go. Because you’ve already touched on trying to get to more deterministic models, and some of the things you’re expecting… If you were to blow that out in a slightly longer time horizon, maybe several years, where do you think all this is going to go, and how do you think it might get there speculatively, recognizing that we’re just doing the crystal ball and asking you to tell us… Kind of when you go to bed at night and you’re thinking about this, what do you think is going to happen?
Okay, there’s multiple areas.
Sure.
[46:46] I feel on the whole fake news creation of crazy content, using AI for the attack - that’s going to grow. There’s business models around it.. I actually think that’s going to force us to have deterministic and provenance on news. So I think that’s a good thing. It’s going to be bad, but it’s going to force us to address that problem. There is a level here that we’re going to have to control it, just like we control nuclear weapons, and other things. I think there’s a level here that we have to be careful not to create things that go completely out of control. And we might have a couple, but that’s a bigger problem. That needs to be addressed.
Where I think we’re going to have the biggest impact is - I think Kevin Kelly had a great phrase in one of his books. He talks about AI even before ChatGPT, but he talked about “AI will become like electricity.” It’ll become embedded in all these little things. But it’s not a massive thing, it’s little bits. It’s little things that you slowly start to have, that introduce a level of intelligence that we don’t have today.
So if you think of most of our interactions, they don’t have a lot of intelligence. They don’t learn. They don’t – and I think as, again, the models become small to run on your phones, as the models become small to run in microservices, I think what will be very powerful is the creation of these lots and lots and lots of little use cases that really make a difference. And then you start to trust it. And then you compound them on each other. And my instinct is that anything that relies on a black box eventually will blow up. And when it blows up, people push back and going “Whoa, whoa, whoa, we can’t have that.” It’s okay for proof of concepts, but that cannot be doing stuff. Also because technology now, I think, had got to a point where we would drown in complexity. We would drown in all sorts of things that we don’t fully understand, and don’t connect the dots… And even application systems, they’re so complex, and companies are so complex. I think understanding them in the smallest way, that will dramatically change. That’s how you change society. You change society by literally introducing something that becomes really powerful, really useful.
And the final point I want to make here is I think there’s a ridiculous opportunity for reframing education. And to finally create a learning environment that individuals can learn in the best way for them. So I think we can change the way how we learn; continuous learning is a big thing, but also how we change the education from being like a memory kind of exam-based stuff, to actually be about learning. And now it’s about creating customized learning paths to the individuals. But also, I guess, just a final point is that this means that the human is literally the one that is ridiculously important here. This is a tool to help the human to be even more productive. The same way that we use the internet, the same way that we use the hammer, the same way we use electricity… It’s just a different one than we had before, because it can understand language. And we never had that before. So I think it’s an insane opportunity.
But the attack side - yes, it will go… But we didn’t need Gen AI for people to create ridiculous attacks, right? But on the defense side, I think it changed the nature of the game. So I think it’s very exciting on that. I hope that answered your question.
It was a great answer.
Awesome, yeah. Thank you so much for taking time, Dinis. This has been great. I really have been looking forward to this, and loved the conversation, loved this idea of kind of thinking about knowledge and data and model, and how those are connected or not connected… Please continue your great work, it’s a great contribution, and… Yeah, thank you so much for taking time. It’s been a pleasure.
My pleasure. Great talking to you guys.
Our transcripts are open source on GitHub. Improvements are welcome. 💚