Beyond the Hype: Using DefectDojo’s MCP for 10X AI LLM Performance for Vulnerability Management TY Page

Sept 17, 2025

Beyond the Hype: Using DefectDojo’s MCP for 10X AI LLM Performance for Vulnerability Management

Transcript

00:07
Here's what we're going to talk about today. We just did the demo. Here's our agenda. Live demo, so pretty straightforward there for the video. Sorry about that. Yeah, thank you, Chris. So we're talking about applications, all the different ways that you can expose different data sources to your AI LLM, like Cloud, Chachibiti, Gemini. So typically,

00:33
applications, all these things, even data lakes. We'll talk about that briefly. We'll have some type of API endpoints, and then you have an MCP server that kind of is an abstraction of those endpoints. It's an abstraction on an abstraction, if you will. We'll talk about that. And that's where the MCP server kind of sits, just in this flow. So when you make a prompt like we just did, so we just ran that prompt here in Claude.

01:00
And so as you can see, it was getting some finding. It's still working there. It is talking to the DefectDojo NCP server, which is then getting information and finding information, et cetera, from DefectDojo. Now, because as I've practiced this, I've had a lot of people go, I don't understand what all of these things are. What's an API? Let me just show you real quick. This is the DefectDojo API, right? And as you look at all these things, you may recognize things like credentials, dashboard,

01:30
endpoints, findings. These are the actual endpoints. So when you think about a API, this is basically what it's doing. It's exposing functions within DefectDojo via the ability to call this directly programmatically, usually in scripts or with other applications and things like that. And there's a lot of them, right? That's just for findings. You can see all the different endpoints just to get something from finding. So

01:58
If I were to blow up all of these, there would be a ton of these API endpoints. So we're not going to expose all of them. That's kind of one of the purposes of this MCP server is that you're not going to expose all of these and we'll explain why in a moment. Let's break down this word real quick because this is going to help you understand why this is important. So we're talking about model context protocol, MCP. What does that mean?

02:27
Yeah, it's a standardized way for LLMs to interact with external systems. I'll try not to read the slides to you, but we got to break this down a little bit. So let's start with model. What are we talking about here? Well, that's referencing the large language model. What is that? Large means this is all the data that it's able to be trained on, billions of parameters. Those are just the...

02:53
some of those that you can kind of see there in the billions of parameters. Language, so this means it is trained to do human text so that you can understand and generate, so it can generate text. And then the model is a mathematical framework that predicts the next word in the sentence. And this right here predicts the next word in a sequence is the most important thing to understand about any LLM. That is all it's doing. It is trying to predict

03:21
the next word in a sentence, sequence, code, whatever it is, right? So if anybody asks you, what is, do you know what AI is? This is what AI is. It is, or at least for an LLM, is billions of parameters trained on human text that predicts the next word in a sequence. That's it, right? Now, obviously it's much bigger than that, but it's also kind of like saying, well, when we, the internet came of age in the nineties, it was just connecting things.

03:51
Right? That's all it was doing was connecting things, but it has a much bigger impact than that. Context. So in the model, so now you understand a little bit about the model and that that's the LLM. Now we're going to talk about context protocol. Let's actually start with protocol because we're not going to spend any time there. Hypertext transfer protocol, file transfer protocol. You know, these are just mechanisms of putting the data in a way that you can send it over a network. It can be interpreted and understood. So

04:20
No different than any other protocol that you may have heard of, HTTP. But the context, this is the most important part of this MCP. This is the working memory. This is the window. This is when you're entering something into chat GPT, you are in that context window. It's kind of the short term memory for the model that you're communicating with, right? That context includes your questions.

04:48
It includes all the previous messages that are in that chat session. So every time you enter a new question or request or do this for me, it's still including everything else that came before it, right? Because that window will fill up and you can even run out of, you have to start a new conversation. That's because your context window is full and it just can't handle any more information in that window. This also includes files that you might upload

05:16
as well as MCP resources. This is a big piece of also because this window is set. In fact, this is the big thing, the reason why there's such a thing as a model context protocol is to kind of work within this window and to not flood that window or make it so small that you can't get any information out of the LLM.

05:43
It's demo time again. Wait, so let's check out, see where we are. Oh, there's our threat intelligence brief. You can see how it was getting some findings. It got some products. It's searching through the OWASP top 10 and getting some vulnerability statistics from 2024. And again, the original question was give, analyze our vulnerability patterns against the OWASP top 10 and create a threat intelligence brief with industry benchmarks.

06:12
I haven't seen this before. So, and this is also being done on demo data. So our results may vary. So our current threat landscape, how many total findings we have, we've got a number of criticals. By context, we're way over, etc. So now we've got some comparisons to kind of justify some things that we may want to change or do analysis. So we've, we've got a number of the OWASP top 10 that we need to be kind of

06:41
worried about. And so here's some more detailed information about those. A threat intelligence assessment. We've got a lot of access control failures, so that might be the highest thing we want to hit. So I have some strategic recommendations. Here's some strategic initiatives and success metrics that we can measure by later. And there's your threat intelligence brief.

07:10
Not too shabby, I must say. Let's do another one. I'm going to come back. Let's go to, let's do something a little bit more challenging. We're going to

07:24
create a comprehensive security transformation strategy, including current state analysis, gap analysis, remediation, roadmap, budget requirements, and some success metrics for a board presentation. Go. And off it goes.

07:40
I've done some sacrifices to the demo gods this morning. I think they're pretty happy. The volcano was really hot. Okay. So let's get back to why. So we're back to this context that we were talking about that context window, right? This is why MCP exists. So MCP servers solve the limitations of that context window. They don't expose all the endpoints because that would fill up that window, right?

08:09
So you have to be kind of selective. You also want to kind of understand the kind of questions you might be getting from the source of that data. There's some pre-processing and it can deliver information that can stay within the context of that window. So then the LLM can kind of focus on all the analysis it's doing based on the results that it's getting from the MCP. So it really makes the LLM a lot more effective when it's talking to any of these resources.

08:38
Is it 10x effective? We shall see. Let's dig a little bit deeper into what the MCP is made up of. So at a very simple level, an MCP server doesn't just translate findings coming back from the API or from the application. There are extra resources. You can add data like, well, I'll give you some examples in a moment. You can add the tools. These are usually kind of related to the API calls themselves. I'm going to make this call to get this data.

09:06
I might pair it with some of the resources. I might even have some pre-built prompts inside the MCP itself to help the LLM understand the kind of things that it's going to be getting back. So it's more than an API abstraction. It's doing a lot of this kind of efficiency gains when it's giving the data back to the LLM based on the prompts that it's getting.

09:32
Here's a little bit more detailed view specifically of the DefectDojo MCP server. These are all examples. This is not necessarily what will be in the final MCP server because as you iterate through these things, I am not an expert in this. We are all learning as we are going. This is a fairly new concept and so I'm not an expert, but for examples, getting findings, tracking SLAs, trend analysis,

10:02
those resources examples there like you can see CWE OWASP Top 10 various compliance frameworks so you can add resources to help compare or contrast or add extra data there kind like we did in the first prompt. Also there are built-in prompts that will help guide the LLM based on the things that you may ask it to do just kind of helps make that more efficient as it's breaking down that prompt so

10:31
several different areas where the MCP is contributing and making this a little bit more efficient. Have I given Claude enough time? Apparently I have. So again, created a security, a comprehensive security transformation strategy, and it gave me this strategy. Now there's some other things in the background that we might have time to show later, but apparently I'm terrible at remediating vulnerabilities. We have a

11:00
executive summary here that says our security posture is not good, potentially is going to cost the money or cost us money, expected 18 million in avoided cost. Well, so as you get there, you get this current state analysis with some security posture, some infrastructure overviews, some risk factors, here's a gap assessment. So a lot of this is going to be based on the accuracy of your data. If you have

11:30
Bad data, you may not get as accurate results. You may have to really review something like this very, very close to see. I got an 18-month transformation roadmap. It's probably going to take longer. And then we've got some budget requirements and some resource allocations that if you invest so much, you potentially can get a three-year return on that ROI just preventing potential breaches, compliance, et cetera.

11:58
and some more success metrics. So pretty cool stuff from a prompt where I basically didn't do all of the arduous data mining to get this. Yeah, and I need a decision quickly. So pretty good so far. Let's give it another prop. Let's give something a little bit more.

12:23
Let's do a security operations briefing with current threat, aging analysis, team assignments, and operational recommendations. And we'll let this one.

12:36
Yeah, we'll go back up here to start a fresh context window, a fresh, we're going to generate some, do a security operations center briefing. Let's see what it comes back with this and off it goes. All right, you do that. Starts with getting findings. All right. So while is doing that, let's answer a couple of questions that might be, might be in your head. So why would I need this? Because an LLM can talk

13:03
directly to API's. It can actually talk directly to all the scanning tools, right? Yeah, we're going to get to that, promise. But why would you need this in the first place? Well, let's pretend you didn't have a thing called MCP. Without MCP, your AI LLM might have to connect directly to a whole bunch of tools directly. And it's going to struggle with this. Every one of these tools has a different data format.

13:33
Every one of these tools will have all the same vulnerabilities if they're scanning the same repo or same container or those kind of things. They're all going to have a lot of the same findings. So that means you're gonna have lots of duplicates across all of those different tools. There's not really an understanding of what a HAPI is providing. You're just going to kind of flood the LLM with a lot of data. That's going to quickly fill up any contextual windows of what it's trying to get data from all of these kind of things

14:01
and you're going to waste a lot of tokens doing this. Tokens represent basically word counts. So as the LLM is going through its neural network and it's all the training that it has, it's using tokens, which is the language. It's actually the words and the sentence. So you don't want to waste tokens because that also kind of fills up your contextual window because it's got all this stuff it's trying to traverse as it's getting you an answer.

14:30
and you don't want to be at 100%. There's some sweet spots even within the contextual window. So not great approach here. Let's say what if every one of these tools creates an MCP server, of like Defectojo? Well, you're still going to kind of hit some of the same problems. Yep, it is definitely going to be more efficient. However, because none of these tools are really familiar or aware of the other tools, and therefore the data is not normalized, finding in this tool,

14:59
probably isn't going to look like a finding in another tool. The fields are going to be named different things. And the LLM is going to wrestle with all of that. It is going to have to do all of the normalization and deduplication and correlation across multiple tools. That's asking the LLM to do a lot. In fact, you're asking the LLM to do exactly what DefectDojo does. We get a lot of talk or requests or questions about data lakes, right? This was hot in the...

15:29
You know, so many years ago, everybody was selling data lakes and you're going to put all your data together so that you can use all of it. And so now a lot of people are trying to plug their LLMs into data lakes. You're going to see a lot of the same kind of problems. I'm sure there's going to be a lot of solutions that address some of this kind of thing. But essentially, it's no different than if you were to go into a data warehouse and try to figure out all this on your own. That's what the LLM is going to try to do. And so again, it's still going to be

15:59
you know, fighting against erroneous information. A lot of times you may have heard the term rags or rag servers, retrieval, augmented generation. Here's another one of those terms. I won't break this one down like I did the others. But basically what this means is if you have a rag, are you give a prompt to the LLM that prompt gets processed by a server, not like an AI server, but basically like kind of a

16:26
It will basically use regex to parse out what you put in your prompt. If it sees things like, you know, I wanted to see a report of all of the sales last quarter, it'll see that word sales for the quarter and it will turn that into a way to retrieve that data or those documents. Literally, it'll go into your data lake, it'll pull the documents and it'll add those to the prompt and then it gives that back to your LLM. So it's

16:55
That's why it's called retrieval augmented. It's going to go get files. It's going to go get these things, bring that back so that that is part of the context window so that now it can generate the report. So it's kind of not a way that MCP works, but it's more of just kind of a brute force method of bringing back data, documents, and things like that. Same issues are going to be there as far as the LLM trying to traverse a lot of data when you're entering those prompts.

17:26
So the big point is DefectDojo is already solving a lot of those problems. That's exactly what DefectDojo does. It is going to aggregate vulnerability findings from all of these different tools, different times that the scans are run throughout an SDLC. And it's going to normalize that data. So all the data looks alike, right? We want all the apples to look like apples. And therefore, then we can deduplicate so that we're not seeing

17:51
duplicates. We're not seeing false positives that live in all the individual tools. We have one place that we can kind of normalize and clean that data so then the LLM can do analysis on very clean accurate data which means you're going to get very clean accurate responses or at least a lot more accurate than if you're just throwing everything into the LLM. So this is a huge, huge improvement

18:17
over just connecting everything directly as I was showing before. A little bit more detail. I'm not going to read all this, but this is when you implement the DefectDojo for all of your scanning tools, all of the vulnerabilities you're trying to track, and then you plug that into an LLM, you're getting all the benefits that DefectDojo already provides, which is huge when you're talking about the efficiency or the performance

18:47
of the LLM. We may be getting to 10x, but before we do that, let's go back and see how we're doing. Our SOC report says we are terrible at remediating things. So we have a very high threat level. We've got some prioritized threat analysis, things that need to be fixed right away. We've got some aging analysis, which I'm really surprised it's not even worse than that.

19:14
So we got some team assignments with the workload distribution, which is interesting. Immediate action items within the next four hours, unbooked. Some priorities, key performance indicators. That's, I mean, really impressive. I love generating these kind of analysis with AI, because, I mean, my gosh, that's a lot of analysis and work and findings to kind of pore through.

19:43
We're going to do one more of these. We're going to do a big one. So this one is my favorite. We are going to ask Claude, if I run out of tokens, I apologize. I've never made it this far. My CISO wants to evaluate the effectiveness of our current SAS tools. Use DefectDojo to give us an analysis report of false positives, developer team performance,

20:12
recommendations for tool configuration improvements, training gaps, identified from recurring vulnerability patterns, so things I may want to train my devs on, and a cost analysis of my current versus recommended tooling approach. Yeah, format this in HTML so I can use this to justify some budget. Go. All right, we'll see if we run out of tokens. All right, it's off and running.

20:40
So 10x, really? You think it's really 10x? The number 10x actually came from Claude. I asked Claude to give me an analysis of how much of a performance improvement. So I'm going to blame Claude for this number because 10x seems a little bit high. But let's break it down. And we're going to break this down. So performance versus efficiency, right? These are different words. Performance means how fast you're going.

21:07
Efficiency is how much gas you use to go that fast, right? Intelligence amplifier means thinking of things that maybe you haven't thought of or doing analysis that you don't have time for, that kind of thing. So intelligence amplifier, that's a really good thing. And an accuracy enhancer, because again, better data, clean data, better results. So is this really 10x? So I asked Claude, let's break this down. Prove to me that this is 10x.

21:37
And this is kind of where we went. First, the vulnerability data quality improvement. It's basically we're saying 5x here. Why? Well, first, if you have all of those different scanners and they all have their false positives, they all have their duplicates, you've got a signal to noise ratio. You've got a lot of data that is not helping you or the LLM at all. In fact, you're probably at between 10 and 20 percent because for every tool that you add, you're adding their duplicates, their false positives, their

22:07
unnormalized data. So when DefectDojo process that you are cleaning up that signal until you've got clarity 85 to 95 percent. It's clean, no duplicates, no false positives, or at least the false positives are consistent across everything. The data import processing and the normalization. So raw tool versus DefectDojo normalized. Without going into these, let's just call it if it's 70 percent token reduction.

22:36
Unified authentication, if it's every one of those different tools, it may struggle with, you know, actually checking within with all those tools for with other with authentication and security layers there. So let's just say 3X data enrichment, raw scanner versus DefectDojo enriched because we will add EPSS. We actually can do some cross tool correlation asset inventory. So we're going to say maybe 2X. So if you know,

23:06
Is this how the math works? Yes, it is how the math works. If each one of these only improved things, it just kept things level. If it was just one X, you know, doesn't change anything, you would get one, you know, one to one. This is where sometimes you hear people, especially in sales, talk about, one plus one is three. It's a force multiplier. No, one times one is one.

23:27
In fact, most computer integrations between different software, you take this piece of software and you integrate with this piece of software and you don't even get one from each one of them. You get like a percentage of what works on each one of those tools, right? So a percentage, 50% times 50% is not one, it's smaller. So if you're not getting equal on both sides of this equation, you're not even going to get to one time X, right? You're going to lose things. That's actually the norm.

23:56
But here, we're getting a lot of performance improvement from DefectDojo. And this doesn't include the MCP server. So if we look at the MCP optimization, what the MCP is doing for this, because we have those tools, we've limited the API calls, we've added some knowledge base to this, we've added domain prompts to make this. So the intelligence amplification we're talking about. Again, if I'm saying 10x,

24:25
I just need a two and a five to get to 10. If we call this four, because we're adding a lot of efficiencies here, performance optimization, 2X, 3X, enterprise integration, 2X saves everybody a lot of time, future-proof architecture, ability to add tools and subtract tools very easily because they're coming through DefectDojo. They're not having to be integrated directly into your LLM. So here, again, basic math.

24:54
24x if you change all of these to two you'd have two times two times two which I think is eight still. So somewhere in there right not saying those are hard numbers, but now 30x 24x 720x have I gone crazy? Now I'm just trying to prove that 10x is actually a very conservative number because again if if DefectDojo just helped with twice the work

25:24
and your MCP did five times the work at more efficient. That's 10. All we have to have is a little bit more help and you're going, and this can really depend on the MCP implementation. There's lots of different ways of doing this. So this is where we kind of understand our MCP has to be made more efficient to basically add the things that DefectDojo is doing and vice versa. And this is really becomes a force multiplier. You're everything from DefectDojo.

25:53
You're getting everything from the MCP and the LLM, and that's how you get a force multiplier. This is why this is such a big deal for DefectDojo. How did we do? Oh, there's our SAS effectiveness analysis you can see, and that didn't even get prompted to continue. So we've got that analysis. It did a lot of getting data from DefectDojo.

26:20
Here's our current state analysis, our executive summary. Remember this is for kind of the CSO. Verification rates, false positives, analysis by tool and the vulnerability types, CWE distribution, developer team performance comparisons. So my prod blue doing pretty good, but some of these other teams may want to take a look at some things. Team specific challenges.

26:50
Tool configuration improvements. So got some configuration changes we can implement or suggested. Tool integration improvements. So maybe some CI-CD, IDE tools, some critical training needs for my developers.

27:08
training impact projections. And then here's a cost analysis of current versus recommended approach that we can spend some time on. Some investment numbers and there's implementation roadmap. It just goes on and on. Also probably good reason to limit the prompts to very, very specific things, because this one has got a lot in it. So this is a...

27:34
I mean, that's just amazing to me that you can generate a report with that much information and insight based on vulnerabilities in an environment and that that could provide some value in a very short, quick amount of time.

27:53
Wow. So in summary, that was fast. Force multiplication for optimal abstraction. Each side maximizes the other. DefectDojo is already doing all the hard work because normalizing and deduplicating across all of these different tools is not easy. I think it's underestimated how different all these tools and the data that they provide can be different.

28:21
you know, even if it's just the same CVE identifier, it's insane how they're all different and don't necessarily work together. And then when you add something like an MCP server to that, I think then you're getting some real value, especially if you're doing internal LLM experiments. There's a lot of shadow IT happening right now of people trying to do LLM internally, trying to do it privately.

28:49
MCP also helps you kind of keep things private. You can run this inside of an air-gapped environment and keep it there because you can keep all that data local.

29:02
That's the spiel. Questions, comments of what we have shown.