Getting Data into DefectDojo Using LLMs TY Page

Written by TRACY WALKER | Aug 8, 2025 10:08:03 AM

Transcript

00:07
Hi, everybody. And I may just fake internet connection just to watch Chris's performance, because I hear he's a great singer and knows some Broadway songs there. Hi, everybody. Yeah, so we're going to talk about extending open source tools using AI, or a little bit different title, generating a DefectDojo parser, using only AI. So hopefully that's of interest.

00:36
First and foremost, and I thank our founder, Matt, for this particular slide. I love this slide because it clearly communicates to everybody on here that I have absolutely no idea. I am not claiming to be an AI prompt expert, although I've been using AI for a couple of years and, you know, helping my daughter with.

01:04
some reports and different things like that, but I'd never really tried to use it for actual code. Read a lot of bash scripts and kind of some simplistic type things, but saw an opportunity to try to, well, what was I trying to do, to extend some of our open source tools. So here's our agenda. Some of this will be, we're going to alter this a little bit. I've been, I've done this.

01:32
presentation as a meetup. By the way, shout out to the Minneapolis OASP. They had us here, that's where I'm at, is in Minneapolis, Minnesota. And so the OASP chapter here, we did a meetup and did this presentation and it was really, really good, very interactive. So I'm gonna try to adapt this a little bit for a webinar where it won't be quite so active.

01:59
interactive, but we'll try to get the same points across. And hopefully you'll come away with this maybe with some ideas or maybe even a little bit more confidence about trying to use an LLM to generate code or something like this. So we're going to talk a little bit about DefectDojo, how we're going to extend it through the parsers, kind of talk about exactly what I was trying to do. And then we'll get into this quest. This is a two to three month kind of effort.

02:29
over time and even as, and I mainly have been using cloud, so let's talk, we'll talk a little bit about that, but we'll talk a little about cloud and even the advancements in the LLM as I was going gave me new life to try it again because I would kind of get lost and into some hallucinations and things like that. All right, so very first thing, if you have joined this and you don't know what DefectDojo is, I'll give a very, very brief summary.

02:58
The thing that I really want to point out here is down here where you see the 200 plus integrations equals parsers. So that's what we're really talking about here. We are talking about the way that DefectDojo is able to parse the output files from lots and lots of scanners, right? And these integrations, integrations, we're even going to talk a little bit about that word. I personally hate integrations. It makes you kind of locked in with tools.

03:27
We'll talk a little bit about that as well. So here's a clarification on what defecto2 does and why we're talking about these parsers. So everybody has a software development lifecycle and it's always unique. Any environment you go into, it is unique. And in those environments, you have a variety of ways you might be testing your code, your applications for security throughout a software development lifecycle. There's all different kinds of tests.

03:55
There's hundreds of different tools you can use. Everybody wants to use different tools and that's awesome. But this is what DefectDojo does. It is able to import the output from all of these tools. And this means, and this is kind of why I even draw the logo above the SDLC, right? I'm not putting it in the middle and having a bunch of lines because I want to communicate that DefectDojo just...

04:23
gets the results from all of these different types of tools. And it's a one-way export from the tool, import into DefectDojo. So you're not even really integrating the tool unless you use our API connectors or some of these things. But in the open source, as well as the Pro, we are able to do this for all of these different tools. And this is the slide I wanna get to. So we're gonna be talking about these parsers. There's a one-to-one relationship.

04:53
and we say these 200 plus supported security tools, well, some of these tools may have multiple exports or different types of exports. And so we also have parsers that can handle those outputs. So like all of these different tools, and we're talking to all, you know, lots and lots of tools, and this is not full coverage by any stretch of the imagination, but there is a one-to-one relationship or a minimum one-to-one relationship between the parser and these tools.

05:23
Some of these may have extra outputs or they may report on this part and then do a different report for that part. So we may have multiple parsers for these. But this is what we're talking about is these parsers right here. Now, this is the cool thing about DefectDojo is these parsers have been created by the community as well as our DefectDojo engineers. And so...

05:48
Also, these tools will change their format. And especially because you have so many people using DefectDojo, usually they will detect that pretty quick and we can fix parsers or the community can fix parsers. We even now have a universal parser inside of DefectDojo to help adapt for those kind of changes because they are impacting. You know, we'll talk a little bit more about the vendors and especially the ones that are intentionally breaking.

06:18
their output or maybe obfuscating certain data fields from their output because they don't want you to use it in other tools. Personally, I think that's a bad thing for security. So this is what we're going to talk about. Now, the reason why I was trying to do this with AI was, well, for one, building a parser is kind of core, right? For understanding DefectDojo, these parsers are what have made it the adoption of it very, very cool.

06:46
because again, I don't have to change my SDLC to pull things into DefectDojo. My SDLC, all of these tools are not aware, again, unless you're using the API connectors or whatever, but they're not aware that DefectDojo even exists. So I don't really have to integrate it. I just have to get the outputs into DefectDojo, right? To build a parser. So another reason that parsers are so important is because they can be very detailed.

07:14
Our finding objects, when we parse this data, it goes into a finding in our database, right? And so a finding object has over 50 fields in it, right? There's all these different data elements and metadata. And then the parser, depending on the tool, will take all the fields that it can from the output of the upstream scanning tool and try to map all those fields into one of those 50 plus finding fields in DefectDojo.

07:44
And so that is really a lot of detail in the parsers. You can do it specifically for each tool. Every tool has different formats and different labels and all of this kind of thing. So it's the way to kind of normalize and standardize what we're putting, getting as much as we can from those upstream tools. So to build these parsers can sometimes be a little bit of an effort, both for community or...

08:14
defecto Joe. So I wanted to save us some time wanted to see if I could at least get us like 80% there, maybe 90% there, maybe it wasn't perfect, but maybe it would save us a bunch of time by using some AI to kind of automatically generate this. So I saw it as a way to use AI for something real, other than my daughter's Book report and

08:38
Also would give me an opportunity to not only contribute to the DefectDojo community, but also become a member of that community, have that experience as a community member, and hopefully share and learn some new things. So let's talk about how would you even approach trying to build a parser in DefectDojo? First and foremost, everything that I'm using is 100% open source. I developed this on a Macbook Pro using Docker, GitHub.

09:08
CLI commands, et cetera, et cetera. I have a Claude Pro personal subscription, which up until last night was $20 a month, but they just offered a little sale. If you subscribe for a year, drops it down to about 15 a month. I pay for this myself. Part of this for me was to try to do this independently. So DVIC Dojo is not reimbursing me for this. They've actually given me a lot of time to spend to work through this.

09:37
Thank you to my employer. I didn't really use a lot of ChatGPT or DeepSeq, mainly because Claude has some capabilities that make it a lot easier to try to do this. I did notice one thing when I was kind of testing ChatGPT and DeepSeq, and this is for anybody who's using those, try entering the same prompt in both at the same time and see what kind of results you get, because I was getting...

10:06
crazy similar responses based on my prompts from both. The exact same prompt in both, and it was just crazy how close they were in those responses. Almost not quite identical, but pretty close, like eerily trained on the same info. Just a little side note there.

10:29
Rules. Why would I need rules for this? Well, I kind of wanted to have some boundaries of exactly what I was trying to do so that, you know, kind of keep my focus and not get to in the weeds. And first and foremost, if your company doesn't have any kind of like AI, LLM policies, then just know you should never be uploading anything proprietary into an LLM. So any proprietary source code or anything like that.

10:57
But since DefectDojo is an open source project and that code is open source and is on the internet and all of our documentation and all of these kinds of things for the community. So all of that is free game. You can take those and put those in an LLM because they are open source, they're open to the community, open to the internet. I didn't get any special help. I didn't want any special privileges. I really wanted to experience this as a community member. There is no human generated code.

11:26
I use the prompts, I cut and paste. Yes, I did some debugging and even some editing, some of the Lint checks, extra lines or spaces and stuff like that. I can edit that kind of thing directly. Really important that I was trying to make this repeatable. I wanted to be able to automate this and be able to do it again and again, maybe even just being able to drop some example outputs from a part from a scanner and try to get the AI to just build.

11:56
the parsers from scratch. And of course, we're trying to create something of value, not just save time, but actually give us something that we can create some value there. Real quick on some parser specifics. So it's not just one file. There are multiple files that have to be changed to introduce a new parser. As I said, there's over 200 currently.

12:21
and they're all open source, so that's very powerful. That means no matter which version of Dojo you're using, you can use those parsers. This bullet right here, varying data mapping detail. So sometimes community member names a parser, maybe needs to have that done quickly. So depending on the output from the file, let's say that they run a scanner and it creates a CSV or JSON and it has 40 fields in it.

12:52
but they only mean 10 or 15. That may be the only 10 or 15 that were put into the parser. So there is a variety of detail in the parsers. The potential for detail is max. You can get as much out of those output files as possible. But there is varying detail there. There is also varying detail on the documentation, as you will see. And even as I began to look into this even deeper,

13:18
varying ways of doing the parsing for the exact same kind of fields because you give 10 different developers the same requirements you're going to get 10 different applications. So a lot of different approaches to this parsing function. So all of this line right here is an opportunity for improvement using AI tools and you'll see this here in a moment. These files that you're seeing here, so the reason that I have the unit test files here in green,

13:47
That is usually what is provided either by a vendor or some, you know, maybe it's a DefectDojo pro user, they'll, they have a scanner. It produces JSON or XML or CSVs and they'll send those to support. Like, can you help us build a parser for this so we can get the details? And so that's what we do for DefectDojo pro customers. Um, so they will provide this. Usually it's one file that has a whole bunch of findings in it.

14:15
And then what we do is we edit that file, we create a file that has zero findings, that has one finding and then has many findings. Or if it's got too many, we'll, you know, shave it down to, you know, five or 10 or something like that. That's where the unit tests. So the unit tests are also going to run to double check that the parsers parsing correctly and it uses these files to do that check. Well, that's going to be one of our inputs. So we don't have to create those.

14:43
I've also got some green check marks down here for this settings. The settings disk is really more of a cut and paste, so you don't necessarily need AI for this. It helps if you have a little bit of an understanding of how the hash stuff works, but you can almost kind of get away with just kind of a copy and paste on the hash settings, as well as the documentation. Now, my first attempt, my plan was to do the documentation last.

15:09
And then I learned, oh no, you should do the documentation first and use that as an input. So let's get into that. So my first approach. Obviously in LLM, Cloudeye, you can upload Markdown, HTML, text, JSON, Python. It can read all of those kind of formats. So for my first prompts, I was basically taking or creating some documents or taking documents from the documentation.

15:39
that related to the architecture, the build process, the finding class. So I actually did a JSON API call to get a finding class so you could see what that, so I could see what it looks like. This is our target. We have all of these data fields, what I'm going to map into, and of course, and then other parsers, because we do have over 200 other parsers. And so I thought I can use some of those as an example to provide some guidelines, right?

16:08
and that many of those exist, as well as examples of the files that the parser is parsing. I'll show you that in a moment. So the plan was to build the parser, build, test, refactor, repeat until I had something working. Then I would generate the unit test and then generate the documentation. My first stop was I didn't know exactly which parsers.

16:33
supported which types of files. You can go into the documentation and one by one kind of figure that out. But I needed to really just make a list. So I use Claude to give me some scripts and commands to basically figure that out from this directory here. In our GitHub under Django DefectDojo, there's a folder called unit test. And in that, I gotta move my little thing here.

17:01
Yeah, unit tests, scans. And so in there is, I was about to scroll my screenshot there. So there is a folder for every single parser and inside every one of these folders is either a JSON, CSV, XML, there's even a few other kind of stringers there. Go back, there we go. So you can see we have different categories.

17:26
So I just ran some scripts to figure out which ones support CSVs, because those are what I'm going to use as my examples. So I would have the parser code for each one of these that does the CSV, because that was going to be my first plan of attack. So I loaded those CSV files. I loaded the parser documentation and parser examples for like three or four different parsers, because I thought those will be good examples.

17:54
When it did build the parser, because it certainly did, it took all the documentation that it kind of uploaded into that and gave me, after a couple of changes, I could successfully build it. And when I would go into DefectDojo, Rapid Fire was one of the parsers now. And it was missing some data fields, it wasn't quite working, so it kind of was going back and forth, this isn't working, I'm getting this error, et cetera, and the AI would...

18:23
Feed me back changes, et cetera. Unit tests weren't working. And then as the chat got longer and longer, and I could kind of, I could, I think, even show you this, but I'm gonna skip it because I'm gonna keep us on time. The LLM began repeating errors, or it began like I'm developing on a Mac, and it would give me instructions for a different operating system. I'm going, wait, I told you, I'm running on a Mac.

18:49
got into some arguments even. So we began to hallucinate a little bit. It was getting off track, chat session was getting super long and it just got into this grind where it just was not working. The lessons I learned here from, well, that prompt order is super critical but also kind of organizing it. One of the things that I did was I basically loaded the example or the...

19:16
you know, I want to build a parser for rapid fire. So I loaded the rapid fire output first. And then I loaded a bunch of examples. And that's the wrong order. I should have had all my examples. And now I want you to build me something new because later in this chat session, you know, this back and forth, it started getting confused as to what parser we were trying to build versus the examples. And it just, again, got to a point where I was like, okay, so this just isn't going to work.

19:45
And I kind of gave up. Um, yeah, the troubleshooting, the chats got too long, got too complex. It started spinning and it just, I could tell it wasn't going anywhere. So took some time away from it, mainly because I've just kind of given up, but then Chot Claude, pardon me, I was just running a little bit, Claude came up with projects and projects allowed you. So part of the problem was I was having to load all of this into a chat. And do a conversation. And.

20:15
All of that documentation, so if I had to start it again, I had to go and put all of the documents into a folder and try to upload them all again and restart the chat. And then I started typing all of my prompts into a document because I knew I wanted to make this repeatable. And so to do that, I wanna cut and paste from some kind of document that has all of these prompts in it ordered, right? So by doing a project, and I...

20:43
I think this is where we can go. I keep having to move this little zoom window. Let's go into all of our projects. So you can see here the parser generator tool. And so a project allows you to load all of these files over here. And you can see, some of these I've updated to prepare for this, but you can see the branching model. How to develop a DefectDojo parser. This comes straight from the DefectDojo documentation.

21:12
And what I did to adapt some of these documentation, it's marked down, but you can see up here, I got up here a description, the draft, false weight. So I added this little title to some of them to help Clog kind of keep these things organized as it's building its little index of all these files, right? So otherwise down below here, this is straight from the DefectDojo documentation page. So no other edits, copy paste.

21:42
just upload that right into there. At the very bottom, so I'm gonna scroll down, just for some reason you see all these CSV files. So these are the inputs. And let's see if we can, yeah. So we see here a couple of things with the project. You can set project instructions. So you can see the prompt that I had started here.

22:05
You are an expert, an assistant, we're developing a parser generator tool, et cetera. So I'm giving some overall project instructions that will be used every time. And then I was loading a whole bunch of files. And some of those files that we were loading, hopefully remember that list of all these CSVs. Well, I created a little script to give me those. So this is the part I wanted you to see. So we've got a CSV input that hits the parser.

22:34
The parser then parses that data and puts that into a finding object. And then what I would do is I would use our API to pull the results of that input and how we had parsed it. And what this allowed me to do was for all the parsers that support CSVs, I was able to then push findings using the test file. So I would actually use the parser, push that in via the API.

23:04
and then pull out the results that would be in a JSON file that DefectDojo creates, right? So now I have three files and I would copy and rename those three files because every one of these, the parser is named parser.py. So I would need to change the names of all of those. And what you're seeing over here, you can kind of see like the twist lock, I've got the parser code, I've got a CSV input, that is what comes from twist lock.

23:34
and then I would have the JSON output from DefectDojo. So I have the actual input, I have the parser code, Claude can't run the code, but it sees the code, and by having the actual output, it can say, well, here's how the data was parsed, here's the data that existed in the input, and then this is what the parser was supposed to be doing. So it doesn't have to run the code, but it can see what it did and exactly what it did. And you can see...

24:01
I created that for every single one of those CSV parsers, kind of tried to automate that because that's what I wanted to put into right here. And so that's what you're seeing. Those are the CSV inputs. You're seeing some of the markdown. So that's my unit test. You see the finding class, the build environment, my local Mac environment, the BlackDuck finding. So here we have the BlackDuck parser.

24:30
So that's the actual parser code. This is the JSON output from DefectDojo. And then what's this? Black duck mapping. So this was part of the breakthrough that I kind of learned about using the AI to build things to then feed back into the AI. Let's explain. So this parser mapping, this is a big part of this. So again, I was trying to create this mapping of the input, the parser.

24:59
and the output. And that's what you see right here for TrustWave. CSV, there's the Python parser, and there's the output. There is the prompt that I was using. So what I wanted it to do was take those three and give me a mapping, a way to say, well, here's all the fields in the input, here's all the fields that it was mapped to and any special conditions or anything like this.

25:22
And that gave me this mapping file that you see right here where it has the MD, the Black Duck mapping. So it would give me the mapping. Here's all the fields. Notice on this one, and I know this isn't, I would have just, did I make a, let's see. Let's go to Black Duck. Here we'll do, here we'll get one. Oh, here's even better, because I just did this. Data mapping, oh, there we go. It just wasn't scrolling.

25:52
So, TrustWave Finding Mapping.

25:59
Okay, this is the exact one you just saw on the screen. So I uploaded those three, gave it the prompt, and it came up with, so what I also told it to do is don't give me Markdown, give me HTML so I can view it and see what it looks like. You'll notice that the mapping, I also had to do some things like, for example, sometimes you're talking to like a three-year-old when you're talking to an AI LLM. For example, this line right here.

26:29
You can see this took twice, took me two edits to get this to work. Because the first one, here is the first prompt. See how short it is? And I'm even saying here, to be clear, I want you to re-import, etc., etc. Identify, etc. Well, that one didn't quite work, so I had to update it and make it a little bit longer, more precise. Show me every field, show me every detail. Count the fields.

26:59
verify that you have entered all those fields, et cetera, and it gave me this mapping. So total fields in the input, the mapping, what was mapped, what wasn't mapped, a summary, unmapped fields, additional finding field settings, right? So, and notice this right here, this parser line number. So this was, came kind of as I was realizing how this was working out really well.

27:26
I can identify the exact line in the parser that is performing the parsing for a particular data field. This gives me a way to index every parser that is parsing the exact same field. So if I have severity, severity should probably be parsed the same way in all parsers. A standard way of doing it that's easy, repeatable, et cetera. Description.

27:52
IP address, all of these components. Well, every one of these parsers has one of these. So if I have a way to compare the actual code, I could maybe even improve or identify certain parsers that will maybe it had total fields in CSV 30, and it's only gonna map 13. Well, what about those other 18 or 17 fields, right? So it gives me also a way to see how detailed these parsers may also work.

28:23
Why am I taking us on this little quest here? Well, what it made me realize, so there's the mapping. A little bit better, easier to see there. But what it made me realize is when I go to, sorry, scrolling the wrong way. When I went to our documentation, this was the documentation. And this is one of the things about, one of the beautiful things about having an open source tool is anytime you have 400 contributors,

28:52
You have 400 cooks in the kitchen. And DefectDojo is the soup. And when you have 400 chefs adding their own spice, that's why we have DefectDojo pros to try to clean up that soup a little bit. There's varying degrees of detail. Some folks will build a parser just for what they needed to do, but without really thinking about, well, maybe this needs to work for everybody and maybe it should parse more fields than what I need. And this documentation is a great example.

29:22
The template that's provided doesn't require a lot of detail. So in this case, this TrustWave parser, this is the documentation for it. Well, since I have documentation, I created an improved documentation format that includes all the fields that would give us more detail, exactly which fields are mapped, which ones are not, how they're mapped, mapping details. There's all this kind of data, right? And that became my first community.

29:52
pull request. And that, I believe, got approved and will be being updated soon. So now I have a way to even go through all of our documentation and try to update it a little bit to give us more of that kind of detail, help folks troubleshooting parsers if they're not seeing things right. So that was phenomenal. And that was also what enabled me to kind of go faster as far as this parser building project.

30:20
I did try to do a parser after that for an output from fluid attacks and not to call out fluid attacks here. We love fluid attacks. But they were obfuscating some of their data. Their output files don't always contain the name of the vulnerability. So that makes it really difficult to figure out what it is if they don't provide just like the name. So this is a case where...

30:48
In the name of security, and this goes for all vendors out there, you should be creating output files that contain all the information that someone would need and not hide certain things so you're there forced to use your tool. I know you're in it to make money, but we're in it for the name of security, and we would like to see all vendors be able to support exporting all of their data without hiding or obfuscating it so that you have to go back to these unique IDs within.

31:17
your specific tool. So worked a whole bunch on this one and then got to the point where I realized Oh, those fields aren't even in there. I should have checked them first and So that was that little quest and then finally the big boss battle Working to get this rapid-fire purser working So let's I believe that's where we are going to now review a little bit more detail in Claude So let's go back up to our person generator

31:47
project and you can see here there's a number there's going to be a number of verifying rapid fire parser stuff and I'll make this short because I don't think you really want to see everything that I was trying to do to develop this parser This is a little bit older And again, I'm not going yes. It even gave me the git commands and things like this So we're not going to go through all of my prompts however

32:15
You can kind of see, I do want to scroll up, and I should have done screenshots for this, for this presentation, I apologize for that. But let's just talk about over here, we're in the formatting the unit test. That means that it had worked up above here.

32:32
So keep scrolling out and you can see how long this check is, right? So the updated parser. So when I would get issues, I have confirmed, you know, that's what it said. Yes, proceed. So I would continually get the parser. There's, again, there's like three or four of those files that it would need to update. And what I would do is I would copy and paste the output from that. And I would paste that into my, oh man.

33:02
lost my, oh, I can't share that because I'm sharing Chrome, apologies. But I would edit this, I would rebuild it, test it to make sure everything worked, et cetera, et cetera, until I finally got a working parser. Then when I, something I didn't realize is then when I did a pull request for this parser, there's a whole bunch of checks that get run, there's Lint checks and things like that. And let's take you to there, so.

33:30
Uh, in my commits, I think this is rather funny because I was trying to kind of time this. I could do a pull request specifically for this webinar, but you can see here, uh, in the last hour, I was frantically trying to fix some Lint errors, um, lots of Lint errors. So I had 65 errors, fed those back into the AI, went down to 16, didn't solve many.

33:56
jump back up to 58, right? Third time's a charm, and now I've got even more issues. So continually trying to fix all of those, and I still may not have all of these fixed up to this point. But that's OK. I'll keep going. I actually think we did reach the point where

34:20
Let's go to.

34:23
a pull request on Django. So yeah, I'm still got a few issues in the checks. So majority of checks do appear to be working. I've just got some, I'm gonna get, oh, the linting actually did work well there. So I've got some things I still need to check to actually get the pull request in, et cetera. Oh no, it did. I think we did actually fix that. So here is the result.

34:53
So if I go to all my engagement, this is the open source DefectDojo. I'm gonna delete this last test.

35:06
And let's do a quick import from RapidFire. I can't tell you how excited I was the first time I searched on the dropdown and saw the RapidFire scan up here. That was a really exciting moment.

35:26
And this is coming from, I don't know if you can actually see this, but I'm picking the mini-vote. So this is the same unit test file that we took from our example. Let's import that. Yay. That was also another really exciting feeling when it successfully processed and showed me some findings. And there we see some findings down here. I'll click on this one and also highlight some different things. So you can see here.

35:56
we're importing all of these different fields. You can see this description. So this is being parsed and formatted. I struggled a lot with this area right here, this impact. These bullets, there are carriage returns within these different pieces of data. So I would get a bullet, and then the CVE piece would be below that. Was it like in line? So lots of really fine.

36:25
tuning on just the formatting. Also, I'm trying to make sure all of the links worked and we're taking us to the appropriate place for all the links that are provided. So all of these security advisories, for example, making sure that these all take you to the same and right place, right? Because that's what we wanna do with Dojo is give you one place to get all of this information. So that is a pretty successful looking import.

36:52
The only thing that I didn't like about the import is I was only getting two findings. So I went back to cloud, I took that CSV file and I said, hey, why don't you make me a larger file that gives me very realistic data, but is from different CVEs, but give me a bigger file that has information in that. So.

37:17
I think I can find it. Let's import some scan results for rapid fire again. So at this time, this is going to be an AI generated file.

37:30
And did I save it in a place where I can easily get it? I thought that I did. Pardon me, I know this part is fun.

37:44
rapid fire and.

37:50
Uh, here it is right here. No, that's the output. So as I was testing this, another tip and trick, uh, every time I would do an import, if I was having issues, I would go back to that same, um, go to the Jason, go to the API call and pull out the resulting findings. For example, um, when I was having those formatting issues, really struggled with getting Claude AI to understand exactly where I was talking about and exactly what I was seeing.

38:19
So by looking at the JSON, I could then take that little snippet that was that impact and shared just that portion. And you could see the carriage returns and things like that. That's the input samples. Oh my gosh, I should have put this in an easier place to remember where I put it. Let's see if I kept it in the scans directly.

38:45
There it is, AI generated rapid fire CSV. Open that one. And I know it tested this one.

38:56
10 findings. And so as I scroll down here, we see some criticals. I love the fact that I even was able to exercise the tagging there with that ransomware tag. And you can see I have, so let's just pick one of these. Again, this is an AI generated vulnerability. They discovered the age, all of these things, the vulnerability ID, did that work? That looks right. As I scroll down, I even have little

39:26
warning icon and associated DVDs. The formatting looks good. There's the NBD link, security advisory and the references. Really looks really, really good. So again, additional tests that I was doing to try to get this to work. And that is the, well, that is the results of that parser working. So.

39:55
In summary, I found a way, so the whole point of this was to find a way to use AI to generate something that was real, that would work within DefectDojo. And I think the results are that I can, one, improve documentation for all of our parsers, because I can now actually do some mapping and create some pretty good documentation. I hope to create a pull request for an updated format.

40:23
for the parser documentation that includes a lot more detail. That will be very helpful to, I think, users and the community as well. So I can update the documentation. I can update the format for the documentation. I can also now do analysis of every parser because I can identify within the parser code the line numbers responsible for a particular element or piece of metadata. And now I can compare that between parsers.

40:51
I can also do some analysis to identify which parsers are not getting all the data from the upstream tool. So that's another way I can improve parsers going forward.

View full post