Transcript
00:07
Thank you, Chris. Happy Tuesday, everyone. Like Chris mentioned, if you're new to these sessions, my name is Greg Anderson. I'm the creator and now CEO at DefectDojo. And so normally for these sessions, we usually have two parts, one session on the commercial side of Dojo, and then what we're doing on the open source side of the house. Today we have three sections to talk about AI testing tools as well, which I think is...
00:35
A fairly hot topic in our space and in our community. So today we have, we have three rather than two, but to get things going, this is our agenda for today. So we'll talk about some new integrations that are now in pro as requested by customers. We'll talk about AI testing tools and how we're thinking about AI testing and the tools we plan on supporting.
01:03
And there's also an open forum there if you'd like to share what tools you're using or how you want DefectDojo to think about AI security testing tools. And then finally, we'll share a community update, is primarily positioned around V3. You're welcome to ask questions at any time. We do also save a question section for the end, just depending on what your preference is as an audience member.
01:31
But first, starting with new integrations, I think today, security professionals have to ship data in more places than ever before. And so when we first created DefectDojo, we started with JIRA. This integration has become extremely robust with everything from custom templates to bi-directional syncing of issues. And this is all available.
01:58
and open source for people to leverage. But as more and more tools have kind of come into this area, the demand for others has increased. And so today in Pro, you can now do the same things with GitHub, GitLab, ServiceNow, and Azure DevOps. There is still a bug in the ServiceNow one we discovered. And so I think that one has been temporarily turned off. But if you're a Pro customer,
02:27
You can now push findings after they've been through DefectDojo's enrichment and distillation processes to any of these downstream integrations. And so this is what it looks like in the new UI. There's this new integrations tab that is under settings and the video walkthrough I'll show you in a second. Let's you first connect to the tools and then it lets you define the severity mappings and then
02:57
It also lets you choose how you want the integration to handle issues. Do you just want to push new things? Do you want to update old things? And then it also lets you tie this integration to what is a product today in DefectDojo or an engagement. And so we look at this as kind of the next generation of flexibility with integrations. The fact that you can attach these things to more than just a product.
03:26
for integration. So with that said, let's check it out. So the first step is just providing credentials to integrate with the integration that the platform will then use to make the connection happen. So we also always allow for a label. So if you're using multiple instances, it's easy to distinguish when you're working with these integrations.
03:53
And then once that is added, once the credentials are added, the next step is defining how you want these things to map, how you map say dojo severities to severities in the new issue tracker that's being supported. And so once that's completed, the next step is to define how you want the interaction to work and what you want it to attach to. So.
04:21
everything from just publishing new things to continuously updating the findings as they're modified.
04:33
And so once that's completed, the new integration is set up. And for the findings that you pushed based on the criteria that you've specified, you'll see a new little icon in the right that will hyperlink the issue.
04:56
And then when you click it, it opens it to the relevant issue that it's been linked to in Azure boards, GitHub, GitLab, ServiceNow, et cetera.
05:09
Okay, so that is kind of the big update on the pro side of the house for this month. Another area that we're keen into at DefectDojo is AI testing and so I want to say upfront that like I am, I'm pretty fatigued about hearing about AI. Maybe you all are as well. This is how I feel about AI at this point, but with that said I'm not an AI skeptic. I'm just
05:38
I'm just tired of hearing about it. And so I think the reality for security people is that AI is here to stay in some capacity. Some people I think will argue about how successful the outcomes are in AI from like a revenue perspective when we talk about business at large. But I think we can all agree that in some form or capacity AI is here to stay. And thus as security professionals, we have to start to think about
06:07
Well, how do we secure these technologies? And so, I mean, this is a brand new area for security in terms of tooling. When we talk about, you know, testing a web app in a classical setting, all these things are, those things are pretty well defined at this point. You have your set of DAS tools that you really like. You have your set of SaaS tools that I think are an industry standard. But this is a totally new frontier that we're trying to figure out.
06:36
both for our community, but also for ourselves as we add AI features to the Dojo platform, optional AI features. And so we have posted on our GitHub a request for community feedback, if you'd like to share and participate. The type of things we're looking for are listed, but at a high level, we're looking to understand what tools do you like, what tools are working well for you, and what are their shortcomings to both.
07:05
support these tools in the future, but also help people to create a comprehensive view of doing AI security testing successfully. And so when we think about AI security testing today, how we've largely broken this down is into four categories. There's a million ways to slice this in all honesty, in terms of how granular you want to get, but to start with, this is how we're thinking about it at a high level. So
07:35
First, we'll talk about input manipulation, how we're defining that, the tools we've identified that we think do the best of that today. And so when I say the best, it's not necessarily the tool that is the most comprehensive, but the tool that is, that it provides the highest ROI. It's the easiest to get started with, has the best coverage, has the best results. That's typically how we're defining the best. And also for...
08:02
On this iteration, we're specifically looking for things that are open source or free. We're not trying to, I think, feature commercial tools right now because we don't want there to be any barrier to entry and the space is just evolving so rapidly that we want to keep it low investment for people that are interested in this. But at a high level, starting with input manipulation, I think this is the category that most people think of.
08:32
when we talk about vulnerability is specific to LLMs that are used directly by end users. And so that's things like prompt injections or adversarial attacks that are designed to make the AI misclassified data. So anything that requires input manipulation, some people like to break these out. There's another, I think really popular diagram on, I believe it's eight.
09:00
different categories in AI security testing, but we wanted to kind of lump them together by vulnerability category, if you will, at least to start with for simplicity. And so when we look at tools that are doing this well, there's the adversarial robustness toolbox, which is published both by the Linux Foundation and IBM. There's counterfeit, which is published by Microsoft.
09:29
And there is TextAttack, who I forget who the publisher is. Art is really, really expansive when we were testing these tools, but a little more difficult to get started with. And so the clear standout for us, at least in this space, was CounterFit and Microsoft. was just, it's a command line. It's just much easier to get started with than the other tools of the space. We also looked at
09:58
GARAC from Nvidia, which is something we're using internally, but it's a little more expansive. It's not quite as focused. If you're looking to just check this box from security testing, we think this is probably the best tool in the space, at least today, but it's also such an expansive and growing space. It's hard to keep up, which is also why we were asking for community feedback. And then the next category that we're really
10:27
keen to cover is a data centric vulnerabilities. So this is primarily related to the data that actually gets into AI that is used for training, reinforced learning, et cetera. And I think this is the category that probably has the most sparse tooling, both great expectations and clean lab are not, I would say really inherent security tools. They're primarily focused on
10:57
validating data with LLMs. So with that said, great expectations, at least in our cursory review, didn't get great reviews from certain members in the data community who are experts in data integrity rather than security. Clean Labs we thought was pretty easy to get started with and we had decent success. But I would say in terms of like market evolution, tools available, et cetera,
11:26
This is probably the category we've struggled the most with internally to find something that we really like and is easy to use without a really, really heavy lift to get something valuable for the sake of security, for the sake of a security program or just supporting this area of AI security in general.
11:51
Next model architecture vulnerabilities. think this is one of my favorite categories, although it's not the most popular. Before becoming CEO of DefectDojo, I did a lot of work in security research. I published a vulnerability at DEFCON 22 on compromising CI CD servers, which was a side channel attack. And so I kind of think of these as like the side channel attacks of AI. So they have a
12:20
a place that is near and dear to my heart. And so these tools, this category of vulnerability is primarily about pulling data out of the AI specialized queries to either extract data or cause a situation where the service is unavailable or eat up the computing bandwidth, et cetera.
12:45
And so a kind of similar to the last category, I think these tools are more tangential than direct answers. But with that said, um, but from the tools we've identified, privacy meter was, was relatively easy to get started with. And so, this helped us just to, I think, monitor the integrity and get some insights in terms of how data is being translated that AIs are being trained on TensorFlow.
13:14
privacy we found really difficult to get started with. It's very, very robust, as is kind of everything that you see out of TensorFlow. But I mean, you have to read an encyclopedia to get started. I'm looking for tools that I can just read the read me and get going. And then art also has some of these capabilities, but we didn't really particularly like it for this capacity. And so
13:43
While this is my favorite category, I never think it's going to get the limelight like input injection will, but I'm sure we'll see some really interesting stuff come out of the space and out of this category. And then finally, just classic supply chain, oh not a particularly interesting category. I don't think there is a ton of difference between AI supply chain attacks and
14:10
just classical dependency supply chain attacks. So this category isn't particularly exciting. The tools are well known, already used, et cetera, but still something that I think is worth noting when we talk about comprehensive security programs designed to protect AI. So just to highlight again, if you do want to share feedback in this space with the Dojo community,
14:38
You know, please go share in the discussion or, you know, also on Slack, but we're trying to keep everything central to that discussion so we can figure out how best to support these tools, what the best ones are, et cetera. And so that brings us to the community updates for this month. And so if you've attended the last couple of sessions,
15:05
most of this you'll already have heard in some flavor. If you're new to office hours, I always like to provide the context. So if this is your first time ever attending office hours, you can understand what's going on in community and what we're talking about. So there's a little bit that's recycled from last month, but the big focus when we talk about delivering something for the community from DefectDojo is
15:34
V3 of the platform. And so V3 is essentially just trying to address some of the issues the community has highlighted to us or where the platform can be improved that are more foundational than we could have oh otherwise typically tackled without where Dojo is today as a commercial platform. And so the big change from last month
16:03
is we've started to actually trot out the settings for v3. So if this isn't in the release as of yesterday, it'll be out within the next two weeks. But if you look at your open source settings in Dojo, we've started to include some of the Booleans that will let you enable v3 functionality. And so oh the reason we're doing it this way is
16:31
to make dev easier for all the people that are working on v3, both our employees and key community contributors. It's just the easiest way to do development by flipping these Booleans on and off. But we expect when v3 is actually released that these Booleans won't be available. They're exclusively for transitioning and testing in a way that doesn't
17:00
interrupt people who are just using V2 of the platform, but also lets us develop with ease. And so the reason that we're expecting the toggles won't be controllable in actual V3 is just to give a unified experience that is easier to support. We either want you to be all the way over on V3 once it's released or staying on V2. Partially this is for community support reasons as well. DefectDojo, the organization has
17:30
resources that we pay that are experts that sit in our OAS slack channel to just help people with open source dojo. And so it would create a lot of complexity to have to understand like every single Boolean that was enabled or, know, if there's a bad combination that we didn't test for, we just didn't want to get into that business, but we wanted to be clear and transparent on why these things exist in the interim. So it doesn't feel like.
17:58
we're taking something away. This was just the easiest way to develop this without causing issues for people who aren't developers and who are just using V2. Part of the changes that we're looking to achieve, and we touched on this in the last session, so if you were here for that, just gotta hear that again, unfortunately, is to just rebrand some of these assets and create.
18:25
more complicated hierarchies in Pro with how microservices have developed. I mean, the feedback we've received is just that people want their data to be more granular. And so some of the things that you'll see shift our product type to organization, product to asset, and endpoint to location. And we'll see some extra derivatives of these in Pro, but we think this will also help.
18:54
open source with regard to nesting things, defining things and having a hierarchy that works better with hyper modern tech stacks and organizations. And so most of these things we already touched on, but just, you know, why are we changing these models and what do we expect? But that's not all that V3 is about. With some of the settings that are being trotted out, those are things that were
19:21
pretty certain on we will deliver for the community. The other things we're debating are we know we're going to do some amount of UI polish and open source. It won't be as great as what's in the pro UI, which is reactive, hyper scalable, et cetera, planned to handle, designed to handle a million findings. Although we actually test up to 22 million in a customer setting. But the other thing that
19:51
we're discussing internally is to open source connectors or not. And so if you're familiar with connectors and pro, it is a way to pull data automatically from the API for those who aren't using CI CD for integration. You know, on one hand, commercial is what makes open source possible. On the other hand, we see a lot of benefit to potentially open sourcing this and, asking for contributions and help.
20:21
to develop these integrations with tools that you want to bring into DefectDojo in a more seamless way. And then finally, we've been thinking a lot about parity because on the one hand, one of the commitments we have made at DefectDojo is to not make open source worse. there are, Dojo is a massive platform. There are so many features that we have to think about in port to V3 that small...
20:51
groups in the community are potentially using or have stopped using altogether. And so we're very determined to do what we've said, you know, above all else. But we're also trying to figure out like if we really have to port everything to V3 because that could change, you know, the delivery timeline from, you know, six months to a year, just depending if we truly have to do all those things for.
21:21
the sake of a hundred percent compatible compatibility. And then I think the final thing that we're kicking around internally is how do we get people from V2 to V3? When we say a migration, we mean a migration in the sense of Dojo, not the sense of tech. so if you've done any sort of dev with Dojo, you'll know there's this thing called a migration that's designed to move
21:52
database architecture automatically. And so that's how we're thinking about migration in the context of today. It's not as if you'll have to set up an entirely new server and move all the data. That's not what we're expecting. We are expecting though to likely move where we're hosting images. We've used Docker Hub for a really long time, but I think last year it was Docker Hub started charging us a significant amount of money for hosting. And so we just don't see.
22:21
A large advantage to keeping our images there other than we get to pay Docker money. And so V3 seems like a good time to host those images elsewhere with minimal disruption if if we're going to do that. But these are all things that are just up for consideration. If you do have thoughts, you know, please feel free to share them with us in the off slack or on GitHub, wherever you feel comfortable giving us that feedback.
22:50
So with regard to timeline, think we will see.
22:55
the ability to do an early cutover for open source users towards the end of the year. I think that will still likely be labeled in some sort of a pre-release tag, whether it's alpha, beta, et cetera. I don't think it'll be full GA by the end of the year. I expect we'll still be supporting and updating V2, but I think our goal is to have it in a complete state by the end of the year. So.
23:23
When we talk about these bullions that are added to get to V3 features, I expect all of those will be done at this point. And you'll get to see what a complete V3 experience is on the open source side of the house. The other thing in full transparency that is competing for our dev time and bandwidth is a big announcement at AppSec USA that is in the work. And in full transparency, the feature behind that announcement is our highest priority.
23:52
It's in the AI space, so that probably makes it pretty obvious, but we want to have that fully polished for AppSec USA. And so it's just the other thing that's competing with our dev resources for pushing V3 even farther along. And so with that, that concludes our presentation. I truly appreciate everyone taking the time to attend. I know as security professionals, everyone's very, very busy.
24:22
Your time is very valuable. And so we truly appreciate it. The one other thing I have mentioned, we got really clear feedback that people want to evaluate pro on their own terms and on their own timeline. And so in addition to offering a fairly, I think this is a three month trial for anyone that's interested in pro. We also now have a pro public demo that's available on our GitHub. So.
24:51
If you don't want to talk to a salesperson, if you just want to check out pro on your own timeline, we wanted to make that available to people as well. But with that, I'm happy to answer any questions. Thank you, Chris.