Semgrep
Semgrep is a fast, open-source static analysis tool that performs static application security testing (SAST), software composition analysis (SCA), and secrets scanning across 30+ programming languages to identify security vulnerabilities, code quality issues, and insecure dependencies by using pattern-based rules that resemble actual source code rather than complex regex or abstract syntax trees. The platform integrates seamlessly into IDEs, CI/CD pipelines, and developer workflows with AI-powered analysis to reduce false positives by up to 98% through dataflow reachability analysis, providing automated remediation guidance, customizable security rules, and enforcement of secure coding standards throughout the software development lifecycle.
Semgrep Integration with DefectDojo
If your team runs Semgrep in CI, you're already generating structured SAST output on every push or pull request. The question is what you do with it. Feeding those results into DefectDojo gives you something raw Semgrep output doesn't: a persistent, deduplicated record of findings across every repo and every scan — with the ability to track remediation status, assign findings to engineers, enforce SLA compliance, and report on code security posture over time. Without a platform like DefectDojo behind it, Semgrep findings live and die with the pipeline run that produced them.
Why Semgrep Matters
Semgrep is a fast, rule-based static analysis engine that runs across a wide range of languages and frameworks. Its value in a security program comes from a combination of low noise, high customizability, and tight CI integration.
- Semgrep's rule language is readable and auditable — security teams can write and maintain custom rules without being compiler engineers, and findings map directly to patterns in the codebase rather than abstract heuristics
- The Semgrep Registry provides a large library of community and Semgrep-maintained rules covering OWASP Top 10 patterns, framework-specific vulnerabilities, and secrets detection — giving teams immediate coverage without starting from scratch
- Findings include file path, line number, matched code snippet, and rule metadata, making triage fast: developers can locate and understand the issue without leaving their editor
- Semgrep supports output in JSON and SARIF formats, both of which DefectDojo can consume
- Semgrep Pro and Semgrep Code extend analysis to cross-file and cross-function dataflow, surfacing taint-based vulnerabilities (SQL injection, XSS via user-controlled data) that single-file pattern matching would miss
- It runs in seconds to minutes on most codebases, making it practical to run on every PR rather than as an infrequent batch scan
Advantages of This Integration
Semgrep tells you what's in the code right now. DefectDojo tells you what's been in the code, what's been fixed, and what your team has decided to accept or defer.
- Deduplication across repos and branches: The same finding type appearing in multiple repositories doesn't create unrelated noise — DefectDojo's deduplication links related findings across products and surfaces them as a pattern worth addressing at the framework or template level.
- Remediation tracking that survives pipeline runs: A finding first seen six weeks ago doesn't reset its age when the repo gets a new scan. DefectDojo retains discovery date, last seen date, and days open — giving you accurate aging data for prioritization and reporting.
- SLA enforcement by severity: Semgrep rule severity (ERROR, WARNING, INFO) maps to DefectDojo severity on import. You can configure SLA policies so that High severity SAST findings have a 30-day remediation target, with breach tracking and alerting when deadlines pass.
- False positive handling without re-opening: Findings marked as false positives in DefectDojo are retained across reimports. If Semgrep reports the same pattern on the next scan, DefectDojo recognizes the existing record and doesn't re-open it — eliminating the need to suppress findings in the scanner itself.
- Cross-repo security posture reporting: Security teams managing 20, 50, or 200 repositories get a single view of open SAST findings across the entire portfolio, filterable by severity, rule, language, and age. That view doesn't exist in Semgrep's native output or CI logs.
- Developer-facing ticket workflow: Triaged findings can be pushed directly to Jira or GitHub Issues with the matched code snippet, rule description, and remediation guidance included — giving developers everything they need to act without requiring access to a security tool.
How This Integration Works
Semgrep produces JSON or SARIF output that DefectDojo's parsers consume directly. For most teams, the JSON format (--json) is the simpler path.
Step 1 — Run Semgrep and capture output
# Run with all configured rules and output JSON
semgrep --config auto --json --output semgrep-results.json .
# Run with a specific ruleset
semgrep --config p/owasp-top-ten --json --output semgrep-results.json .
# SARIF output (alternative format, also supported by DefectDojo)
semgrep --config auto --sarif --output semgrep-results.sarif .
In CI, capture the output file as a pipeline artifact before the import step.
Step 2 — Import into DefectDojo
Use Semgrep JSON Report as the scan type for JSON output, or SARIF for SARIF output:
curl -X POST https://<defectdojo-host>/api/v2/import-scan/ \
-H "Authorization: Token <your-api-token>" \
-F "scan_type=Semgrep JSON Report" \
-F "file=@semgrep-results.json" \
-F "engagement=<engagement-id>" \
-F "product=<product-id>" \
-F "active=true" \
-F "verified=false"
Step 3 — Reimport for ongoing CI scans
For repositories scanned on every PR or merge to main, use the reimport endpoint to update existing findings rather than accumulate redundant test records:
curl -X POST https://<defectdojo-host>/api/v2/reimport-scan/ \
-H "Authorization: Token <your-api-token>" \
-F "scan_type=Semgrep JSON Report" \
-F "file=@semgrep-results.json" \
-F "test=<test-id>"
Findings no longer present in the latest scan are automatically marked resolved. New findings are created. Previously false-positived findings are left untouched.
Data Granularity: What Gets Imported
|
Field |
Source in Semgrep Output |
Notes |
|---|---|---|
|
Title |
check_id (rule ID) |
e.g., python.django.security.injection.tainted-sql-string |
|
Severity |
extra.severity |
ERROR → High, WARNING → Medium, INFO → Low |
|
CWE ID |
extra.metadata.cwe |
Populated for rules that include CWE metadata |
|
OWASP Category |
extra.metadata.owasp |
Where rule metadata includes OWASP mapping |
|
Description |
extra.message |
Rule-specific finding message, often includes context |
|
File Path |
path |
Relative path to the affected file |
|
Line Number |
start.line / end.line |
Start and end line of the matched code |
|
Matched Code |
extra.lines |
The actual code snippet that triggered the rule |
|
Remediation |
extra.metadata.fix |
Fix guidance where included in rule metadata |
|
References |
extra.metadata.references |
Links to CVE, CWE, OWASP, or rule documentation |
|
Rule Source |
extra.metadata.source |
URL to the rule definition in Semgrep Registry |
|
Confidence |
extra.metadata.confidence |
HIGH, MEDIUM, LOW where rule provides it |
|
Language |
Inferred from path extension |
Used for filtering in DefectDojo |
Use Cases
In a pull request gate: Semgrep runs on every PR. Results import into a DefectDojo engagement scoped to that repository. Security engineers review new findings in DefectDojo rather than digging through CI logs, triage within the platform, and push validated issues to the developer's PR review queue via Jira or GitHub Issues. Findings that have already been accepted or false-positived don't resurface as noise.
Across a monorepo with multiple services: A single Semgrep scan on a monorepo produces findings across dozens of services. DefectDojo's product structure lets you map findings to individual service owners, assign by team, and track remediation per service — without the security team manually partitioning scan output.
For a custom rule rollout: The security team writes Semgrep rules to detect insecure use of an internal cryptography library. The rules are deployed across all repos. DefectDojo aggregates the results, showing which teams have addressed the pattern and which haven't — turning a point-in-time rule deployment into a tracked remediation campaign.
During an audit or compliance review: DefectDojo's engagement history shows when SAST scanning was run, what was found, and how findings were resolved. For frameworks requiring evidence of code-level security testing (SOC 2, PCI DSS), the import history and finding lifecycle records are audit-ready without additional reporting work.
Operational Tips
- Use --config auto with care in large repos: Semgrep's auto mode pulls rules from the Registry based on detected languages. This is convenient but can introduce variability between scans if the Registry is updated. Pinning to specific rulesets (p/owasp-top-ten, p/secrets) gives you more stable, comparable results over time.
- Map one DefectDojo product per repository: A one-to-one mapping between repos and DefectDojo products keeps deduplication clean and makes ownership clear. If you have a monorepo, consider mapping products to top-level service directories instead.
- Set INFO severity findings to inactive on import: Semgrep INFO findings are frequently style or hygiene observations rather than security issues. Importing them as inactive keeps your active queue focused on actionable risk while preserving the data for reference.
- Use tags to track rule categories: DefectDojo supports tagging findings on import. Tagging by Semgrep ruleset (owasp, secrets, custom) lets you filter and report by category — useful for understanding where your codebase's risk is concentrated.
- Build a shared false positive library: When a finding is false-positived in DefectDojo, document the rationale in the finding notes. Over time, this becomes a team reference for common Semgrep false positive patterns in your codebase — reducing redundant triage work across engineers.
- Coordinate suppression between Semgrep and DefectDojo: Semgrep supports inline # nosemgrep suppressions. Use these sparingly and prefer handling false positives in DefectDojo where the decision is documented and auditable, rather than silently suppressed in source code.
- Integrate reimport into your main branch pipeline: PR-level scans catch new findings at introduction. A separate reimport on every merge to main gives you an authoritative, current-state view of each repo's SAST posture in DefectDojo — useful for SLA tracking and reporting that shouldn't depend on branch-level scan history.