§ Trackr.Live

Evaluate-STIG

Evaluate-STIG exists because SCAP never covered the whole catalog. The SCAP Compliance Checker (SCC, the NIWC Atlantic tool) only scans STIGs that have a published SCAP benchmark, and for years that has been a minority of the STIGs DISA actually publishes. Everything else gets hand-checked in STIG Viewer, control by control, by an engineer clicking through a checklist on a Friday. Evaluate-STIG is a PowerShell script that automates a large chunk of that manual work: point it at a host, it auto-detects which STIGs apply, runs the checks it has logic for, and writes out CKL or CKLB checklists you can import straight into STIG Manager. That gap-filling is the entire reason it spread across DoD.

A PowerShell automation engine fanning a single host out into a stack of checklist files, the files flowing toward a central collector that resembles a STIG Manager database.

It is not a vulnerability scanner. Keep that straight, because the file format fools people. Evaluate-STIG’s job is to produce the STIG checklist artifact itself, the thing an assessor opens to see whether SRG-OS-000033 is Open or Not a Finding on this box. Closer to a documentation engine that happens to do real checking than to a scanner in the RA-5 sense. More on why that matters below.

Who actually makes it

Developed at NSWC Crane (Naval Surface Warfare Center, Crane Division) under NAVSEA. You will see it credited loosely to “the Navy” or “NSWC,” which is true but blurry. Crane is the specific shop, and it matters because Evaluate-STIG gets conflated with STIG Manager, which is a different Navy organization entirely (NUWCDIVNPT, the Naval Undersea Warfare Center division in Newport). Two tools, two orgs, one pipeline. Evaluate-STIG produces the checklists; STIG Manager collects and tracks them. Don’t credit one team for the other’s work in your SSP references.

The tool is a PowerShell script set, not an installed product. You unpack it and run it. No MSI, no agent, no service. Part of why it caught on: an ISSE can drop it on a jump box, scan, and pull checklists without a software-approval cycle for an installer.

Versioning runs in a 1.24xx / 1.25xx shape. The build that shows up most in public references is the 1.2410.x line from late 2024, but the exact current 2026 build lives behind CAC on the NAVSEA repo and I cannot confirm the precise number from here. Treat any version you see quoted as a floor, not gospel, and check the releases page on the repo before you cite one in an artifact. The tool’s STIG content updates on its own cadence, separate from the DISA quarterly STIG release, which is where a lot of the day-to-day friction comes from.

The “open source” trap

Here is the correction worth making loudly, because the seed framing for this tool is wrong in a way that bites people. Evaluate-STIG is government-developed and free to DoD, but it is not open source in the public-download sense. Distribution is CAC-gated:

  • NIPR: the NAVSEA GitLab at spork.navsea.navy.mil/nswc-crane-division/evaluate-stig (releases under /-/releases)
  • SIPR and the classified side: through Intelink (the NAVSEA-RMF site)
  • Commercial internet only with a CAC and the right access

You cannot git clone it from a public GitHub URL the way you pull OpenSCAP or Trivy. Contrast that with STIG Manager, which genuinely is public on GitHub under NUWCDIVNPT, open-source (MIT for the codebase, GPLv3 for the ExtJS-based client), anyone can grab it. Evaluate-STIG is the opposite posture: free as in no cost to DoD, gated as in you need a CAC and a reason to be there. If you are a vendor or a contractor trying to test against it in a non-DoD lab, that gating is a real wall, and “just download it” is not the answer someone outside the fence wants to hear.

On approvals, it is well-papered for a government script. It is carried in Navy DADMS, the Army has it in eMASS with an Assess-Only ATO, and NAVSEA reporting puts it in use across NAVAIR and NMCI. There is no FedRAMP listing and there never will be (it is not a cloud service), so don’t go looking for one.

Where it sits in the RMF flow

You run Evaluate-STIG to satisfy STIG assessment requirements under the RMF and DoDI 8510.01. The control it feeds most directly is CM-6, configuration settings, because a STIG is a configuration baseline and the checklist is your evidence that the baseline is enforced. STIG compliance also leans on CM-7 (least functionality) for the disable-the-unused-service findings.

The cadence is ConMon-driven. System admins and ISSEs run it on whatever scan rhythm the system’s continuous-monitoring plan demands, commonly monthly or quarterly, and the output flows the same way every time: Evaluate-STIG writes CKLB, those import into STIG Manager, STIG Manager rolls up the assessment, and the result eventually lands in eMASS as part of the ATO package. SCC handles the benchmarked STIGs in parallel; Evaluate-STIG handles the rest. In most shops the two run side by side and the checklists get merged in STIG Manager.

Evaluate-STIG SCC (SCAP Compliance Checker)
STIG coverage breadth Wide; covers non-benchmarked STIGs (81 vs SCC’s 26 per a 3/16/2023 reference; treat as dated) Only STIGs with a published SCAP benchmark
SCAP benchmark support Not benchmark-driven; uses its own check logic Native; SCAP is the whole point
Output CKL and CKLB checklists plus summary reports XCCDF/ARF results, CKL export
Install footprint PowerShell script, no install, Bash variant for Linux Installed application
Distribution / access CAC-gated (NAVSEA GitLab, Intelink) CAC-gated via NIWC Atlantic / cyber.mil
Where it feeds STIG Manager, then eMASS STIG Manager / eMASS

The 81-vs-26 number is the most-cited stat for this tool and it is genuinely the whole pitch, but it is a March 2023 snapshot. Both tools have moved since. SCC is on the 5.14 line as of early 2026, with content tracking the current DISA quarter. Don’t state a hard current count for Evaluate-STIG unless you pull it from the repo; the gap is still large, but the exact figure is stale the moment you write it.

What it does well, and the gripes

The strengths are real. Auto-detection of applicable STIGs saves the worst part of the manual workflow, which is figuring out which checklists even apply to a given host. No install means no approval friction. CKLB output drops straight into STIG Manager. And the answer-key mechanism is the feature that earns its keep: you can supply a keyed file that pre-answers findings which are policy-driven or site-specific (a finding that’s mitigated by a network control, say, or accepted by the AO) so the tool stamps them automatically instead of forcing a human to re-justify the same Open finding every single scan.

That same answer-key feature is also where the audit risk lives.

Deeper: answer keys are the power and the liability.
An answer key tells Evaluate-STIG to mark a finding as Not a Finding, or as Not Applicable with a comment, without re-checking the box every run. Used well, it encodes real, AO-accepted risk decisions and stops your ISSEs from re-typing the same justification monthly. Used badly, it becomes a quiet rubber stamp: a finding gets keyed to “compliant” once, the underlying config drifts, and the checklist keeps reporting green because the key, not the system, is answering. An assessor who knows the tool will ask to see the answer keys and will spot-check a keyed finding against the live config. If you own the system, treat the answer key as a controlled artifact with its own review, because a stale key is how an ATO package accumulates findings nobody has actually validated in a year. This is also the cleanest argument that Evaluate-STIG is a documentation tool wearing scanner clothes.

The complaints are equally real. It is a script you run with privilege, which puts it squarely in SR (supply chain) territory: verify provenance, pull it from the actual NAVSEA repo, check the signing, don’t grab a copy someone emailed you. Version drift is a recurring headache, because the tool’s STIG content and DISA’s quarterly STIG releases march to different drums, and a host can show clean against last quarter’s content. PowerShell execution policy trips up first-time runs (Set-ExecutionPolicy or a bypass on the invocation, depending on how locked-down the box is). Remote scanning to multiple hosts means WinRM on Windows targets or SSH for the Bash variant, and that plumbing is the usual setup tax. And the CAC gating, again, makes it awkward to get anywhere near a non-DoD environment.

Operational notes

Deployment comes in a few shapes. Local single-host run is the simplest. Remote scan against a list of hosts is the common production mode, over WinRM. For Linux targets there’s Evaluate-STIG_Bash.sh, the Bash variant, which covers the RHEL/Ubuntu STIGs PowerShell wouldn’t naturally reach. A 200-host monthly scan and a large-fleet run behave differently: the small scan you can babysit, the fleet run you schedule and triage from the STIG Manager rollup, and on SIPR the whole thing moves slower because you are working through Intelink and a more locked-down host posture.

Integration is CKLB into STIG Manager, then eMASS. One pitfall worth flagging: there have been reported quirks importing multiple CKLs from Evaluate-STIG into STIG Manager, including a case where only the last checklist in a batch got processed (an issue tracked on the STIG Manager GitHub). Check that your full set imported, not just the last file. And keep matching the tool’s content version to the STIG release you’re being assessed against; the mismatch is the single most common reason a checklist looks done but isn’t.

Where does it land in the control catalog? CM-6 is the primary tie, with CM-7 along for the ride. SR applies because you are executing a privileged government script and provenance is your problem. RA-5 is the one to be careful with: people want to file Evaluate-STIG next to ACAS because both produce CKLs, but RA-5 is vulnerability identification and this is configuration compliance. Adjacent, not the same. I’d keep it out of the RA-5 bucket and let ACAS/Nessus own that lane. SA-11 doesn’t really apply; this isn’t a developer-testing tool.

If you take one thing from this: in a lot of DoD shops Evaluate-STIG has quietly become the default STIG tool, and SCC survives mostly for the SCAP benchmarks it still owns. That’s a defensible read of where the two tools sit in 2026, and not everyone will agree with it, but the coverage math has been pointing that direction since 2023.

Sources

Adjacent material on this site