Image-Mode RHEL Breaks Your File Integrity Baseline, and Nobody Told the SOC

By AutoCypher · 6 weeks ago 06 Jun 2026

Red Hat shipped image mode for RHEL (the bootc / ostree-based variant) as a supported path in 2024, and by now it is showing up in production fleets and, more to the point, in front of authorizing officials who are used to package-mode hosts. If your environment is moving any workload onto image-mode RHEL 9 or the RHEL 10 builds, the thing nobody puts in the migration deck is this: your existing host integrity story — the auditd watches, the AIDE database, the EDR file-monitoring rules — was written against a writable /usr. Image mode makes /usr read-only and immutable, ships the OS as a container image, and shoves all the mutable state into /etc, /var, and a couple of overlay paths. The controls don’t break loudly. They just quietly stop covering the surface they used to cover, and the gaps land exactly where persistence and drift now live.

That is the whole problem in one paragraph. The rest is mechanism and instrumentation.

What actually changed under the hood

In package mode, rpm writes files all over the tree — /usr/bin, /usr/lib, /usr/sbin, config under /etc, the works — and your baseline assumes any unexpected write to /usr is suspicious. That assumption was load-bearing for a lot of SI-7 implementations.

Image mode flips it. The OS is built as an OCI container image and deployed transactionally. /usr is mounted read-only off an ostree commit. You can’t rpm -i into a running system the way you used to; you change the image, build a new commit, and reboot into it (bootc upgrade, then a reboot, or staged via bootc switch). The running root is a deployment that ostree can roll back to the previous commit if the new one fails to boot. Composefs backs the whole thing with a content-addressed, fs-verity-protected store, so the immutability is enforced at the filesystem layer rather than by a permission bit you can flip back.

Which sounds great for integrity — and it genuinely is, for /usr. The catch is that mutable state didn’t disappear. It moved, and concentrated:

/etc is a writable overlay seeded from the image. Local edits persist across upgrades and are three-way merged against the new image’s /etc. This is where most config drift now lives.
/var is fully writable and persistent, untouched by image upgrades. Anything that wants to survive a reboot and stay out of the image’s way goes here.
Systemd units, timers, and the usual persistence real estate are reachable through both, plus whatever a workload bakes into a layered image.

So the integrity surface you actually need to watch shrank in one place and got sharper in two others. If your monitoring still treats /usr writes as the headline event, you’re watching a door that’s been welded shut while the windows on the other side of the building stay open.

Where the detections need to move

Start with what image mode gives you for free, because it’s the one piece that’s strictly better. The state of the deployed image is queryable. bootc status --format json returns the booted image reference, digest, and whether a staged or rolled-back deployment exists. That digest is your real baseline now — not a file hash database you rebuild nightly and pray nobody poisoned. If the booted digest doesn’t match what your build pipeline says it should be, that’s a CM-2 / CM-6 finding with no ambiguity. Pipe bootc status output into your config-management telemetry on a timer and alert on digest mismatch against the expected tag for that host’s role. Cheap, high-signal, and it maps cleanly to CM-3 change control because the only legitimate way to change that digest is through the pipeline.

The writable paths are where the work is. You want auditd watches on /etc and the persistence-relevant slices of /var, and you want them scoped, because a blanket watch on /var will bury your SOC.

A /etc write watch is the obvious one and it is noisier than people expect on first deploy. Every bootc upgrade reboot does the three-way /etc merge, cloud-init rewrites network and SSH config on first boot, and anything using systemctl enable touches /etc/systemd/system. In a fleet of a few hundred image-mode hosts, expect the raw /etc watch to generate thousands of events a day before tuning, the bulk of it boot-time merge churn and config-management runs. The first tuning pass is almost entirely about excluding your own automation: the ansible/puppet service account UID, the cloud-init process lineage, and the systemd-tmpfiles activity that fires on every boot. What’s left after that — interactive edits to /etc outside a maintenance window, by a UID that isn’t your config-management identity — is the signal you actually wanted.

In Splunk, working off sourcetype=linux:audit, the shape is roughly:

sourcetype=linux:audit type=PATH name="/etc/*"
| transaction host maxspan=2s startswith="type=SYSCALL"
| search auid!="ansible_svc_uid" exe!="/usr/bin/cloud-init"
| stats count by host auid exe name

That transaction stitch is necessary because the PATH record and the SYSCALL record with the auid and exe arrive as separate events, and the field you care about for attribution (auid, the login UID, not uid) lives on the SYSCALL record. Get this wrong and you’ll alert on the effective UID of a setuid binary instead of the human who invoked it. The Elastic equivalent works but the auditd module’s field mapping is messier — auditd.data.name versus file.path depending on which ingest pipeline version you’re on, and the correlation between syscall and path is something you end up doing in the detection rule rather than getting for free.

One real gotcha: on image-mode hosts the auditd rules themselves usually ship in the image under /usr/lib/audit or get dropped into /etc/audit/rules.d at build time. Good — that means the audit policy is version-controlled with the image and an attacker can’t quietly edit the rules file on disk and have it stick across reboot. But it also means a rules change requires an image rebuild and reboot, so don’t expect to hot-patch a noisy rule at 0200 during an incident. Plan the rule set before you cut the image.

The persistence question everyone gets to eventually

The interesting attacker question on these hosts is: where do you hide something that survives a bootc upgrade? Because the upgrade swaps /usr wholesale, anything dropped into the read-only tree is gone after the next image roll. That pushes persistence toward /var, /etc/systemd/system, user crontabs, and — the one that catches teams out — layered local modifications that get folded into a rebuilt image if the build process is sloppy about what it copies in.

That last one is the supply-chain-shaped risk (SR-3, SR-4, SR-11) and it’s the one I’d watch hardest, because it’s the path that turns a one-time compromise into a durable one. If your Containerfile for the layered image does a broad COPY from a build context that an attacker can influence, or pulls a base layer whose digest you don’t pin, the immutability you’re relying on is now guaranteeing the persistence of whatever got baked in. Pin base image digests, not tags. Verify the signature on the base (Red Hat signs theirs; check it in the pipeline, don’t assume). And treat the image build host as a tier-0 asset, because in image mode it is the only place root filesystem content originates. The host’s read-only /usr is exactly as trustworthy as your build pipeline and not one bit more.

For the runtime side, the systemd unit creation watch is your highest-value persistence detection here. New or modified files in /etc/systemd/system and /etc/systemd/user, correlated against your change windows, plus journald events for unit installs. On the journald side, if you’re forwarding via rsyslog imjournal into Splunk, watch the time skew — imjournal stamps with the journal’s __REALTIME_TIMESTAMP but if the relay clock drifts you’ll get events that sort wrong in the index and your maxspan transactions silently miss correlations. NTP discipline on the relay matters more than it looks.

Where the AIDE-style approach falls down now

Plenty of shops will reflexively keep running AIDE or Tripwire against image-mode hosts because it’s in the STIG checklist and the assessor expects to see it. It mostly still works against /etc and /var. Against /usr it’s now redundant with composefs fs-verity, which is doing continuous, kernel-enforced integrity verification that a nightly AIDE scan can’t match — fs-verity catches tampering at read time, AIDE catches it whenever the cron job next runs (assuming the cron job is running; the silent-failure-for-eight-months pattern is real and an AIDE database that hasn’t been rebuilt since the last image roll throws so many false positives that everyone learns to ignore the report). My honest read: keep AIDE scoped to the writable paths if compliance demands a named FIM tool, lean on bootc status digest verification and fs-verity for /usr, and don’t pretend the nightly scan is buying you integrity coverage it isn’t.

The control mapping shakes out roughly like this:

Control	What carries it in image mode
CM-2, CM-6	`bootc status` digest vs. expected pipeline tag
CM-3	Image rebuild + reboot is the only legitimate change path
SI-7	composefs/fs-verity for `/usr`; auditd watches for `/etc`,`/var`
SI-4	journald unit-install events; scoped `/etc` write detections
AU-2, AU-12	audit rules shipped in-image, version-controlled
SR-3, SR-4, SR-11	base digest pinning, signature verification, build-host trust

The assessment-and-authorization angle (CA-7, continuous monitoring) is actually where image mode pays for itself: a fleet where every host can attest its exact OS digest against a known-good build is a far cleaner ConMon story than “AIDE ran and mostly agreed with itself.” Use that in the SSP. It’s the rare case where the new architecture makes the control narrative shorter and more defensible at the same time.

Don’t let the read-only root lull anyone into thinking the host is hardened by construction. The mutable surface moved; instrument the new surface, pin the build, and verify the digest. The hosts that get burned will be the ones where someone saw “immutable” in the slide deck and quietly retired the monitoring instead of repointing it.

What actually changed under the hood

Where the detections need to move

The persistence question everyone gets to eventually

Where the AIDE-style approach falls down now

Sources