Ubuntu’s userns Mediation Is a Tripwire, Not a Wall
Ubuntu turned unprivileged user namespace mediation on by default in 24.04 LTS, and a lot of people quietly filed it under “containment, handled.” It isn’t containment. It’s a friction layer that raises the cost of a specific privilege-escalation pattern, and within the year, Qualys had published three ways around it. The Ubuntu Security Team’s response was, roughly, “yes, and that’s fine, because this was never a security boundary.” Both things are true at once. The feature is worth running. It is also worth understanding precisely what it does and does not stop, because the gap between those two is exactly where a SOC lead gets a false sense of coverage.
The part most teams miss is even simpler: the mediation produces a clean, greppable audit event every time it fires, and almost nobody is ingesting it.
Why unprivileged user namespaces are a privesc multiplier
User namespaces let an unprivileged process map its own UID to UID 0 inside a new namespace. That root is fake in the sense that it maps back to your real, unprivileged UID on the host. But it’s real enough to reach kernel code paths that previously demanded actual root: namespace-aware filesystem mounts, network configuration, certain ioctls, parts of the netfilter and packet-handling surface. The feature exists for good reasons. Rootless containers, bwrap-based sandboxes, Flatpak, Chromium’s sandbox all depend on it.
The problem is the math. A whole class of Linux kernel LPE bugs is only reachable once you hold administrative capabilities in some namespace. Unprivileged userns hands an ordinary local account exactly that, which converts “needs root to trigger” into “needs a shell.” That’s why the historical mitigation on Debian and Ubuntu was the blunt kernel.unprivileged_userns_clone=0, and why RHEL 7 shipped user.max_user_namespaces=0 by default for years. (Fedora went the other way and enabled unprivileged userns early; the distros never agreed on where to land.) Killing the feature outright breaks rootless Podman and anything using bwrap, so it was always a fight between the hardening team and the desktop team.
Ubuntu’s 23.10/24.04 approach is more surgical. Instead of disabling userns, AppArmor mediates what an unprivileged process can do with one. On a default 24.04 box the profile actually lets an unprivileged, unconfined process create the namespace — it just drops it into a restricted unprivileged_userns profile that denies the capabilities (CAP_SYS_ADMIN and friends) that make the namespace worth having. Creation succeeds; the privileged operations inside it don’t. A profile that explicitly grants userns is what restores the dangerous full-capability behavior. Two sysctls govern it:
kernel.apparmor_restrict_unprivileged_userns = 1
kernel.apparmor_restrict_unprivileged_unconfined = 1
The first is the headline control. The second is the one people forget, and it matters more than the docs make obvious.
The bypasses, and why they aren’t bugs
Qualys disclosed three bypasses to the Ubuntu Security Team in January 2025 (advisory and Ubuntu’s writeup are in the sources). The mechanisms differ but the shape is identical: each one borrows a permissive AppArmor profile that is allowed to create user namespaces, and gets unprivileged code to run under it. One route abuses aa-exec to transition into a friendlier profile. One uses busybox, which on 24.04-era systems shipped with a broad profile that allowed userns (Ubuntu 25.04 later removed the BusyBox and Nautilus profiles that directly permitted it and added a tighter bwrap-userns-restrict profile, so verify the local profile set on newer releases). One uses LD_PRELOAD to inject into a binary that already has the permission — note that this only works against a non-SUID target, and it’s hijacking a process that already cleared the restriction rather than defeating the mediation logic itself.
The first sysctl alone doesn’t stop any of these, because the permissive default profiles Ubuntu ships (to keep bwrap, Flatpak, and friends working) are the exact thing being borrowed. That’s what kernel.apparmor_restrict_unprivileged_unconfined=1 is for: it stops an unconfined process from aa-exec-ing itself into a more favorable profile in the first place. If you enabled the first knob and not the second, you enabled the half that looks good in a config audit and skipped the half that closes the obvious door.
Ubuntu’s position is that none of this is a vulnerability, and they’re technically right. The root you obtain in the namespace still maps to your real UID; the bypass doesn’t grant access you didn’t already have at the namespace boundary. The risk it reintroduces is indirect: it puts the kernel attack surface back within reach. Whether that bothers you depends entirely on how much you trust your kernel patch cadence. On a fleet that’s three CVEs behind on the HWE kernel because the reboot window keeps slipping, it should bother you a lot.
So treat the mediation as what it is. A layer that turns trivial userns abuse into something that needs a known bypass technique and leaves a trail. Not a wall.
What to detect, and where it actually lands
Here’s the useful part. When the restriction fires, AppArmor emits a kernel audit record — but the shape depends on whether the unprivileged_userns profile is loaded, and getting that wrong is how you write a detection that quietly misses the common case. On a default 24.04 box, where the profile is present, an unprivileged unshare -Ur produces a two-record chain:
# 1) the transition into the restricted profile — logged as AUDIT, not a denial
apparmor="AUDIT" operation="userns_create" class="namespace" \
info="Userns create - transitioning profile" profile="unconfined" \
pid=... comm="..." target="unprivileged_userns" requested="userns_create"
# 2) the record that actually breaks unshare — a capability denial inside that profile
apparmor="DENIED" operation="capable" class="cap" \
profile="unprivileged_userns" pid=... comm="..." capability=21 capname="sys_admin"
The thing that stops the process is the second record. unshare -Ur needs CAP_SYS_ADMIN (capability 21) to build the namespace, and the unprivileged_userns profile it just transitioned into denies that capability. The userns_create line above it is an AUDIT transition, not a block — so a search that greps only for apparmor="DENIED" operation="userns_create" will miss this sequence entirely on a stock 24.04 fleet.
That direct denial does exist, but in a different situation: when the unprivileged_userns profile is missing or fails to load, you get a single record instead, and it says so in the info field:
apparmor="DENIED" operation="userns_create" class="namespace" \
info="Userns create restricted - failed to find unprivileged_userns profile" \
error=-13 profile="unconfined" pid=... comm="..."
Both shapes are real; which one a host emits is a property of its profile state, so capture a sample on your own image first (run unshare -Ur true and read the record) before you commit a search. Detect the chain, not a single string: the AUDIT operation="userns_create" transition into profile="unprivileged_userns", the DENIED operation="capable" that immediately follows it, and the direct DENIED operation="userns_create" that covers the missing-profile case. One caveat before you assume completeness: AppArmor audit output can be rate-limited, so “an event every time it fires” holds only when DENIED auditing isn’t being throttled — check /sys/module/apparmor/parameters/audit and your auditd backlog settings before you trust the count. If auditd is running, the record routes through /var/log/audit/audit.log as a type=1400 event; if it isn’t, the same line lands in the kernel ring buffer, and on freshly installed cloud images where journald is primary and rsyslog isn’t pulled in, you’ll find it in the journal (journalctl -k or journalctl _TRANSPORT=audit) rather than in /var/log/kern.log — server ISO installs more often still have rsyslog present. Which path you get changes your ingest story, so check before you write the search.
On Splunk with the Linux audit TA and a Universal Forwarder watching audit.log, the starting query is roughly:
index=linux sourcetype=linux:audit (
operation="userns_create"
OR (profile="unprivileged_userns" apparmor="DENIED" operation="capable")
)
| stats count values(apparmor) values(operation) values(profile) values(capname) by host, comm
Adjust the sourcetype and field names to match your TA; many deployments don’t parse AppArmor denials cleanly and you’ll be matching raw text. The Elastic equivalent works too, but mind the moving target: Auditbeat’s auditd module is being phased out in favor of the Elastic Agent system integration, and either way AppArmor records don’t decompose as cleanly as SELinux AVCs, so you’ll be matching on the raw message more than on parsed fields until you write the ingest pipeline grok yourself.
Volume is the thing to think about before you turn on an alert. On a server fleet that doesn’t run rootless containers, this event is rare, and a single hit on an interactive host is genuinely interesting. On a desktop or developer fleet it’s the opposite. Anything that uses userns but lacks a matching profile will trip the denial repeatedly: a hand-built Chromium, a dev’s unshare one-liner, a Steam install, a Flatpak app that didn’t get the bwrap profile pulled in. Expect the first week on a mixed engineering fleet to be loud, dropping to near-nothing once you carve out the legitimate comm values. The noise isn’t the detection failing; it’s the detection telling you which apps are exercising the feature.
First round of tuning is an allowlist by comm and profile, not a threshold. Rate-based alerting is the wrong instinct here because the bad case is low-and-slow: one denial from bash, python3, or some dropped binary in /tmp is more meaningful than two hundred from chrome. So suppress the known userns users, then alert on a denial whose comm isn’t on the list, especially when the process path sits in a world-writable directory. And don’t trust comm on its own — it’s just the process name and an attacker can rename a payload to chrome or bwrap for free, so pin the rule to the resolved exe path and treat a known-good comm running from /tmp, /dev/shm, or a home Downloads directory as the alert, not the suppression. Even the path can be played: a symlink or a binary swapped on disk after exec means a clean-looking exe isn’t proof of integrity, so where your EDR exposes process inode/device or a hash, prefer that over the path string. That last condition is where the signal concentrates.
Two caveats that will bite you. First, host time skew: AppArmor records carry the kernel audit timestamp, and if your NTP discipline is loose across the fleet, correlating a userns denial with the process tree from your EDR gets annoying fast (lean on pid/ppid and the audit serial rather than the clock when it does). Second, agent coverage. If you’re keying off auditd but half the fleet runs the host’s default config where AppArmor logs only go to the journal, you’ll have a blind spot you don’t know about until an incident review. Confirm the path per image, not per distro.
The environment assumptions that change the answer
This is an AppArmor feature, so it’s an Ubuntu (and Debian-derivative) story. On RHEL 8/9/10, Fedora, or anything running SELinux, operation="userns_create" does not exist. The userns concern is identical but the control and the telemetry are completely different: you’re looking at user.max_user_namespaces, auditd syscall rules on clone/clone3/unshare filtering for the CLONE_NEWUSER flag, or an eBPF tool like Falco watching namespace creation. Don’t copy the Ubuntu detection into a mixed estate and assume parity.
Version matters too. 24.04 LTS ships the mediation on by default. 22.04 does not, even on the HWE kernel, so a fleet mid-migration has two security postures and you should know which hosts are which before you trust any aggregate. A dashboard showing zero deny events on the 22.04 estate isn’t reassurance; it’s the absence of the enforcement mechanism entirely. And rootless container platforms change the calculus entirely. Kubernetes user-namespace support for pods, rootless Podman, rootless Docker/containerd, CI runners doing rootless builds: these need unprivileged userns to work. Enforce the restriction hard on those nodes without an allowlist and you’ll break workloads, then someone will disable the sysctl globally to “fix” it, and now you’ve got neither the feature nor the hardening. Scope it per node role.
And check for LXD. Ubuntu’s own hardening guidance is blunt about it: if LXD is installed, it completely disables the user namespace restriction feature while it’s running, which makes both sysctls irrelevant. So on an LXD host a green config audit — both knobs reading 1 — is not evidence the mediation is active. Validate enforcement directly with a test denial and a log sample (unshare -Ur true and confirm the audit record lands), not a sysctl -a snapshot.
Control mapping
| Control | How this maps |
|---|---|
| CM-7 | Least functionality. Mediating or disabling an unneeded kernel capability is the textbook case. |
| CM-6 | Both sysctls belong in your hardening baseline as enforced settings, not one-time sysctl -w runs. |
| SI-4 | The userns_create DENIED events are the monitoring signal; ingest them or you have CM-7 with no verification. |
| AC-6 / SC-39 | Least privilege and process isolation. Userns is the isolation primitive; the mediation governs who gets to wield it. |
| SI-7 / AU-12 | Drift detection on the sysctl values and the drop-in file content, and audit-record generation for the denials themselves. |
The CM-6 line is where teams cut the corner. Setting both knobs with sysctl -w survives until the next reboot, then quietly reverts, and your config scanner reads green because it checked the running value an hour after boot when it happened to still be set. Cloud images make this worse: cloud-init can regenerate sysctl state on boot or during autoscaling and silently overwrite a baseline you thought was enforced. Put them in a managed /etc/sysctl.d/ drop-in, render it through your config management, make sure no cloud-init snippet conflicts, and have SI-7 drift detection assert the file content and the live sysctl output match. Checking only one of those is how you end up reporting a control as effective while it’s been off on a third of the fleet since the last kernel update.
Run the mediation. Set both sysctls, not one. Ingest the userns_create denials and allowlist by process, not by rate. And write down, somewhere your incident responders will find it at 0300, that a userns denial is a hardening tripwire and not proof of containment, because the person paged is going to want to know whether the direct path was blocked and the attacker stopped there, or whether they pivoted to a profile-borrowing technique that the denial won’t catch.
Sources
- Understanding AppArmor user namespace restriction (Ubuntu Community Hub)
- Restricted unprivileged user namespaces are coming to Ubuntu 23.10 (Ubuntu)
- Three bypasses of Ubuntu’s unprivileged user namespace restrictions (Qualys / oss-sec)
- [spec] Unprivileged user namespace restrictions via AppArmor in Ubuntu 23.10 (Ubuntu Community Hub)
- Ubuntu 24.04 LTS release notes (Ubuntu)
- Ubuntu 25.04 release notes — BusyBox/Nautilus profile removal, bwrap-userns-restrict (Ubuntu)
- The journey of bypassing Ubuntu’s unprivileged namespace restriction (DEVCORE)