CVE-2026-23111: One exclamation mark — and an unprivileged user gets root – Sysadmin Courses | Courses for beginner system administrators

Picture a regular user on your server. No sudo, no special permissions, can’t read /root, can’t touch system files. A few seconds later — root. Not because they cracked anything sophisticated. Because a developer accidentally placed an exclamation mark in the kernel where it had no business being.

CVE-2026-23111 is a use-after-free vulnerability in the Linux kernel’s nf_tables subsystem, rated CVSS 7.8 (High) by Ubuntu. It lets an unprivileged user escalate to root on Debian and Ubuntu. A working public exploit is already out — Oliver Sieber of Exodus Intelligence published one with >99% reliability on an idle system, and FuzzingLabs had a PoC out since April. No in-the-wild exploitation has been confirmed at the time of writing, but the patch shipped in February — anyone who hasn’t updated has been sitting next to a public exploit for months.

WHAT NF_TABLES IS

Nftables is the Linux kernel subsystem that replaced iptables. It handles everything your modern Linux firewall does: packet filtering, NAT, traffic marking. Whether you’re using nft directly, firewalld, or nftables through systemd — it’s the same engine underneath. On most modern servers, nf_tables is running by default even if you’ve never explicitly configured a firewall rule.

Inside nf_tables, objects are arranged in a hierarchy: tables contain chains, chains contain rules, rules are built from expressions. The key structure for understanding this bug is the verdict map — a lookup table that matches an incoming packet to an action: accept, drop, or jump to another chain. Think of it as a routing table, except instead of routing addresses it routes decisions about what to do with a packet. A verdict map can hold regular elements with specific keys, and a special catchall element — a wildcard that fires when a packet doesn’t match anything else, essentially a “everything else — do this” rule. That’s where the bug lived.

To ensure ruleset changes take effect atomically — without races against live traffic — the kernel uses a generation mask mechanism. Changes accumulate in the “next generation” and flip active all at once. If a batch operation fails, the kernel invokes an abort phase that must fully roll back everything. The inverted logic was hiding in that rollback.

HOW THE BUG WORKS

The bug is a single character: an exclamation mark in the condition inside nft_map_catchall_activate(). This function runs during the abort phase to reactivate catchall elements in a verdict map that were deactivated as part of the transaction being rolled back.

The correct logic: skip already-active elements, process inactive ones. That’s exactly how the equivalent function for regular elements — nft_mapelem_activate() — is written. In the catchall version, the condition is inverted: the function skips inactive elements and processes active ones — precisely the opposite of what’s needed. One ! character flipped the entire logic upside down.

The consequences are concrete. When a verdict map with an NFT_GOTO catchall element referencing a chain is deleted, the chain’s reference counter (chain->use) is decremented — that’s normal, the reference is being removed. During abort, the counter should be restored, because the deletion is being undone. Because of the inverted condition, that restoration never happens — each abort cycle permanently decrements chain->use. Once the counter reaches zero, the kernel assumes nothing references the chain anymore, DELCHAIN succeeds and frees the memory — while other objects still hold pointers to it. That’s use-after-free: accessing memory that’s already been freed.

HOW IT IS EXPLOITED

No privileges required — just access to user namespaces and nf_tables, both enabled by default on Debian and Ubuntu. Ubuntu 24.04 has additional namespace restrictions, but a known bypass exists via aa-exec -p trinity -- unshare -Urmin /bin/sh — this command spawns a shell in a new namespace while bypassing the AppArmor profile that normally restricts namespace creation.

The attack uses four batches. Batch 1: delete a pipapo-backed verdict map with a catchall element, then deliberately trigger an error in the same batch — this fires the abort phase and decrements the chain reference counter without restoring it. Batch 2: send any successful operation to flip the generation cursor — without this step the next batch won’t work correctly. Batch 3: delete the verdict map again — now the catchall element is active relative to the new generation, and the chain reference counter hits zero. Batch 4: delete the chain — this succeeds because the counter is zero, even though the base chain still has a rule pointing to it. Use-after-free achieved.

From there it’s kernel heap manipulation. An NFT_MSG_GETRULE request against a rule referencing the freed chain triggers nft_verdict_dump(), which reads the chain’s name as a string from already-freed memory. By placing a seq_operations struct at that address via open("/proc/self/stat", 0), the attacker leaks a pointer into kernel code and computes the kernel base address, defeating KASLR. Heap addresses leak next, then by manipulating blob_gen_0 of the freed chain the attacker hijacks control flow and executes a ROP chain. The result: commit_creds(&init_cred) grants the process root credentials, switch_task_namespaces() on PID 1 tears down namespace isolation, and the attacker is root on the host.

WHAT HAPPENS NEXT — DEPENDS ON CONFIGURATION

The exploit works on Debian Bookworm and Trixie, Ubuntu 22.04 LTS and Ubuntu 24.04 LTS — with minor ROP gadget differences between kernel builds, since function offsets and data structure layouts vary across versions. Exploit stability is >99% on an idle system and around 80% under Apache benchmark load. Two independent research teams found different paths to root from the same bug, which means blocking one path doesn’t automatically close the other.

Additional hardening helps but isn’t a complete defence. If SELinux is in enforcing mode, the FuzzingLabs variant requires an extra step: explicitly zeroing selinux_state.enforcing through the ROP chain. ASLR doesn’t protect you — it’s defeated by the kernel base address leak that happens before any control flow hijack is attempted. Real protection comes from exactly two things: patching the kernel or fully disabling unprivileged user namespaces.

REAL-WORLD ATTACK CHAIN

The bug isn’t remotely exploitable — you need a local shell to start. That’s precisely what makes it dangerous as a second-stage attack: an RCE vulnerability in a web application drops an attacker into a shell as www-data, and CVE-2026-23111 turns that shell into root in seconds. At that point the attacker owns the host completely — reading /etc/shadow, pulling SSH keys, intercepting traffic, patching system service binaries. Container isolation collapses via switch_task_namespaces() on PID 1, which breaks the attacker out of namespace isolation and onto the host.

The highest-risk environments are multi-user servers, VPS instances with shared kernels, CI/CD runners, and shared hosting — any setup where unprivileged users or workloads can create namespaces. On shared hosting, one compromised site becomes the entry point for compromising the entire server and every neighbouring site. On a cloud server with a multi-tenant kernel, it becomes the entry point for a container escape.

TIMELINE

Exodus Intelligence found the vulnerability in early 2025 during research into the nf_tables subsystem. On February 5, 2026, the patch landed in the Linux kernel upstream (commit f41c5d151078c5348271ffaf8e7410d96f2d82f8) — a single line removed, the inverted condition gone, CVE-2026-23111 assigned the same day. On April 16, 2026, FuzzingLabs (Alexis and Lyes), while preparing for Pwn2Own Berlin 2026, independently reproduced the vulnerability and published a full PoC with technical breakdown. On June 8, 2026, Exodus Intelligence (Oliver Sieber) released a detailed writeup with a working exploit confirmed against Debian Bookworm, Trixie, Ubuntu 22.04 LTS, and 24.04 LTS.

The gap between the patch (February 5) and the first public working exploit (April 16) was just 70 days. During that window, unpatched systems already had a working exploit available in the open. By June, when the second more detailed writeup appeared, that gap had stretched past four months — meaning administrators running unpatched systems had been operating with a known-public hole the entire time.

WHY IT MATTERS

Linux kernel LPE through nf_tables isn’t a new story. CVE-2022-1015, CVE-2022-1016, CVE-2022-32250, CVE-2023-32233 — this subsystem has a rich CVE history, and the pattern is consistent: complex transactional mechanism, rare execution path, corner case that nobody tested. CVE-2026-23111 fits right in. A team that chose nf_tables for Pwn2Own Berlin 2026 specifically because they didn’t know the subsystem well immediately found an exploitable bug. That’s not bad luck — that’s accumulated technical debt in security-critical code.

CVE-2026-23111 landed in the middle of a notable surge in Linux LPE disclosures. Recent months brought Copy Fail (CVE-2026-31431) — a flaw in copy-on-write handling, Dirty Frag and its variant Fragnesia (CVE-2026-46300) — heap fragmentation in the XFRM ESP-in-TCP subsystem, DirtyDecrypt, and a nine-year-old ptrace vulnerability (CVE-2026-46333). Different subsystems, different techniques — but one consistent pattern: unprivileged foothold in, root on the host out. The common denominator across most of these is user namespaces, which grant unprivileged users access to kernel interfaces. Organisations without a direct operational need for unprivileged user namespaces should be seriously considering disabling them by default.

ADDITIONAL HARDENING LAYER

Alongside patching the kernel, there’s another approach worth considering: reducing the attack surface by blacklisting unused kernel modules. Most servers load thousands of modules at boot, yet only a fraction of them are actually needed. Every dormant module is a potential vector for the next LPE disclosure. modulejail addresses this simply: it scans the list of currently loaded modules and creates a modprobe.d blacklist for everything else. If a module isn’t needed right now, it won’t load at all.

One important constraint: run modulejail only after the system has reached a steady state — all services started, all filesystems mounted, all required drivers loaded. Running it too early risks blacklisting modules that will be needed later. Rolling back is manual: delete /etc/modprobe.d/modulejail-blacklist.conf and reboot.

Install on Debian/Ubuntu:

sudo apt install modulejail

On RHEL/Fedora/Alma/Rocky:

sudo dnf install modulejail

Run after the system reaches steady state — the default conservative profile is intended for servers:

sudo modulejail

modulejail doesn’t patch vulnerabilities or check CVE databases — it’s an attack surface reduction tool, not a replacement for kernel updates. Use it as an additional layer after the kernel is already patched.

UPDATE

The patch has been in upstream since February 5, 2026, and distributions have shipped updated kernel packages. On Debian and Ubuntu, updating is straightforward — apt update syncs the package lists from the repositories, apt upgrade installs all available updates including the new kernel:

sudo apt update && sudo apt upgrade

A reboot is required after installing the new kernel — the updated package is sitting on disk but the system keeps running the old kernel in memory until a restart. The reboot command cleanly shuts down all processes and boots into the new kernel:

sudo reboot

After rebooting, verify the system is actually running the updated kernel — uname -r prints the version of the running kernel. On Ubuntu and Debian, security patches are often backported without changing the main version number, so the version string may look the same as before while the build date is newer. To confirm the current package is up to date, check the build date with apt-cache policy linux-image-$(uname -r) — the Installed line should show a date after February 2026:

uname -r
apt-cache policy linux-image-$(uname -r)

If updating right now isn’t possible, the temporary mitigation is disabling unprivileged user namespaces. This blocks the exploitation vector because an attacker can no longer create an isolated namespace to interact with nf_tables without privileges. The side effect: rootless containers (rootless Docker, Podman, LXC) break, as do some browser sandboxes and build tools. On Debian and Ubuntu both parameters are needed — kernel.unprivileged_userns_clone disables user namespace creation at the kernel level, user.max_user_namespaces zeros the limit:

sudo sysctl -w kernel.unprivileged_userns_clone=0
sudo sysctl -w user.max_user_namespaces=0

These settings don’t survive a reboot. To make them permanent, add the lines to your sysctl configuration — the kernel reads them automatically at boot:

kernel.unprivileged_userns_clone=0
user.max_user_namespaces=0

On RHEL-compatible systems, kernel.unprivileged_userns_clone doesn’t exist — user.max_user_namespaces=0 is sufficient there. On Debian and Ubuntu you need both: setting only one leaves the vector partially open.

CONCLUSIONS

One character in an abort phase condition — and an unprivileged user gets root. Not because the attack is clever, but because a corner case in nf_tables transactional logic went untested for years, and user namespaces opened the door to it without any privileges required. A working public exploit with 99% reliability exists. No in-the-wild exploitation confirmed yet — but that window is closing.

For sysadmins: update the kernel and reboot the server. That’s the only reliable fix. If you can’t do it immediately, disable unprivileged user namespaces as a stopgap, knowing it will affect containerised workloads. Shared hosting servers, container hosts, and CI/CD runners are the top priority — that’s where an unprivileged shell becomes root fastest. For providers running multi-tenant kernels in cloud environments, this is the most urgent item on the list right now.

Comments (2)

Jasper Nuyens

On all servers where nf_tables isn’t used, modulejail prevents the loading of the module by a regular user, preventing this privilege-escalation path. Already in debian repo and AUR. https://www.modulejail.com

June 11, 2026 at 7:04 pm

1. sysadmin
  
  Thank you, Jasper! Great addition — modulejail is a smart approach to reducing the kernel module attack surface, especially given the current wave of LPE disclosures. Added to the article.
  
  June 11, 2026 at 8:05 pm

Blog

CVE-2026-23111: One exclamation mark — and an unprivileged user gets root

CVE-2026-3300: Everest Forms Pro Runs Arbitrary PHP from a Form Field

Apache HTTP Server 2.4.68: thirteen CVEs in one patch — update now

Comments (2)

Jasper Nuyens

sysadmin

Leave your thought here Cancel reply

Address

Explore

Information

Donation