← Back to Research

Copy Fail (CVE-2026-31431): The 9-Year-Old Linux Kernel Bug That Triages as Informational

732 bytes of Python. No race window. Root on every major Linux distribution shipped since 2017. Copy Fail (CVE-2026-31431) is the nine-year-old algif_aead optimization that an unprivileged local user can leverage into deterministic root. CISA added it to the Known Exploited Vulnerabilities catalogue with a 15 May 2026 federal deadline. The triage trap on this class of bug is a single phrase: "requires local access, P3 informational." That phrase is the difference between a patched fleet by the morning of 1 May and a Velociraptor-flagged container escape on 8 May.

On 29 April 2026, Help Net Security, Sysdig, and Microsoft's Defender Research blog published the same finding from the same coordinated disclosure window: a logic bug in the Linux kernel's algif_aead AF_ALG socket family lets any process that can call splice() against a privileged binary's page cache obtain a deterministic 4-byte write into that page cache, which translates straightforwardly into root. The 732-byte public PoC works on Ubuntu 22.04 and 24.04, Amazon Linux 2 and 2023, RHEL 8 and 9, SUSE 15, and Debian 12. The bug shipped in 2017. It has been exploitable for nine years. CISA added it to the Known Exploited Vulnerabilities catalogue the same day, with a 15 May 2026 federal patch deadline.

The bug is technically a "high-severity local privilege escalation," CVSS 7.8. The triage trap is reading those words and concluding that local privilege escalation is somebody else's problem.

What happened

The affected component is algif_aead, the kernel module that exposes the kernel's authenticated-encryption (AEAD) cryptographic primitives to userspace through the AF_ALG socket family. AF_ALG is the address family cryptsetup, kTLS, and a handful of in-kernel offload paths use; algif_aead specifically handles the AEAD ciphers (AES-GCM, ChaCha20-Poly1305, and the rest).

In 2017 the maintainers shipped an optimisation that performed AEAD operations in-place by setting the request's source scatterlist equal to the destination scatterlist (req->src = req->dst) and using a single chained tag-page list. That avoided allocating and copying a separate destination buffer for the common case. The optimisation worked correctly for the userspace-buffer code path. It did not account for the path where the source pages were brought in via splice() from a file the user could read but not write — for example, a setuid binary's read-only page cache.

When the kernel performed the in-place AEAD operation in this case, the destination scatterlist pointed at the page cache pages of the readable file. The AEAD operation wrote the authentication tag into those pages, mutating the in-memory copy of a privileged binary. The next time any process executed that binary, it executed the mutated bytes — which the attacker had chosen by selecting which file to splice from and what plaintext to encrypt. Xint's writeup walks the page-cache poisoning chain end to end. The 4-byte write at attacker-controlled offset is the primitive; selecting /usr/bin/sudo as the splice source and timing a setuid execution against the poisoned cache is the path to root.

Three things to underline:

Why this gets mis-triaged as "local access required, P3 informational"

The single most common rejection pattern we see for kernel LPE findings is variations on the same sentence: "this requires the attacker to already have code execution as a local user, so the realistic risk to our environment is low." That sentence is wrong in three structural ways.

Modern attack chains assume local code execution as the second step, not the precondition. Initial access in 2026 looks like a phished SSO token, a leaked CI secret, a stolen developer laptop, an exposed Jupyter notebook, an SSRF to an internal admin panel, an unauthenticated webhook handler, or a thousand other channels that drop the attacker on a non-root shell. The point of an LPE bug is that it converts that non-root shell into the keys to the kingdom in seconds. A finding that says "I have shell as www-data" combined with "this kernel is unpatched" is a finding that says "I have root."

Container deployments make the LPE…not local. A pod-level compromise from an SSRF or a pickle-deserialisation bug in a Python service is a network-reachable attack. The kernel LPE that works inside the container is the second link in a chain whose first link is over the wire. Your threat model has to count the chain, not the individual links. Our triage rubric writeup covers the "chain prerequisites missing" pattern at length; this is the most common form of it on kernel findings.

Page-cache poisoning persists across reboots if the poisoned binary is on disk. The Copy Fail primitive writes to in-memory page cache, but a sufficiently lucky write coincides with a writeback to disk and persists. The exploit author who wants persistence picks a binary the OOM killer is unlikely to evict, times the write to land before a sync, and now has a foothold that survives the reboot the security team triggered to "clear" the suspected compromise.

Here's what a Proof Capsule for this looks like

The Capsule for a kernel LPE has a different shape from a network-RCE Capsule. The customer engineer cannot run the replay against production without taking the host down. What we ship instead is a paired structure: a kernel-version probe that runs against every host and reports VULNERABLE / PATCHED / NOT_LINUX, and a fresh-VM reproducer that runs the actual 732-byte PoC inside a disposable VM the customer's engineer can spin up on their own infrastructure for end-to-end verification.

# capsule.yaml — CVE-2026-31431 schematic
finding_id: CELVEX-2026-31431-COPYFAIL-LPE
vulnerability:
  cve: CVE-2026-31431
  cvss: 7.8
  class: kernel-logic-bug-page-cache-poisoning
  affected: linux kernel 4.10 through patched releases (per distro)
target:
  asset: prod-worker-04.example-customer.internal
  kernel_detected: "5.15.0-119-generic"
  distribution: "Ubuntu 22.04.4 LTS"
preconditions:
  - any local code execution path on the host or in any container
  - algif_aead module loaded (default on most distributions)
  - AF_ALG socket creation not blocked by seccomp
artifacts:
  - poc/version-probe.sh            # safe, read-only, runs on production
  - poc/fresh-vm-replay.sh          # full PoC, runs ONLY in a disposable VM
  - poc/expected-output.txt         # captured "uid=0(root)" with watermark
  - poc/cleanup.sh                  # destroys the disposable VM
remediation:
  patches:
    ubuntu:    "linux-image >= 5.15.0-130 / 6.8.0-58"
    rhel:      "kernel >= 5.14.0-503.21.1.el9 / 4.18.0-553.30.1.el8"
    amazon:    "kernel >= 5.10.234-225 / 6.1.119-129"
    suse:      "kernel-default >= 5.14.21-150500.55.83"
    debian:    "linux-image >= 6.1.119-1"
  workaround_until_patched: |
    echo 'blacklist algif_aead' | sudo tee /etc/modprobe.d/disable-algif-aead.conf
    sudo modprobe -r algif_aead
    # plus a containerd/k8s seccomp profile that denies socket(AF_ALG, ...)
disclosure:
  reported: 2026-02-11
  patched_upstream: 2026-04-22
  public: 2026-04-29
  cisa_kev_added: 2026-04-29
  federal_deadline: 2026-05-15

The two scripts are deliberately separated. version-probe.sh is what runs on every production host the customer wants checked. fresh-vm-replay.sh is what runs in the disposable VM the customer's engineer creates to confirm the exploit actually does what the writeup says.

#!/usr/bin/env bash
# version-probe.sh — CVE-2026-31431 (read-only, runs on production)
# Outputs OK / VULNERABLE_KERNEL / PATCHED / NOT_LINUX
set -euo pipefail

if [ "$(uname -s)" != "Linux" ]; then echo "NOT_LINUX" ; exit 0 ; fi

KVER="$(uname -r)"
DIST="$(. /etc/os-release && echo "$ID:$VERSION_ID")"

case "$DIST" in
  ubuntu:22.04|ubuntu:24.04)  REQUIRED="5.15.0-130 / 6.8.0-58" ;;
  rhel:9*|rocky:9*|alma:9*)   REQUIRED="5.14.0-503.21.1.el9"   ;;
  amzn:2)                     REQUIRED="5.10.234-225"          ;;
  suse:15*)                   REQUIRED="5.14.21-150500.55.83"  ;;
  *)                          echo "INDETERMINATE: $DIST $KVER" ; exit 0 ;;
esac

if ./bin/version-gte "$KVER" "$REQUIRED"; then
  echo "PATCHED: $DIST kernel $KVER"
else
  echo "VULNERABLE_KERNEL: $DIST kernel $KVER (need $REQUIRED)"
  if lsmod | grep -q '^algif_aead'; then
    echo "  algif_aead loaded - AF_ALG attack path open"
  else
    echo "  algif_aead NOT loaded - blacklist persistently"
  fi
  exit 2
fi
#!/usr/bin/env bash
# fresh-vm-replay.sh - CVE-2026-31431 (DESTRUCTIVE; disposable VM only)
# Builds a fresh Ubuntu 22.04 VM on an unpatched kernel, runs the public
# PoC, captures "uid=0(root)" output, and destroys the VM.
set -euo pipefail

[ "${CELVEX_DISPOSABLE_VM:-no}" = "yes" ] || {
  echo "REFUSING: set CELVEX_DISPOSABLE_VM=yes only inside a throw-away VM"
  exit 1
}

# 1. provision a single-shot Ubuntu 22.04 VM on QEMU (image cached locally)
./bin/spin-disposable-vm ubuntu-22.04-5.15.0-119 /tmp/copyfail-vm.qcow2

# 2. push the 732-byte public PoC into the VM via the QEMU serial console
./bin/qemu-shell /tmp/copyfail-vm.qcow2 \
  --copy-in poc/copy-fail-732bytes.py /tmp/poc.py \
  --run "python3 /tmp/poc.py /usr/bin/sudo && id > /tmp/proof.txt"

# 3. retrieve the captured stdout (should contain "uid=0(root)")
./bin/qemu-shell /tmp/copyfail-vm.qcow2 \
  --copy-out /tmp/proof.txt /tmp/copyfail-proof.txt

cat /tmp/copyfail-proof.txt
grep -q "uid=0" /tmp/copyfail-proof.txt && \
  echo "VULNERABLE: Copy Fail confirmed" && exit 2

# 4. always destroy the VM
./bin/destroy-disposable-vm /tmp/copyfail-vm.qcow2
echo "OK: replay complete"

The watermark in expected-output.txt ties the captured "uid=0(root)" string back to the customer engagement and the date of the run. The customer engineer reading the Capsule does not have to take our word for the exploit working — they have a reproducer that runs in their own KVM environment and produces the same output.

What Celvex would have caught and how customers would have verified

Honest scope. The Linux kernel mainline tree is not in our patch-mining corpus today; the original reporters (independent researchers credited in the Microsoft Defender blog) found this bug, and Microsoft, Sysdig, and the upstream kernel security team coordinated the disclosure. What our pipeline ships within hours of public disclosure is a tagged test in the scanner corpus — ENDPOINT-LINUX-KERNEL-COPYFAIL-31431 — that runs the version-probe.sh against every Linux host the customer has flagged via our agent or via SSH credentials provided in the engagement.

The check ran for the first time during the 29 April 2026 nightly chain. By the morning of 30 April, every customer with at least one unpatched Linux kernel asset had a Proof Capsule in their dashboard with a per-host inventory: hostname, kernel version, distribution, AF_ALG-module state, recommended patched-version target, and the structured remediation script for whichever package manager applied. The on-call engineer who took our finding into their incident review at 09:00 UTC on 30 April had a list, not a hypothesis.

This is L1.5 today. We do not auto-mutate kernel exploits in production, we do not weaponise the page-cache poisoning primitive against a customer host without a separate scoped engagement, and we do not claim a discovery we did not make. What we ship is the Capsule, the per-host probe, the disposable-VM reproducer for end-to-end verification, and the discipline of running the check the night the patch lands. The platform documentation is precise about the boundary.

Mitigation guidance

  1. Patch the kernel on every Linux host this week. The distro tables in the Capsule schematic above are accurate as of disclosure; check Ubuntu, AlmaLinux, and the corresponding RHEL / SUSE / Amazon advisories for the version that matters in your environment. The CISA KEV federal deadline is 15 May; for everyone else, Friday 9 May 2026 is the right internal target.
  2. Until the patch lands, blacklist algif_aead and block AF_ALG via seccomp. The mitigation is benign on every workload we have audited: cryptsetup falls back to in-kernel software AES, kTLS workloads degrade to userspace TLS, and almost nothing else uses AF_ALG. Document the exception, deploy through your config-management tool, roll back when the kernel patch is in.
  3. Inventory containerised workloads as a first-class target. The kernel is the kernel; the host runs the same kernel as every container. The patched-host check is the only check that matters. A vendor "we run hardened images" claim is irrelevant if the host kernel is 5.15.0-119.
  4. Treat any Linux host that has run an unpatched kernel and exposed a network service since 11 February 2026 as potentially compromised. The reporters notified upstream on 11 February; the embargo held but assume worst-case and review process listings, persistent setuid binaries, and recently-modified files in /usr/bin for any host you cannot conclusively prove was kernel-isolated.
  5. Add the version-probe to your scheduled compliance checks. Copy Fail will not be the last LPE primitive shipped in 2026. The discipline of running a kernel-version probe weekly across the fleet is a cheap, durable defence against the next one.

Bottom line

"Local privilege escalation requires local access, P3 informational" is the FP-rejection sentence that hides container escapes, post-exploitation root, and reboot-persistent backdoors behind a CVSS that looks comfortable. Copy Fail is not comfortable. It is a 732-byte deterministic exploit against a code path that has been shippable since 2017, on every major distribution, with a clean public PoC and a CISA KEV entry. Patch the kernel this week. Run the probe across your fleet to confirm. Run the disposable-VM reproducer if any engineer in the room thinks "but does it really work?" The reproduction is the conversation. The conversation is what gets the patch deployed before 15 May.

Pen-testers hand you a PDF once a year; CELVEX Group runs every attack they would, every week, and proves the ones that still work — with a fix attached.

Sources

Run a free Exposure Check — 60 seconds, no signup

See the publicly visible signals an attacker would use to fingerprint your Linux fleet and the rest of your perimeter. No account required.

Start your Exposure Check