1. The FIM stack that became a checkbox
Every compliance framework written in the last fifteen years requires file integrity monitoring. PCI-DSS 11.5.2 calls it out by name. NIST 800-53 SI-7 demands it. HIPAA Security Rule 164.312(c)(1) leans on it. ISO 27001 Annex A.12.4.1 expects it. Every auditor in the world has a checkbox that says "FIM deployed", and every Linux SOC engineer who has been around longer than two budget cycles has a war story about installing AIDE on an afternoon in 2019, generating the baseline once, scheduling a nightly cron, and never touching the configuration again.
That afternoon was the last time most production AIDE deployments were tuned. The world has changed since 2019. The Linux kernel has changed even more. memfd_create(2) went mainstream as the post-exploit primitive of choice. eBPF graduated from "interesting tracing technology" to "the new attacker filesystem" with /sys/fs/bpf as a persistence anchor. Container runtimes moved tens of thousands of binaries out of the host filesystem entirely and into overlay merges that no host FIM tool walks. systemd-run learned to spawn transient timer units that never touch disk. io_uring learned to read files without ever calling read(2). BlackTech, GhostEmperor, and a handful of state-aligned teams have demonstrated initramfs-level persistence in the wild. The 2019-era AIDE config did not anticipate any of this and, almost certainly, was never updated to handle it.
What we find when we audit customer environments is consistent enough to be depressing. The FIM stack is present. It runs on a schedule. It produces logs that someone, in theory, reviews. The auditor sees the green checkbox and signs the report. The attacker also sees the green checkbox, and by the time their second-stage tooling lands, none of it touches the parts of the filesystem the FIM stack actually watches. AIDE excludes /tmp, /var/tmp, /run, /proc, /sys. Samhain runs on a 24-hour cadence with UseInotify=no. OSSEC <syscheck> skips /sys. Wazuh-FIM is realtime on /etc only. The union of those four tools, as deployed by default, leaves enough room to drive a kernel module through.
This post is the field-grade walkthrough. Eight composite-bypass tests, each named after a specific evasion primitive, each paired with the configuration audit that catches it. If your FIM stack has not been re-tuned since the kernel hit 5.0, you should be reading this with aide --check open in another window.
2. Eight ways to slip past the checkers
The CelvexGroup test catalog tracks these as ENDPOINT-FIM-STACK-001 through ENDPOINT-FIM-STACK-008. Each one is implemented as a configuration-and-evidence audit: the customer uploads their AIDE config, Samhain RC, OSSEC ossec.conf, and Wazuh agent FIM block, and the scanner computes the union coverage and the residual blind spots. The scanner never executes anything on the host. Everything below is reproducible by hand from the customer's own configs and a handful of read-only commands.
001 -- memfd_exec reachable: the unhashed in-memory binary
The Linux memfd_create(2) syscall (kernel 3.17, documented at man7.org/linux/man-pages/man2/memfd_create.2.html) returns a file descriptor backed by anonymous memory, with no on-disk path. Pass that descriptor to execveat(2) with AT_EMPTY_PATH and the kernel will execute whatever bytes you wrote into it, with no inode the FIM tool can hash and no path it can walk. The process's /proc/<pid>/exe symlink points at /memfd:<name> (deleted), which AIDE, Samhain, OSSEC <syscheck>, and Wazuh-FIM all uniformly skip.
The technique was popularised by writeups like magisterquis.github.io/2018/03/31/in-memory-only-elf-execution.html and is now standard tradecraft. Cobalt Strike, Sliver, Mythic, Metasploit, and most modern Linux post-ex frameworks ship a memfd loader. The FIM stack as deployed by default has zero coverage for it.
The audit. Grep all four configs for an explicit memfd: or in-memory-binary watch. There almost never is one. Then forward the auditd rule below into Wazuh and run a periodic /proc/*/exe walker as a wodle command:
-a always,exit -F arch=b64 -S memfd_create
-a always,exit -F arch=b64 -S execveat
The complementary collector is one line:
find /proc/*/exe -lname '*memfd:*' -o -lname '*(deleted)*' 2>/dev/null
Track the count over time. Non-zero results outside known-good interpreters (systemd-cryptsetup, kernel processes) are the signal.
002 -- exclude-coverage union audit: 40% of disk excluded by default
Take a typical Ubuntu 22.04 production server with stock AIDE, Samhain, OSSEC and Wazuh-FIM configs. Compute the union of paths the four tools actually watch. Subtract from total disk bytes. The number is uncomfortable.
In our customer audits the median union-coverage is 58%. The 25th percentile is 42%. Standard exclusions across all four tools include /tmp, /var/tmp, /var/cache, /var/log (log churn), /home (user-file churn), /opt (nobody updated the include list when the application moved there), /run, /proc, /sys, and most container-engine state directories. The scanner test computes excluded_bytes / total_bytes and FAILs below a configurable threshold (default 70%, raised to 80% for production gates).
T1564.001 (Hidden Files and Directories) sits in the top quartile of techniques observed in 2024-2026 incident reports, and the staging location is almost always inside the FIM exclude set. /tmp is a perennial favourite. /var/tmp survives reboots. /dev/shm is tmpfs and ignored by every FIM tool. The remediation is unglamorous: document the exclude list as a single source of truth, decide per-path whether to re-include or fill the gap with auditd, and audit the percentage quarterly. Customers can act on a number. They cannot act on a 200-line rule diff.
003 -- bpffs pinned objects: /sys/fs/bpf is the new attacker filesystem
The eBPF subsystem ships a dedicated filesystem at /sys/fs/bpf, documented in Documentation/bpf/bpf_devel_QA.rst and the bpf(2) man page. Programs, maps, and links can be pinned there with BPF_OBJ_PIN, persisting across the lifetime of the loader process. An attacker with a malicious eBPF program (LSM hook, kprobe, tracepoint) can pin it at /sys/fs/bpf/<name> and re-attach it on every reload without ever touching a userspace path. The persistence is rootkit-grade: the program runs inside the kernel's tracing infrastructure, can hook getdents, kill, exec, and a hundred other syscalls, and lives on a filesystem that no major FIM tool monitors by default.
AIDE's default config excludes /sys outright. Samhain's stock policy never recurses into /sys. OSSEC <syscheck> has /sys in the default ignore list. Wazuh-FIM does not realtime-watch bpffs. The four-tool union has zero coverage for the entire bpffs surface.
The audit. Run bpftool prog show -j and bpftool map show -j and ls -laR /sys/fs/bpf on the host. Capture the program types, the program tags, the load times, and the pinned paths. Then audit each FIM config for an explicit /sys/fs/bpf include. If none of the four tools watches it (they do not), and the bpftool snapshot shows BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_TRACEPOINT, or BPF_PROG_TYPE_LSM programs pinned outside an allow-list of known tracers (Falco, Cilium, BCC), the audit FAILs.
The remediation. Wazuh syscheck supports realtime on bpffs in 4.4+. AIDE handles it via a manual line:
/sys/fs/bpf p+u+g+md5+sha256
Forward bpftool prog list output to the SIEM on a five-minute cadence, alert on new program IDs, and treat any unsanctioned tracepoint or LSM program as an incident. Symbiote (Intezer's 2022 writeup) and BPFDoor (PwC's 2022 analysis) both used this surface. TripleCross is the academic reference implementation. The threat model is well-documented and the detection costs almost nothing.
004 -- baseline provenance: your AIDE database was tampered before you snapshotted
The most elegant FIM evasion is also the laziest. The attacker lands on the host, drops their persistence, and then runs aide --init (or aide --update, or the Samhain equivalent, or whatever the local rebaseline command happens to be). The new baseline now contains the attacker's modifications as part of the "clean" state. Every subsequent scan reports zero changes. Detection silently fails forever, until somebody notices the database mtime is newer than they remember.
The audit. Capture per-tool baseline DB metadata: aide.db.gz mtime, samhain.db mtime, syscheck.db mtime, Wazuh fim.db mtime. Compare against host install metadata: /etc/machine-id ctime, /var/log/installer/ timestamps, /proc/1/stat boot epoch. The two should be coherent. A baseline DB older than the host install date means the database survived something it should not have (image template carryover). A baseline DB newer than the host install date by more than ~90 days without a documented rebaseline event means somebody, possibly the attacker, regenerated the database after initial deployment.
The remediation is structural. Sign the FIM baseline at install time with a build-system key. Verify the signature on every scan. Forbid in-place re-baselining; force every regeneration through change-management with dual approval. Store every baseline DB hash and signing time in a tamper-resistant evidence store: Vault transit engine, AWS KMS audit log, or off-host write-only syslog. The Samhain documentation at la-samhna.de/samhain/manual/index.html has covered baseline integrity since the early 2000s; the AIDE manual covers it under the --init and --update sections. The kit has been there. Customers just do not configure it.
005 -- container runtime drift: overlay store gaps
Run AIDE on a host that runs Docker or containerd or CRI-O. Generate the baseline. Now exec into a long-running container, modify a binary inside it, and run AIDE again. AIDE reports zero changes. The reason is that AIDE hashes paths on the host, and the container's filesystem lives in an overlay merge under /var/lib/docker/overlay2/<id>/merged/ (or /var/lib/containers/storage/overlay/<id>/merged/). The merged directory is materialised only when the container is running, contains a synthetic union of the image layers and the container's read-write layer, and is essentially never on the host FIM include list.
The result is that AIDE saying "no changes since Tuesday" while a long-lived production container has been actively tampered with at runtime is a real, not hypothetical, customer pattern. T1525 (Implant Internal Image) covers the variant where the image itself is modified pre-deployment. The runtime variant is even easier and more common.
The audit. For each running container, capture (a) the image digest, (b) the image-layer file-hash manifest from the registry, (c) a sampled in-container file-hash set computed via nsenter or docker exec. Diverge between (b) and (c) without a corresponding image change is runtime drift. The simplest fix is in-container FIM (a Wazuh agent inside the container, or an admission-controlled sidecar diffing / against the image manifest on a five-minute cadence). The structural fix is enforced container immutability: readOnlyRootFilesystem: true, short-lived containers, image-pull-on-deploy, and allowPrivilegeEscalation: false. Both Kubernetes Pod Security Standards and the Docker overlayfs-driver docs cover the operational pattern.
006 -- realtime / inotify config audit: most FIMs run periodic, not realtime
Wazuh-FIM supports realtime via inotify on Linux and via the FileSystemWatcher API on Windows. OSSEC <syscheck> supports <realtime>yes</realtime>. Samhain supports UseInotify=yes. AIDE has no native realtime mode and is fundamentally a periodic scanner. The four-tool union, when correctly configured, can hit realtime coverage on every critical path. As deployed by default, it usually does not.
The audit operates in two modes. Static-config audit on the four config bundles, looking for the realtime tunables. Network-observable audit against the Wazuh manager API at /manager/configuration/syscheck (which leaks the <realtime> posture if the manager API is on the public side of the network, a common misconfiguration that the same scanner family catches as a separate finding). The fail gate fires when fewer than two of the four stacks declare realtime monitoring on at least one critical path (/etc, /usr/sbin, /usr/bin, /root).
The cost of the gap is straightforward. Without realtime, FIM is a daily snapshot. The attacker lands at 09:00, installs persistence, covers their tracks, and the next scheduled scan at 02:00 either catches it 17 hours later or never catches it because the cleanup ran in between. The remediation is one Wazuh stanza:
<directories realtime="yes" check_all="yes" report_changes="yes">/etc,/usr/bin,/usr/sbin,/root</directories>
Plus <realtime>yes</realtime> in OSSEC and UseInotify=yes in Samhain. The Wazuh and OSSEC FIM documentation at documentation.wazuh.com/current/user-manual/capabilities/file-integrity/configuring-fim.html and ossec.net/docs/manual/syscheck/syscheck-real-time.html and Samhain's la-samhna.de/samhain/manual/inotify.html all walk through this. It is a four-line config change. It is also one of the highest-leverage tunables in any FIM deployment, and we still find it disabled in the majority of production environments.
007 -- transient systemd-run unit history: the timer that vanishes before scan
systemd-run --on-active=, systemd-run --on-calendar=, and the related transient-unit modes documented at freedesktop.org/software/systemd/man/systemd-run.html create units that live in /run/systemd/transient/. That path is tmpfs. It is never persisted to disk. Every FIM tool's default config skips /run because it is tmpfs and skipping it avoids alert noise from the boot-time churn. The result is that an attacker can run systemd-run --on-calendar=daily /tmp/.beacon and get a persistent scheduled callback that lives entirely in memory, vanishes from disk on reboot, and never trips any FIM scan.
T1053.006 (Scheduled Task/Job: Systemd Timers) is in active use by multiple state-aligned teams. The transient variant is the harder-to-detect version because the unit file does not live on the disk the FIM tool watches.
The audit. Snapshot systemctl list-units --type=timer,service --state=running --all and ls -la /run/systemd/transient/ on a 15-minute cadence. Diff against a known baseline. Alert on new run-.timer and run-.service units whose ExecStart falls outside an allow-list. Forward systemd audit events to Wazuh through <localfile><log_format>journald</log_format><location>journald</location></localfile> and add a decoder for systemd-run invocations. The unit history lives in journald, not on disk, so the journald shipper is the canonical collection path.
008 -- initramfs diff vs distro digest: the BlackTech / persistence vector
The initramfs is the early-userspace ramdisk that runs before the kernel hands off to systemd. On Debian-family systems it lives at /boot/initrd.img-<version>. On RHEL-family systems it is /boot/initramfs-<version>.img. AIDE, Samhain, OSSEC, and Wazuh-FIM all hash the file as a single blob. None of them decompress and walk the contents. A tampered initramfs is therefore the canonical Linux kernel-mode persistence primitive: an LD_PRELOAD shim, a custom init script, a malicious early-userspace module, or a UEFI-stage payload that runs before any monitoring is in scope.
BlackTech's research, documented across the Trend Micro and Mandiant 2023-2024 writeups, used initramfs-level persistence as part of the broader UEFI-bootkit toolchain. The pattern recurs in Kaspersky's GhostEmperor analysis and in the various Lazarus-attributed Linux-server intrusions reported by AhnLab and Symantec. The threat model is real and has been real for years. The detection is a unmkinitramfs away.
The audit. Run unmkinitramfs /boot/initrd.img-$(uname -r) /tmp/initrd-extracted/ (or the dracut equivalent: lsinitrd /boot/initramfs-$(uname -r).img --unpack /tmp/initrd-extracted/). Hash every file in the extracted tree. Compare against the distro-package-reported digest: debsums initramfs-tools on Debian, rpm -V dracut on RHEL. Any divergence outside the expected per-host customisation (machine-id, microcode-ucode blob, locally-configured crypttab) is a kernel-mode-compromise incident. Where Secure Boot is enabled, verify the signed-shim chain.
The reference documentation is at manpages.debian.org/testing/initramfs-tools-core/unmkinitramfs.8.en.html and the dracut sources at github.com/dracutdevs/dracut. The MITRE technique is T1542.003 (Bootkit). The remediation is a Wazuh <command> wodle calling a small shell helper that pipes the unmkinitramfs output to sha256sum and forwards the manifest to the manager on a weekly cadence. Treat any divergence as critical.
3. The compound-bypass problem
The eight tests above are individually significant. They are catastrophic in compound. The reason is that the FIM exclusions are mostly disjoint between tools, but the gaps share an alignment: the parts of the system the attacker actually wants to use are uncovered by every tool simultaneously.
Take a representative deployment: AIDE on the nightly cron, Samhain on a 24-hour cycle without inotify, OSSEC <syscheck> with the default exclude list, Wazuh-FIM realtime on /etc only. The compound coverage map looks like this.
/tmp is excluded by AIDE (default ignore), excluded by Samhain (default policy), excluded by OSSEC (default ignore list), and not in any Wazuh-FIM include. Union coverage on /tmp: zero. The attacker stages tooling in /tmp/.cache/.X11-unix/.beacon and the four-tool stack never sees it.
/sys/fs/bpf is excluded by AIDE (the /sys exclusion), excluded by Samhain (skips /sys), excluded by OSSEC (default ignore list contains /sys), and not in any Wazuh-FIM include. Union coverage: zero. The attacker pins their eBPF rootkit and the four-tool stack never sees it.
/run/systemd/transient/ is in /run which is tmpfs, which is in every tool's default ignore set. Union coverage: zero. The attacker schedules transient timers and the four-tool stack never sees them.
In-memory binaries via memfd_create have no on-disk path at all. Union coverage: zero. The attacker runs unhashed binaries and the four-tool stack never sees them.
Container overlay merges live under /var/lib/docker/overlay2/, which is in every tool's default exclude (because the FIM rules were written before containers were a deployment pattern). Union coverage: zero. The attacker tampers a long-lived container and the four-tool stack never sees it.
Initramfs blobs are hashed but not decomposed. The single-blob hash matches because the blob is the same blob; the decompressed contents have changed. Effective coverage on the contents: zero. The attacker plants their bootkit and the four-tool stack does not notice.
That is six independent zero-coverage surfaces, each one usable on its own, each one combinable with the others. The compound mathematics is not "AIDE catches some things, Samhain catches others, the union covers most of the system." The compound mathematics is "all four tools were tuned with the same default exclusions written from the same threat model from a different decade, and the modern attacker primitives all happen to land inside that shared exclusion set." This is not a redundant defence. It is a single defence, deployed four times, with the same blind spots.
Adding a fifth FIM tool with the same default config does nothing. The fix is to enumerate the modern primitive surfaces (memfd, bpffs, transient units, overlays, initramfs contents) and put each one inside at least one tool's include set, with a complementary auditd or SIEM rule for anything that cannot live inside a path-based watch. The first step is measuring the current exclude union. Test 002 exists for exactly this reason.
4. What to check today
This section is the unglamorous version: the commands you run on your own production host, this afternoon, to find out if any of the eight findings above actually apply to you. None of them are dangerous. All of them are read-only.
AIDE database freshness and contents:
sudo aide --check 2>&1 | head -30
sudo stat /var/lib/aide/aide.db.gz
sudo grep -E '^(\s*\!|\s*[A-Za-z/])' /etc/aide/aide.conf | grep -E '!|=' | sort -u
The third line dumps every exclude (!) and equality (=) rule. Read it. Look for /tmp, /var/tmp, /run, /sys, /proc, /home, /var/log. Decide whether each one is intentional.
Samhain configuration and inotify state:
sudo samhain -t check -p info 2>&1 | head -20
sudo grep -E '^(UseInotify|SetMailTime|SetLooptime)' /etc/samhain/samhainrc
sudo stat /var/lib/samhain/samhain_file
UseInotify=no (or absent) is a finding. SetLooptime greater than 3600 seconds on a production system is a finding.
OSSEC / Wazuh syscheck configuration:
sudo grep -A 20 '<syscheck>' /var/ossec/etc/ossec.conf
sudo grep -E '<directories|<ignore' /var/ossec/etc/ossec.conf
Look for <frequency> (anything over 7200 seconds is too long), <realtime> (must be yes on at least one critical path), and the <ignore> list (compare against the default; what was added?).
eBPF persistence surface:
sudo bpftool prog show
sudo bpftool map show
sudo ls -laR /sys/fs/bpf 2>/dev/null
Any program of type kprobe, tracepoint, lsm, or cgroup_skb that you cannot attribute to a known tracer (Falco, Cilium, BCC, the kernel's own bpf-prog-show entries) is a finding. Pinned objects in /sys/fs/bpf outside /sys/fs/bpf/cilium, /sys/fs/bpf/falco, or your known toolchain are findings.
memfd_exec scan:
find /proc/*/exe -lname '*memfd:*' 2>/dev/null
find /proc/*/exe -lname '*(deleted)*' 2>/dev/null
Any non-zero output outside systemd-cryptsetup, runc, and known-good interpreters is a finding.
Initramfs integrity:
sudo unmkinitramfs /boot/initrd.img-$(uname -r) /tmp/initrd-extracted/
sudo find /tmp/initrd-extracted/ -type f -exec sha256sum {} \; | sort > /tmp/initrd.sha256
sudo debsums initramfs-tools 2>&1 | grep -v OK
Anything in the third command's output is a critical finding.
Transient systemd units:
systemctl list-units --type=timer,service --state=running --all | grep '^run-'
ls -la /run/systemd/transient/
Each run-.timer or run-.service is a transient unit. Verify that every one has a known operator-initiated provenance.
5. Closing
The FIM stack is the oldest detection layer in the Linux SOC and, in most production environments, the most stale. The compliance frameworks that drove its adoption never updated their threat model when memfd_create shipped, when bpffs became attacker-reachable, when containers ate the workload, or when transient systemd units became the persistence pattern. The auditor's checkbox still says "FIM deployed". The attacker's playbook has moved on.
The fix is not to rip out AIDE, Samhain, OSSEC, or Wazuh-FIM. They still do useful work where they were tuned. The fix is to measure the union exclude surface, enumerate the modern primitives that fall outside it, and close each gap with the right tool: a Wazuh realtime watch, an auditd rule, a bpftool snapshot wodle, an unmkinitramfs weekly diff, an in-container sidecar. The eight tests above are the field-grade checklist. Run them. Read your own configs. Then run them again next quarter.