How attackers chain SSRF and cloud metadata for credential theft

The attacker is uploading an avatar. Or pasting a URL into a markdown previewer. Or letting your serverless function fetch a webhook target. The function fetches http://169.254.169.254/latest/meta-data/iam/security-credentials/ instead, and forty seconds later the IAM session token is in someone else's hands. The class is fifteen years old. Capital One in 2019 was a textbook case. The reason it's the dominant cloud-side initial-access pattern of 2026 is not that nobody knows how it works. It's that the controls that stop it cold are still optional, still buried in a checkbox, and still missing on roughly nine in ten production environments we audit.

This is the first piece in our attack-research series. The thesis we built CelvexGroup on is that the security industry has been selling defenders a slideshow of what attackers could do, and what defenders need is a tape of what attackers are doing, with the exact reproduction so engineering can fix it on the same call. Verifiable security. No screenshots. No "could not reproduce." Today's piece is the cloud-side initial-access class that has shown up in more incident-response retainers than any other in the past eighteen months.

If you run anything substantial on AWS, Azure, GCP, DigitalOcean, Hetzner, or any of their second-tier siblings, the attack we walk through below is on a list someone is working through this week. The question is whether the version of your stack they hit is the one with seven layers of defense or the one with two.

The attack pattern in one paragraph

Every cloud provider exposes a metadata service to its virtual machines on a non-routable link-local address: 169.254.169.254 on AWS, GCP, and DigitalOcean; 169.254.169.254 reachable via metadata.google.internal on GCP; and 169.254.169.254 behind a header-required handshake on Azure. The metadata service ships back the VM's identity, its user-data, the names of the IAM roles attached to it, and, the prize, short-lived session credentials that the IAM role can use to call the cloud control plane. If an attacker can convince a process running on that VM to make an HTTP request on her behalf to that address, the response goes back through whatever side channel she has, whether reflected, blind, or stored, and she now holds an IAM session that, at minimum, can read everything that role can read, and at worst can pivot to keys-of-the-kingdom.

The "convince a process to make a request on her behalf" part is the SSRF. In 2026, the SSRF entry points that matter are not the textbook ?url=http://... proxy parameter. They are: image-and-document processors that follow remote URLs (Pandoc, ImageMagick, Chatwoot's upload pipeline), serverless functions that fetch user-supplied webhook targets, OEM "import from URL" features in CMS platforms, PDF generators that resolve external resources, and AI orchestration libraries that cheerfully fetch whatever the LLM was told was a useful tool. Ostorlab's CVE-2026-5205 writeup on Chatwoot is the canonical recent example: an authenticated agent uploads a "file," the upload handler fetches an attacker-supplied external_url, and the response, including a complete metadata bundle from the DigitalOcean droplet, comes back to the attacker.

Why this is still working in 2026

Three reasons, in order of how often we see them:

IMDSv1 is still on by default in too many places. AWS shipped IMDSv2 in 2019 and made it default for new launches in late 2023. The keyword is new. Existing instances, instances launched from older AMIs, instances created by Terraform modules pinned to old provider versions, instances created by automation that explicitly set HttpTokens=optional, all still expose v1. Industry estimates suggest more than nine in ten instances did not enforce IMDSv2 as of 2022, and the trend has improved but not flipped. v1 lets you query the metadata service with a single GET. v2 requires a PUT to obtain a session token first, and SSRFs that only allow GET cannot complete that handshake.
The IAM roles attached to compute have more permissions than they need. The role on your image-processing worker has S3 write to the public-assets bucket, plus S3 read on the data warehouse exports bucket because someone added it during a one-off backfill and never removed it. The session token an SSRF steals is whatever role the VM had attached. Least-privilege at the role level limits the blast radius even when v1 is on.
The application code that fetches user-controlled URLs has no allow-list. The function that follows a URL trusts the response. It does not check whether the resolved IP is in 169.254.0.0/16 or 169.254.169.254/32. It does not pin DNS or refuse redirects. It does not run with an egress firewall that blocks the metadata IP. F5 Labs documented an active March 2025 campaign that ran exactly this fetch-and-grab pattern across exposed EC2 instances for ten days before defenders got noisy enough to push the operators off.

The attacker decision tree

ATTACKER DECISION TREE SSRF -> Cloud Metadata ┌──────────────────────────────────────┐ │ 1. Find an HTTP-fetching surface │ │ - image upload that supports URL │ │ - markdown link previewer │ │ - webhook target validator │ │ - PDF / report generator │ │ - AI-tool URL fetcher │ └──────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ 2. Try direct metadata IP │ │ http://169.254.169.254/... │ │ │ │ │ ├──→ blocked: try dns rebind │ │ ├──→ blocked: try IPv6 │ │ │ [::ffff:169.254.169.254] │ │ ├──→ blocked: try alt forms │ │ │ 2852039166, 0xa9fea9fe │ │ └──→ blocked: try redirect chain │ │ via attacker-controlled URL│ └──────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ 3. Identify the cloud / IMDS ver. │ │ v1 → single GET, response = win │ │ v2 → need PUT for token; usually │ │ fail unless SSRF supports │ │ arbitrary methods/headers │ └──────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ 4. Steal short-lived credentials │ │ /iam/security-credentials/ │ │ → AccessKeyId/SecretKey/Token │ └──────────────┬───────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ 5. Pivot from the session token │ │ - sts:GetCallerIdentity │ │ - enum buckets, dbs, kms │ │ - look for cross-account roles │ │ - exfil within the token's TTL │ └──────────────────────────────────────┘

The five-step pattern an SSRF-into-metadata operator runs against any cloud-hosted target.

Steps 2 and 3 are where the half-measures break down. A defender who blocks 169.254.169.254 as a literal string in their HTTP client allow-list misses 2852039166 (decimal IP), 0xa9fea9fe (hex), 0251.0376.0251.0376 (octal), and [::ffff:169.254.169.254] (IPv4-mapped IPv6). A defender who blocks all four IP encodings misses DNS rebinding from a name the attacker controls. A defender who blocks all of those by configuring an egress firewall at the network layer rather than the application layer catches the lot, which is why network-level egress controls plus IMDSv2 plus minimal IAM roles is the only layer cake that works against a creative operator.

A composite real-world scenario

The following is composite. The pieces are real; we have walked variations of this attack chain in three customer post-mortems in the past nine months. We have anonymised, but the chain itself is verbatim.

The target is a mid-market SaaS company: about 200 employees, $40M ARR, runs on AWS in two regions, has the kind of security maturity where someone owns it part-time but there is no full-time CISO. The product lets customers upload company logos, screenshots of their CRM dashboards, and PDF brochures into the customer-success portal so the AE can include them in QBR decks. The upload endpoint accepts either a multipart file or a JSON body with an image_url field. The latter exists because three large customers asked for it during onboarding so they could automate uploads from their internal asset management system.

An attacker who paid $80 for a recycled enterprise email address (the kind sold by initial-access brokers; we wrote about the Q1 2026 pricing in our threat-intel blog) signs up for a free-trial customer-success portal account, finds the upload endpoint within 15 minutes of poking around, and lobs the first probe. image_url=http://169.254.169.254/latest/meta-data/. The endpoint fetches it, transcodes the response (which is plain text starting with ami-id) as if it were an image, and the transcode fails with a parse error. The error response includes the first 200 bytes of the fetched content. That is the side channel.

From there it is mechanical. Two more requests confirm IMDSv1 is on. Six more enumerate the IAM role list. The seventh fetches /iam/security-credentials/<role-name> and the response includes AccessKeyId, SecretAccessKey, and Token. The attacker copies the credentials into a fresh AWS CLI session and starts enumerating. aws sts get-caller-identity tells her which account she is in. aws s3 ls tells her which buckets the role can touch. One of those buckets contains nightly database snapshots in parquet format because the analytics team needed them out of RDS and into Athena and the role on the upload worker is the one that handles "asset-related transient files." She pulls the snapshots, walks away, and forty-eight hours later the customer is reading her ransom note.

The whole chain takes thirty-five minutes from the first probe. The customer's WAF logged the requests but did not flag them because 169.254.169.254 in a query parameter is not a default rule. The egress firewall did not exist because the upload worker was in a public subnet behind a NAT gateway and the SRE team had not gotten around to writing a network-egress policy for the worker fleet. The IAM role had the bucket read because of a Terraform module someone forked in 2023 and never reviewed.

None of those gaps was unknown to the security team. They were all on the backlog. The attacker just got there first.

What we observe in customer environments

This section is written conservatively because we are honest about what we can and cannot measure. CelvexGroup runs continuous, signed, replayable validation against assets the customer flags into our scope. We are not a SIEM and we do not see post-incident telemetry from production environments unless the customer connects it. What we do see, on the external attack surface side, is a representative sample of how often the controls that stop SSRF-into-metadata are present.

Across the customer engagements we have run in the past six months, the rough breakdown of cloud-hosted assets we have verified against the metadata-SSRF chain is:

Roughly half have IMDSv2 enforced on the assets we tested. The other half are running v1, v2-optional, or we could not determine without internal access.
Roughly a third have an application-layer URL allow-list on at least one URL-fetching endpoint. The other two thirds rely on transit-layer or network-layer controls, which are inconsistent across environments.
Roughly one in ten have an egress firewall that explicitly blocks 169.254.0.0/16 from compute subnets. This is the single highest-leverage control in the stack and the least-frequently-deployed.
Less than one in twenty have IAM roles on URL-fetching workloads that have been right-sized to least-privilege within the past twelve months. Most have visible role bloat from features that shipped, were rolled back, or were superseded but never had the role pruned.

The honest read is that a competent attacker walking into one of these environments with this chain in mind will succeed roughly nine times out of ten. That is not a market-positioning claim. It is what the public surface of mid-market SaaS looks like when you look at it cold.

What to do about it: the seven-item, one-sprint checklist

The reason we built this piece around a checklist is that the controls are not new and the techniques to deploy them are not exotic. The reason they are missing is that nobody owned them and nothing forced the issue. Here is the list. If your team can hand-wave a "yes" on all seven, the SSRF-into-metadata class is not your highest-priority cloud risk this quarter. If you cannot hand-wave more than three, the rest of this section is the most valuable hour you spend this week.

Seven-control hardening list: cloud metadata SSRF

Enforce IMDSv2 on every existing instance. AWS provides a one-shot migration via aws ec2 modify-instance-metadata-options --http-tokens required --http-endpoint enabled --http-put-response-hop-limit 1. Run it as a fleet-wide change with a rollback plan. Block instance launches that try to set HttpTokens=optional at the SCP level.
Set http-put-response-hop-limit=1. This blocks containers running on the host from reaching the metadata service via the host network namespace. Default is 1 on new instances; many older instances are 2 or 3.
Network-layer egress block on the metadata IP from compute subnets. Add an explicit deny rule for 169.254.169.254/32 outbound from every subnet that hosts URL-fetching workloads. On AWS this is a security-group + NACL change. On GCP the equivalent is a firewall rule. This is the single highest-leverage control because it stops every encoding trick at the network layer.
Application-layer URL allow-lists or pin-to-IP for every endpoint that fetches user-controlled URLs. The library you want is one that resolves the URL once, validates the resolved IP against a deny-list (RFC 1918, link-local, multicast, IPv4-mapped IPv6 link-local), and then connects to the validated IP rather than re-resolving the hostname mid-request. This kills DNS rebinding cold.
IAM role audit on every workload that fetches URLs. If the role on an image-processing worker can read your customer database, your finance bucket, or any cross-account artefact, that is the wrong role. Use iamlive or AWS Access Analyzer to generate a least-privilege policy from observed traffic, then deploy.
Egress logging on URL-fetching workloads, queryable for thirty days. If you cannot grep your VPC flow logs for 169.254.169.254, you cannot detect this attack post-hoc. Most teams have flow logs. Most teams have not built the query.
SIEM rule that fires on any successful connection to 169.254.169.254 from a workload that has no business there. Allow-list the CSP agent. Alert on everything else. The false-positive rate on this rule is near zero in our experience: legitimate metadata traffic comes from the CSP agent, kubelet, and a handful of well-known userspace tools. Anything else is investigation-worthy.

If your team can hand-wave a "yes" on all seven, this class is not your highest-priority cloud risk this quarter. If you cannot hand-wave more than three, this is the most valuable hour you spend this week.

How Celvex catches this

Our approach to the SSRF-into-metadata class follows the same four-step framework we apply to every attack pattern in our scanner corpus.

Find. Prove. Fix. Verify.

Find

Continuous external scan of customer-flagged assets identifies URL-fetching endpoints, fingerprints the cloud provider, and probes for IMDSv1 vs v2 indicators against the metadata IP via every encoding listed above.

Prove

For confirmed exposures, the scanner generates a Proof Capsule: the exact request body that triggered the metadata fetch, the truncated response evidence, and a self-contained replay.sh the customer's engineer runs against their own asset.

Fix

The Capsule's remediation block points at the seven-item checklist above, scoped to the affected workload: which IMDS setting to flip, which egress rule to add, which role to audit. Engineering reads it like a ticket.

Verify

After the fix lands, the customer re-runs replay.sh. The same probes that returned metadata before now return network-blocked. The finding closes automatically. The dashboard records a verified-fix event for the auditor.

The Proof Capsule for a metadata-SSRF finding looks like this in skeleton form. The point is that the customer's engineer can re-run the exploit themselves, in their own environment, without trusting anything we wrote in the writeup.

# capsule.yaml — SSRF -> cloud metadata schematic
finding_id: CELVEX-2026-SSRF-METADATA-{customer}-{asset}
attack_class: ssrf-cloud-metadata-credential-theft
severity: critical
target:
  asset: uploads.example-customer.com
  endpoint: POST /api/v1/uploads {image_url: ...}
  cloud: aws
  region: us-east-1
preconditions:
  - asset accepts user-supplied URL in image_url field
  - asset has IAM role attached
  - asset network egress allows 169.254.169.254
artifacts:
  - poc/replay.sh                # one-shot reproducer
  - poc/expected-output.txt      # truncated metadata, watermarked
  - poc/screen-recording.mp4     # 28-second walkthrough
  - poc/cleanup.sh               # rotates IAM session, removes test upload
proof:
  imds_version_detected: v1
  metadata_paths_reachable:
    - /latest/meta-data/instance-id
    - /latest/meta-data/iam/security-credentials/
  credentials_observed: <hash-only, never raw, watermarked>
remediation:
  primary: enforce-imds-v2-fleet-wide
  secondary: egress-deny-169.254.169.254-from-uploads-subnet
  tertiary: audit-iam-role-on-uploads-worker
  references:
    - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-IMDS-existing-instances.html
disclosure:
  reported_to_customer: 2026-05-04T09:14:00Z
  customer_acknowledged: 2026-05-04T09:31:00Z

We are honest about where this fits on the autonomy maturity curve. CelvexGroup ships at L1.5 today: the scanner has a tagged corpus for SSRF-into-metadata across the major cloud providers, the Capsule format is signed end-to-end, and the replay primitive runs unattended on the customer side. Our L2 milestone in 90 days is full automated chain validation: the scanner not only confirms the exposure, but follows the chain into a mock-IAM environment to confirm what data the stolen session token would have reached. L3 by twelve months is autonomous mutation of the SSRF probe against unfamiliar URL-fetching surfaces: the scanner discovers a new endpoint shape and synthesises probes specific to it without human-in-the-loop. We are not there yet. We will say so on the day we are.

The point of the framework is that customers do not have to trust our autonomy claims. The Capsule is the proof. Run it. Watch the metadata come back. Apply the seven controls. Re-run. Watch the request fail. Find. Prove. Fix. Verify.

Bottom line

SSRF into the cloud metadata service is the most common cloud-side initial-access pattern of 2025-2026 because the controls that stop it are well-known, individually inexpensive, and almost never deployed in full. The fix is a one-sprint hardening exercise plus a continuous validation loop that confirms the fix held this week, last week, and the week the auditor asks. Pen-testers hand you a PDF once a year. We run the same attacks every week, every asset, with a fix attached to every finding.

Verifiable security. That is what we ship.

Sources

See your cloud-side exposure to this attack class.

Free Exposure Check: sixty seconds, no signup. We fingerprint your cloud-hosted assets the way an attacker would, and ship one signed Proof Capsule for the highest-confidence finding.

Run a Free Scan →