The story everyone remembers about T-Mobile's 2023 breach is that 37 million customer records walked out the door through an API. The story everyone remembers about AT&T's 2024 breach is that 73 million records sat on a Snowflake instance with stolen contractor credentials. The story everyone forgot — because the news cycle moves on — is that both incidents are instances of the same pattern that has been showing up in incident reports since at least 2014. The vulnerability is not the credential, the WAF, or the cloud provider. The vulnerability is that one authenticated request returned more data than any human workflow ever needed.
If your API can return one million records on a single call, you have a bulk-extraction primitive whether you intended one or not. The question is not whether attackers can find it; the question is whether the rest of your stack treats "one valid request" and "one valid request that returns half your customer database" as different events. In most stacks we audit, it does not.
What happened
Three reference cases, in chronological order:
T-Mobile, January 2023. 37 million customer records exfiltrated through a misconfigured API endpoint. The endpoint had been deprecated years earlier — meaning it was no longer documented, no longer in the developer portal, no longer mentioned in any current customer-facing flow — but it was still routable from the internet, still wired to a backing data store, and still returned customer records on a request shaped roughly like a legacy account-lookup call. The endpoint required authentication, but the authentication scheme was a token issued under the original integration's permission model, which had since been superseded. Attackers reached the endpoint over a period of weeks and pulled data on a per-customer-id basis at sustained high rates. T-Mobile reported the attacker iterated through customer identifiers programmatically.
AT&T, April 2024. 73 million customer records exposed through a Snowflake tenant accessed using credentials stolen from a third-party contractor. The credentials reached the attacker through unrelated infostealer malware on the contractor's machine. Once authenticated to Snowflake, the attacker ran SQL queries against the customer-records tables. There was no row-level access control limiting what queries could return. There was no per-query rate limit. There was no anomaly alert on a query that returned tens of millions of rows in a single result set. The query, by Snowflake's standards, was syntactically and authorizationally valid; the issue was that the data plane's notion of "valid" did not include "proportional to a legitimate human workload."
Snowflake (cross-customer), May–June 2024. 165 customer tenants compromised in roughly the same way. The list was publicized progressively as customers acknowledged exposure, and included Ticketmaster, Santander, AT&T, Advance Auto Parts, and others. The vector was credential-based — typically infostealer-harvested credentials reused against Snowflake tenants without enforced multi-factor — and the impact was identical to the AT&T case. Authenticated SQL access to a tenant with no rate limits per record-count produced bulk extraction at the speed of the network.
Three breaches, three vendors, three sectors. One pattern: privileged data accessible via a single authenticated request, with no rate-limit beyond a default WAF. The credential, the auth scheme, and the data store change. The architecture does not.
Why it kept working
The defensive mental model most teams use is binary: either a request is authenticated and authorized, or it is not. If it is, the request returns. If it is not, the request is rejected. That model was sufficient when the typical request returned one record, or one page of records, and the underlying database was sized to a single business unit. It is not sufficient when the same authenticated request can return millions of rows because the data plane is shared across the entire customer base.
The right model is proportionality: auth proportional to volume. A request that returns 1 record gets the auth requirements appropriate for retrieving 1 record. A request that returns 1 million records gets the auth requirements appropriate for retrieving 1 million records. Those requirements are different. They include things like step-up authentication, named-purpose justification, manager approval, time-window restrictions, and explicit rate limits expressed in records-per-minute rather than requests-per-minute. None of the three breach cases above had any of these controls in place.
The deprecated-endpoint case (T-Mobile) and the active-endpoint case (Snowflake) share a deeper property too: deprecation in modern stacks is rarely deletion. An endpoint stops being documented; it does not stop being routable. A query stops being part of a current workflow; it does not stop being executable. The list of "paths through which an authenticated caller can reach the customer-records table" grows monotonically, because removing a path is a coordinated engineering effort and adding a path is a single pull request. Over five to ten years of platform evolution, the bulk-extraction surface grows by accretion. Nobody decides to expose it. Nobody decides not to.
The role of the WAF in all three cases deserves mention. WAFs are tuned to block individual malicious requests — SQL injection patterns, traversal attempts, recognized exploit signatures. They are not tuned to block legitimate requests that happen to return enormous result sets. A SQL query for SELECT * FROM customers against a Snowflake tenant by a user with read access does not look malicious to any signature-based control. It looks like a query. The WAF, doing exactly what it was configured to do, lets it through.
What to check today
Five tests, ordered by how cheaply you can run them:
- List every endpoint that accepts a customer identifier and returns customer data, including endpoints not in your current API documentation. Pull from your API gateway logs over the last 90 days, not from the developer portal. Endpoints that respond but are not documented are the T-Mobile category. Anything in that list that does not have an active business owner who can explain what it is for should be turned off this week.
- Identify your top 10 endpoints by 99th-percentile response size. Sort by bytes returned, not by request count. Any endpoint that can return more than 10MB on a single response is a bulk-extraction candidate. Document the legitimate workflow that justifies that response size. If no workflow justifies it, the response should be paginated and capped.
- Test rate limits in records-per-minute, not requests-per-minute. A 60-requests-per-minute rate limit on an endpoint that returns 100,000 records per call permits 6 million records per minute of extraction. That is not a rate limit; it is an extraction-speed governor. Re-express your rate limits in records-per-minute and pick a number proportional to your legitimate users' actual workload.
- Audit your data warehouse for per-query result-size limits. Snowflake, BigQuery, Redshift, Databricks: each supports query-level constraints on result set size. Most customers we audit have these set to the platform default, which is "unlimited or near-unlimited." Pick a number that matches your largest legitimate analytical workload and enforce it on the customer-records tables specifically.
- Wire result-size anomalies to a paged channel. If a query against your customer-records table returns more rows in five minutes than the historical 99th-percentile of legitimate queries against that table, page someone. The detection logic is one query against your audit log. Most teams already have the data and have never written the detection.
Two of these tests cost nothing — they are queries against logs you already have. Three of them require a configuration change but no new tooling. None of them require a vendor procurement cycle. The reason most environments fail all five is not budget; it is that nobody has owned "bulk extraction" as a category of risk distinct from "authentication" and "authorization."
How CELVEX Group tests for this
Test API-BULK-EXTRACTION-PATTERN-001, defined in core/test_catalog/_supplement_pattern_analysis_2026-03.py, takes the architectural pattern from the three reference breaches and runs the equivalent probes against a customer's API surface and data plane. The test enumerates every reachable endpoint that accepts a customer identifier, measures 99th-percentile response sizes, and computes the maximum records-per-minute that an authenticated caller could pull through each path before hitting a rate limit. It also tests for deprecated-but-routable endpoints by replaying request shapes from older API versions against current hostnames and grading which still respond with data.
For data warehouses, the test runs authenticated queries representative of the breach pattern — large bulk reads of customer-records tables — and reports whether per-query result-size limits, anomaly alerts, and step-up auth are wired to detect or constrain the workload. The output is a table mapping each privileged data store to the worst-case extraction throughput an authenticated insider or compromised credential could achieve, and the specific control change that would cap the throughput at a defensible number.
What we don't do is grade on whether authentication is required. Every one of T-Mobile, AT&T, and the broader Snowflake set required authentication. The bar that matters is whether the authentication scheme is proportional to the request's potential blast radius, and that bar is what the test measures.
Bottom line
Three carriers. Three years. One pattern. The bulk-extraction breach is not a 2024 phenomenon and it is not specific to telecoms. It is the predictable consequence of a defensive model that asks "is this caller authorized?" without also asking "is this caller's workload proportional to legitimate business activity?" The fix is not a new product category. It is treating record-volume as a first-class security signal alongside authentication and authorization, and writing the rate limits, result-size caps, and anomaly alerts that operationalize it.
If your environment cannot answer the question "what is the largest single response a customer-records endpoint or query can return today, and is that proportional to any legitimate workflow?" — then your stack is in the same architectural position that T-Mobile, AT&T, and 165 Snowflake tenants were in before the news cycle visited each of them. Run the test. Pull the number. Decide whether you can defend it. Then decide what you will change.
Sources
Run a free Exposure Check — 60 seconds, no signup
See the publicly visible signals an attacker would use to map your bulk-extraction surface. No account required.
Start your Exposure Check