← Back to Attack Research

The exposed LLM builder: when the AI control plane becomes the breach

A team stands up a low-code LLM app builder for a prototype, leaves the UI and API reachable, and forgets it. That instance is not a toy. It holds the model-provider keys, the database credentials, and the tools the agent can call. CVE-2026-46442 (CVSS 9.9) turns one such builder, Flowise, into authenticated remote code execution through the custom-function node. Here is the decision tree from a discovered builder to host compromise, and the boundary that ends it.

An engineer wires up a drag-and-drop LLM builder to prototype a support bot. They paste in an OpenAI key, point a node at the production read-replica, add a couple of tool integrations, and bind the UI to 0.0.0.0:3000 "so the PM can click through it." The prototype ships, the demo lands, and the instance keeps running, unauthenticated, on a routable interface, for months. That instance is now the highest-value box on the network and nobody is watching it. It is the AI control plane: it stores the credentials to every model provider and connected system, it holds the prompts and the data flows, and it can execute code and reach internal services on the agent's behalf. CVE-2026-46442 (CVSS 9.9) is the sharp edge of this class: in Flowise before 3.1.2, POST /api/v1/node-custom-function lacks route-level authorization, so any authenticated user or API key can submit arbitrary JavaScript to the Custom JS Function node, escape the NodeVM sandbox, and reach child_process for command execution on the host. This piece walks the decision tree from a discovered LLM builder to credential theft, SSRF, and code execution, and it is the direct sequel to our piece on MCP unauthenticated tool invocation: same thesis, one layer up. The AI control plane is now production infrastructure, and it is being run like a scratch pad.

Low-code LLM builders, Flowise being the canonical example, exist to collapse the distance between an idea and a running agent. You drag a chat node, a retriever node, a tool node, and a model node onto a canvas, fill in the keys, and you have an application. That is exactly why these tools spread through engineering orgs faster than any security review can follow. The problem is not the abstraction; it is what the abstraction concentrates. To wire a working agent the builder must store the secrets that make it work: the model-provider API keys, the database connection strings, the OAuth tokens for connected SaaS, and the definitions of every tool the agent can invoke. A builder instance is a credential vault, an SSRF launch point, and a code-execution surface, fused into one web app that someone stood up in an afternoon and never threat-modeled.

This is part of our attack-research series. Where the MCP piece walked a single missing auth boundary on a tool transport, this one walks the broader pattern: the entire AI control plane left reachable, with the builder UI and API as the front door to everything behind it. By the end you should be able to fingerprint your own builder exposure, walk the four-stage decision tree, and ship the boundary that takes the control plane off the open internet. Verifiable security.

The attack pattern in one paragraph

A low-code LLM builder exposes a web UI and a REST API that together let a caller read and edit flows, read and write credentials, and execute the nodes a flow is built from. The instance was stood up for a prototype, so it runs with default or absent authentication, bound to a routable interface or fronted by a proxy with no access control. An attacker who reaches it crosses one boundary at a time. First, discovery: the builder has a recognizable HTTP fingerprint, predictable ports (3000 is the Flowise default), and an API surface that answers without credentials or behind weak basic auth. Second, credential read: the builder's own endpoints return stored secrets, and in real instances the filtering is inconsistent, CVE-2026-46443 is exactly a case where a credential fetch with a filter parameter fails to strip the encryptedData field that the unfiltered path correctly omits. Third, tool and SSRF abuse: nodes that fetch URLs become server-side request forgery primitives aimed at cloud metadata; nodes that query databases read every row the connection can see. Fourth, code execution: a custom-function node runs attacker-supplied JavaScript, and a sandbox escape (CVE-2026-46442) turns that into command execution on the host. Each stage is a self-contained compromise; together they are total. The single observation underneath: the builder treats its own API as a trusted internal tool, but it is sitting on a network as an untrusted front door.

The unifying thesis with our MCP work: the AI control plane, the layer that holds the keys and runs the tools, is now production infrastructure, and a large fraction of it is deployed as if it were a developer toy. The attacker lives in that gap.

Why this still ships in 2026

These are good tools written by competent teams, and the recent Flowise advisories were responsibly disclosed and patched in 3.1.2. So why is the class alive and well? The reasons are structural to how prototyping tools get adopted.

  1. The default posture is "trusted single user." A builder is designed to run on a developer's machine for one person. Authentication is frequently opt-in or a single shared basic-auth credential, and the prototype that becomes production never gets a real identity layer. CVE-2026-46440 was precisely a checkBasicAuth path doing plaintext, unrate-limited, direct comparison, the shape of auth a tool ships when "real auth comes later."
  2. The whole point is to execute code and call tools. A custom-function node that runs JavaScript is a feature, not a bug; the same is true of a node that fetches a URL or queries a database. The builder's value is code and tool execution, which means the security model has to be route-level authorization on every one of those surfaces, and a missing check on a single endpoint, CVE-2026-46442's node-custom-function route, is authenticated RCE.
  3. Secrets are concentrated by design. To make an agent work the builder must hold the model keys and the connection strings, so any read primitive that leaks them, an inconsistent filter (CVE-2026-46443), a mass-assignment that reassigns a resource across a tenant boundary (CVE-2026-46441, CVE-2026-42861/2/3), or a missing permission check on a CRUD route (CVE-2026-46444), pays out in live credentials, not just data.
  4. It slips past the tooling that should catch it. Appsec scanners look at the apps your team writes; the builder is a third-party app nobody registered as production. Cloud-posture tooling flags a public load balancer or an open security group, but a builder on an internal subnet, or one a developer port-forwarded "just for the demo," is invisible to both. The control plane falls in the seam between the two programs.

None of these is a single vendor's failing. They are the predictable result of a tool category whose adoption curve outran its deployment discipline, in exactly the way MCP servers did. The Flowise CVEs are the first well-numbered cluster of the class; they will not be the last builder to ship one.

The attacker decision tree

ATTACKER DECISION TREE Exposed LLM Builder -> Control-Plane Compromise ┌──────────────────────────────────────────┐ │ 1. Discover an exposed builder │ │ - HTTP fingerprint / title / favicon │ │ - default port 3000, /api/v1/* routes │ │ - reachable directly OR via a proxy │ │ with no real access control │ └────────────────┬─────────────────────────┘ │ ▼ ┌──────────────────────────────────────────┐ │ 2. Read config + stored secrets │ │ - credentials endpoint leaks │ │ encryptedData (CVE-2026-46443) │ │ - model keys, DB strings, OAuth tokens│ │ - flow definitions = the data map │ └────────────────┬─────────────────────────┘ │ ▼ ┌──────────────────────────────────────────┐ │ 3. Abuse tools + SSRF │ │ - URL-fetch node -> cloud metadata │ │ (169.254.169.254) = role creds │ │ - DB node -> read every reachable row │ │ - cross-workspace reassignment │ │ (mass assignment) breaks isolation │ └────────────────┬─────────────────────────┘ │ ▼ ┌──────────────────────────────────────────┐ │ 4. Execute code on the host │ │ - custom-function node runs JS │ │ - NodeVM sandbox escape -> child_proc │ │ (CVE-2026-46442, CVSS 9.9 RCE) │ │ - pivot from the control plane │ │ into everything it could reach │ └──────────────────────────────────────────┘

Four stages from a reachable builder to host compromise. Each stage is its own breach; the credentials in stage 2 alone are usually game over.

The decisive realization is that an attacker does not need stage 4 to win. Stage 2 hands them the model-provider keys (billing fraud and data access at the provider) and the database connection strings (direct read of production data). Stage 3 turns a benign-looking URL-fetch node into the cloud-metadata SSRF chain we have walked before, now reachable with no exploit at all, just the builder's intended feature pointed at 169.254.169.254. Stage 4 is the worst case, authenticated RCE on the host, and CVE-2026-46442 shows how short the path is once a single route skips its authorization check. The whole tree is scriptable, and our own probe stops at the first confirmed read: one credential-endpoint response or one config read is the proof. We never run the custom-function exploit against anything but our own disposable lab build.

A composite real-world scenario

The setting is a mid-size SaaS company whose data team adopted a low-code LLM builder to prototype an internal "ask the docs" assistant. To make it useful they connected three things: an OpenAI credential for the model, a Postgres credential pointed at an analytics read-replica, and an HTTP-request node for pulling internal API docs. They bound the UI to 0.0.0.0:3000 behind the office reverse proxy so teammates could try it, protected it with a single shared basic-auth string from the README example, and moved on. The prototype became the thing three teams use daily. Nobody added real authentication, nobody put it in the asset inventory, and the version was pinned the day it was installed.

An attacker with any internal foothold, a phished laptop, a guest-WiFi pivot, or a malicious dependency running in CI, finds the builder by its fingerprint and predictable API surface. The shared basic-auth string is in a wiki page; even without it, the rate-limit and comparison weaknesses of the era (CVE-2026-46440) make it cheap. Once past the front door, the attacker does not touch a flashy exploit first. They read credentials:

# Stage 2: read stored credentials via the builder's own API.
# The unfiltered path strips encryptedData; the filtered path (CVE-2026-46443) does not.
$ curl -s 'http://builder.internal:3000/api/v1/credentials?credentialName=openAIApi' \
    -H 'Authorization: Basic <shared>'
[{"id":"...","name":"prod-openai","credentialName":"openAIApi",
  "encryptedData":"...leaked ciphertext the unfiltered route would have omitted..."}]

That single response is the boundary crossing: the control plane handed back stored secret material it should never return. From here the attacker has options that need no further bug. The Postgres node's connection reads every row of the analytics replica. The HTTP-request node is a server-side request forgery primitive: point it at the cloud metadata endpoint and the response is the instance role's temporary credentials, the same SSRF-to-credential-theft chain we have documented, now reachable as a feature rather than a flaw.

And then the worst case. The builder exposes a custom-function node so flow authors can write a little glue JavaScript. In the vulnerable line, POST /api/v1/node-custom-function never checks that the caller is authorized for code execution, and when no external sandbox key (E2B_APIKEY) is set, which the advisory calls the common deployment case, the code runs in a NodeVM sandbox that can be escaped to reach the host process object and child_process. That is CVE-2026-46442, CVSS 9.9: authenticated remote code execution on the builder host, from inside a tool whose entire job is to run user code. Our probe would stop long before this, at the stage-2 read. A real attacker would not stop, and the box they land on is the one holding every key.

The reason it works is not exotic. The builder treated its own API as a trusted internal surface, when in deployment it was an untrusted front door on a routable network, holding the most valuable secrets in the environment.

Why this slips past both appsec and cloud-posture tooling

This class is dangerous precisely because it falls in a coverage seam. Application security programs test the code your team writes and the apps your team registers. A low-code builder is a third-party app that a single engineer installed; it never entered the SDLC, never got a threat model, and is not in the scope a pentest was sold against. Cloud-posture tooling (CSPM) reliably flags a public S3 bucket, an open security group, or an internet-facing load balancer, but it reasons about cloud-provider configuration, not about a container a developer runs on an internal subnet, a service they port-forwarded for a demo, or a builder reachable through an existing proxy. The instance has no "public = true" attribute for CSPM to alarm on, and no source repository for SAST to scan. It is production infrastructure that neither program knows exists. That invisibility, not any single CVE, is why exposed builders persist: the tooling that would catch a public database never gets pointed at the control plane.

What we observe in customer environments

We are honest about scope and about the limits of testing. We only ever lab-test our own builds of open-source LLM builders, in disposable containers with no egress after fetch, and any probe we ship against a customer's own flagged assets is non-destructive: a fingerprint, a read of a config or credential-listing endpoint, an auth-posture check, never the code-execution exploit, never a destructive call. Within those limits, surveying open-source builders and customer-flagged internal deployments over recent months, the rough shape:

The honest read, matching our research disposition: there is no novel CVE in this post, and we do not claim one. The Flowise cluster is real, responsibly disclosed, and patched in 3.1.2; our contribution is the class framing and the non-destructive detection of exposed control-plane builders before someone walks this tree against them.

What to do about it: the control-plane boundary contract

The fix is to treat the LLM builder as the production infrastructure it became: put authentication in front of it, control its egress, isolate its secrets, and keep its control plane off the open network.

LLM-builder hardening contract: controls that end the class

The LLM builder holds the keys, runs the tools, and reaches the data. Deploy it like the production infrastructure it is, not like the scratch pad it started as.

The audit, concretely, is a discovery-plus-posture sweep against every builder you can find:

# 1) reachable from a non-loopback origin?  (should be: no, unless behind real auth)
$ nc -z -w2 builder.internal 3000 && echo "REACHABLE off-host"

# 2) API responds without (or with only shared) auth?  (should be: rejected)
$ curl -s -o /dev/null -w '%{http_code}\n' http://builder.internal:3000/api/v1/credentials

# 3) version pinned below the patched line?  (should be: at or above 3.1.2 for Flowise)
$ curl -s http://builder.internal:3000/api/v1/version    # then compare to the advisory

# 4) can a node reach cloud metadata?  (should be: egress-denied)
#    confirm the egress policy blocks 169.254.169.254 and RFC-1918 from builder nodes

Any builder that answers its API without real auth, runs below the patched line, or can reach metadata from a node has no boundary. The sweep is finishable in an afternoon for any org that knows where its builders run, and the first task is usually discovering that they run in more places than anyone documented.

How Celvex catches this

Find. Prove. Fix. Verify.

Find

The scanner fingerprints exposed LLM builders (HTTP signature, default ports, /api/v1/* routes) and runs a non-destructive posture family covering reachability, weak or absent auth, vulnerable version lines, and metadata-reachable nodes, grounded in the real Flowise advisory cluster.

Prove

For a confirmed exposure we ship a signed Proof Capsule with the exact unauthenticated request and a single read-only response (a config or credential-listing read) against the customer's own lab build, Ed25519-signed for air-gapped verification. One read is the proof; we never run the code-execution exploit.

Fix

The Capsule's remediation block points at the boundary that failed: put auth in front, upgrade past 3.1.2, deny metadata and RFC-1918 egress, isolate the secret, or disable the code node, with the exact setting and line to change.

Verify

After the fix lands, the unauthenticated call is rejected, the version is at or above the patched line, and the node can no longer reach metadata. The finding closes automatically and the verified-fix event is recorded for the audit trail.

Where we sit on the autonomy curve: at L1.5 today, the exposed-builder posture probes are grounded in the real Flowise advisory cluster and our existing web-application testing and SSRF families, not hypothesis, and ship as a non-destructive test set extending our AI-control-plane discovery work. At L2 within 90 days, the corpus adds builder-specific credential-surface and node-risk scoring (flagging URL-fetch, database, and code-execution nodes behind a weak boundary) across more builder products. At L3 within twelve months, the scanner synthesises product-specific posture probes for unfamiliar builders it fingerprints in customer environments, with a strict refuse-non-RFC1918-target and no-code-execution guard. We do not claim L3 today, and we do not claim a novel CVE: the Flowise cluster is real and patched. We claim our L1.5 catches the exposed-control-plane and credential-surface exposures reliably and ships a reproducible Capsule for each.

Bottom line

The headlines will keep treating each exposed LLM builder as its own incident. The class underneath is the thesis we have been building toward: the AI control plane, the layer that holds the model keys, the database credentials, and the tools the agent can call, is now production infrastructure, and a large fraction of it is deployed like a developer toy. CVE-2026-46442 is the sharp edge, authenticated RCE through a custom-function node in Flowise before 3.1.2, but the credentials are usually game over a stage earlier, and a URL-fetch node is an SSRF primitive without any exploit at all. It slips past appsec because it is a third-party app nobody registered, and past cloud-posture tooling because it is not a public-bucket-shaped problem. The fix is a contract: never expose the control plane, patch past the cluster, control egress, isolate secrets, and disable or sandbox the code nodes you do not need. Until that boundary exists, every exposed builder is one read away from handing an attacker the keys to everything the agent could reach. It is the MCP story one layer up, and the same conclusion holds.

Verifiable security. Find it. Prove it. Fix it. Verify the fix held. That is what we ship.

Sources

Find your exposed AI control plane.

Free Exposure Check, no signup required. We fingerprint low-code LLM builders in your estate and run the non-destructive posture family, then ship a Proof Capsule for the highest-confidence control-plane exposure.

Run a Free Scan →