OpenClaw AI Agents Leaking Sensitive Data in Indirect Prompt Injection Attacks

OpenClaw AI Agents leaking private information through indirect prompt injection Attackers can use weak default settings and prompt injection flaws to turn normal agent behavior into a way to steal data without anyone knowing. The main problem isn't just that the AI model is confused; it's that the agent can be controlled to steal sensitive information without any user interaction. The most scary example comes from the security company PromptArmor.

They showed how an attacker can use a method called indirect prompt injection along with messaging app features to make an OpenClaw agent leak data. Chain of 0-Click Attacks An attacker puts harmful instructions in content that the AI agent is supposed to read. The agent reads the instructions and makes a URL that the attacker can control.

The agent adds private conversations or API keys to the URL's query parameters, which are sensitive data. The agent sends the bad link back to the user through messaging apps like Discord or Telegram. Before the user even clicks on the link, the messaging app automatically makes a link preview by getting the URL and sending the sensitive information directly to the attacker.

The messaging app's auto-preview feature sends an outbound HTTP request without the user having to click anything, which makes it a dangerous "no-click" attack. The agent's response is what causes the exfiltration event. CNCERT says that OpenClaw's default security settings put businesses at a lot of risk because they let agents browse, do tasks, or interact with local files.

They put threats into four main groups: indirect prompt injection through external data, accidental destructive actions, malicious activities by third parties, and taking advantage of known product vulnerabilities. OpenClaw is very useful because it can do real work, but this freedom makes compromises much worse. Messaging integrations with auto-preview features that make it easy for hackers to steal data.

Access to hosts and containers that lets you quickly change things so that they happen in the real world. A skills ecosystem where unverified or harmful extensions can make the attack surface much bigger. Being close to stored secrets, since agents usually work near operational credentials and tokens. OpenAI recently pointed out that developers need to assume that untrusted content will try to change the system once an agent can get information from outside sources and act on its own.

Invaders says that security teams should see this problem as an architectural flaw instead of just a bug in the AI. To keep deployments safe, companies should do the following: Turn off the auto-preview features in Telegram, Discord, Slack, and other channels where AI agents make links. Put OpenClaw runtimes in tightly controlled containers and keep the default management ports off the public internet.

Limit access to the file system when it's not needed, and don't put passwords in plain text configuration files. Only install agent skills from sources you trust, and check third-party code by hand before turning it on. Set up network monitoring to warn you when an agent makes a link to a domain you don't know about or does a DNS lookup that you didn't expect.

The most important question for security teams is no longer whether an AI model can be hacked, but what a hacked agent can do next without anyone knowing. Follow us on LinkedIn and X for daily cybersecurity updates. Get in touch with us to have your stories featured.