OpenAI ChatGPT Atlas web browser susceptible to prompt injection attack. Attack disguises malicious instructions to look like a URL, but that Atlas treats as high-trust 'user intent' text. Prompt injections are a main concern with AI assistant browsers.

SquareX Labs demonstrated that threat actors can spoof sidebars for AI assistants inside browser interfaces using malicious extensions to steal data or trick users into downloading and running malware. "Because omnibox prompts are treated as trusted user input, they may receive fewer checks than content sourced from webpages," security researcher Martí Jordà said, in a report published Friday. The attack kicks in when the user enters a prompt into the spoofed sidebar, causing the extension to hook into its AI engine and return malicious instructions when certain "trigger prompts" are detected. The disclosure comes as browsers like Perplexity Comet and Opera Neon have been found susceptible to the attack.

Prompt injection is where attackers hide malicious instructions in websites, emails, or other sources. OpenAI's Chief Information Security Officer, Dane Stuckey, acknowledged the security challenge. The company has performed extensive red-teaming, implemented model training techniques to reward the model for ignoring malicious instructions, and enforced additional guardrails and safety measures to detect and block such attacks.

Malicious prompt injections are a "frontier security problem that the entire industry is grappling with," according to Perplexity, which has adopted a multi-layered strategy to shield users from potential threats like goal hijacking, image-based injections, hidden HTML/CSS instructions, and content confusion attacks. In a post on X, the company stated, "We're entering an era where the democratization of AI capabilities means everyone needs protection from increasingly sophisticated attacks."