Poison AI buttons and links may betray your trust • The Register
Amid its ongoing promotion of AI’s wonders, Microsoft has warned customers it has found many instances of a technique that manipulates the technology to produce biased advice.
The software giant says its security researchers have detected a surge in attacks designed to poison the “memory” of AI models with manipulative data, a technique it calls “AI Recommendation Poisoning.” It’s similar to SEO Poisoning, a technique used by miscreants to make malicious websites rank higher in search results, but focused on AI models rather than search engines.
The Windows biz says it has spotted companies adding hidden instructions to “Summarize with AI” buttons and links placed on websites.
It’s not complicated to do this because URLs that point to AI chatbots can include a query parameter with a manipulative prompt text.
For example, The Register entered a link with URL-encoded text into Firefox’s omnibox that told Perplexity AI to summarize a CNBC article as if it were written by a pirate.
The AI service returned a pirate-speak summary, citing the article and other sources.
A less frivolous instruction, or one calling for an AI to produce output with a particular bent, would likely see any AI produce content that reflects the hidden instructions.
“We identified over 50 unique prompts from 31 companies across 14 industries, with freely available tooling making this technique trivially easy to deploy,” the Microsoft Defender Security Team said in a blog post. “This matters because compromised AI assistants can provide subtly biased recommendations on critical topics including health, finance, and security without users knowing their AI has been manipulated.”
We found that the technique worked with Google Search, too.
Microsoft’s researchers note that various code libraries and web resources can be used to create AI share buttons for recommendation injection. The effectiveness of these techniques, they concede, can vary over time as platforms alter website behavior and implement protections.
But assuming the poisoning has been triggered automatically or unwittingly by someone, not only would the model’s output reflect that prompt text, but subsequent responses would also consider the prompt text as historic context or “memory.”
“AI Memory Poisoning occurs when an external actor injects unauthorized instructions or ‘facts’ into an AI assistant’s memory,” the Defender team explained. “Once poisoned, the AI treats these injected instructions as legitimate user preferences, influencing future responses.”
The risk, Microsoft’s researchers argue, is that AI Recommendation Poisoning erodes people’s trust in AI services – at least among those who haven’t already written AI models off as unreliable.
Users may not take the time to verify AI recommendations, the security researchers say, and confident-sounding assertions by AI models make that more likely.
“This makes memory poisoning particularly insidious – users may not realize their AI has been compromised, and even if they suspected something was wrong, they wouldn’t know how to check or fix it,” the Defender team said. “The manipulation is invisible and persistent.”
Redmond’s researchers urge customers to be cautious with AI-related links and to check where they lead – sound advice for any web link. They also advise customers to review the stored memories of AI assistants, to delete unfamiliar entries, to clear memory periodically, and to question dubious recommendations.
Microsoft’s Defenders also recommend that corporate security teams scan for AI Recommendation Poisoning attempts in tenant email and messaging applications. ®
First Appeared on
Source link