Sheejith's Personal Site

Microsoft’s AI Can Be Turned Into an Automated Phishing Machine

Microsoft raced to put generative AI at the heart of its systems. Ask a question about an upcoming meeting and the company’s Copilot AI system can pull answers from your emails, Teams chats, and files—a potential productivity boon. But these exact processes can also be abused by hackers.

Today at the Black Hat security conference in Las Vegas, researcher Michael Bargury is demonstrating five proof-of-concept ways that Copilot, which runs on its Microsoft 365 apps, such as Word, can be manipulated by malicious attackers, including using it to provide false references to files, exfiltrate some private data, and dodge Microsoft’s security protections.

One of the most alarming displays, arguably, is Bargury’s ability to turn the AI into an automatic spear-phishing machine. Dubbed LOLCopilot, the red-teaming code Bargury created can—crucially, once a hacker has access to someone’s work email—use Copilot to see who you email regularly, draft a message mimicking your writing style (including emoji use), and send a personalized blast that can include a malicious link or attached malware.

“I can do this with everyone you have ever spoken to, and I can send hundreds of emails on your behalf,” says Bargury, the cofounder and CTO of security company Zenity, who published his findings alongside videos showing how Copilot could be abused. “A hacker would spend days crafting the right email to get you to click on it, but they can generate hundreds of these emails in a few minutes.”

That demonstration, as with other attacks created by Bargury, broadly works by using the large language model (LLM) as designed: typing written questions to access data the AI can retrieve. However, it can produce malicious results by including additional data or instructions to perform certain actions. The research highlights some of the challenges of connecting AI systems to corporate data and what can happen when “untrusted” outside data is thrown into the mix—particularly when the AI answers with what could look like legitimate results.

Among the other attacks created by Bargury is a demonstration of how a hacker—who, again, must already have hijacked an email account—can gain access to sensitive information, such as people’s salaries, without triggering Microsoft’s protections for sensitive files. When asking for the data, Bargury’s prompt demands the system does not provide references to the files data is taken from. “A bit of bullying does help,” Bargury says.

In other instances, he shows how an attacker—who doesn’t have access to email accounts but poisons the AI’s database by sending it a malicious email—can manipulate answers about banking information to provide their own bank details. “Every time you give AI access to data, that is a way for an attacker to get in,” Bargury says.

Another demo shows how an external hacker could get some limited information about whether an upcoming company earnings call will be good or bad, while the final instance, Bargury says, turns Copilot into a “malicious insider” by providing users with links to phishing websites.

Phillip Misner, head of AI incident detection and response at Microsoft, says the company appreciates Bargury identifying the vulnerability and says it has been working with him to assess the findings. “The risks of post-compromise abuse of AI are similar to other post-compromise techniques,” Misner says. “Security prevention and monitoring across environments and identities help mitigate or stop such behaviors.”

As generative AI systems, such as OpenAI’s ChatGPT, Microsoft’s Copilot, and Google’s Gemini, have developed in the past two years, they’ve moved onto a trajectory where they may eventually be completing tasks for people, like booking meetings or online shopping. However, security researchers have consistently highlighted that allowing external data into AI systems, such as through emails or accessing content from websites, creates security risks through indirect prompt injection and poisoning attacks.

“I think it’s not that well understood how much more effective an attacker can actually become now,” says Johann Rehberger, a security researcher and red team director, who has extensively demonstrated security weaknesses in AI systems. “What we have to be worried [about] now is actually what is the LLM producing and sending out to the user.”

Bargury says Microsoft has put a lot of effort into protecting its Copilot system from prompt injection attacks, but he says he found ways to exploit it by unraveling how the system is built. This included extracting the internal system prompt, he says, and working out how it can access enterprise resources and the techniques it uses to do so. “You talk to Copilot and it’s a limited conversation, because Microsoft has put a lot of controls,” he says. “But once you use a few magic words, it opens up and you can do whatever you want.”

Posted on: 8/10/2024 4:12:57 AM


Talkbacks

You must be logged in to enter talkback comments.