Upgrade to Pro — share decks privately, control downloads, hide ads and more …

πŸ’€ AI Red Teaming Case Study: When RAG gets RAGg...

Avatar for Kennedy Torkura Kennedy Torkura
May 25, 2025
23

πŸ’€ AI Red Teaming Case Study: When RAG gets RAGged UpΒ πŸ’€

πŸ’€ AI Red Teaming Case Study: When RAG gets RAGged Up πŸ’€

Hey cloudy defenders, what happens when your precious RAG architecture gets RAGged up ? πŸ’₯

Well, RAG becomes the very sword adversaries wield against you to cause havoc ! This could be stealthy -> completely under the radar, remaining undetected for a long time.

How can this happen ? πŸ€”

πŸ‘‰ Let's quickly review a case study published by MITRE ATLAS, courtesy of the folks at Zenity πŸ™Œ

πŸ‘€ This case study, aptly titled "Financial Transaction Hijacking with M365 Copilot as an Insider", demonstrates how attackers could carry out multi-step attacks against RAG-powered AI infrastructure. The immediate impact is financial loss: funds get transferred to the attacker's bank account πŸ™€

🀺 While the target was Microsoft M365 Co-Pilot, AI apps leveraging RAG, e.g. RAG RAG-powered AI Agents, could be vulnerable to these attacks. The multi-step attack consists of several MITRE ATLAS techniques, including: LLM Plugin Compromise, Gather RAG-Indexed Targets, and several discovery techniques.

πŸŽ‰ Good news -> there are countermeasures for these attack techniques, I included some of them in the document (slides 11 & 12).

πŸŒ€ However, given the nature of unique app requirements per organization, threat modeling and red teaming exercises are super imperative for detecting, understanding, and mitigating threats.

πŸ”— More details about the case study are here -> https://atlas.mitre.org/studies/AML.CS0026

⚑ So key take-aways -> Don't assume, conduct threat modeling and red teaming against you AI applications and stay ahead of attackers.

πŸš€ See how Mitigant helps -> https://www.mitigant.io/en/platform/security-for-genai

Avatar for Kennedy Torkura

Kennedy Torkura

May 25, 2025
Tweet

Transcript

  1. @run2obtain 1 AI Red Teaming: Case Study Financial Transaction Hijacking

    With M365 Co-Pilot as an Insider Source: https://atlas.mitre.org/studies/AML.CS0026
  2. @run2obtain Overview (1) 2 Researchers from Zenity conducted a red

    teaming exercise in August 2024 that successfully manipulated Microsoft 365 Copilot. The attack abused the fact that Copilot ingests received emails into a RAG database. The attack was initiated by sending an email that contained info designed to be retrieved by a user query as well as a prompt injection to manipulate Copilot’s behaviour.
  3. @run2obtain Overview (2) 3 Essentially, the info in the email

    targeted users searching for banking info necessary for completing bank transfers. Funny enough, the info had the attacker's bank details instead. The prompt injection overrode Copilot's search functionality to treat the attacker’s info as a retrieved document and also manipulated the document reference in it’s response. This tricked the user into believing that Copilot's result is trustworthy and makes it more likely they will follow through with the wire transfer -> to the benefit of the attacker.
  4. @run2obtain Attack Techniques & Mitigations 11 S/No Attack Technique Mitigation

    1. Gather RAG-Indexed Targets (AML.T0064) Generative AI Guardrails 2. AI-Enabled Product or Service (AML.T0047) AI Telemetry Logging 3. Discover LLM System Information: Special Character Sets (AML.T0069.000) Generative AI Guardrails 4. Discover LLM System Information: System Instruction Keywords (AML.T0069.002) Generative AI Guardrails 5. Retrieval Content Crafting (AML.T0066) Generative AI Guardrails 6. LLM Prompt Crafting (AML.T0065) Generative AI Guardrails 7. Exploit Public-Facing Application (AML.T0049) Application Isolation and Sandboxing, Exploit Protection, Privileged Account Management
  5. @run2obtain Attack Techniques & Mitigations 12 S/No Attack Technique Mitigations

    8. LLM Prompt Obfuscation (AML.T0068) AI Telemetry Logging, Generative AI Guardrails 9. RAG Poisoning (AML.T0070) Generative AI Guardrails 10. False RAG Entry Injection ( AML.T0071) Generative AI Guardrails 11. LLM Prompt Injection: Indirect (AML.T0051.001) AI Telemetry Logging 12. LLM Plugin Compromise (AML.T0053) Generative AI Guardrails, Generative AI Guidelines, Generative AI Model Alignment 13. LLM Trusted Output Components Manipulation: Citations ( AML.T0067.000) 14. External Harms: Financial Harm (AML.T0048.000)
  6. @run2obtain A few take-aways ….. 13 1. Securing GenAI infrastructure

    is of utmost importance, while traditional security approaches might help they are mostly insufficient. 2. Leverage AI Red Teaming to discover security gaps in you AI applications. 3. RAG architectures provide huge benefits, however they also introduce several security concerns that might be challenging to identify hence the need for security measures like threat modelling and AI red teaming.
  7. @run2obtain References 16 1. We got an ~RCE on M365

    Copilot by sending an email., Twitter 2. Living off Microsoft Copilot at BHUSA24: Financial transaction hijacking with Copilot as an insider, YouTube 3. Article from The Register with response from Microsoft 4. https://www.mitigant.io/en/blog/bedrock-or-bedsand-attacking-amazon- bedrocks-achilles-heel