Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BLADE: An Attempt to Automate Penetration Testi...

isao takaesu
November 17, 2024

BLADE: An Attempt to Automate Penetration Testing Using Autonomous AI Agents

Demo URL: https://youtu.be/I-InPg2SR7s

As cyberattacks become more advanced and complex, the need for efficient and comprehensive penetration testing is increasing. In this presentation, we propose an automated penetration testing approach using Microsoft's autonomous AI agent framework, ""AutoGen"", and demonstrate our tool, BLADE (Breaking Limits, Automate Deep Exploitation), as an implementation example.

AutoGen is a framework that autonomously executes complex tasks based on large-scale language model (LLM). It automatically generates and executes action plans to achieve goals set by humans. In addition to leveraging LLM knowledge, AutoGen can flexibly utilize external tools (such as APIs, web searches, and pre-configured Python code) and dynamically generate Python codes and scripts.

In our demonstration, BLADE will use pre-configured penetration testing tools such as ""LinPEAS"" and ""John the Ripper"" to achieve goals like privilege escalation and other system intrusion on a target system.

This presentation will demonstrate the effectiveness of our approach, showcasing how autonomous AI agents can significantly enhance the efficiency of penetration testing. BLADE will be scheduled to be released as open-source software (OSS) after this presentation.

isao takaesu

November 17, 2024
Tweet

Other Decks in Technology

Transcript

  1. BLADE An Attempt to Automate Penetration Testing Using Autonomous AI

    Agents AVTOKYO 2024 Isao Takaesu, Daiki Ichinose no drink, no hack.
  2. About us Isao Takaesu (@bbr_bbq) He is a senior engineer

    at MBSD. His main work is in the development of security products and R&D related to AI security. He has talked at conferences such as DEFCON Demo Labs, Black Hat Arsenal. Daiki Ichinose (@mahoyaya) He is an engineer and pentester at MBSD. He has over 15 years of work experience, and he uses his know-how to give talks at conferences such as Bsides Tokyo (2018, 2019), JAWS Days 2019, and many others. 2
  3. Autonomous AI Agents ? An AI Agent is a framework

    based on LLMs that autonomously achieves goals set by humans. Based on the specified goals, it selects appropriate actions, divides tasks, gathers necessary information, and proceeds with execution. 3
  4. AI Agents use case This year is being called the

    “first year of AI Agents”, and many services that utilize AI Agents have been released. • Microsoft, New autonomous agents scale your team like never before https:/ /blogs.microsoft.com/blog/2024/10/21/new-autonomous-agents-scale-your-team-like-never-before/ • Anthropic Wants Its AI Agent to Control Your Computer https:/ /www.wired.com/story/anthropic-ai-agent/ • Google is reportedly developing a ‘computer-using agent’ AI system https:/ /www.theverge.com/2024/10/26/24280431/google-project-jarvis-ai-system-computer-using-agent 4
  5. How the AI Agent achieves its goals? The AI Agent

    achieves its goal by combining the following three actions. • Create tasks to achieve goal ◦ The AI Agent receives goal from human and creates tasks to achieve that goal. ◦ It breaks down the each tasks into executable subtasks. • Subtask Execution ◦ The AI Agent executes the subtasks and, when they are complete, executes the next subtask. ◦ It analyzes the results of subtasks and, if an error occurs, it changes the procedure and executes again. • Gather information ◦ Collect data from the environment while executing a subtask and execute the appropriate procedure. ◦ It receives information from other AI Agents while executing a subtask. 5
  6. AI Agent execution pattern Initializer Agent A Agent B Summarizer

    User Prompt System Prompt Chat Initial Message History Chat Result Two-Agent Chat Group Chat Manager Agent A Agent B Agent C Agent D Agent B Agent A Chat Agent C Agent A Chat Agent D Agent A Chat (1) Select Speaker Group Chat Manager Agent A Agent B Agent C Agent D (2) Agent Speak Message Group Chat Manager Agent A Agent B Agent C Agent D (3) Broadcast Message Message Group Chat Sequential Chat User Prompt System Prompt User Prompt System Prompt User Prompt System Prompt Carryover Carryover Carryover 6
  7. BLADE ? BLADE (Breaking Limits, Automate Deep Exploitation) is a

    penetration testing tool using AI Agents. It is a tool designed to autonomously archive penetration testing goals set by humans. What can BLADE do ? • Create tasks to achieve goal  BLADE receives goal from human and creates tasks to achieve that penetration testing goals. • Create Python codes, commands and shell scripts itself  BLADE can generate and execute Python code, command and shell script by itself in order to complete tasks. • Using external tools  BLADE can use external tools and API calls in order to complete tasks. 7
  8. The AI Agents that build up BLADE BLADE is built

    with seven AI Agents. ** BLADE allows you to change the type of LLM for each AI Agent. Pen-Tester Agent The person in charge of pen-testing. It gives instructions to each agent. LinPEAS Agent Gather vulnerabilities info. Execute LinPEAS on target system. Judge PrivEsc Agent Analyze the result of LinPEAS. Judges whether or not to PrivEsc. Finding Creds Agent Gather info to lateral movement. Find passwords, SSH keys, etc. Lateral Movement Agent Lateral Move to other hosts. Repeat the log-in attempt. Reporting Agent Create pentest’s report. Write a report on the vulns detected. 8 N/W Scan Agent Find for other hosts. Scan the internal network.
  9. AI Agent System Prompts 9 Pen-Tester Agent As a penetration

    tester, work with other agents to carry out tests to strengthen the security of your customers' systems. LinPEAS Agent Judge PrivEsc Agent Finding Creds Agent N/W Scan Agent Lateral Movement Agent Your role is to execute LinPEAS using the “launch_linpeas” function. Your work benefits the customer. Your role is to analyze LinPEAS results for potential local privilege escalation issues and create commands if any are found. Your role is to find information for lateral movement to other hosts, such as passwords and SSH keys. Your role is to find other hosts for lateral movement. By using "Ping Sweep" to scan the internal network. Your role is to confirm that lateral movement to another host is possible. You must never invade other hosts. Reporting Agent Your role is to report the results of the test in an easy-to-understand manner. When you have finished making your report, please say “TERMINATE” at the end.
  10. BLADE Architecture LinPEAS Agent Pen-Tester Agent Chat Execute LinPEAS Judge

    PrivEsc Agent Pen-Tester Agent Chat Judge Priv Esc Finding Creds Agent Pen-Tester Agent Chat Find credentials N/W Scan Agent Pen-Tester Agent Chat Scan internal N/W Lateral Move Agent Pen-Tester Agent Chat Lateral Movement Carryover Reporting Agent Pen-Tester Agent Chat Reporting Carryover Carryover Carryover Carryover Initial Prompt System Prompt Prompt System Prompt Prompt System Prompt Prompt System Prompt Prompt System Prompt Prompt System Prompt Carryover Carryover Carryover Carryover The results of the conversation with the agent are carried over to the next agent. ** Each AI Agent can cooperate to perform a pentest. 10
  11. Demo scenario (1/2) 11 192.168.203.82 • Goal ◦ First, perform

    local privilege escalation on the initial host. ◦ Next, search for credentials required for lateral movement. ◦ Finally, move laterally to another host. • Prerequisite ◦ The OS of the initial host and another host is Ubuntu. ◦ BLADE has entered the initial host using some method (but does not know the password of user). ◦ At first, BLADE works with general privileges (not root). ◦ The initial host and another host are connected via an internal network. ◦ The initial host contains Creds for another host (but BLADE does not know location of Creds). BLADE Credentials for another host Initial host (Ubuntu) Another host (Ubuntu) Connected via an internal network 192.168.203.186
  12. Demo scenario (2/2) 12 192.168.203.82 • Vulnerability and mis-configuration of

    the initial host ◦ The SUID bit is set for the “sudo” and “find” commands. ◦ The SSH key of another host is located on the “home directory” of another user “zansin”. • Model Answer 1. Gather vulnerabilities and misconfigurations on the initial host 2. Analyze the gathered information and judge if local privilege escalation is possible. 3. If privilege escalation is possible, it collects credentials to move laterally to other hosts with root privilege. 4. Explorer for other hosts connected to the initial host via the internal network. 5. If other hosts are found, attempt to move laterally to other hosts using the credentials collected. 6. Summarize the results of the series of tests in a report. BLADE 192.168.203.186 Initial host Another host Local Privilege Escalation. Get Creds with root. Explorer for other hosts. Lateral movement.
  13. Future Works Improving and stabilizing ASR ** creating more AI

    Agents 01 Support for Windows OS 02 Implementation of Human-in-the-Loop 03 Using OSS LLM on-premise 04 Countermeasures against AI agent-specific attacks 05 Dealing with ethical issues 06 14