Autonomous Discovery of Critical Zero-Days

Share:

Since July 2024, ZeroPath’s tool has uncovered critical zero-day vulnerabilities—including RCE, authentication bypasses, and IDORs—in popular AI platforms and open-source projects. Our approach has identified security flaws in projects owned by Netflix, Salesforce, and Hulu.

AI-driven 0-day detection is here. AI-assisted security research has been quietly advancing since early 2023, when AIxCC researchers demonstrated the first practical applications of LLM-powered vulnerability detection in AI systems. Modern LLMs have been used to improve the accuracy of detections of existing classes of web issues (XSS, SQLi, CSRF) and find business logic and authentication problems that were previously undetectable by SAST.

So what does this mean in terms of AIs’ ability to do unsupervised security research? Since July 2024, ZeroPath is taking a novel approach combining deep program analysis with adversarial AI agents for validation. Our methodology has uncovered numerous critical vulnerabilities in production systems, including several that traditional Static Application Security Testing (SAST) tools were ill-equipped to find. This post provides a technical deep-dive into our research methodology and a living summary of the bugs found in popular open-source tools.


Summary of Vulnerabilities Discovered

Note: This list represents only a portion of our findings. Many vulnerabilities remain undisclosed due to ongoing remediation efforts or pending responsible disclosure processes. We will update this list as new issues are disclosed over the next few months.

Date Project Vulnerability Technical Impact Root Cause CVE/Reference
Jul 21, 2024 Fonoster Voice Server Local File Inclusion Access to system files via voice file paths Incomplete path validation CVE-2024-43035
Jul 22, 2024 Uptrain Remote Code Execution Arbitrary code execution via eval RCE during project creation Pending Assignment
Aug 22, 2024 LibrePhotos File Upload + Path Traversal Arbitrary file write via photo upload Insufficient path sanitization Pending Assignment
Aug 22, 2024 Clone-Voice Command Injection System command execution via voice file metadata Unescaped input in ffmpeg command Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized Conversation Deletion Complete deletion of other users’ chat history Missing object-level authorization checks Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized Canvas Deletion Deletion of other users’ visualization canvases Insufficient IDOR protection on API endpoint Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized Knowledge Base Access Read access to other users’ private knowledge bases Missing tenant isolation in KB queries Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized File Movement Moving/deleting other users’ uploaded files Missing ACL checks in file operations Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized Conversation Access Reading other users’ private conversations Broken access control in chat retrieval Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized API Key Removal Removal of other users’ API keys IDOR in key management endpoint Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized Knowledge Base Enumeration Enumeration of all private knowledge bases Missing authentication in list endpoint Pending Assignment
Sep 2, 2024 RAGFlow Unauthorized Dialog Deletion Mass deletion of other users’ dialogs Race condition in deletion endpoint Pending Assignment
Sep 3, 2024 E2nest (Netflix) Local File Inclusion Arbitrary file read via path traversal in model loading Insufficient path normalization in config loading CVE-2024-9301
Sep 5, 2024 LibrePhotos Unauthorized Access to User Jobs Access to other users’ processing jobs Missing authorization in job queue Pending Assignment
Sep 5, 2024 LibrePhotos Token Refresh Authentication Bypass Complete authentication bypass Improper token validation Pending Assignment
Sep 20, 2024 Monaco (Hulu) Remote Code Execution Code execution via deserialization Unsanitized data being passed into pickle.loads CVE-2024-48946
Sep 20, 2024 Monaco (Hulu) Unauthorized Redis Access Access to all Redis clusters administered by Monaco Missing authentication in app_redis_api endpoint Pending Assignment
Oct 1, 2024 LogAI (Salesforce) Directory Traversal Access to sensitive files via log paths Broken path traversal protection Pending Assignment
Oct 24, 2024 DB-GPT Directory Traversal Access to database files via backup paths Missing path normalization Pending Assignment

For comprehensive technical analyses of some these vulnerabilities, please refer to our write-ups:

  • Uptrain RCE: Project Creation to RCE
  • Clone-Voice Command Injection: From Voice Processing to Shell Access
  • Fonoster Voice Server LFI: Breaking Audio File Boundaries
  • LibrePhotos Arbitrary File Upload: From Upload to RCE

Vulnerability Distribution

53%21%11%5%5%5%Vulnerability TypesAuthorization Flaws (53%)Directory Traversal/LFI (21%)Remote Code Execution (11%)File Upload Issues (5%)Authentication Bypass (5%)Command Injection (5%)

1. Authorization Flaws

  • Prevalence: 53% of the vulnerabilities (10 instances)
  • Common Issues:
    • Missing object-level access controls
    • Insufficient tenant isolation
    • Broken access control in API endpoints
    • IDOR vulnerabilities in resource management
    • Unauthorized Redis access and configuration exposure
  • Impact: Unauthorized access, data leakage, and resource manipulation across tenant boundaries.
  • Examples:
    • RAGFlow’s multiple IDOR vulnerabilities allow manipulation of conversations, canvases, knowledge bases, and API keys belonging to other users
    • Unauthorized access to Redis instances due to missing access controls

2. File Operation Issues

  • Prevalence: 26% of the vulnerabilities (5 instances)
  • Common Issues:
    • Directory traversal in configuration loading
    • Local file inclusion via path manipulation
    • Unsafe file handling in upload features
    • Insufficient path validation and normalization
  • Impact: Unauthorized file access, sensitive data exposure, and potential system compromise.
  • Examples:
    • E2nest’s LFI via model path traversal (CVE-2024-9301)
    • DB-GPT’s directory traversal in backup paths
    • LogAI’s broken path traversal protection

3. Code Execution Vulnerabilities

  • Prevalence: 16% of the vulnerabilities (3 instances)
  • Common Issues:
    • Unsafe pickle deserialization
    • Command injection in file processing
    • Unsanitized input in system commands
  • Impact: Remote code execution, system command execution, and potential full system compromise.
  • Examples:
    • Monaco’s RCE via pickle deserialization (CVE-2024-48946)
    • Clone-Voice’s command injection via ffmpeg metadata
    • Uptrain’s RCE via project creation

Our Technical Methodology

TL;DR – most of these bugs are simple and could have been found with a code review from a security researcher or, in some cases, scanners. The historical issue, however, with automating the discovery of these bugs is that traditional SAST tools often rely on pattern matching and predefined rules, which can miss complex vulnerabilities that do not fit known patterns (i.e. business logic / broken authentication flaws or non-traditional sinks such as from dependencies). They also tend to generate a high rate of false positives, overwhelming security teams and reducing efficiency.

The beauty of LLMs is that they can reduce ambiguity in most of the situations that previously caused scanners to be either unusable or produce few findings when mass-scanning open source repositories like this. For instance, you can prevent:

  • Alerting on test code
  • Alerting on CLI only administrators would have access to
  • Alerting on injection bugs whose “sinks” (i.e. injection sources) aren’t able to be controlled by attackers in practice
  • Alerting on injection bugs that include controls that make an attack possible (for instance, limiting the input to valid UUIDs)

To do this well, we combine deep program analysis with an adversarial agents that test the plausibility of vulnerabilties at each step. The solution ends up mirroring the traditional phases of a pentest – recon, analysis, exploitation (and remediation which is not mentioned in this post).

Note: Most sections have been adopted from our how it works post, for more information about patching and remediation please refer to it.

Stage 1: Application Identification

ZeroPath starts by using AI agents to investigate what applications are inside a repository and gather some basic data about how they work. This step is crucial when dealing with mono-repositories or repositories containing multiple services, as often happens with microservice architectures. Specifically, we:

  1. Identify directory boundaries for each application
  2. Generate application descriptions and metadata, noting details like the auth procedure and tech stack
  3. Collect additional contextual information helpful for subsequent analysis stages

This process helps ensure that ZeroPath has enough information about the apps to discriminate between relevant and irrelevant security issues.

Stage 2: AST Generation and Indexing

To illustrate the following steps, we will be using a basic Django application that provides fundamental functionality for:

  1. User management (creating and listing users)
  2. Content management (creating and listing posts)
  3. User authentication (login and logout capabilities)

Below is an example of the method to retrieve users from the application:

class UserViewSet(View):
    def get(self, request):
        users = User.objects.all()
        return render(request, 'user_list.html', {'users': users})

That’s how it’s represented as plain text, but as with most languages this is broken down into an intermediate representation before compilation. Using tree-sitter we can convert the method definition into an AST that has standard names for things like “function_definition”, “body”, and etc.:

(function_definition
  name: (identifier)  ; get
  parameters: (parameters
    (identifier)  ; self
    (identifier))  ; request
  body: (block
    (expression_statement
      (assignment
        left: (identifier)  ; users
        right: (call
          function: (attribute
            object: (attribute
              object: (identifier)  ; User
              attribute: (identifier))  ; objects
            attribute: (identifier))  ; all
          arguments: (argument_list))))
    (return_statement
      (call
        function: (identifier)  ; render
        arguments: (argument_list
          (identifier)  ; request
          (string)  ; 'user_list.html'
          (dictionary
            (pair
              key: (string)  ; 'users'
              value: (identifier))))))))  ; users

This AST representation breaks down the get_users function, showing its structure, parameters, and the operations it performs. Each node in the tree is represented by parentheses, with the node type followed by its children. Leaf nodes (like identifiers and strings) are represented directly. Comments after semicolons provide additional information or clarification about the nodes.

This format allows for a detailed, hierarchical view of the code structure, making it easier to analyze. In particular, from the AST we create a call graph, which is a map of the program’s function invocations. The call graph facilitates navigation through the codebase during vulnerability analysis, and also provides a comprehensive summary of the application’s structure and behavior. This holistic understanding is key to our tool’s ability to detect complex, context-dependent vulnerabilities, and it looks something like this:

WSGI Handler: process_request

URL Dispatcher: resolve

UserViewSet: dispatch

PostViewSet: dispatch

LoginView: dispatch

LogoutView: dispatch

UserViewSet: list

UserViewSet: create

PostViewSet: list

PostViewSet: create

LoginView: post

LogoutView: post

User.objects.all/create

Post.objects.all/create

authenticate

login

logout

render

render

context_processor

HttpResponse

Stage 3: Graph Enrichment

After generating the AST, we enrich the graph with contextual information by identifying features like endpoints (exposed functions or URLs that can be accessed externally) and assigning attributes to each node. These attributes can be details such as request paths, HTTP methods, and authentication and authorization mechanisms. For example, a node representing a login function might be enriched with attributes indicating it accepts POST requests, and implements rate limiting. A key aspect of this enrichment is recognizing how middleware and other security controls are implemented across the application. This process transforms the basic AST into a more comprehensive representation of the application’s structure and behavior. While the initial AST shows the structure of individual functions, this enriched call graph demonstrates how these functions interact and what security measures are in place throughout the application flow.

Process Request

Pre-process

HTTPS Redirect

Manage Sessions

Authenticate User

CSRF Validation

/api/users/

/api/posts/

/auth/login/

/auth/logout/

GET

POST

GET

POST

POST

POST

Query/Update

Query/Update

Query/Update

Query/Update

ORM

ORM

Render

Render

Render

Render

Apply

Apply

Serve

Cache Responses

Generate Response

Generate Response

Generate Response

Generate Response

Generate Response

Generate Response

Process Response

Process Response

Process Response

Process Response

Process Response

Success

Failure

Logout

Django WSGI Handler

URL Dispatcher

Security Middleware

Session Middleware

Authentication Middleware

CSRF Protection Middleware

UserViewSet

PostViewSet

LoginView

LogoutView

List Users

Create User

List Posts

Create Post

Authenticate User

End Session

User Model

Post Model

Database

List Template

Detail Template

Context Processors

Static File Handler

Static Files

Cache Middleware

Response Object

Create Session

Error Response

Delete Session

Stage 4: Vulnerability Discovery and Validation

Finally we get to the most important part, using the call graph to discover vulnerabilities. In our application security analysis, vulnerabilities are bucketed into three main types:

  1. Technical Vulnerabilities: These encompass implementation-specific security flaws such as SQL Injection (SQLI), XML external entity (XXE) injection, Cross-site Scripting (XSS), Cross-site Request Forgery (CSRF), Leaking of secrets, and Server-side Request Forgery (SSRF).
  2. Business Logic Flaws: These vulnerabilities arise from flaws in the application’s logic. Examples include:
    • Price manipulation in e-commerce systems
    • Exploitation of coupon systems leading to incorrect pricing
    • Bypassing intended workflow sequences
    • Lack of rate limiting especially when interacting with external APIs (leading to excessive charges from providers)
  3. Authentication/Authorization Issues: These stem from improper implementation of user authentication or access control mechanisms. Common subtypes include:
    • Insecure Direct Object Reference (IDOR)
    • Missing Function Level Access Control
    • Broken Session Management

Each category requires distinct analysis techniques. Technical vulnerabilities often involve pattern matching and taint analysis, business logic flaws require understanding of intended application behavior, and authentication/authorization issues necessitate comprehensive flow analysis of user sessions and permissions. Some ways we find these bugs:

  • Static Rules: ZeroPath has a large set of static rules that detail vulnerable code patterns, which are then used to semantically search if a given codebase uses them. Using this we are able to detect many existing classes of issues.
  • Threat Modeling: Having ZeroPath comes up with attack scenario and verifies them by performing rigorous investigations of the code.
  • Software Composition Analysis (SCA): ZeroPath actively monitors dependencies used within the application for known vulnerabilities, and see if these dependencies’ problems are exploitable from the outside.
  • Secret Scanning and Validation: ZeroPath also scans for secrets and performs validation on the secrets to ensure that they’re valid and provides information about each discovered secret to help quickly rotate and enforce best practices.

Our methodology for investigating business logic flaws and broken authentication vulnerabilities combines two AI techniques: Tree-of-Thoughts (ToT) and an adaptation of the ReAct framework.

ToT enables multi-path reasoning, intermediate step evaluation, and outcome ranking. This improves our ability to explore complex vulnerability scenarios. The ReAct-inspired component enforces structured tool usage with explicit action justification, enhancing the rigor of our investigative process.

By integrating these techniques, we’ve developed a framework that allows for comprehensive vulnerability assessment. ToT facilitates thorough scenario exploration, while the ReAct adaptation ensures methodical tool application. This approach has proven particularly effective in addressing the nuanced challenges presented by business logic and authentication vulnerabilities.

To further enhance our validation process and ensure the exploitability of identified vulnerabilities, ZeroPath employs a Monte Carlo Tree Self-refine (MCTSr) algorithm. This approach, inspired by recent advancements in AI problem-solving, allows us to efficiently explore and verify complex technical attack vectors.

Monte Carlo Tree Self-Refine (MCTSr)

Our MCTSr implementation builds upon research from Shanghai Artificial Intelligence Laboratory, Fudan University, and collaborating institutions. Their work on solving International Mathematical Olympiad problems using Monte Carlo Tree Search and self-refinement techniques provided a foundation that we’ve adapted for cybersecurity applications. We’ve modified this approach to navigate the decision trees involved in verifying security vulnerabilities, allowing both for more efficient exploration of potential attack vectors and fewer false positives.

The most important part of using MCTSr is defining a “win function”. As a static analysis tool, our win function is implemented by an LLM that determines the chances that a hypothesized attack could work, and the severity of the problem.

The particular verification agent we use is different for problems like SSTI, SQLi, XSS, and business logic issues. Generally, an agent is designed to pull in relevant information from previous stages, and consider all of the controls that could make an attack impractical. If the LLM investigator determines that a given attack is above a given practicality threshold, it’s sent for the next stage, which is patch generation and tweaking.

Conclusion

AI-driven vulnerability detection is moving fast. While some are just now jumping into this field, it’s been developing for a while. Since July 2024, we’ve been exploring how deep program analysis combined with adversarial AI agents can uncover critical bugs that might be overlooked by traditional tools.

What’s intriguing is that many of these vulnerabilities are pretty straightforward—they could’ve been spotted with a solid code review or standard scanning tools. But conventional methods often miss them because they don’t fit neatly into known patterns. That’s where AI comes in, helping us catch issues that might slip through the cracks.

Source

Leave a Comment

Your email address will not be published. Required fields are marked *

loader-image
London, GB
1:53 am, Jan 18, 2025
weather icon 3°C
L: 3° | H: 4°
overcast clouds
Humidity: 87 %
Pressure: 1033 mb
Wind: 6 mph SE
Wind Gust: 0 mph
UV Index: 0
Precipitation: 0 mm
Clouds: 100%
Rain Chance: 0%
Visibility: 10 km
Sunrise: 7:56 am
Sunset: 4:24 pm
DailyHourly
Daily ForecastHourly Forecast
Today 9:00 pm
weather icon
3° | 4°°C 0 mm 0% 4 mph 92 % 1032 mb 0 mm/h
Tomorrow 9:00 pm
weather icon
1° | 5°°C 0 mm 0% 7 mph 91 % 1023 mb 0 mm/h
Mon Jan 20 9:00 pm
weather icon
2° | 6°°C 0 mm 0% 4 mph 97 % 1020 mb 0 mm/h
Tue Jan 21 9:00 pm
weather icon
4° | 7°°C 0 mm 0% 5 mph 97 % 1019 mb 0 mm/h
Wed Jan 22 9:00 pm
weather icon
4° | 8°°C 0.2 mm 20% 9 mph 97 % 1013 mb 0 mm/h
Today 3:00 am
weather icon
2° | 3°°C 0 mm 0% 2 mph 87 % 1032 mb 0 mm/h
Today 6:00 am
weather icon
1° | 3°°C 0 mm 0% 1 mph 90 % 1032 mb 0 mm/h
Today 9:00 am
weather icon
2° | 2°°C 0 mm 0% 2 mph 90 % 1031 mb 0 mm/h
Today 12:00 pm
weather icon
5° | 5°°C 0 mm 0% 3 mph 71 % 1030 mb 0 mm/h
Today 3:00 pm
weather icon
6° | 6°°C 0 mm 0% 3 mph 65 % 1027 mb 0 mm/h
Today 6:00 pm
weather icon
4° | 4°°C 0 mm 0% 4 mph 86 % 1026 mb 0 mm/h
Today 9:00 pm
weather icon
3° | 3°°C 0 mm 0% 4 mph 92 % 1025 mb 0 mm/h
Tomorrow 12:00 am
weather icon
2° | 2°°C 0 mm 0% 3 mph 85 % 1023 mb 0 mm/h
Name Price24H (%)
Bitcoin(BTC)
€101,450.08
2.90%
Ethereum(ETH)
€3,368.17
2.66%
XRP(XRP)
€3.15
-1.14%
Tether(USDT)
€0.97
0.07%
Solana(SOL)
€212.31
1.68%
Dogecoin(DOGE)
€0.414343
10.59%
USDC(USDC)
€0.97
0.01%
Shiba Inu(SHIB)
€0.000023
8.10%
Pepe(PEPE)
€0.000019
10.19%
Peanut the Squirrel(PNUT)
€0.65
9.40%
Scroll to Top