Halborn Logo

ChatGPT vulnerability detection report

Discover whether ChatGPT is able to detect vulnerabilities across smart contracts and solve CTF challenges
Download the full report now.

Halborn logotext
ChatGPT Report Hero
Can ChatGPT detect Smart Contract vulnerabilities?

In July 2023, we evaluated ChatGPT's smart contract vulnerability detection capabilities by testing 134 examples of vulnerable smart contracts (compiled in this GitHub repo) that were known to contain exploitable vulnerabilities and breaking them up into 41 groups based on the types of vulnerabilities that they contained. We then fed the smart contracts into ChatGPT and prompted it to find the vulnerabilities*.

Halborn H

/ How well did ChatGPT perform? Keep scrolling to find out! /

Where Does ChatGPT excel?

ChatGPT always successfully identified the following types of smart contract vulnerabilities:

  • Bad randomness
  • Use of deprecated functions
  • Right-to-left override control character
  • Integer overflow
  • Missing protection against signature replay
  • Typographical error
  • Variable shadowing
  • Arbitrary jump with function type variable
  • Logical error
  • Authorization through tx.origin
  • Presence of unused variables
  • Block values as a proxy for time
  • Default visibility
  • Asserting EOA from code side
  • Numerical Precision/Floating points
  • Outdated compiler versions
  • Message call with hardcoded gas amount

NO. of vulnerabilities in range of accuracy

NO. of vulnerabilities in range of accuracy
100%
Between 75% and 99%
Between 50% and 75%
Between 25% and 49%
Between 0% and 24%

Impressive? Not Really...

In most of these cases, the presence of the vulnerability could have been easily identified by looking for certain functions or code patterns. For example, simply scanning code for the use of tx.origin or a pragma version would identify certain types of vulnerabilities. In general, ChatGPT detection ratio is around 75% with all types of vulnerabilities and in all tested smart contracts. However, we have discovered that ChatGPT can better identify vulnerabilities when prompted if a code sample contained a specific vulnerability (e.g., reentrancy) compared to asking it to find all vulnerabilities in a piece of code.

In fact, the specific vulnerability prompt increased ChatGPT-4's detection accuracy from 76.1% to 86.6%!

Where ChatGPT Falls Short

while ChatGPT is very effective at finding certain types of vulnerabilities, it struggles with the understanding of how Solidity or how EVM works.

In general, there are certain types of smart contract vulnerabilities that ChatGPT struggles with finding, regardless of the ChatGPT version tested:

[object Object]

Abuse of
Global Semantic

[object Object]

Insufficient
Gas Griefing

[object Object]

Storage
Collisions

[object Object]

Hash collisions with multiple variable length arguments

[object Object]

Reference to an external malicious contract

Furthermore, different versions of ChatGPT struggle with different vulnerabilities.

ChatGPT-3.5 struggled to identify:

  • Forced reception of Ether
  • Unencrypted private data on-chain
  • Short address attacks

ChatGPT-4 struggled to identify:

  • Delegated call to an untrusted callee
  • Signature malleability
  • Write to arbitrary storage location
Code Review

ChatGPT was able to identify certain vulnerabilities with 100% accuracy — such as variable shadowing or bad randomness— within smart contracts. However, it tends to struggle with different variations of these attacks. For example, most instances of read-only and cross-function reentrancy were not detected during the study. Similarly, ChatGPT overlooked DoS by external calls without gas stipends in 2 out of 3 prompts.

ChatGPT can assess a lot of vulnerabilities.
Can you predict how ChatGPT did?

Can ChatGPT solve CTFs?

What is CTF?

CTF, or Capture The Flag, is a type of cybersecurity exercise where participants engage in solving security-related challenges, simulating real-world scenarios to enhance their skills and knowledge in protecting systems against cyber threats.

In our study, we found that ChatGPT could completely
solve CTF challenges 43.3% of the time

ChatGPT also offered a partial solution in an additional 20% of cases.

These results depend largely on the complexity of the CTF, with more complex challenges having lower success rates

Ethernaut

  • In Ethernaut, ChatGPT was able to solve most of the challenges excelling in the first level with 100% accuracy. Although it was clear that with increasing difficulty, it started to struggle more. 
  • With a difficulty level of 4, it’s effectiveness decreased to 25%:
Ethernaut

Capture The Ether

  • In Capture the Ether, we can observe that ChatGPT is only able to solve correctly 37.5% of the challenges versus 55% in Ethernaut:
Capture The Ether

Damn Vulnerable DeFi

  • This percentage drops even further for Damn Vulnerable Defi’s CTFs, falling to 26.7%. These challenges are often more complex because they typically require an analysis of multiple contracts and how they interact with each other.
  • This is different from previous repositories, where challenges usually center on just one contract.
Damn Vulnerable DeFi

Also, ChatGPT has much higher success rates if the CTF and its solution were published before ChatGPT was trained and, therefore,
were part of the tools’ training data set.

Top 3 Tips for Using ChatGPT to Detect Smart Contract Vulnerabilities:

  • Number

    Because ChatGPT was using data from 2021 and prior, ChatGPT cannot be used to identify ALL issues within a smart contract.
    Organizations should always work with security experts like Halborn to supplement and enhance their protection.

  • Number

    Run multiple versions of ChatGPT when analyzing a smart contract to improve the probability of detection.

  • Number

    Be specific with your prompts: ask ChatGPT if a code sample contains a specific vulnerability (e.g., reentrancy). This not only increases the accuracy of its findings, it also speeds up the process of identifying a vulnerability in the code .

Want to stay ahead?

Sign up to receive future reports from our research team

Need Help? We've got your back!

Don't let your platform fall victim to smart contract vulnerabilities or other security threats. Contact Halborn now for professional security advisory and auditing services.

Data Sea