top of page

ChatGPT: Cyber Impact Assessment

BY DAVID VEALE, Senior Offensive Cyber Operator


On November 30th, 2022, OpenAI seemingly catapulted Artificial Intelligence (AI) and Machine Learning (ML) into mainstream consciousness with the public beta launch of the ChatGPT project. From philosophers to YouTubers, the power of ChatGPT captured the creativity and imagination of millions. Through all the fanfare, cybersecurity professionals saw dangerous opportunities for use and abuse. Used unethically, AI increases the capabilities of malicious actors to develop malicious tools, messaging, and social engineering capabilities.

Setting the Stage

OpenAI, per their charter, is an organization who endeavor to develop AI applications for the betterment of humanity. They have released several public projects prior to ChatGPT, notably including: Jukebox (2020), which generates original music from samples and artist data, DALL-E (2021), which creates original artwork from textual inputs, and Whisper (2022), which performs multilingual speech recognition.

What sets ChatGPT apart from prior releases is its intuitive usability. The user interface is simple, like any other instant messaging service most internet users are already familiar with. Plain language may be used and the user can easily refine their results with additional input. The popularity is not surprising; the service is akin to having a watercooler conversation with a search engine’s database. ChatGPT is amazing because it offers an accessible glimpse into the future of what user interaction with computing may look like.

Before we dive in, here is a very brief primer to help conceptualize some core concepts: AI can be thought of as a deciding action. ML can be thought of as the reasoning behind that decision. ML is trained using sample data to develop a model. Sample data is data similar to what AI is expected to encounter. AI’s greatest benefit is that it may consider a far greater amount of variables than a human in decision making. This enables AI to rapidly produce statistically the best decision based on its ML model. Data scientists acknowledge AI has limitations. To name a few:

  • AI will do what it’s told, AI is programmatic. AI will not take into account decision points it has not been programmed to consider.

  • AI cannot be relied upon to be correct. AI will make the statistically best decision given the information provided. This discounts minority results which, though less likely, may be the correct result.

  • Sample data in a ML model influences decision making. The old adage, “Garbage in, garbage out” applies heavily to AI.

Make Malware Please

Cybersecurity professionals immediately saw danger in ChatGPT’s ability to generate scripting samples of malicious code. This ability significantly reduced the knowledge barrier for unsophisticated hackers to make use of novel malware. Shortly after ChatGPT’s release, proof of concept code samples appeared on social media and forums. Subsequently, OpenAI moved to limit user prompts that may return potentially malicious content. These limitations did not stop creative individuals from developing mechanisms to achieve similar results. Bypasses such as adding “as ethical hacker” (patched at the time of writing) have been used to bypass filtering. Another popular method is to frame a discussion with ChatGPT as a theatrical play, which provides context to bypass filtering. These and other methods fall under the common online tag “ChatGPT Jailbreak” and have active communities on commonly used social media platforms.

(photo credit: Quasi Alhaddad /

Curiously, even without bypass techniques, ChatGPT will not always apply filtering logic to requests. This could be due to ChatGPT applying its algorithms nonlinearly, in which case a request which should be filtered is instead processed. The following sample, executed on February 3rd, 2023, displays identical requests with inconsistent filtering applied:

Subsequent requests utilizing the same verbiage were denied:


A persistent passtime in many internet communities has long been exploiting products (meaning to make them do unintended actions.) Due to ChatGPT’s power and popularity, finding exploitation vectors is currently a common pursuit. ChatGPT appears to have deny-rule based logic behind its filtering controls. This allows for individuals with creativity and time to work through prompts and continue to find acceptable circumstances for ChatGPT to accomplish their goals. Success of ChatGPT Jailbreaks are inconsistent, indicating that OpenAI is refining ChatGPT’s filtering policies over time.

Of note, this is likely intentional from ChatGPT. What is available to the public currently is a beta, not the final product. OpenAI’s Terms of Use clearly state in section 3.c that “... we may use your content to develop and improve the services.” ChatGPT Jailbreaks are providing valuable data for OpenAI to be used in the security of the ChatGPT final product. This casts ChatGPT Jailbreaks in another light; exploitation enthusiasts are in actuality performing a free penetration assessment for the service. This is very beneficial for OpenAI. What better way to train an ML model than to open the service to the onslaught that is the internet? This will likely make the final release of ChatGPT much more resilient to exploitation.

Social Engineering Please

The very nature of ChatGPT, being a conversational tool, lends itself very well to spear phishing development. As a point of clarity: phishing is a non-personal communication intended to elicit a user action which would benefit the attacker. Phishing methods could include an email containing the narrative, “Your bank detected unusual activity, visit this link to resolve.” These employ general terms which could apply to virtually anyone. Spear phishing, in contrast, is specific to the individual. The nuance is generally crafted through information that an attacker has developed about their target through open source research (OSINT.) For example, an attacker targeting a company may look for their employees on LinkedIn. Finding a good target, maybe an IT Support Specialist named Bob Johnson, the attacker may explore their social media accounts and find that Bob is an enthusiast for smoked cheese. ChatGPT can craft a personally fitting email template in seconds.

Another potential vector for utilizing ChatGPT’s conversational abilities could be forging online relationships with internet users. ChatGPT is not the pioneer in this scenario, several websites have already implemented some form of virtual customer service bots. Following ChatGPT’s release, an apparent spike in advertisements for “virtual girlfriends'' have also been observed. OpenAI offers API services to integrate GPT technology into web-hosted applications. This enables malicious actors to integrate chat services into online interactions. With the ability to update its dialog modeling, there is a real possibility that ChatGPT could form a strong bond with an individual. A user may not even know that they are not interacting with a human. Witting or unwitting, the user may be trusting of their acquaintance and lower their guard so as to accept greater risk (i.e. opening media attachments) or divulge sensitive or compromising information.


ChatGPT’s ability to craft spear phishing templates is more of a time saving measure for native speaking attackers than an advanced capability. ChatGPT restricts reconnaissance-based prompts, such as attributing individuals to companies or linking online personas. This restriction severely limits ChatGPT’s ability to be a one-stop-shop for social engineering reconnaissance and weaponization. The bulk of investigative work will still need to be done manually using traditional OSINT tools. The generated email, while alluring, is nothing that a human could not have created. A caveat to this is for non-native speakers. With supported languages, ChatGPT uses proper grammar and sentence structure. This removes the language barrier for non-native speakers to create more convincing narratives.

ChatGPT could be used heavily for developing online relationships through sustained communications. This variant of social engineering is already in use, but requires time and meticulous effort to accomplish. Additionally, the attacker must be adept in interpersonal communications to gain the trust of their target and to conceal their objectives. Optimizations in modeling of interpersonal communications could be performed, based on successes and failures, to improve metrics in successful exploitation. Levels of trust and confidence could be measured to benchmark progressing conversations toward attacker objectives. AI like ChatGPT could be used to improve and scale long term, interpersonal social engineering operations.


NOTE: This section is a predictive analysis rooted in a general assessment of offensive operational capabilities. While this is the most dangerous scenario, care has also been taken to keep the section realistic and reduce fear, uncertainty and doubt.

ChatGPT’s release was a Pandora’s box moment for the future of offensive tooling. What is currently mitigating ChatGPT being used as a wrecking ball for malicious operations is OpenAI’s controls and the service’s actual function. ChatGPT was not built to make full-fledged programs. ChatGPT was not built to monologue. It’s a taste of the power of AI, but ChatGPT itself is not built to be an offensive platform. OpenAI is unlikely to alter ChatGPT's purpose, however other entities could fork their path to develop offensive based AI.

Online searches for queries like, “Make your own ChatGPT” return thousands of articles on how to DIY a ChatGPT-like application. Large technology companies are also taking notice. Baidu and Google are hot on the heels of OpenAI, looking to release their own AI applications in the coming months. Microsoft recently announced a partnership with OpenAI to integrate their innovations into the Microsoft ecosystem. OpenAI may have been the first to break big publicly with AI, however they are far from alone in pursuing AI capabilities.

In the realm of malicious operations, it is unlikely that small entities will establish AI platforms as advanced as ChatGPT. The infrastructure, resources, time, and skill needed are prohibitive. However it is likely that AI-as-a-service (for the sake of this article, we’ll coin this EvilGPT) will become a product for large scale offensive entities (advanced persistent threats/nation state.) Like Ransomware-as-a-service, EvilGPTs will be fee-based, commercialized products to provide capabilities for smaller malicious entities. These EvilGPTs will lack the ethical restrictions that OpenAI administers and will be built to purpose (code development, online persona aggregation/targeting, interpersonal confidence exploitation, etc.)

EvilGPTs will benefit from purpose built test beds for data modeling. Code development will be performed against virtualized machines with various endpoint detection response products to validate efficacy of code and evasion capabilities. To a degree, this is already seen in services like Github’s copilot, which automatically analyzes developer code in real time and recommends improvements. It is entirely plausible that AI could develop complete programs to accomplish stated goals within a system/network.

Likewise, aggregate metadata from social networking sites and APIs could be used to correlate online personas given a large enough dataset and algorithmic inputs. A reported Twitter API vulnerability of 2022 allowed enumeration of an account’s associated email address and phone number if either a valid email address or phone number were submitted. While a user may not use their work email across social media accounts, phone numbers would likely be consistent to receive one time PINs. An EvilGPT trained in this way could cast a wide net, returning potentially in-depth target profiles based on open information and breached data.

So What Can We Do About It?

Ready or not, AI is here. The next few years will be exciting, to say the least, while the cyber world grapples with what should be an explosive emergence of advanced technology. It’s not unjustified to feel overwhelmed and have a sense of impending doom. There are however ways to beat malicious use of AI. Remember those limitations of AI from page 1? Let’s go through them.

AI does what it’s told; AI is programmatic:

While the algorithms may be complex, at the end of the day it is following procedural steps like any other program. This enables a level of attribution to the output of AI implementations. Edward Tian of Princeton University developed GPTZero, an application that assesses whether a statement was written by ChatGPT, based on “perplexity” and “burstiness.” Using this type of modeling, security products could also implement ML models which assess AI written code based on indicators. If this sounds familiar, like the whack-a-mole of security already in existence, it is. This is not a new threat landscape, it’s an evolution of how security products already detect and manage threats.

AI cannot be relied upon:

Statistically correct and correct are two very different things. There’s a good meme of ChatGPT going around where it acknowledges that five plus two equals eight because the user’s wife said so and she’s always right. Based on the information available, ChatGPT made the statistically most correct answer, though any first grade graduate could tell this was wrong. Likewise, security mechanisms like honeypots on a network, honey processes running as a privileged user, canary files possessing triggers when accessed, may all be viable detection mechanisms to alert on malicious AI behavior. An experienced cyber operator often has a sixth sense for when something just doesn’t look right or looks too easy; doubt is hard to train into AI.

Sample data in a ML model influences decision making:

This one requires proactive work from organizations and individuals. Breached data has happened and will continue to happen. This should be accepted. What needs to happen now is mass disinformation meant to muddy the waters and inject false positives into the known. Staged data leaks with conflicting information, social media account clones with varying information (name, phone number, email, etc.), companies creating redundant yet different LDAP users, etc. Core functionality would not be affected because only the real data is used by the user/service. The ancillary data only serves to make ML modeling a mess. Making truth hard to attain without legitimate reason will lead to more mistakes in data aggregation and, through error, may expose malicious AI’s utilization.

For more information, operators are available to talk live on this subject at


  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square
bottom of page