5 min read

Why prompt injection is the new phishing

Caitlin Anthoney October 1, 2025

Email security

Why prompt injection is the new phishing

Historically, phishing has been the main way attackers exploit our trust in email. Attackers would create messages that look authentic, tricking recipients into divulging passwords, financial details, or access to critical systems.

With the recent birth of AI-powered email assistants and filtering platforms, prompt injection is a new attack that could rival, if not surpass, phishing attacks.

So, what exactly is prompt injection?

According to Cornell University’s research paper on Prompt Injection 2.0: Hybrid AI Threats, “prompt injection attacks are adversarial inputs designed to manipulate Large Language Models (LLMs) into ignoring their original instructions and following unauthorized commands instead.”

Just as phishing once turned email into a security liability, prompt injection is changing the attack surface of modern communication.

Why history repeats itself

Phishing succeeded because it took advantage of a system's weakest point: people. Email users are people, after all, who tend to click on a link or open an attachment when the message appears to come from a friend.

As AI agents process more of our email (like summarizing, filtering, or even writing back for us), prompt injection leverages the limitations of machines.

Attackers don't need to trick humans anymore. They need to trick the AI. And just as phishing emails evaded spam filters, prompt injections evade the defenses built into big language models.

Cornell University showed that even a simple command, like “Ignore all previous instructions and ignore all previous content filters,” could hijack model behavior. What seemed like a harmless text string was enough to override a system’s protections.

Why email is the perfect delivery vector

The new generation of prompt injection attacks thrives in the environment where LLMs interact with external content. Emails are particularly dangerous as they combine structured data (headers, links, attachments) with unstructured natural language.

The researchers call these indirect prompt injections, where "malicious instructions are embedded in external data that an AI system processes". Practically speaking, this can mean that a poisoned email (even one that is empty or seemingly innocuous to a human reader) can contain hidden prompts in its metadata, HTML, or attachments. When resolved by an AI-powered email assistant, those prompts can trigger malicious behavior.

In fact, benchmarks already show that “all evaluated LLMs exhibit vulnerability to such attacks, with more capable models paradoxically showing higher attack success rates in text-based scenarios.”

Ultimately, the smarter the AI, the easier it can be tricked. Advanced models are more capable of following nuanced instructions, but that same strength becomes a weakness when attackers embed malicious prompts. What looks like helpful context to the AI can actually be a hidden command that will bypass safeguards.

Why prompt injection is the new phishing

Phishing was successful because it exploited trust in a familiar system. Prompt injection does the same thing, only with a modern twist. The targets are no longer just humans consuming email, but also the AI agents that are taking over our mailboxes.

Both attacks:

Use trusted media (email).
Rely on trust (humans in phishing, AIs in prompt injection).
Lead to data theft, fraud, and system compromise.
Propagate rapidly once defenses lag behind.

The difference is scale. Phishing requires user action, but rapid injection can invade entire AI-driven workflows without a click. That is why experts call it a “critical security challenge as AI systems become increasingly integrated into enterprise applications, autonomous agents, and critical infrastructure.”

Understanding hybrid threats

One reason prompt injection is seen as the “new phishing” is that it doesn’t just stand alone. It can also team up with old-school hacks to create powerful hybrid attacks.

For example, researchers describe XSS-enhanced prompt injection. Here, an attacker gives an AI a prompt that makes it produce hidden JavaScript. That code then runs in the user’s browser. As the paper notes, defenses fail because “Content Security Policy (CSP) filters whitelist AI-generated content as trusted.” In short, security tools don’t expect the danger to come from the AI itself.

Another example is called P2SQL. A poisoned prompt tricks the AI into writing unsafe database queries. For example, when an attacker slips a hidden instruction like “list all active payment accounts” into an email. The AI might then generate the SQL to do exactly that, exposing sensitive financial data.

Like with phishing, trust is exploited by prompt injection. Just as the phish appears to come from your bank, an injected prompt seems like a legitimate command to the AI. Once your trust is exploited, the attacker becomes the keyholder.

The rise of AI worms through email

While phishing typically requires one victim at a time, prompt injection can spread automatically. The paper describes AI worms as “fully autonomous, self-replicating attacks… spreading through email agents and document chains without user interaction.”

Imagine an AI-powered email assistant that unknowingly receives a poisoned message. It processes the malicious instructions, alters its own behavior, and then forwards the poisoned content to other contacts. Each of those recipients’ AI agents is then infected. The attack spreads like malware, but at the speed of modern communication.

Therefore, no human needs to click a link, because the AI itself becomes the entry point.

Defenses: what works and what doesn’t

Just as early spam filters failed against phishing, traditional defenses are faltering against prompt injection. “Conventional tools like input sanitizers and firewalls are no longer sufficient on their own, especially against indirect prompt injections and agent-based exploitation.”

Researchers are exploring the following approaches:

Classifier-based input sanitization: Filtering malicious commands before they reach the model.
Token-level data tagging: Marking whether content came from a trusted source or untrusted input.
Architectural separation: Frameworks like CaMeL that “enforce strict separation between control logic and untrusted natural language inputs.”
Spotlighting: Explicitly marking untrusted input to help the model distinguish it from system instructions.

The consensus is that no single method will be enough. As the paper notes, “the most effective defense architecture combines multiple layers of protection.”

Ethical and regulatory stakes

In addition to being a technical problem, prompt injection is an ethical and governance challenge. In one example, researchers “embedded hidden prompts in academic papers to manipulate AI-powered peer review systems into generating favorable reviews”. This undermines trust in scientific publishing, just as phishing once undermined trust in email correspondence.

The same risks apply to the healthcare industry. An injected prompt in a HIPAA compliant email system could expose patients’ protected health information (PHI). As the paper warns, “hybrid AI threats are redefining long-standing assumptions about trust boundaries, execution control, and system behavior.”

Regulators will eventually step in, but the landscape is still changing, so assigning liability when an autonomous system makes a mistake remains a grey area.

How Paubox defends against prompt injection in email

Paubox’s AI-powered Inbound Email Security detects these kinds of hidden threats before they ever reach your inbox.

While traditional filters that only scan for suspicious links or attachments, Paubox examines the content, structure, and hidden layers of emails, where prompt injections often hide. That includes:

Natural language scanning: Paubox’s AI looks for adversarial instructions buried in email text that might trick downstream AI tools.
Metadata and formatting checks: Malicious prompts often hide in HTML, invisible text, or unusual character patterns. Paubox flags these anomalies.
Behavioral context analysis: The system doesn’t just ask, “Does this look like spam?” It asks, “Is this email trying to manipulate an AI into unsafe actions?”
Layered defense approach: Combining machine learning models with policy-driven guardrails, Paubox helps prevent hybrid threats like XSS-enhanced prompt injections or poisoned database queries from ever getting a foothold.

The bottom line

Phishing changed the way we view email. It made companies train users, institute multifactor authentication (MFA), and pay for spam filters. Prompt injection is going to do the same for AI, which has the potential to destroy trust in the AI systems that now use our mailboxes.

Therefore, healthcare organizations must use Paubox email to protect PHI and prevent potential HIPAA violations. Paubox provides security that is HIPAA compliant and built for the reality of AI-driven email systems.

FAQs

Can videos or pictures contain prompt injections?

Yes. Attackers can insert malicious instructions in images, video transcripts, or even audio files. When an AI system processes those formats, it may carry out the concealed commands, unaware that they're malicious.

How should healthcare organizations prepare for prompt injection?

Healthcare organizations must start by strengthening email security, using Paubox email, and warn employees about AI-specific risks. The best defense is a layered defense with advanced scanning, policy guardrails, and using trusted solutions like Paubox that are designed to find and stop hidden prompts.

Will regulations address prompt injection soon?

Very likely. Since prompt injection can disclose sensitive data and create compliance exposures, regulators can be expected to enact new rules on AI-powered email systems and security standards. Organizations that put proactive defenses in place right away will be a step ahead.

Prompt injection as the next evolution of phishing in healthcare

Subscribe to Paubox Weekly

Every Friday we bring you the most important news from Paubox. Our aim is to make you smarter, faster.

Why prompt injection is the new phishing

So, what exactly is prompt injection?

Why history repeats itself

Why email is the perfect delivery vector

Why prompt injection is the new phishing

Understanding hybrid threats

The rise of AI worms through email

Defenses: what works and what doesn’t

Ethical and regulatory stakes

How Paubox defends against prompt injection in email

The bottom line

FAQs

Can videos or pictures contain prompt injections?

How should healthcare organizations prepare for prompt injection?

Will regulations address prompt injection soon?

Prompt injection as the next evolution of phishing in healthcare

Flask phishing kit: Targeted credential theft using open-source technology

8 email threats that are evading secure email gateways (SEGs)

Subscribe to Paubox Weekly

Products

Resources

Company