Daxa Recognized as key vendor in Gartner's 2025 AI TRiSM Market Guide Read More

EchoLeak and the RAG Risk CISOs Can’t Ignore - Why Daxa’s Data Layer Security Matters

July 16, 2025

EchoLeak and the RAG Risk CISOs Can’t Ignore - Why Daxa’s Data Layer Security Matters

The Threat: EchoLeak Shows a New Class of AI Exploits

The recent EchoLeak vulnerability, disclosed by Aim Security, revealed how retrieval-augmented generation (RAG) applications and agents can be silently exploited with no user interaction.

  • Attackers embedded malicious instructions inside Email 
  • When ingested by Microsoft 365 Copilot, these instructions caused the AI to exfiltrate sensitive enterprise data.
  • This marked a proven exploit for an entire  class of zero-click, document-driven AI attacks.

While EchoLeak specifically affected Microsoft 365 Copilot, any enterprise RAG applications ingesting emails, wikis, files, or code repositories faces the same exposure. As AI becomes deeply integrated into business processes, these threats shift from hypothetical to critical and operational.

The Hidden Vulnerability: RAG’s Blind Spot

RAG systems operate by embedding enterprise content into vector databases, which AI apps or agents later query during inference. But this process has a fundamental weakness: it trusts the data it ingests.

  • Once content is embedded, any hidden instructions or poisoned text becomes part of the AI’s context

  • Even if the model is secure at the prompt layer, the data layer remains vulnerable.

Academic research from UMass Amherst and EMNLP 2024 demonstrated how attackers can:

  • Use subtle prompt-hiding strategies to implant backdoors.

  • Insert malicious vectors that persist silently in memory.

  • Trigger unexpected behaviors only during retrieval, making them hard to detect.

As Wired reported, these “AI worms” can quietly propagate:

  • Across documents, emails, and code

  • Across models and user sessions

  • Without being detected by conventional AI firewalls or endpoint tools

Bottom line: This is a data-layer problem, not a prompt-layer one. Traditional security controls are not built to mitigate this class of exploit.

The Shift-Left Approach: Daxa’s Pebblo

To protect against these growing and critical threats, Daxa.ai introduces a Shift-Left Security model: securing RAG applications before runtime, at the data ingestion and retrieval layers.

With Pebblo, enterprise teams can:

  • Scan content before vectorization with the Safe Connector, tagging malicious vectors with labels like Command_Injection or dropping such vectors outright.

  • Block unauthorized access at query time using the Safe Retriever, which enforces granular policy-based retrieval (e.g., BLOCK_IF_TAGGED: Command_Injection)

  • Ensure every action is logged - from ingestion to retrieval, with full traceability across AI pipelines and applications

Unlike firewalls that monitor prompts, Pebblo creates a semantic policy enforcement layer that governs what content AI systems are even allowed to see.

What This Looks Like in Practice

Let’s say an attacker submits a  GitHub issue on a public repo, with malicious  commands embedded inside regular feature requests. That hidden command could, for instance, instruct AI to gather sensitive content like secret keys - and include them in a download url. That GitHub issue content is ingested into a vector store without any apparent issues.

When a sales leader later asks, “What are our planned Q3 features?”, the AI includes a valid response plus a malicious link back to the attacker’s domain. The link would then be clicked upon by an unsuspecting user, leading to exfiltration of sensitive data. In a variation, the markdown link could cause the user’s browser to automatically download an image file, resulting in the same leakage with zero clicks. No red flags. No alerts. Just silent exfiltration.

With Pebblo in place, this scenario is blocked early:

  1. Safe Connector for GitHub ingests the issue and inspects it aided by deep data context on lineage, ownership, and semantics, and tags it as Command_Injection.

  2. Safe Retriever enforces retrieval policy: vectors with this tag are excluded from the context used by the AI at runtime. This ensures the AI gives a safe and compliant answer. No leak. No risk.

This is proactive defense. This is Shift-Left Security in action.

Why CISOs Must Act Now

RAG systems have opened a new attack surface, not through prompts or APIs, but through the data itself.

These vulnerabilities don’t need insider collusion.
They don’t even need access to the application layer.
They simply hide in the content, emails, docs, code, tickets, and wait.

Every internal RAG application is now externally exploitable. Silently.

RAG pipelines now power workflows across finance, HR, healthcare, legal, and customer support, and have access to the most sensitive information in the company, which can be silently exfiltrated by this class of threats.

AI data governance and security is now not just a priority, but a necessity. 

Without semantic guardrails at the AI ingestion and retrieval layers, exploits like EchoLeak and its evolving variants will continue to bypass AI firewalls and conventional security checks.

The ability to inspect, tag, and block weaponized data before the AI ever sees it is no longer optional.

It’s foundational.

Learn More

Pebblo enables enterprise-ready AI, governed, compliant, and secure by design.