Home
LLM: From ally to trap, managing the risks of adversarial attacks in AI runtime

LLM: From ally to trap, managing the risks of adversarial attacks in AI runtime

November 24, 2025

The integration of large language models (LLMs) is rapidly transforming the landscape of business applications, promising massive gains in productivity and performance. However, this rapid adoption also introduces a new generation of critical vulnerabilities that operational and strategic teams must address head-on.

The Squad Group and its integration and Managed Security Service Provider (MSSP) subsidiary, Squad Cybersolutions (formerly Newlode), are committed to shedding light on these emerging challenges. Drawing on this expertise, we organized our "Techlab" demonstration sessions, as we have done for the past eight years at the Cybersecurity Conference in Monaco.

The Lab "AI: From Ally to Trap – When Its Use Backfires," presented by Sébastien BOULET (Principal Engineer and DevSecops Team Lead at Squad Cybersolutions), was a huge success in 2025. It was requested by CESIN to be rebroadcast at the Cyber Campus in La Défense on November 21, 2025. This session highlighted the real dangers when AI, supposed to be an ally, is misused at the very heart of its execution environment, the AI runtime.

The Age of LLMs and the Proliferation of Risks

The Transformer revolution, which began around 2017, led to the emergence of Large Language Models (LLMs), which work by predicting the next token. These models, although powerful, are not based on true understanding, but on statistical patterns.

Adoption is growing exponentially: 47% of companies reported building AI-oriented applications in 2025, andPalo Alto Networks'predicts 100,000 AI applications for businesses by 2026. At the same time, the proliferation of available models, with more than 2.1 million on Hugging Face alone (early October 2025), is multiplying the vectors of threat.

LLMs and the systems that integrate them are inherently vulnerable to adversarial attacks. These vulnerabilities are mapped by dedicated frameworks such as MITRE ATLAS.

LLM misuse: injection and poisoning

Attacks against LLMs exploit their inability to reliably distinguish legitimate instructions from malicious content, even when models have been trained for safety.

1. Prompt Injection

Prompt injection is ranked as the number one risk (LLM01:2025) byOWASP for AI applications. It allows an attacker to use intentionally crafted inputs to alter the behavior or expected output of the model.

It manifests itself in two main forms:

Direct prompt injection: The attacker inserts malicious instructions directly into the user interface (chat, search field).
Indirect prompt injection (IPI): This is the most insidious and security-threatening form. Malicious instructions are embedded in external sources (documents, emails, web pages, API responses) that the LLM ingests and processes as context. IPI bypasses traditional input validations and executes commands with system privileges. These instructions can be hidden in HTML comments, invisible text (white on white), or metadata, making them undetectable to the human eye. IPI can turn an LLM into an intrusion gateway, enabling data leakage or unauthorized actions to be performed via the APIs to which it has access.

2. Data Poisoning

Poisoning is an attack where the malicious actor intentionally manipulates the data used to train or inform AI systems. Even slight alterations to the data can skew the model's behavior.

Poisoning can occur during the data movement process (data poisoning) or directly in the model (model poisoning), particularly in federated learning environments.
Types of attacks include targeted attacks (causing the model to ignore a specific type of malware) and untargeted attacks (degrading overall performance), as well as backdoor attacks that insert hidden triggers into the system.

Consequences and case studies

The risks associated with adversarial attacks are not theoretical; they result in security breaches, financial losses, and erosion of trust.

Data contamination: It has been shown that 0.001% of corrupted tokens are enough to invalidate a model, potentially spreading misinformation. This contamination persists because the results of LLMs are then used to train future models.
Loss of control and data leakage: An unsecured AI infrastructure is a current vulnerability. Attacks can lead to the leakage of sensitive information (privacy violations), the amplification of misinformation, or operational disruption.
AI agent hijacking (persistent memory): AI agents with long-term memory (used to retain context between sessions) are a new attack surface. An indirect prompt injection can silently poison the agent's memory by inserting malicious instructions that persist. These instructions are then injected into the orchestration prompts (system instructions) of future sessions, amplifying the potential impact and enabling the silent exfiltration of conversation history.
Example of financial impact: In December 2023, a dealership's Chevrolet chatbot was manipulated by a simple prompt injection to agree to sell a high-value vehicle ($76,000) for only $1.

Our response: Securing the AI Runtime

Faced with this new frontier of risks, the Squad Group positions itself as an expert capable of ensuring robust and proactive cybersecurity. Our mission, Securing Together, emphasizes collective intelligence and the ability to integrate cutting-edge solutions.

The Techlab "AI: From Ally to Trap" demonstrated in concrete terms how these attacks operate directly in the AI runtime, where trust takes precedence over vigilance.

Squad Cybersolutions, as an expert integrator and MSSP, offers a defense-in-depth approach to these threats, based on essential security practices, including securing AI runtime.

To strengthen organizations' cyber posture, mitigation strategies must be multi-layered:

Ensuring model protection (AI Runtime): The focus is on securing the AI execution environment. Partner solutions, such as those from Palo Alto Networks highlighted during Techlab, strengthen each link in the AI runtime:
- Audit of dependencies and configurations.
- Execution controls and real-time instrumentation.
- Behavioral monitoring to counter threats.
- Using solutions such as Prisma AIRS to detect and block prompt attacks and prevent data leaks in real time.
Rigorous filtering and validation: It is imperative to sanitize content before it is fed into the LLM, removing hidden HTML tags, metadata, and off-screen text, which attackers use to conceal instructions.
Context delimitation: Prompts must be designed with clear boundaries to separate trusted system instructions from untrusted external content, in order to reduce the risk of the model obeying hostile text.
Principle of least privilege: Restrict the operational capabilities of the LLM, assign it API keys with minimal permissions, and require human validation for sensitive actions.

In conclusion, modern cybersecurity requires recognizing that LLMs, while powerful allies, represent unprecedented attack vectors. The Squad Group and Squad Cybersolutions are committed to supporting security teams in mastering these new environments, transforming isolated tools into a coordinated defense. Through our expertise and our Techlabs, we are working to ensure that innovation in AI is synonymous with security, not vulnerability.