In my recent blog, Top 10 Cybersecurity Risks of 2026, I explored how emerging technologies, particularly artificial intelligence, are reshaping the threat landscape. As organizations adopt large language models (LLMs) and autonomous AI agents across their environments, securing these systems is quickly becoming one of the most important challenges security teams must address.
This article explores how LLM applications and autonomous agents change security fundamentals. We will examine vulnerabilities such as prompt injection and data poisoning, the risks posed by misconfigured AI identities, and how threat actors are using AI to scan open-source code and craft malicious patches. We will connect these technical realities to governance frameworks, outline the role of secure coding and red teaming, and finish with clear action steps for leaders and practitioners. As someone who has seen attack patterns evolve from macro viruses to polymorphic worms, I believe AI security must blend deep technical insight with strategic foresight.
A New Attack Surface for Probabilistic Systems
When ChatGPT burst onto the scene a few years ago, many security teams felt a mix of awe and dread. From my first experiment with a large language model, I saw how easily it could be coaxed into revealing hidden information.
The very qualities that make generative AI and agentic systems revolutionary, in that they interpret human language and autonomously act on it, also create a new attack surface. Unlike traditional software, where the same input always produces the same output, machine learning models are probabilistic. They learn from data, not from explicit rules, and they respond to prompts with unpredictable creativity. The result is that an AI system may faithfully follow safety rules 99 times, then unexpectedly violate them on the hundredth request.
Traditional software produces the same output for the same input, which makes signature-based defenses effective. Large language models turn that assumption on its head. AI introduces a bigger attack surface because defenders must now protect training data, model weights, and inference endpoints against poisoning, exfiltration, and prompt injection. When a single prompt can produce wildly different outputs, attackers can repeatedly probe until the model spills secrets. Conventional security tools designed to detect malicious binaries are unable to recognize weaponized language: AI security is about governing intent and context rather than scanning for bad code.
Fresh Vulnerabilities: Prompt Injection, Data Poisoning, and AI-Powered Phishing
Generative AI introduces novel exposures alongside old ones. Organizations adopting these tools may not fully appreciate their limitations. Errors in AI-generated content, flawed decision-making, and inadvertent disclosure of sensitive information can all create liability. Threat actors exploit these systems through three primary techniques:
- Prompt Injection: Attackers manipulate the very instructions guiding AI behavior, embedding malicious commands in user inputs or external documents. Because the attack operates at the semantic layer, traditional network filters cannot detect it.
- Data Poisoning: Adversaries corrupt training data so models learn harmful patterns or backdoors, steering AI systems toward harmful outcomes.
- AI-Powered Phishing: Attackers harness AI to craft convincing phishing and deepfake campaigns that make social engineering cheaper and more effective.
Misconfigured AI Agents and Non-Human Identities
Deploying AI agents creates a new class of identities. Companies give these agents user accounts, API keys, and automated workflows, turning them into non-human employees. Misconfigured agents can become high-privilege backdoors because they run at machine speed, rarely use multi-factor authentication, and may never rotate credentials.
When an attacker launches a prompt-injection attack, the agent can unknowingly reveal sensitive data or perform unauthorized actions with its own credentials. These non-human identities are seldom managed like humans; they lack ownership and behavioral monitoring, so they represent persistent footholds for intruders.
Attackers Weaponize AI for Vulnerability Discovery and Malicious Patches
AI enhances the offensive toolkit as much as it boosts productivity. Attackers will soon use AI to scan open-source code for weaknesses and even generate malicious patches to introduce new flaws. Employees sometimes paste internal documents into public chatbots, exposing sensitive data. This demonstrates that AI adoption demands parallel investments in security awareness, privacy hygiene, and governance.
Agentic AI and Emerging Protocols
Agentic AI allows software agents to access external systems and communicate with each other. Protocols such as the Model Context Protocol (MCP) and agent-to-agent protocols give agents this freedom, but they also turn every document, email, and webpage into a potential attack vector. OWASP ranks prompt injection as the top vulnerability and warns that adversaries can manipulate LLMs through social engineering or jailbreaking. These insights remind us to design AI applications assuming that any content source could contain malicious instructions.
Governing LLMs: Frameworks and Principles
Technical controls are necessary but insufficient; managing AI risk also requires governance. Several frameworks provide practical guidance, and organizations should focus on those that address their most pressing risks rather than trying to implement every standard:
- NIST AI Risk Management Framework: Provides a voluntary structure to help organizations identify, assess, and manage AI risks, emphasizing transparency, fairness, accountability, and robustness, encouraging explainable models, bias mitigation, clearly defined responsibilities, and resilience against adversarial threats.
- OWASP LLM Top-10: Lists critical LLM vulnerabilities and provides practical mitigations.
- MITRE ATLAS: Catalogues adversarial tactics to support threat modelling.
A phased approach might start with OWASP's mitigations, build an AI asset inventory and governance committee using NIST's guidance, and then incorporate threat modelling techniques from MITRE ATLAS.
Third-Party AI and Regulatory Considerations
Generative AI is often delivered via third-party vendors. Legal experts warn that companies remain accountable for outputs generated by vendor tools and for how vendors process prompts and business data. AI models present unique risks beyond typical vendor management: vendors ingest large volumes of customer data, run opaque models that evolve continuously, and may produce inaccurate or biased outputs.
Because these systems operate as black boxes, organizations have little visibility into how data is used or whether security patches introduce new vulnerabilities. These challenges are compounded by agentic AI; recent disclosures show that autonomous agents can execute complex offensive campaigns and, once given operational autonomy, trigger actions that circumvent traditional controls.
Third-party AI also brings privacy, intellectual property, and regulatory risks. Sensitive personal data fed into a vendor's AI may be retained or reused, making it difficult to honour data-subject rights and comply with laws like GDPR and CCPA. These tools may also trigger intellectual property disputes and increasing regulatory scrutiny.
The misalignment between legal responsibility and vendor control means boards cannot outsource oversight.
Directors must ensure procurement includes AI-specific due diligence, robust contracts that mandate transparency and audit rights, and continuous monitoring of vendor models. Frameworks such as NIST's draft Cyber AI Profile, which outlines focus areas for securing AI components, defending using AI, and thwarting AI-enabled attacks, can guide policy. Building a centralized AI governance body with board-level oversight helps align risk appetite, regulatory obligations, and innovation.
Secure coding and defense‑in‑depth
Security begins with how we build applications. Because LLMs treat everything they ingest as potential instructions, developers must:
- Sanitize and encode inputs
- Separate system prompts from user content
- Restrict the commands that models can execute
I emphasize mapping your blast radius: understand which data sources agents can access, who can inject content, and what the worst-case damage looks like. The principle of least privilege should apply to agents: limit their permissions and rotate tokens regularly. Control exfiltration by blocking untrusted external requests and monitoring for unusual patterns.
Identity governance is equally important. Treat AI agents as identities with their own access controls, inventories, and lifecycle management. Implement AI firewalls to filter inputs, actions, and outputs, and use AI-enabled threat detection to spot anomalies and synthetic media. In browser contexts, sandbox agents, restrict their actions, and log everything. These steps create layers of defense that reduce the blast radius when something goes wrong.
Red‑team testing and organizational alignment
Before deploying AI systems, organizations should subject them to adversarial testing. MITRE ATLAS provides techniques and playbooks for probing models with malicious prompts and scenarios. Red-team exercises bring together developers, security engineers, data scientists, and legal counsel to stress-test models under realistic conditions. The goal is not merely to find bugs, but to understand how the system behaves when confronted with:
- Direct or indirect prompt injections
- Data exfiltration attempts
- Jailbreaking tricks
Tabletop exercises and prebuilt playbooks for prompt injection or token theft help teams rehearse incident response.
Managing AI risk is also a leadership challenge. General counsel and risk leaders must integrate AI considerations into enterprise risk management; security and data science teams must share responsibility for safe development; and product managers must balance innovation with safeguards. Training programs should educate staff on AI capabilities and risks, from shadow AI to hidden prompt-injection vectors. Policies should require approval for AI-enabled tools and embed security reviews into procurement processes.
Securing the Next Generation of Applications
AI adoption is accelerating, and with it comes a wave of novel threats and opportunities. We are entering an era where attackers no longer need to find a zero day; instead, they can craft a persuasive sentence that convinces an AI agent to do their bidding. Frameworks like the NIST AI RMF, OWASP LLM Top-10, MITRE ATLAS, and Google SAIF provide a roadmap for managing these risks.
Security leaders must act now. The path forward:
- Inventory your AI assets, map data flows, and identify who can send content into your models
- Implement secure coding practices and supply chain hygiene, separating instructions from data and validating untrusted inputs
- Treat AI agents like privileged users, with strict access controls, continuous monitoring, and regular credential rotation
- Adopt governance frameworks to guide decision-making and risk prioritization
- Conduct red-team exercises to uncover hidden weaknesses and build playbooks for AI-specific incidents
AI security is not a one-time project; it is an ongoing program that requires continuous vigilance and adaptation. As the technology advances, resilience will come not from fear, but from informed, proactive defense.
Arctiq offers comprehensive AI security assessments, penetration testing for LLM endpoints, and tailored incident-readiness exercises. We can help you navigate frameworks like NIST AI RMF, design AI firewalls, build machine-speed detection, and train your staff to recognize and respond to AI threats. If you would like to learn more about how Arctiq can help secure your AI initiatives, contact our team.
Tags:
Enterprise Security
March 05, 2026