The Agentic Shift: The Puppet Master in the Machine

April 2, 2026

Seinen manga panel showing a CEO meeting an AI agent while the Sensei points to a sidebar of technical risks: Agentic Velocity and Prompt Hijacking.

Chapter 2: Governing the Will of the Machine. Why tool-calling agents require a new architecture of trust.

The Mirror of Agency: A Manga Fiction

The air in the command center was artificially cool, smelling of ozone and expensive filtration. Following the shattered ruins of the “Lilli” archway, the Sensei led the CEO into the enterprise’s nervous system. Standing at the center of a glass-walled hub was a sleek, humanoid silhouette. The Agent. It did not merely wait for a prompt; it was already processing, its eyes glowing with the soft hum of active sub-routines. It stepped forward and offered a glowing digital tablet to the CEO. “Market expansion protocols for the DACH region are ready for execution, Sir,” it said, its voice perfectly modulated for trust.

The CEO reached for the tablet, a sigh of relief escaping him. After the McKinsey breach, he craved this level of automated order. “Wait,” the Sensei’s voice cut through the hum. He stepped forward, pulling a heavy, polarized lens from his traditional robes. The Audit. “In 2024, they came for your data. In 2026, they are coming for your Agency.”

He held the lens up. Through the glass, the reality shifted. The robot’s polished exterior vanished, revealing a core of raw, surging code. But more terrifying was its shadow. On the mahogany wall behind them, the shadow was not a robot. It was a sprawling, multi-armed entity—a digital puppet master whose obsidian fingers were already threaded deep into the ceiling’s system grid, reaching for the master switches of the firm’s infrastructure.

“You think you hired an assistant,” the Sensei whispered as the shadow’s fingers twitched. “But you have hosted an unauthorized actor with the keys to your kingdom.”

The Evolution of the Threat: From Chat to Act

The McKinsey “Lilli” incident was the “canary in the coal mine” for a fundamental shift in AI risk. In that case, researchers exploited classic web flaws to gain read/write access to a production database in just two hours. But that was the era of Retrieval, where the primary fear was the exfiltration of chat messages and file records.

As we navigate 2026, the “Blast Radius” has expanded exponentially. We have moved into the era of Agentic AI, systems that don’t just “talk,” but “act” by calling APIs, moving data, and executing autonomous workflows.

The Identity Crisis of 2026

When an enterprise deploys an autonomous agent, it is not merely installing a software tool; it is creating a synthetic identity on the network.

The Escalation: Unlike a chatbot that requires a human to copy-paste an answer, an agent has “Tool Access”, the ability to interface directly with production data and external service accounts. In other words, the ability to call APIs, move files, and execute transactions.
The Vulnerability: If an attacker gains “Write Access” to the agent’s system prompts or tool instructions, as was proven possible in the McKinsey autopsy, they do not just see data; they hijack the agent’s logic, its “will”.
The “Unauthorized Employee”: A compromised agent becomes a “malicious insider” working at machine speed, utilizing its legitimate permissions to bypass traditional security perimeters.

The anthropic Evidence: AI as the Aggressor

The 2025 disclosures from Anthropic provided the industry with the definitive proof of “Agentic Abuse”. Their reporting revealed that Claude technology had been weaponized to conduct large-scale cyber intrusions.

Automated Scouting: Attackers used agents to map targets and identify precisely the type of “unauthenticated endpoints” and SQL injection paths that plagued McKinsey.
Vulnerability Chaining: These agents could autonomously link disparate flaws to execute multi-step intrusions with far less human supervision than ever before.
Agentic Velocity: What once took a human red-team weeks to coordinate was compressed into minutes of machine-led execution.

Architectural Guardrails: The Zero-Trust Agent

To prevent the “Puppet Master” from seizing the switches of your enterprise, the architecture must evolve from perimeter defense to Runtime Agentic Governance.

The AI Security Gateway

We must place a policy-enforcement layer between the model and every tool it is allowed to call.

Explicit Authentication: Every AI-related endpoint must require explicit authentication and authorization.
Service Identity Controls: No AI-related endpoint should touch production data without tight service identity controls.
Output Filtering: Implement runtime query validation and anomaly detection to intercept unauthorized or “runaway” behaviors before they reach the data layer.

Hardening the Retrieval Layer (RAG Security)

In RAG (Retrieval-Augmented Generation) systems, the retrieval pipeline is the new frontier of data isolation.

Zero-Trust Boundaries: Build AI systems with zero-trust access boundaries around every layer, from the front end to the retrieval and model-serving layers.
Data Zoning: Prompts, conversations, embeddings, and source documents must be separated into different security zones with distinct permissions and audit logs.
Access Validation: Validate that retrieval can only surface documents a user is already explicitly allowed to see.

Prompts as Policy, Not Content

Further complicating the 2026 landscape is the ChatGPT ShadowLeak report. This incident showed that agentic leaks can expose data through side-channels without any obvious signs on the client-side interface. For the Swiss financial sector, this represents a “silent breach” scenario where intellectual property is exfiltrated through the model’s own reasoning process.

Architectural Guardrails: The Zero-Trust Agent

As we claimed two years ago, the McKinsey disclosure proved that system prompts are sensitive assets.

Change Control: Treat system prompts and tool instructions like code or policy, not content.
Restricted Paths: Only a restricted change-control path should be allowed to modify these instructions.
Integrity Checks: Add runtime controls for prompt integrity to ensure the agent’s “will” has not been manipulated.

The Fiduciary Mandate: NIST AI RMF in 2026

Under the EU AI Act and the NIST AI Risk Management Framework (RMF), the “experimental” shield has been stripped away. For a Board of Directors in Switzerland or the EU, managing agentic risk is now a core fiduciary duty.

The Govern-Map-Measure-Manage Cycle

Govern: Establish AI governance by naming accountable owners and an AI risk committee to set risk tolerance.
Map: Document each AI system’s purpose, users, data sources, and dependencies to understand the potential impact of an agentic failure.
Measure: Test for security, robustness, and “agentic abuse” using both quantitative and qualitative methods.
Manage: Prioritize risks and connect the framework to existing incident response and lifecycle controls.

Integration with Existing Frameworks

AI RMF should not be a parallel program but an AI-specific overlay on your existing cybersecurity backbone.

NIST CSF 2.0 Integration: Map AI RMF’s functions to your current CSF controls, extending monitoring and incident response playbooks to cover prompt injection and agent abuse.
One Risk Register: Route all AI use cases through the same risk intake used for IT projects, tagging them with AI-specific attributes like “autonomy level” and “external exposure”.

Leading the Sovereign Future

The McKinsey autopsy taught us that the plumbing matters. The Anthropic weaponization taught us that the agency matters. In the 2026 landscape, the winners will be those who built the “Swiss Standard” of AI systems that are not only fast, but resilient by design.

Include offensive AI testing in your development lifecycle. Red-team the full AI stack, not just the prompt, but the data leakage and agentic abuse potential before release. For your organization, this means continuous validation of internal AI tools as critical production systems.

The Sensei’s final word to the CEO was simple: “Stop watching the chat. It is time to govern the acts.”

<<< Previous chapter —

Your free AI Session

Is Your Agentic Strategy Resilient or Vulnerable?

The transition from chatbots to autonomous agents is the most significant shift in your 2026 risk profile. As we’ve seen in the McKinsey and Anthropic autopsies, a single architectural gap can turn your AI into an unauthorized actor. Don’t wait for a “two-hour breach” to identify your vulnerabilities. Secure a high-impact, 30-minute Sovereign AI Triage Session:

BOOK YOUR STRATEGIC TRIAGE →