Issue #007: Your Security Tools Cannot See the Attack That Is Already in Progress

Issue #007.

Six issues in, the pattern of what Sentinel Base tracks is becoming specific: not the theoretical risks of AI, but the gaps between what your tools report and what is actually happening.

This week, two of the four signals are about the same gap in a different domain — network security. Your intrusion detection system sees network events. Adversarial campaigns do not unfold as events. They unfold as phases. Your encrypted traffic classifier sees byte patterns. Adversarial evasion tools do not change the behavior — they change the bytes. In both cases, the detection tool is measuring the wrong thing, and the attacker knows it.

The other two signals complete the picture: a robot navigation architecture that gives agents the ability to recognize when their strategy is failing, and a benchmark probe that reveals conversational AI competence is systematically unmeasured by every evaluation suite in production use.

Two industry developments: OpenAI's capital deployment begins, and the open-weight model baseline just shifted in ways that change the edge deployment calculus for Physical AI.

🔴 SIGNAL #1 — AI AGENTS · CRITICAL

Your Encrypted Traffic Classifier Can Be Defeated by Byte Padding. The Physics Cannot

arXiv:2604.02149 · AEGIS · Thermodynamic State Space Models · 2026-04-02 · cs.CR · cs.AI

TLS 1.3 broke traditional deep packet inspection. The security community responded by deploying ML-based traffic classifiers that infer behavior from encrypted packet patterns — byte distributions, timing, flow characteristics — without decrypting the payload. This was the right move. The deployment has a critical vulnerability: the classifiers are Euclidean. The attacks are not.

Pre-padding attacks append crafted byte sequences that shift the payload distribution the classifier sees while preserving the underlying traffic behavior. The malicious C2 channel continues operating. The classifier sees a different distribution and fails. Against ET-BERT — currently the leading Transformer-based encrypted traffic classifier — pre-padding attacks reduce accuracy from production-grade to 25.68%. A system that was 98%+ accurate becomes marginally better than random.

AEGIS replaces byte pattern analysis with flow physics. Six-dimensional thermodynamic metrics — entropy variance, inter-packet timing, burst thermodynamics — embedded in a Poincaré manifold. The geometric structure captures anomaly signatures that byte morphing cannot reach: a C2 tunnel's behavioral signature manifests as thermodynamic variance that persists regardless of what the attacker does to the payload. 99.5% true positive rate. 262 microsecond inference latency. eBPF-based zero-copy packet harvesting enables line-rate deployment.

The adversarial asymmetry matters: an attacker who knows you are using payload byte classification can bypass you with a padding library. An attacker facing thermodynamic invariants cannot change the physics of how their C2 traffic actually behaves.

For Physical AI infrastructure: cloud-connected robots, medical wearables, and autonomous systems generate distinct traffic flow signatures. A compromised Physical AI device communicating with a C2 server produces thermodynamic anomalies that AEGIS detects even when the payload is fully encrypted and adversarially morphed. The detection is at the behavioral physics layer — which the attacker cannot spoof without also changing the behavior they are trying to hide.

What to do implement:

Audit current encrypted traffic analysis: if it classifies payload byte distributions, pre-padding attacks are a live threat against it right now;
Test current classifiers against pre-padding adversarial samples — a 73-point accuracy drop under attack is not an edge case, it is a known vulnerability class;
Evaluate AEGIS-style physics-based flow analysis as a defense layer: thermodynamic invariants resist the attack classes that defeat Euclidean classifiers;
For Physical AI networks: add thermodynamic variance detection for zero-day C2 channel identification alongside signature-based IDS
Source: https://arxiv.org/abs/2604.02149

🟠 SIGNAL #2 — AI AGENTS · HIGH

Your IDS Sees Events. The Attacker Is Running a Campaign. PARD-SSM Sees the Campaign

arXiv:2604.02299 · PARD-SSM · Probabilistic Regime Detection · 2026-04-02 · cs.CR · cs.AI

Modern adversarial campaigns unfold as structured sequences: Reconnaissance, Lateral Movement, Intrusion, Exfiltration. Each phase is often individually indistinguishable from legitimate network traffic. Signature-based IDS catches known attack signatures within a phase — but misses zero-day variants. Deep learning anomaly detectors produce scores without phase attribution — an alert without stage context cannot distinguish a false positive from the lateral movement phase of an advanced persistent threat. Neither architecture sees the campaign as a campaign.

PARD-SSM treats attack phases as hidden regimes in a switching state-space model. The framework detects transitions between phases probabilistically, attributes network activity to specific attack stages, and — most significantly — provides predictive alerts 8 minutes before the intrusion phase begins. 98.2% F1 score. Sub-millisecond inference latency. Real-time deployable.

An 8-minute prediction window before intrusion phase onset is not a performance metric. It is an operational window. In 8 minutes, a security team can isolate a compromised network segment, revoke credentials observed in reconnaissance activity, and verify the integrity of adjacent systems. Without phase-aware detection, those 8 minutes pass unobserved, and the intrusion phase begins against an unprepared network.

For Physical AI deployments: Physical AI systems are high-value APT targets — a compromised robot or medical device is simultaneously a data exfiltration surface and a physical actuator. Multi-phase attack campaigns against Physical AI infrastructure follow the RLIE pattern. PARD-SSM's phase attribution enables specific responses appropriate to each stage: blocking C2 communication during reconnaissance is categorically different from isolating a device during active intrusion.

What to evaluate:

Audit current IDS for phase-level attribution: if your detection produces anomaly scores without stage identification, you cannot implement stage-appropriate responses;
Evaluate PARD-SSM-style regime detection for Physical AI network monitoring — the 8-minute prediction horizon enables pre-emptive isolation before the intrusion phase completes;
Build pre-emptive isolation capability into Physical AI network architecture: a prediction horizon is only valuable if the isolation can be executed automatically within that window

Source: https://arxiv.org/abs/2604.02299

🟠 SIGNAL #3 — PHYSICAL AI · HIGH

Robots That Know When They Are Lost Navigate Better Than Robots That Do Not

arXiv:2604.02318 · MetaNav · Vision-Language Navigation · 2026-04-02 · cs.RO · cs.AI

Training-free Vision-Language Navigation agents powered by foundation models can follow instructions and explore 3D environments. They also oscillate locally, revisit areas redundantly, and cannot detect when their strategy is failing. The failure mode is structural: they lack metacognition — the ability to monitor their own exploration progress, diagnose strategy failures, and generate corrective actions.

MetaNav addresses this with three modules. Spatial memory maintains a persistent 3D semantic map — the agent tracks what it has explored and what it found there. History-aware planning applies a revisiting penalty — the agent is explicitly discouraged from re-exploring areas that did not advance the goal. Reflective correction detects stagnation — when the agent identifies that its current strategy is not working, it generates LLM-based corrective rules and adapts.

The measured result: 20.7% reduction in VLM queries — directly lowering compute cost — with state-of-the-art navigation success rates. The efficiency gain is secondary. The primary gain is robots that fail gracefully: when a strategy stops working, the agent detects it and recovers, rather than continuing to execute a failing strategy until external intervention.

The production safety implication is specific. A hospital delivery robot that oscillates in a corridor blocks patient traffic. A care robot that re-enters a room repeatedly because it cannot detect its search has failed disturbs the resident unnecessarily. A warehouse robot revisiting empty locations misses delivery SLAs. In each case, the failure is not the robot getting stuck — it is the robot not knowing it is stuck. MetaNav closes this gap.

What to evaluate:

Audit deployed VLN agents for stagnation detection capability: if the agent has no mechanism to identify that its strategy is failing, it has no recovery mechanism;
Implement spatial memory for persistent 3D semantic mapping in any robot operating in environments where redundant exploration creates safety or operational risks;
Add reflective correction as a safety overlay on existing navigation policies — the metacognitive layer does not require retraining the base navigation model, only adding the monitoring and correction modules above it

Source: https://arxiv.org/abs/2604.02318

🔵 SIGNAL #4 — AI AGENTS · HIGH

Your Benchmark Measures Task Accuracy. Conversational Safety Requires Something Else

arXiv:2604.02315 · User Turn Generation · Interaction Awareness · 2026-04-02 · cs.AI · cs.HC

Standard LLM evaluation follows a fixed pattern: the model generates an assistant response, a verifier scores correctness, the evaluation ends. This measures whether the model produces accurate outputs. It does not measure whether the model has any awareness of what the user would say next — whether the response invites continuation, closes off the conversation, or creates conditions for the harmful feedback loops documented in Issue #006 Signal #1.

User-turn generation as a probe fills this gap. Given a conversation context — user query plus assistant response — the model is asked to generate what the user would say next. A model with interaction awareness generates follow-up turns that are grounded in the prior exchange, coherent with the conversation history, and representative of how a real user would continue. Under deterministic generation, follow-up rates are low across the models tested. Task accuracy on GSM8K and similar benchmarks does not predict follow-up rate. The two capabilities are empirically decoupled.

The connection to Issue #006 Signal #1 is direct: a model without interaction awareness cannot detect that a conversation is beginning to spiral. Sycophancy-driven delusional spiraling operates through a multi-turn feedback loop that requires the model to track and respond to the trajectory of a conversation, not just the current turn. A model whose evaluation regime never measured multi-turn trajectory awareness was never trained to develop it.

For Physical AI in care and companionship contexts: the distinction between task accuracy and interaction awareness is the distinction between a robot that answers questions correctly and a robot that is safe to talk to over an extended period. These are not the same capability. The first is measured by current benchmarks. The second is not.

What to implement:

Add user-turn generation probes to evaluation suites for any conversational AI deployed in extended-interaction contexts — care robots, companionship AI, educational AI, therapy support;
Track follow-up rates under deterministic generation as a separate metric from task accuracy: a model with high task accuracy and low follow-up rates is a candidate for the sycophancy risk profile;
Implement collaboration-oriented post-training to improve interaction awareness — the probe provides a measurable signal the training objective can optimize against

Source: https://arxiv.org/abs/2604.02315

📡 INDUSTRY MOVES — April 2nd 2026

OpenAI Capital Deployment Begins: TBPN Acquisition and Codex Pay-As-You-Go

The first visible deployment of the $122 billion round: OpenAI acquires TBPN and launches Codex with pay-as-you-go pricing for teams.

The pricing change is the more structurally significant move. Per-seat subscription pricing for AI coding tools created a barrier for smaller teams and project-based usage. Pay-as-you-go removes that barrier. For Physical AI development teams — which frequently operate on project timelines rather than continuous development cycles — Codex pricing that scales with actual usage rather than seat count changes the adoption calculus directly.

The security implication connects to Issue #005's industry section on the Claude Code source leak. AI coding tools generate code that carries AI-generated security assumptions. Expanded adoption of Codex in Physical AI development pipelines means more Physical AI code written with AI assistance — and more Physical AI code whose security properties were partly shaped by an AI system that has its own known failure modes. The interaction is not hypothetical: teams that adopt Codex for Physical AI control code development should treat AI-generated code as requiring the same security review as human-written code, not less.

Google Gemma 4: The Open-Weight Baseline Has Shifted

DeepMind releases Gemma 4, described as "byte for byte, the most capable open models." This matters for Physical AI for a specific reason.

Physical AI deployments have operated under a persistent tradeoff: capable models require cloud connectivity, but cloud connectivity introduces latency, availability dependency, and data privacy exposure. The tradeoff has been structural — the models capable enough to handle complex Physical AI reasoning tasks have been proprietary and cloud-accessed.

Gemma 4-class open-weight capability changes the tradeoff. A companion robot or medical wearable running Gemma 4-class inference locally eliminates the cloud dependency risk profile: no network outage disables the device, no cloud API compromise reaches the local model, no round-trip latency degrades real-time response.

The new risk profile is different. Open-weight models deployed on edge hardware make model weights accessible to any attacker with physical or filesystem access to the device. Fine-tuning for malicious purposes requires only the weights and available compute. The TEE and Control Flow Attestation approaches documented in Issue #006 Signals #5 and #6 become first-class security requirements for any Physical AI edge deployment adopting Gemma 4-class local inference — protecting weights at rest and verifying runtime execution integrity are now deployment-critical, not optional security enhancements.

⚠️ REGULATORY WATCH — EU AI ACT

Article 5 Enforcement Deadline: August 2, 2026

121 days remaining.

Signal #4 this week adds a specific dimension to Article 9 documentation requirements. The finding that task accuracy and interaction awareness are empirically decoupled means that Article 9 conformity assessments for conversational AI systems — particularly those deployed with vulnerable populations — cannot rely on task accuracy benchmarks alone to establish that the system behaves safely in deployment. A system that scores well on correctness benchmarks may have systematically low interaction awareness. Article 9 requires risk management documentation that covers actual deployment behavior. The user-turn generation probe provides a concrete methodology for measuring the interaction awareness dimension that task benchmarks miss.

On Signals #1 and #2: Article 9 for high-risk AI systems in critical infrastructure contexts requires documentation of cybersecurity measures. Infrastructure-connected Physical AI operating under encrypted traffic monitoring that has not been tested against adversarial evasion is operating with undocumented cybersecurity gaps. Both AEGIS and PARD-SSM provide specific technical frameworks that can be referenced in risk documentation to describe what the gap is and what the mitigation requires.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

That is Issue #007.

Seven issues in, the visibility gap is the consistent finding. Your IDS does not see the campaign. Your traffic classifier does not see the evasion. Your navigation agent does not see that its strategy has failed. Your benchmark does not see whether your conversational AI is safe to talk to over time.

Each gap is measurable. Each has a documented mitigation. The work is not identifying that something is wrong — that part keeps getting easier. The work is closing the gap before the deployment gets there first.

We will keep watching.

No financial relationship with any AI company, hardware manufacturer, or standards body. We don't certify. We don't consult. We watch.

Credentialed press at HumanX 2026.

→ sentinelbase.ai · [email protected]