Context Windows Don't Remember. Agents Need Real Memory.

If you're running AI-assisted CTI investigations and relying on context windows for continuity, you have a memory problem you probably haven't diagnosed yet.

A context window is volatile. When a session ends, everything in it disappears — preferences, constraints, evidence from three turns ago, the analyst's name. This isn't a limitation you can engineer around with a bigger model. It's architectural. A context window is RAM, not storage. The faster you accept this, the faster you build CTI systems that actually hold intelligence across investigations, analysts, and tool boundaries.

This distinction matters because CTI work has a specific kind of memory problem: you need durable, evidence-backed recall that survives across weeks, analyst handoffs, and heterogeneous tooling — not just a longer conversation window. Context windows can't solve this. Durable agent memory can.

The Forgetting Problem in CTI

CTI teams lose critical intelligence to session boundaries constantly. Here are three scenarios that play out across every mature security organization:

Analyst rotation. Your most experienced APT29 tracker goes on leave. Her replacement inherits a chat thread with a model that has no memory of the investigation. Every assumption she made — which malware builds map to which infrastructure clusters, which false positives to ignore — is gone. The replacement spends two weeks re-deriving what the first analyst already knew.

Rediscovered TTP. A Volt Typhoon technique resurfaces in a current incident. You have a working paper buried in a closed case from 2024 that documents the same behavior, with network captures, associated infrastructure, and MITRE ATT&CK technique mapping. It lives in a OneNote that nobody remembers. The analyst documents the same behavior from scratch. Again.

The Slack-to-case gap. During an active investigation, an analyst finds a key correlation in Slack: a peer organization reported a related indicator in an ISAC channel. The analyst copies it into a note. The session ends. When the model restarts, it has no record of the Slack conversation. The correlation is re-discovered manually — sometimes not at all.

None of these failures are about model capability. They're about architecture.

Why Context Windows Fail CTI Teams

Token economics. Every fact in the context competes with every other. As the window fills, retrieval degrades unevenly — Liu et al. documented the "lost in the middle" effect (2023). Models reliably retrieve from the first and last positions; information in the middle is ignored.

Recency bias. Constraints established in turn 3 fall below the attention threshold by turn 7. Gamage (2026) documents this as "Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents." For CTI, provenance and confidence metadata — the facts that matter most — are the most likely to be lost.

No provenance. A context window contains text. It has no mechanism for storing who said it, when, at what confidence, under what TLP classification, and what source it came from.

No TLP enforcement. TLP:AMBER+STRICT and TLP:CLEAR look identical in a prompt. There's no mechanism to prevent classified intelligence from being included in an unauthorized export.

No audit trail. When did the model learn about this actor? What did it overwrite? A context window has no answer. This is a direct gap against NIST 800-53 AU controls.

What Durable Agent Memory Must Do

Capability	What it delivers
Provenance per fact	Every entity carries its source — report title, MISP event ID, analyst note timestamp
Confidence scoring	Facts carry confidence tiers (high/medium/low) that affect downstream reasoning
TLP/CUI enforcement	Classification follows the data across session boundaries
Signed audit export	Exports include cryptographic signature and provenance chain (NIST 800-53 AU-10)
Correction & merge	When new evidence supersedes a claim, history is preserved and confidence re-computed
Evidence linkage	TTPs are linked to the specific report, IOC, and analyst note that established them

A Worked Example: APT29 Recall Across Sessions

Live query in Threat Engram

"What do we know about APT29's outbound C2 infrastructure patterns?"

An analyst in January 2025 logged findings from CISA advisory AA25-045A (TLP:CLEAR) and a closed MISP event (TLP:AMBER). She linked two IP clusters to a malware variant mapped to MITRE ATT&CK T1014 (Rootkit) and T1562 (Impair Defenses).

Six months later, a different analyst on a different shift asks the same question. Threat Engram retrieves:

Both infrastructure clusters, with first-seen / last-seen timestamps
Confidence tier: high (corroborated across two independent sources)
TLP:AMBER source metadata — the answer exists but classification boundaries are enforced
Evidence panel: CISA advisory, MISP event, analyst note
Export controls: the answer can be included in a signed TLP:CLEAR report; the TLP:AMBER provenance record itself is not exportable

Try the live query →

Where This Fits in the Stack

Agent memory isn't a replacement for your existing CTI infrastructure. It's the layer that makes that infrastructure legible to AI agents between human analyst sessions.

Tool	Role	How Agent Memory Complements It
OpenCTI (STIX 2.1)	Ingest + structure + visualize threat entities	Agent memory holds working-context recall — analyst annotations, confidence adjustments, session provenance — that OpenCTI's entity graph doesn't track
MISP (sharing)	IoC distribution, event sharing, community feeds	Agent memory surfaces MISP-derived findings in agentic workflows without requiring analysts to go look them up
SIEM (detection)	Alert triage, log correlation	Agent memory maintains attribution continuity between SIEM alerts and CTI entities

No tool on this list remembers what your analyst concluded six months ago. They remember what was ingested. There's a difference.

Build This Into Your CTI Practice

If you're running AI-assisted investigations without durable memory, you're building intelligence on a foundation that resets every session. The gaps are invisible until they matter most.

Start a 90-day pilot → See how we compare to RAG-only approaches → Read our security model →

References

Liu, N. F., et al. (2023). "Lost in the Middle: How Language Models Use Long Contexts." arXiv:2307.03172.
Gamage, K. et al. (2026). "Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents." arXiv, April 2026.
NIST SP 800-53 Rev 5. "AU-10 — Non-repudiation."
MITRE ATT&CK. T1014 — Rootkit, T1562 — Impair Defenses.
CISA. "Enhancing Resilience Against PRC State-Sponsored Actors." Advisory AA25-045A, February 2025.
FIRST. "Traffic Light Protocol v2.0 (TLP 2.0)." April 2023.

Threat Engram maintains persistent, evidence-backed memory for CTI investigations. TLP classification is enforced at the record level. Signed audit exports available on all plans.