# AI Activity Retention for Regulated Firms

AI tools in regulated workflows create books-and-records obligations standard archives miss. How AI activity retention captures the full execution record.

Source: https://commacompliance.com/ai-platform-compliance
Last updated: 2026-06-09

---
## What is AI activity retention?

AI activity retention is the practice of capturing and reconstructing every AI execution event a regulated firm produces — completely, retrievably, and in tamper-evident form — so it can be produced on examiner request.

The core question regulators will ask is not whether a firm used AI. It is whether the firm can produce the records of what its employees and agents actually *did* with AI - the prompts, the responses, the tool calls, the actions an agent took on the firm's behalf, and the configuration that determined what was possible.

That question has no clean answer yet across the vendor landscape. The absence of explicit AI-specific guidance does not create a compliance safe harbor. The existing framework appears to apply: communications related to client activity, investment recommendations, or firm business have been treated as business records under existing rules regardless of channel. The form factor is not the determining factor. The content is.

## What an AI activity record actually contains

A captured iMessage is a message. A captured AI session is a different kind of object entirely. It splits into two layers that both have to be preserved for an examiner to reconstruct what happened.

### Activity records: what the AI did

- **Prompt.** The full input a user or system sent to the model.
- **Response.** The full output the model returned, including text, structured data, and generated files.
- **Tool call.** Every invocation the AI made against a tool, with arguments, results, timing, status, and the user identity that authorized it. A modern AI session can contain dozens of tool calls per user prompt.
- **Agent action.** Each autonomous step an agent took as part of completing a task. May contain multiple tool calls.
- **Retrieval.** Content the AI pulled from a document store, database, or knowledge base and included in its context.
- **Policy decision.** The middleware verdict made before a call ran: allowed, denied, redacted, rate-limited, alerted. Recorded even when the call never reached the model or the tool.

### Execution context: what the AI was permitted to do

This is the layer most archive vendors are not capturing today.

- **Execution harness.** The software wrapping the AI call - Cursor, a custom agent runtime, ChatGPT's web app, a LangChain orchestrator. Captures what software was driving the AI at the time.
- **Model identity.** The exact model and version handling each call (for example, `gpt-4o-2025-04-18`).
- **System prompt.** The standing instructions configured into the harness or the call. Often the single most important piece of execution context.
- **Tool surface.** The set of tools the AI was permitted to call at the moment of execution, including explicit denies.
- **Lifecycle hooks.** Custom logic configured to fire at defined points - before a prompt, before a tool call, after a response, on error. Hooks can block, mutate, or annotate calls. Their firing is itself a record.

An examiner asking "what could this agent have done?" needs the second layer. A firm can claim "the AI couldn't have wired funds," but without the captured tool surface and lifecycle hooks, the claim cannot be proved.

**Reconstruction context** is activity records and execution context treated as one record. It is what a regulated firm has to be able to produce on demand.

## Why existing compliance infrastructure misses most of it

Compliance platforms were designed around communication channels: email, messaging apps, social media. The mental model is: a person sends a message; the message is captured; the archive stores it.

AI tools do not work that way. An employee does not send a message to ChatGPT. They prompt it. The model generates a response. The model also calls tools: queries databases, sends emails, retrieves documents. The model takes autonomous actions. The configuration around all of that determines what was possible. At each step, the question of "what was captured and when" becomes more complex.

The gaps in current infrastructure fall into four categories:

**No interception point for everything that isn't chat.** Email compliance works since it flows through a single point your compliance tools already monitor. AI tool activity doesn't. It flows through browser sessions, API calls, MCP servers, and embedded interfaces that existing compliance middleware was not designed to intercept.

**No consistent record format.** A captured iMessage is a message. A captured AI session may include a prompt, a response, eight tool calls with arguments and return values, three retrievals from internal systems, a lifecycle hook that fired and redacted a credit card number, and metadata identifying the model version and tool surface in effect. None of that fits what traditional compliance archives were built to store.

**Chat-focused vendor coverage.** As of public materials reviewed 2026-05-15, several large compliance vendors have shipped or are in progress on ChatGPT Enterprise capture via the OpenAI Compliance Platform, and Microsoft Copilot capture via Purview and partner integrations. That is real progress on the chat-transcript layer. The tool calls inside those sessions, the actions an agent took, and the full execution context that determined what the agent was capable of are not consistently documented in vendor public materials. Whether each item lands in an examiner-ready export depends on the specific vendor integration.

**No public coverage we have found for self-hosted AI.** When a firm runs Ollama, vLLM, LiteLLM, or Open WebUI behind its own firewall, there is no vendor compliance API to call. We have not seen public archive-vendor materials documenting capture of execution context for these self-hosted deployments as of 2026-05-15.

## What regulators have said about AI

The [SEC's 2026 examination priorities](https://www.sec.gov/files/2026-exam-priorities.pdf) explicitly name AI governance as a focus area. Examiners have been directed to assess whether firms have policies governing employee use of AI tools, whether those policies are enforced, and whether records of AI-assisted activity are being retained.

FINRA has not issued AI-specific recordkeeping guidance as of this writing, but the [2026 Annual Regulatory Oversight Report](https://www.finra.org/rules-guidance/guidance/reports/2026-finra-annual-regulatory-oversight-report) and [Notice 24-09](https://www.finra.org/rules-guidance/notices/24-09) signal that existing rules apply to AI-generated communications in the same way they apply to any other business communication.

Outside the United States, global banking supervisors have been extending [BCBS 239](https://www.bis.org/publ/bcbs239.htm) - the Basel Committee's principles for effective risk data aggregation and risk reporting - as the de facto data-governance benchmark for AI and agentic systems at G-SIBs and D-SIBs. The European Central Bank's Single Supervisory Mechanism in particular has signaled that AI activity feeding risk-relevant decisions falls within the scope of BCBS 239's data lineage, accuracy, completeness, timeliness, and adaptability obligations. The framework was written for risk reporting pipelines, but supervisors are treating it as the standard for any data pipeline a regulated bank relies on - including the ones that capture what an agent did on the firm's behalf.

The [EU AI Act](https://artificialintelligenceact.eu/) is the largest AI-specific regulatory framework in force globally. [Article 12](https://artificialintelligenceact.eu/article/12/) requires high-risk AI systems to keep automatically generated logs traceable to events relevant for risk identification, post-market monitoring, and supervisory inspection - precisely the record shape that AI activity retention produces. [Article 14](https://artificialintelligenceact.eu/article/14/) requires effective human oversight, including the ability to interpret an AI system's output and override it. [Annex III](https://artificialintelligenceact.eu/annex/3/) enumerates the high-risk use cases that bring the full obligations into scope - creditworthiness assessment, life and health insurance pricing, employment decisions, and access to essential financial services among them. General-purpose AI obligations took effect in August 2025; the high-risk-system obligations are scheduled for 2 December 2027. Firms deploying agents into Annex III workflows are already on the clock for production-grade logging and oversight.

The EU's [Digital Operational Resilience Act (DORA)](https://www.eiopa.europa.eu/digital-operational-resilience-act-dora_en) entered into force on 17 January 2025 and applies to every financial entity operating in the EU - banks, insurers, investment firms, crypto-asset service providers - and the ICT third-party providers that serve them. DORA's logging, incident reporting, and resilience-testing requirements explicitly cover AI systems used in regulated workflows. Communications-capture and AI-activity-archive vendors fall inside DORA's scope of operational resilience evidence; firms operating in the EU should expect to produce captured AI activity as part of incident reconstruction, resilience testing, and supervisory examination under DORA.

The practical implication: firms that cannot produce records of AI-assisted client activity during a routine exam may face scrutiny similar to firms that could not produce WhatsApp records in 2022. Whether AI activity follows the same enforcement trajectory is not yet established. The framework that would support that path is in place.

The window to get ahead of this is now. As of public materials reviewed 2026-05-15, no vendor we have found offers full pipeline-execution-archive coverage across enterprise AI, developer/agent AI, self-hosted AI, and direct API surfaces, including execution context. Firms that build AI activity retention this year will be in a structurally different position than those that wait for explicit AI-specific rulemaking.

## How Comma is approaching AI platform compliance

Comma is building Arc: a compliance platform designed from the ground up for both human communication channels and AI agent activity.

The premise is that the division between "communications compliance" and "AI governance" is artificial. A registered rep sending a WhatsApp message and a registered rep prompting Copilot for a client recommendation are engaged in the same category of regulated activity. The compliance infrastructure should treat them consistently.

Arc is designed as a set of composable components that together form a compliance fabric:

---

### Arc Relay - Available now

Arc Relay is the first production component of the Arc platform. It is an open-source, MIT-licensed MCP control plane that sits between AI clients and the tools they call. It captures every tool call with full context (arguments, results, timing, user identity), applies a configurable middleware pipeline (PII redaction, size limits, alerting, archive), and enforces per-user and per-tool access control.

Arc Relay is self-hosted, open source, and available for download today.

---

### Arc Bridge - Demo available

Arc Bridge enables capture from self-hosted AI infrastructure: Ollama, vLLM, LiteLLM, Open WebUI, and similar deployments. For firms running models behind their own firewall, Arc Bridge integrates at the infrastructure layer without requiring changes to how the team works. Demo available today; production deployment is configured during the customer engagement.

---

### Compliance MCP Servers - Demo available

Comma's Compliance MCP Servers expose the full compliance workflow as MCP-native tools: retention policy, legal hold, discovery, review, and export. An authorized AI agent can run the full compliance lifecycle through the same control plane that captured the records in the first place. Compliance becomes a service the agent can call, not a process the agent gets in the way of. Demo available today; remaining work is hardening, not concept.

---

### Arc Gate - Demo available

Not every AI workflow runs through MCP. Some agents send raw API calls directly to OpenAI, Google, or self-hosted endpoints. Arc Gate sits in front of those calls and applies the same compliance pipeline at the API level. Demo available today; production deployment is configured during the customer engagement.

---

One platform. Human channels and agent channels. Whatever path the data takes.

## How records are stored versus how they are reviewed

A turn is a single exchange in an AI conversation: a prompt from the user and a response from the AI. A multi-turn AI conversation has a structural problem for compliance review. Every new exchange includes the full prior history of the conversation. If a reviewer is shown every captured turn verbatim, turn 20 of a 20-turn session contains the same content as turns 1 through 19 plus the new exchange. A CCO doing supervisory review drowns in redundant text.

Arc captures full state for legal-hold and export integrity. The review interface defaults to **turn-by-turn review** - showing only what is new at each step. The new prompt content, the new response, the new tool calls, the new retrievals, and any execution-context changes that occurred on that turn. Reviewers can expand to full state when they need to. The captured archive is unchanged; the presentation is the part that has been redesigned for supervisor and examiner use.

## What firms should do now

Regulatory clarity on AI is coming, but it hasn't arrived. In the meantime, there are practical steps every regulated firm should be taking, regardless of what any vendor, including Comma, supports today.

**1. Start with an honest inventory.** Which AI tools are actually in use across your front office, compliance team, and operations? Don't limit this to IT-approved deployments. Shadow AI adoption is real, and the tools your team adopted informally are just as subject to regulatory scrutiny as the ones procurement signed off on.

**2. Assess what those tools are producing.** Prompts, responses, summaries, client recommendations: if the output touches client activity or firm business, your existing retention obligations may apply. The question is not whether the tool is on an approved list. It is whether the output would be treated as a business record under the firm's existing recordkeeping framework.

**3. Find out what your tools can actually retain.** Most AI platforms have limited native export functionality, and some have none. Know what each tool can and cannot produce before a regulator asks you to produce it.

**4. Ask your compliance vendor the hard question.** Not whether AI compliance is on their roadmap, but whether they've shipped something. A roadmap slide is not a compliance solution. Most vendors haven't shipped one. You should know whether yours has.

**5. Document where you stand.** A written AI governance policy covering which tools are permitted, how they should be used, and what records management looks like won't close every gap, but it demonstrates good-faith effort. In an exam, that matters.
