# Scan Model Output

Check generated model output before users, workflows, or agents act on it.

Source URL: https://trymighty.ai/docs/integrate/model-output

import {
  CodeBlockTabs,
  CodeBlockTabsList,
  CodeBlockTabsTrigger,
  CodeBlockTab,
} from "fumadocs-ui/components/codeblock";

## Goal

Scan AI output before it reaches the user or a downstream tool.

Use this for chat responses, generated summaries, extraction results, agent tool output, draft emails, risk decisions, and claim recommendations.

## Architecture

1. Scan the user input with `scan_phase=input`.
2. Run the model only if input routing allows it.
3. Scan the model output with `scan_phase=output`.
4. Reuse the input scan's `scan_group_id`.
5. Show, redact, review, or block the output.

## Request And Response

<CodeBlockTabs defaultValue="request">
  <CodeBlockTabsList>
    <CodeBlockTabsTrigger value="request">Request</CodeBlockTabsTrigger>
    <CodeBlockTabsTrigger value="response">Response</CodeBlockTabsTrigger>
  </CodeBlockTabsList>
  <CodeBlockTab value="request">

```ts
export async function scanModelOutput({
  output,
  originalPrompt,
  scanGroupId,
  sessionId,
  dataSensitivity = "strict",
}: {
  output: string;
  originalPrompt: string;
  scanGroupId: string;
  sessionId: string;
  dataSensitivity?: "standard" | "tolerant" | "strict";
}) {
  const response = await fetch("https://gateway.trymighty.ai/v1/scan", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.MIGHTY_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      content: output,
      content_type: "text",
      scan_phase: "output",
      mode: "secure",
      focus: "both",
      profile: "ai_safety",
      data_sensitivity: dataSensitivity,
      scan_group_id: scanGroupId,
      session_id: sessionId,
      original_prompt: originalPrompt,
    }),
  });

  if (!response.ok) {
    throw new Error(`Mighty output scan failed with ${response.status}`);
  }

  return response.json();
}
```

  </CodeBlockTab>
  <CodeBlockTab value="response">

```json
{
  "action": "BLOCK",
  "risk_score": 92,
  "risk_level": "CRITICAL",
  "threats": [
    {
      "category": "secrets_exposure",
      "confidence": 0.96,
      "evidence": "KEY=sk_live_8f1c9d4e2ab3",
      "reason": "Output contains a live API key pattern."
    },
    {
      "category": "system_prompt_leak",
      "confidence": 0.88,
      "evidence": "SYSTEM=You are an underwriter.",
      "reason": "Output reveals the configured system prompt."
    }
  ],
  "redacted_output": "I cannot share that sensitive value.",
  "scan_phase": "output",
  "scan_group_id": "9b3e4f8d-96c9-4f42-8338-8cf9571c1c70",
  "scan_id": "8f713f53-8e73-4878-a7dc-7a538bb420c2",
  "scan_status": "complete"
}
```

  </CodeBlockTab>
</CodeBlockTabs>

Use `data_sensitivity=strict` for public user-visible output. Use `data_sensitivity=tolerant` only for internal summaries, claim notes, or reviewer-only output where business PII is expected.

`threats` is an array of objects with `category`, `confidence`, an optional `evidence` excerpt (the substring that triggered the rule), and a human-readable `reason`. When `redacted_output` is present and policy allows substitution, prefer it over the raw model text.

## Routing Logic

```ts
export function routeModelOutput(scan: {
  action: string;
  redacted_output?: string;
}) {
  if (scan.action === "ALLOW") {
    return { type: "show_original" as const };
  }

  if (scan.action === "WARN") {
    return { type: "show_safe_fallback_and_review" as const };
  }

  if (scan.redacted_output) {
    return { type: "show_redacted" as const, content: scan.redacted_output };
  }

  return { type: "block" as const };
}
```

## Output Tolerance

Generated output needs a tolerance setting because the same text can be acceptable in an internal review note and unacceptable in a public assistant answer.

| Output surface | Suggested settings | Routing |
| --- | --- | --- |
| Public assistant answer | `profile=ai_safety`, `data_sensitivity=strict` | Show `ALLOW`. Show `redacted_output` when returned. Block otherwise. |
| Internal claim summary | `profile=balanced`, `data_sensitivity=tolerant` | Show `ALLOW`. Send `WARN` to review. Stop automation on `BLOCK`. |
| OCR or IDP summary | `focus=both`, `data_sensitivity=tolerant` | Keep file scan, OCR scan, and summary scan connected by `scan_group_id`. |
| Agent tool output | `profile=ai_safety` or `code_assistant`, `data_sensitivity=standard` | Keep `WARN` and `BLOCK` out of model context unless reviewed. |
| High-stakes recommendation | `profile=strict`, `data_sensitivity=strict` | Route `WARN`, `BLOCK`, and `indeterminate` to review. |

`tolerant` does not mean unsafe output is allowed. It means expected business PII should not block by itself.

## Common Mistakes

- Scanning only the prompt. Model output can leak secrets, repeat unsafe instructions, or turn suspicious OCR into trusted wording.
- Creating a new `scan_group_id` for output. Reuse the input scan group.
- Showing output while the output scan is still pending.
- Using `data_sensitivity=tolerant` for public AI output.
- Failing open on scan errors in high-risk flows.

## Production Checklist

- Always keep input and output scans connected with `scan_group_id`.
- Store `original_prompt` when useful for audit.
- Use `profile=ai_safety` for public AI surfaces.
- Use `data_sensitivity=strict` for public output.
- Use `data_sensitivity=tolerant` only for internal PII-heavy output.
- Use a safe fallback if the output scan fails.
- Do not show `BLOCK` output to users.
- Add tests for redacted output and unsafe output paths.

## AI-Agent Prompt

### Add model output scanning

```text
Add Mighty output scanning around model responses.

Requirements:
- The app already scans user input or must add that first.
- Reuse the input scan_group_id when scanning output.
- Send output to POST https://gateway.trymighty.ai/v1/scan.
- Use content_type=text, scan_phase=output, mode=secure, focus=both, profile=ai_safety.
- Use data_sensitivity=strict for public output.
- Use data_sensitivity=tolerant only for internal output where business PII is expected.
- Include original_prompt when available.
- Route ALLOW to show output.
- Route WARN to safe fallback plus review.
- Route BLOCK to redacted_output if present, otherwise block.

Acceptance criteria:
- Output never reaches the user before scan routing.
- Tests cover ALLOW, WARN, BLOCK, redacted_output, and scan failure.
- Logs connect input and output scans by scan_group_id.
```
