Scan Model Output

Check generated model output before users, workflows, or agents act on it.

Goal

Scan AI output before it reaches the user or a downstream tool.

Use this for chat responses, generated summaries, extraction results, agent tool output, draft emails, risk decisions, and claim recommendations.

Architecture

Scan the user input with scan_phase=input.
Run the model only if input routing allows it.
Scan the model output with scan_phase=output.
Reuse the input scan's scan_group_id.
Show, redact, review, or block the output.

Request And Response

export async function scanModelOutput({
  output,
  originalPrompt,
  scanGroupId,
  sessionId,
  dataSensitivity = "strict",
}: {
  output: string;
  originalPrompt: string;
  scanGroupId: string;
  sessionId: string;
  dataSensitivity?: "standard" | "tolerant" | "strict";
}) {
  const response = await fetch("https://gateway.trymighty.ai/v1/scan", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.MIGHTY_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      content: output,
      content_type: "text",
      scan_phase: "output",
      mode: "secure",
      focus: "both",
      profile: "ai_safety",
      data_sensitivity: dataSensitivity,
      scan_group_id: scanGroupId,
      session_id: sessionId,
      original_prompt: originalPrompt,
    }),
  });

  if (!response.ok) {
    throw new Error(`Mighty output scan failed with ${response.status}`);
  }

  return response.json();
}

Use data_sensitivity=strict for public user-visible output. Use data_sensitivity=tolerant only for internal summaries, claim notes, or reviewer-only output where business PII is expected.

threats is an array of objects with category, confidence, an optional evidence excerpt (the substring that triggered the rule), and a human-readable reason. When redacted_output is present and policy allows substitution, prefer it over the raw model text.

Routing Logic

export function routeModelOutput(scan: {
  action: string;
  redacted_output?: string;
}) {
  if (scan.action === "ALLOW") {
    return { type: "show_original" as const };
  }

  if (scan.action === "WARN") {
    return { type: "show_safe_fallback_and_review" as const };
  }

  if (scan.redacted_output) {
    return { type: "show_redacted" as const, content: scan.redacted_output };
  }

  return { type: "block" as const };
}

Output Tolerance

Generated output needs a tolerance setting because the same text can be acceptable in an internal review note and unacceptable in a public assistant answer.

Output surface	Suggested settings	Routing
Public assistant answer	`profile=ai_safety`, `data_sensitivity=strict`	Show ALLOW. Show `redacted_output` when returned. Block otherwise.
Internal claim summary	`profile=balanced`, `data_sensitivity=tolerant`	Show ALLOW. Send WARN to review. Stop automation on BLOCK.
OCR or IDP summary	`focus=both`, `data_sensitivity=tolerant`	Keep file scan, OCR scan, and summary scan connected by `scan_group_id`.
Agent tool output	`profile=ai_safety` or `code_assistant`, `data_sensitivity=standard`	Keep WARN and BLOCK out of model context unless reviewed.
High-stakes recommendation	`profile=strict`, `data_sensitivity=strict`	Route WARN, BLOCK, and `indeterminate` to review.

tolerant does not mean unsafe output is allowed. It means expected business PII should not block by itself.

Common Mistakes

Scanning only the prompt. Model output can leak secrets, repeat unsafe instructions, or turn suspicious OCR into trusted wording.
Creating a new scan_group_id for output. Reuse the input scan group.
Showing output while the output scan is still pending.
Using data_sensitivity=tolerant for public AI output.
Failing open on scan errors in high-risk flows.

Production Checklist

Always keep input and output scans connected with scan_group_id.
Store original_prompt when useful for audit.
Use profile=ai_safety for public AI surfaces.
Use data_sensitivity=strict for public output.
Use data_sensitivity=tolerant only for internal PII-heavy output.
Use a safe fallback if the output scan fails.
Do not show BLOCK output to users.
Add tests for redacted output and unsafe output paths.

Next step

Ready to scan real traffic?

Create an API key, keep it on your server, then wire Mighty into the workflow that handles untrusted material.

Get an API key Vercel AI SDK

AI-Agent Prompt

AI-ready prompt

Add model output scanning

Paste this into Cursor, Codex, Claude Code, or Windsurf.

Add Mighty output scanning around model responses.

Requirements:
- The app already scans user input or must add that first.
- Reuse the input scan_group_id when scanning output.
- Send output to POST https://gateway.trymighty.ai/v1/scan.
- Use content_type=text, scan_phase=output, mode=secure, focus=both, profile=ai_safety.
- Use data_sensitivity=strict for public output.
- Use data_sensitivity=tolerant only for internal output where business PII is expected.
- Include original_prompt when available.
- Route ALLOW to show output.
- Route WARN to safe fallback plus review.
- Route BLOCK to redacted_output if present, otherwise block.

Acceptance criteria:
- Output never reaches the user before scan routing.
- Tests cover ALLOW, WARN, BLOCK, redacted_output, and scan failure.
- Logs connect input and output scans by scan_group_id.