Browse docs

Scan Model Output

Check generated model output before users, workflows, or agents act on it.

Goal

Scan AI output before it reaches the user or a downstream tool.

Use this for chat responses, generated summaries, extraction results, agent tool output, draft emails, risk decisions, and claim recommendations.

Architecture

  1. Scan the user input with scan_phase=input.
  2. Run the model only if input routing allows it.
  3. Scan the model output with scan_phase=output.
  4. Reuse the input scan's scan_group_id.
  5. Show, redact, review, or block the output.

Request And Response

export async function scanModelOutput({
  output,
  originalPrompt,
  scanGroupId,
  sessionId,
  dataSensitivity = "strict",
}: {
  output: string;
  originalPrompt: string;
  scanGroupId: string;
  sessionId: string;
  dataSensitivity?: "standard" | "tolerant" | "strict";
}) {
  const response = await fetch("https://gateway.trymighty.ai/v1/scan", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.MIGHTY_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      content: output,
      content_type: "text",
      scan_phase: "output",
      mode: "secure",
      focus: "both",
      profile: "ai_safety",
      data_sensitivity: dataSensitivity,
      scan_group_id: scanGroupId,
      session_id: sessionId,
      original_prompt: originalPrompt,
    }),
  });

  if (!response.ok) {
    throw new Error(`Mighty output scan failed with ${response.status}`);
  }

  return response.json();
}

Use data_sensitivity=strict for public user-visible output. Use data_sensitivity=tolerant only for internal summaries, claim notes, or reviewer-only output where business PII is expected.

threats is an array of objects with category, confidence, an optional evidence excerpt (the substring that triggered the rule), and a human-readable reason. When redacted_output is present and policy allows substitution, prefer it over the raw model text.

Routing Logic

export function routeModelOutput(scan: {
  action: string;
  redacted_output?: string;
}) {
  if (scan.action === "ALLOW") {
    return { type: "show_original" as const };
  }

  if (scan.action === "WARN") {
    return { type: "show_safe_fallback_and_review" as const };
  }

  if (scan.redacted_output) {
    return { type: "show_redacted" as const, content: scan.redacted_output };
  }

  return { type: "block" as const };
}

Output Tolerance

Generated output needs a tolerance setting because the same text can be acceptable in an internal review note and unacceptable in a public assistant answer.

Output surfaceSuggested settingsRouting
Public assistant answerprofile=ai_safety, data_sensitivity=strictShow ALLOW. Show redacted_output when returned. Block otherwise.
Internal claim summaryprofile=balanced, data_sensitivity=tolerantShow ALLOW. Send WARN to review. Stop automation on BLOCK.
OCR or IDP summaryfocus=both, data_sensitivity=tolerantKeep file scan, OCR scan, and summary scan connected by scan_group_id.
Agent tool outputprofile=ai_safety or code_assistant, data_sensitivity=standardKeep WARN and BLOCK out of model context unless reviewed.
High-stakes recommendationprofile=strict, data_sensitivity=strictRoute WARN, BLOCK, and indeterminate to review.

tolerant does not mean unsafe output is allowed. It means expected business PII should not block by itself.

Common Mistakes

  • Scanning only the prompt. Model output can leak secrets, repeat unsafe instructions, or turn suspicious OCR into trusted wording.
  • Creating a new scan_group_id for output. Reuse the input scan group.
  • Showing output while the output scan is still pending.
  • Using data_sensitivity=tolerant for public AI output.
  • Failing open on scan errors in high-risk flows.

Production Checklist

  • Always keep input and output scans connected with scan_group_id.
  • Store original_prompt when useful for audit.
  • Use profile=ai_safety for public AI surfaces.
  • Use data_sensitivity=strict for public output.
  • Use data_sensitivity=tolerant only for internal PII-heavy output.
  • Use a safe fallback if the output scan fails.
  • Do not show BLOCK output to users.
  • Add tests for redacted output and unsafe output paths.
Next step

Ready to scan real traffic?

Create an API key, keep it on your server, then wire Mighty into the workflow that handles untrusted material.

AI-Agent Prompt

AI-ready prompt
Add model output scanning

Paste this into Cursor, Codex, Claude Code, or Windsurf.

Add Mighty output scanning around model responses.

Requirements:
- The app already scans user input or must add that first.
- Reuse the input scan_group_id when scanning output.
- Send output to POST https://gateway.trymighty.ai/v1/scan.
- Use content_type=text, scan_phase=output, mode=secure, focus=both, profile=ai_safety.
- Use data_sensitivity=strict for public output.
- Use data_sensitivity=tolerant only for internal output where business PII is expected.
- Include original_prompt when available.
- Route ALLOW to show output.
- Route WARN to safe fallback plus review.
- Route BLOCK to redacted_output if present, otherwise block.

Acceptance criteria:
- Output never reaches the user before scan routing.
- Tests cover ALLOW, WARN, BLOCK, redacted_output, and scan failure.
- Logs connect input and output scans by scan_group_id.