Browse docs

Config Decisions

Choose mode, focus, scan phase, profile, data sensitivity, request IDs, and scan groups.

Mighty config should explain intent. Start with safe defaults, then tighten only when the workflow needs it.

Recommended default:

{
  "content_type": "auto",
  "scan_phase": "input",
  "mode": "secure",
  "focus": "both",
  "profile": "balanced",
  "data_sensitivity": "standard"
}

content_type

What it does: tells Mighty which modality you are sending.

When to use it:

ValueUse when
autoYour server does not know the type yet, or you want Mighty to detect it.
textChat text, OCR text, extracted fields, model output, tool output, or notes.
imageDamage photos, identity images, receipt photos, screenshots, or visual evidence.
pdfPDF claim packets, invoices, estimates, forms, or statements.
documentOffice documents or uploaded business documents.

Default value: auto.

Example request:

{
  "content": "Extracted OCR text",
  "content_type": "text",
  "scan_phase": "input",
  "mode": "secure"
}

Common mistake: using text for an uploaded image or PDF before extraction. Scan the original file when possible, then scan extracted text with the same scan_group_id.

Supported Uploads And Limits

Use content_type for the material you send to Mighty. If a PDF contains images, still send it as pdf. Mighty scans page content and accounts for unique embedded images separately.

Limits can differ by plan and deployment. These are the product defaults developers should design around.

MaterialCommon inputsUse content_typeLimits and billing notes
TextJSON strings, chat messages, OCR text, extracted fields, model output, tool output, .txt, SVG texttextText bills as 1 SCU per 1,000 tokens, rounded up. Base64 decoded content shares the 50 MB decoded payload limit.
Images.jpg, .jpeg, .png, .webp, .gif, .bmp, .tif, .tiff, .heic, .heif, .ico, .curimage or autoStandalone images bill as 4 SCU per image. Default upload limit is 50 MB. Default image cap is 100,000,000 pixels. Default GIF cap is 200 frames.
PDFs.pdfpdf or autoPDFs bill as 2 SCU per page plus 4 SCU per unique embedded image. Pro allows up to 1,000 pages and 100 unique embedded images per PDF. Free preview allows 4 pages and 1 unique embedded image.
Documents.docx, .xlsx, .pptx, .odt, .ods, .odp, .rtf, .html, .htm, .csv, .tsv, .ipynb, .eml, mail-like .msgdocument or autoDefault upload limit is 50 MB. Default unzipped document safety cap is 50 MB. Macro-enabled, encrypted, legacy Office, add-in, and template files can be rejected.
AudioTranscript text today. Audio file scanning is closed beta.text for transcriptsDo not send audio files unless your account is beta-enabled. Scan transcripts as text and set metadata like source=audio_transcript.

When the type is unknown, use auto. When your server already knows the type, set the explicit value. Explicit values produce clearer failures and make routing, billing, and logs easier to understand.

Common rejections:

StatusCodeWhat it means
400invalid_pdf, invalid_document, invalid image format, or unsupported enum valueThe file does not match the declared type, the parser cannot safely process it, or a config value is invalid.
402tier_cap_exceeded, tier_pdf_pages_exceeded, tier_pdf_embedded_images_exceededThe scan is valid, but the current plan does not allow that request size or billing state.
413payload_too_large, image_pixel_limit, gif_frame_limit, pdf_page_limit, document_unzip_limitThe file is too large or too complex for the configured safety limits.

Common mistake: converting a PDF to plain text to reduce cost, then treating the result as equivalent. That can miss embedded images, hidden text, suspicious layout signals, and document-level attack surfaces. If cost matters, scan the original file for high-risk workflows and scan extracted text for lower-risk enrichment.

mode

What it does: chooses the scan depth and latency target.

When to use it:

ValueUse whenTradeoff
fastInline chat or low-risk text needs a quick decision.Lowest latency, less depth.
secureProduction default for most apps.Balanced latency and coverage.
comprehensiveDeep image or PDF review is worth more latency.More depth, higher cost, required for async.

Default value: secure.

Example request:

{
  "content": "Claim note text",
  "content_type": "text",
  "scan_phase": "input",
  "mode": "secure"
}

Common mistake: using comprehensive for every chat message. Start with secure and reserve deep review for images, PDFs, high-value cases, or suspicious workflows.

Mode is not tolerance. mode controls scan depth. profile, data_sensitivity, and your routing policy control how strict the product is after Mighty returns a result. See Modes And Tolerance before you tune production routing.

focus

What it does: chooses which family of checks gets priority.

When to use it:

ValueUse when
standardYou mainly need threat and safety checks.
aiYou mainly need AI authenticity or AI fraud signals.
bothYou need threat checks and AI signals together.

Default value: standard.

Example request:

{
  "content": "Base64 image or extracted text",
  "content_type": "image",
  "scan_phase": "input",
  "mode": "secure",
  "focus": "both"
}

Common mistake: using focus=ai as a fraud verdict. Mighty flags suspicious evidence. It does not prove fraud by itself.

AI Involvement Metadata

What it does: preserves workflow context that is useful for review, logs, and AI coding agents.

When to use it:

Metadata keyUse when
ai_involvedThe material will be used by a model, agent, OCR automation, or AI review step.
submitted_as_ai_generatedYour app asks the submitter whether the material was AI-generated or edited.
workflowYou need to distinguish chat, claims, OCR, image review, invoices, or agent tools.

Default value: none.

Example request:

{
  "content": "Uploaded image or extracted text",
  "content_type": "auto",
  "scan_phase": "input",
  "focus": "both",
  "metadata": {
    "workflow": "damage_photo_review",
    "ai_involved": "true",
    "submitted_as_ai_generated": "unknown"
  }
}

Common mistake: treating app metadata as a detection result. Metadata is your app's context. Mighty response fields like authenticity, forensics, threats, and risk_score are scan evidence.

scan_phase

What it does: tells Mighty where the material sits in your workflow.

When to use it:

ValueUse when
inputA user, customer, vendor, claimant, partner, or upstream system submitted the material.
outputA model, OCR pipeline, extraction pipeline, agent, or automation generated the material.

Default value: none. This field is required.

Example request:

{
  "content": "Generated answer shown to a user",
  "content_type": "text",
  "scan_phase": "output",
  "scan_group_id": "9b3e4f8d-96c9-4f42-8338-8cf9571c1c70"
}

Common mistake: scanning output without scan_group_id. Output scans need the group returned by the input scan.

profile

What it does: chooses the risk posture.

When to use it:

ValueUse when
balancedMost production apps.
strictRegulated, financial, insurance, legal, healthcare, or high-value workflows.
permissiveLow-risk internal workflows where false positives are more costly.
code_assistantDeveloper tools and agent code workflows.
ai_safetyAI output, public assistants, or agentic systems.

Default value: balanced.

Example request:

{
  "content": "Agent tool output",
  "content_type": "text",
  "scan_phase": "output",
  "mode": "secure",
  "profile": "ai_safety",
  "scan_group_id": "9b3e4f8d-96c9-4f42-8338-8cf9571c1c70"
}

Common mistake: setting permissive because a workflow is noisy. Use tolerant data sensitivity for expected PII instead.

data_sensitivity

What it does: controls how expected PII affects blocking.

When to use it:

ValueUse when
standardDefault. PII can block unless context allows it.
tolerantBusiness workflows expect contact details, addresses, claim numbers, or invoices.
strictPII and credentials should block aggressively.

Default value: standard.

Example request:

{
  "content": "Customer: Jane Doe, phone: 555-0100",
  "content_type": "text",
  "scan_phase": "input",
  "mode": "secure",
  "data_sensitivity": "tolerant"
}

Common mistake: using tolerant to bypass credential detection. Credentials and secrets should still be treated as high risk.

Sensitive Data And Redaction

What it does: separates expected business PII from unsafe disclosure paths.

When to use it:

NeedSetting or response field
Claims, invoices, healthcare, identity, or support workflows contain normal PIIdata_sensitivity=tolerant
Public AI output should not reveal secrets or sensitive textdata_sensitivity=strict and scan_phase=output
Mighty returns safer replacement textUse redacted_output instead of the original output.
Mighty blocks output and no redaction is availableDo not show the original output.

Default value: data_sensitivity=standard. redacted_output appears only when available.

Example response:

{
  "action": "BLOCK",
  "risk_score": 91,
  "risk_level": "CRITICAL",
  "threats": [
    {
      "category": "secrets_exposure",
      "confidence": 0.96,
      "evidence": "sk_live_8f1c9d4e2ab3",
      "reason": "Output contains a live API key pattern."
    }
  ],
  "redacted_output": "I cannot share that sensitive value.",
  "scan_phase": "output"
}

Common mistake: assuming every block can be safely rewritten. Use redacted_output only when Mighty returns it and your product policy allows it.

IDs And Idempotency

What they do: connect scans to logs, sessions, retries, and downstream review.

FieldUse it for
request_idOne unique ID per request. Use it for idempotency and logs.
scan_idThe scan result ID returned by Mighty. Use it for audit and polling.
scan_group_idConnect input scans, output scans, and derived evidence.
session_idKeep a chat, claim, case, or workflow together over time.

Default value: Mighty can generate missing request_id, scan_group_id, and session_id for input scans.

Example request:

{
  "content": "Uploaded invoice text",
  "content_type": "text",
  "scan_phase": "input",
  "request_id": "ab82f4ad-8d64-4bb4-b4ed-77df63291198",
  "scan_group_id": "9b3e4f8d-96c9-4f42-8338-8cf9571c1c70",
  "session_id": "claim_18422"
}

Common mistake: generating a new scan_group_id for model output. Reuse the input scan group so the evidence stays connected.

Next step

Ready to scan real traffic?

Create an API key, keep it on your server, then wire Mighty into the workflow that handles untrusted material.