๐Ÿ“„ Metadata Upload Guide

You can improve the accuracy and relevance of automated processing by uploading a small JSON file with additional metadata about your documents. The fields below are optional โ€” include only what you know.

Tip: If youโ€™re unsure about a field, you can leave it out. The system will use defaults automatically.


๐Ÿ“ writing_style

Describes the visual form of the text in the document.

This helps the system choose between transcription strategies (e.g., handwriting recognition vs. printed text).


๐ŸŒ language

The main language(s) used in the document.


๐Ÿ“… time_period

Gives a rough estimate of when the document was created. This can influence how the system interprets spelling, abbreviations, and writing style.


โš™๏ธ transcription_preferences

Specify how youโ€™d like the system to handle certain aspects of transcription. Each setting is optional โ€” include only the ones you want to customize.

Defaults:

{
  "expand_abbreviations": false,
  "preserve_line_breaks": true,
  "retain_punctuation_and_spelling": true,
  "normalize_to_modern_language": false,
  "ignore_marginalia": false
}

๐Ÿ“ layout_structure

Describes how the text is arranged on the page. This helps guide interpretation of complex layouts.


๐ŸŽจ non_textual_elements

List any non-text content that appears in the document. This information can help avoid misinterpreting these as text.

Example:

"non_textual_elements": ["illustrations", "handwritten_notes"]

๐ŸŒˆ color_format

Indicates the color style of the scanned image. This helps the system calibrate processing.


โ†”๏ธ orientation

Specifies how the page is oriented. This helps avoid mistakes when the image isnโ€™t upright.


๐Ÿงพ image_quality_notes

List any known image quality issues so the system can adapt or flag the content.

Example:

"image_quality_notes": ["blurry", "includes_fingers"]

โœ๏ธ description

A free-text description of the document set. This can provide general context to improve results for both transcription and metadata generation.

Example:

"description": "These are scanned forms from a late 20th-century survey project on urban housing."