Import formats, size limits, and fidelity

A practical guide to what imports can carry, what gets normalized, and where limits apply.

Scope

This article covers the import formats Valo Reader currently accepts, the size limits that apply, and what gets preserved after import. For step-by-step instructions on specific paths, see Markdown and plain text imports and YouTube imports.

Import paths

The Create Story page supports these content types:

  • Manual Text — write or paste story text directly.
  • Text + Audio — upload a script or transcript, then attach an audio file.
  • Subtitles Only — upload timed subtitles and use the subtitle text as the story.
  • Media + Subtitles — upload audio or video with a timed subtitle file.
  • YouTube — import from a YouTube watch page using the Valo Reader browser extension.

The browser extension can also send webpage imports into the story creation flow. Web imports depend on what readable text the extension can extract from the page.

Supported files

Text-like files. .txt, .text, .md, and .markdown are read as text. Line breaks are normalized, and you can review and edit the result before creating the story.

PDF. Text-based PDFs are supported. Encrypted or password-protected PDFs cannot be imported. Scanned image PDFs are not supported — Valo Reader cannot extract text from them. Not every PDF will import successfully — if extraction fails, try a different copy of the file or paste the text directly.

ePub. ePub imports extract chapter text from the book spine and strip most styling. Navigation pages are skipped, but other front matter (copyright pages, table of contents, acknowledgements) is included — you may want to trim these after import. Ruby text (furigana) is removed, which is expected for reading practice. DRM-protected ePubs cannot be imported. Image-heavy ePubs with minimal text may produce near-empty results. Review the extracted text before creating the story.

Subtitle files. .srt, .vtt, .ass, and .ssa are supported for subtitle-only imports. Media + Subtitles currently asks for SRT or VTT for the subtitle slot. Subtitle styling is removed, while timing cues are kept when they can be parsed.

Audio and video files. Audio and video files can be uploaded for Text + Audio or Media + Subtitles. The file picker accepts common audio/video types, including MP3, M4A, WAV, FLAC, OGG/Opus, MP4, MOV, MKV, AVI, WebM, and similar browser-recognized media files.

What Gets Preserved

Valo Reader preserves the pieces needed for reading and listening:

  • Story text
  • Paragraph breaks and simple line structure
  • Subtitle timing cues when a subtitle file or YouTube captions provide them
  • Uploaded audio after processing
  • A YouTube source link when the story comes from YouTube

Valo Reader does not try to preserve document layout. Expect these to be simplified or removed:

  • Fonts, colors, page layout, margins, and page numbers
  • Images and embedded media inside documents
  • Markdown styling such as bold, italics, links, tables, and code formatting
  • ePub styling, navigation pages, and non-story front matter
  • Subtitle styling and positioning

After import, the story text is tokenized into individual words for study. See Choosing your study language to make sure your language is set so definitions and vocabulary tracking work correctly.

Size limits

Current upload limits are:

  • Document and subtitle files: 10MB.
  • Audio and video files: 1GB.
  • Story text: 5,000,000 characters for one story.

Free-plan accounts also have a per-story word limit for imports and full text replacements. The Create Story page shows your current free-plan limit when it applies. See Free plan limits for lifetime story caps and per-story word limits.

If you hit a limit, split the content into smaller stories, use a smaller file, or paste a shorter excerpt.

Text cleanup during import

Text is cleaned up during import:

  • Repeated blank lines are collapsed.
  • Extra whitespace and artifacts from PDF extraction are removed where possible.
  • Subtitle formatting tags and styling commands are stripped from subtitle text.

The app does not verify that imported text matches your study language before creating the story. If the language is wrong, reading tools may be less useful. See Choosing your study language to set the correct language before importing.

YouTube and Web Imports

Use the Valo Reader browser extension to import YouTube videos. The extension pulls the caption text and timing data from the video and opens a new story for you to review. Videos with creator-uploaded subtitles produce the best results — YouTube's auto-generated captions are not very reliable and may contain transcription errors. See YouTube imports for more.

Webpage imports also come through the extension. They are best for pages with clear article text. Pages that rely heavily on scripts, paywalls, custom layouts, or embedded media may produce incomplete text.

Practical Advice

  • For books, import one chapter or section at a time.
  • For PDFs, use text-based PDFs — scanned image PDFs are not supported.
  • For YouTube, prefer videos with creator-uploaded subtitles over auto-generated captions. YouTube's automatic transcription is not very reliable and may produce transcription issues.
  • For subtitles, check that the extracted transcript reads naturally before creating the story.
  • For media files, use shorter clips when possible so processing and review are easier.
  • For Markdown, treat the file as a convenient text source, not a rich-format document.