Skip to main content

Supported Audio & Video Formats

Audio and video processing is available on Professional plans and above. Community-tier accounts can upload documents and images only.


Audio formats

FormatContainerNotes
MP3MP3Most common; widely compatible
WAVWAVUncompressed; preferred for preservation masters
M4AMP4Apple's lossless / lossy container
FLACFLACLossless compression; recommended over MP3 for archival
OGGOGGOpen container; less common in archives
AACAAC / M4ALossy; common from mobile recorders

File size limits (Professional / Team): 500 MB per file. Enterprise: unlimited.


Video formats

FormatContainerNotes
MP4MP4 (H.264 / H.265)Most common; recommended for upload
MOVQuickTimeCommon from broadcast and consumer cameras
WebMWebMOpen web video format
AVIAVIOlder container; supported for historical files
MKVMatroskaModern open container

File size limits (Professional / Team): 2 GB per file. Enterprise: unlimited.

A video item counts as 3 items against your monthly quota — transcription, frame extraction, and content analysis run in parallel.


Preservation guidance

The Library of Congress maintains recommended formats for archival use. In summary:

Use casePreferredAcceptableAt-risk
Audio masterWAV (uncompressed, 24-bit / 96 kHz)FLACMP3
Audio access copyFLACMP3 320 kbpsM4A AAC
Video masterMP4 H.264 (high bitrate) or MOV ProResMKVAVI, WMV
Video access copyMP4 H.264WebMolder codecs

The Archiver doesn't transcode your uploads — what you upload is what's stored and exported in BagIt / preservation packages. Upload the highest-quality version your plan allows.


Container vs codec

A format is the container; what's inside is the codec. The Archiver supports any audio or video codec that ffmpeg can decode (which is to say, effectively all of them). You shouldn't ever hit a "codec not supported" error unless the file is corrupt.


What about subtitles?

If you have an existing transcript or subtitle file (SRT, VTT), upload it alongside the media and the platform will use it instead of running its own transcription. The item still counts as the relevant item cost (1 for audio, 3 for video) because vision and content analysis still run — but transcription is skipped.


See also