Skip to Main Content

Digital Repositories at Chapman University

Digital Commons and Figshare are open access repositories for sharing and preserving the research outputs of Chapman scholars and researchers.

Choosing file formats for sharing and preservation

The format your data is stored in may make it harder for others to use or less stable for long-term preservation.

The Library of Congress provides an extensive reference manual for choosing appropriate formats for preservation if your data type is not listed below.

The "How to Package Your Data" page of the Libraries' Data Management guide has additional resources on metadata and data standards that may be helpful during this process.

Recommended File Formats

Files that use open or ubiquitous formats are easier to share, preserve, and reuse. The table below provides file format recommendations for common file types. Formats in bold indicate the best choice if multiple options are available. If you plan to use Chapman Figshare, please prepare your data files based on these recommendations when possible.

File Type Recommended Formats Acceptable Formats
Audio Wave (.wav), FLAC (.flac), AIFF (.aiff, .aif); MPEG (.mp3), FLAC, Ogg Vorbus (.ogg),
Database eXistdb database files (.xml files in a hierarchy of directories)  
Image: Raster uncompressed Tagged Image File Format (.tif or .tiff) JPEG (low/no compression); BMP, PNG (.png)
Image: Vector Scalable Vector Graphics (.svg)  
Tabular data Comma Separated Values (.csv) or Tab Separated Values (.tsv), UTF-8 encoding preferred.  MS Excel (.xlsx 2007 or later), Open Office (.ods), structured plain text files (.txt)
Text: Formatted PDF/A (.pdf), Rich Text (.rft) Other PDF formats (.pdf), Microsoft Word (.docx), Open Office (.odt)
Text: Plain Plain Text (.txt) UTF-8 encoding or plain text (.txt) ASCII encoding  
Structural: Markup
 
SGML w/ valid DTD (.smg, .sgml); XML w/ valid DTD (.xml), KML (.kml), JSON (.json), Markdown (.md) HTML (.html, .htm)
Video MPEG-4 (.mp4) AVI (.avi); QuickTime (.mov)
Virtual Reality X3D (.x3d)  

 

File Formats to Avoid

Below is a table of file formats to avoid as they are either: 1) proprietary; 2) discontinued; or 3) difficult to preserve or reuse.

File Types File Formats Reasoning

Audio

Real Audio (.ra, .rm, .ram)

Windows Media Audio (.wma)

Proprietary and discontinued
Statistics Packages .JMP files, .sav files Proprietary
Adobe Creative Suite .psd; .eps, .indd, etc. Proprietary
Older Microsoft Office Suite .doc, .xls Proprietary and defunct.
Images JPEG 2000 (.jp2) Defunct
Video Windows Media Video (.wmv) Proprietary and defunct
Generic .dat DAT files are usually contain structured text so it is better to save the data in a format that makes the structure explicit (such as .csv or .tsv).
Databases Microsoft Access (.accdb), FileMaker Pro (.fm5, .fm7, .fm12, etc.) Proprietary

 

Examples on this page were kindly provided by Iowa State University, reused with permission.