- Researcher Guides
- FAIR Data
FAIR Data: Interoperable
Data and metadata should conform to widely used file formats and disciplinary standards for data collection should be used where possible to allow your data to be combined and re-used with other data.
- Your data will be provided in suitable file formats for long term access and re-use.
- The metadata will be provided following relevant disciplinary standards.
- Controlled vocabularies, keywords, thesauri or ontologies will be used where possible.
File Formats & Standards
When choosing file formats for research data it's important to consider whether the format is:
- Open & non-proprietary
- Uncompressed or lossless
File formats that are open or non-proprietary will tend to retain a good chance of being remaining accessible, even if the software that created them is no longer available. Specialised proprietary formats used only by a niche set of users may present problems for future use. Formats which are ubiquitous or have become the default standard within a discipline, whether proprietary or not, are also more likely to be maintained into the future. This is important whether you plan on sharing and archiving your data at the end of you research project or whether you simply want the data to remain accessible by yourself and other researchers in your department.
- Proprietary format: Photoshop .psd file
- Open format: .tiff image file
Formats that are compressed or 'lossy' are often smaller in file size but the data are compressed as part of the encoding process, resulting in a data essentially being thrown away.
- Lossy formats: .mp3 audio file, .jpeg image file
- Lossless formats: .wav audio file, .tiff image file
Things to consider when choosing a file format:
- How you plan to analyse your data
- Which software and file formats you and your colleagues have used in the past
- Any discipline specific norms or technical standards
- Whether file formats are at risk of obsolescence because of their dependence on a particular technology.
- Which formats are best to use for the long-term preservation of data
- Whether important information might be lost by converting between different formats
- The possibility of embedding metadata that describes content within the file itself, e.g. creator information, variable names and labels
Sometimes it is useful to store your data using one format for data collection and analysis and also in a more open or accessible format for sharing or archiving once your project is complete. If it is your intention to share your data our chosen Archive or Repository will likely have recommended file formats based on best practice within the disciplines they support.
Find a Metadata Standard for your Discipline
- FAIRsharingFAIRsharing is a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies.
- DCC list of discipline-specific metadata standardsA detailed list of discipline-specific metadata standards has been compiled by the Digital Curation Centre (DCC).
- Research Data Alliance Metadata Standards DirectoryThe RDA Metadata Standards Directory contains widely used metadata standards in the Arts & Humanities, Engineering, Life Sciences, Physical Sciences & Mathematics, Social & Behavioral Sciences and General Research Data.
Controlled Vocabularies & Ontologies
- Getty Thesaurus of Geographic Names (TNG)The Getty Thesaurus of Geographic Names (TGN) includes names and associated information about places.
- Library of Congress Subject Headings (LCSH)Library of Congress Subject Headings (LCSH) comprise a thesaurus or controlled vocabulary of subject headings, maintained by the United States Library of Congress.
- W3C Note on Date and Time Formats (W3CDTF)This document defines a profile of ISO 8601, the International Standard for the representation of dates and times. ISO 8601 describes a large number of date/time formats. To reduce the scope for error and the complexity of software, it is useful to restrict the supported formats to a small number. This profile defines a few date/time formats, likely to satisfy most requirements.
- ISO 8601 Date and Time FormatAn internationally accepted way to represent dates and times using numbers: YYYY-MM-DD
For example, September 27, 2012 is represented as 2012-09-27.
- DCMI Type VocabularyThe DCMI Type Vocabulary provides a general, cross-domain list of approved terms that may be used as values for the Resource Type element to identify the genre of a resource.
- Last Updated: Nov 21, 2022 11:23 AM
- URL: https://libguides.ucd.ie/FAIR
- Print Page