Research Data Management: Metadata
At a Glance
Metadata is another form of documentation and is simply ‘data about data’.
- It is related to the broader contextual information that describes your data, but is usually more structured in that it conforms to set standards and is machine readable.
- Metadata Standards provide specific data fields or elements to be used in describing data for a particular use.
- Some research fields have predefined metadata standards.
- Dublin Core is used here as an example of a metadata schema to illustrate the types of metadata you might need to capture but also the fact that metadata can be captured in a standardised way using controlled vocabularies.
Find a Metadata Standard for your Discipline
Controlled Vocabularies & Ontologies
Rich metadata (elements which describe the data) enhance the findability, interoperability and reusability of your data. To comply with the FAIR Principles metadata should be accessible, wherever possible, even if the data aren’t.
The quality of the descriptive information (metadata and documentation) regarding the data has a profound impact on their reusability so the more documentation and metadata you can provide, the better.
Your chosen Data Repository or Archive may have a metadata template you can complete or a required standard you must use. If not you should follow relevant disciplinary standards.
National Archives of Australia - Meta... What? Metadata!
Dublin Core is used here as an example of a metadata schema to illustrate the types of metadata you might need to capture but also the fact that metadata can be captured in a standardised way using controlled vocabularies.
Dublin Core is comprised of 15 “core” metadata elements. It is one of the simplest and most widely used metadata schema. The name "Dublin" is due to its origin at a 1995 invitational workshop in Dublin, Ohio, nothing to do with Dublin, Ireland unfortunately. Originally developed to describe web resources, Dublin Core has been used to describe a variety of physical and digital resources.
Built into the Dublin Core standard are definitions of each metadata element that state what kinds of information should be recorded where and how. Associated with many of the data elements are suggested controlled vocabularies.
All elements are optional and repeatable.
|Dublin Core Element||Definition||Suggested
|Tile||A name given to the resource. Typically, a Title will be a name by which the resource is formally known.|
|Creator||An entity primarily responsible for making the resource. Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity.|
|Date||A point or period of time associated with an event in the lifecycle of the resource. Date may be used to express temporal information at any level of granularity.||
W3C Note on Date and Time Formats (W3CDTF)
|Description||An account of the resource. Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource.|
|Rights||Information about rights held in and over the resource. Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights.|
|Type||The nature or genre of the resource. To describe the file format, physical medium, or dimensions of the resource, use the Format element.||DCMI Type Vocabulary|
|Language||A language of the resource.||
|Contributor||An entity responsible for making contributions to the resource. Examples of a Contributor include a person, an organization, or a service. Typically, the name of a Contributor should be used to indicate the entity.|
|Relation||A related resource. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.|
|Source||A related resource from which the described resource is derived. The described resource may be derived from the related resource in whole or in part. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.|
The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.
Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates.
Temporal topic may be a named period, date, or date range. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Where appropriate, named places or time periods can be used in preference to numeric identifiers such as sets of coordinates or date ranges.
|Thesaurus of Geographic Names (TGN)|
|Subject||The topic of the resource. Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary.||Library of Congress Subject Headings (LCSH)|
|Identifier||An unambiguous reference to the resource within a given context. Recommended best practice is to identify the resource by means of a string conforming to a formal identification system.|
|Format||The file format, physical medium, or dimensions of the resource. Examples of dimensions include size and duration.||Internet Media Types (MIME)|
|Publisher||An entity responsible for making the resource available. Examples of a Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the entity.|
Metadata Example: Book
The example below uses basic Dublin Core metadata elements to describe the book using Title, Creator, Date, Description and Type.
Notice how the author's name is formatted. This is a standard way of formatting names within Libraries and facilitates alphabetical sorting by author. Additional metadata elements that could be added to increase the richness of the metadata include the ISBN, information about the publisher, the location of this specific book within the library, some subject terms, the language of the book, the physical size of the book (helpful to know when shelving), the number of pages etc.
Metadata Example: Digital Image
The example below uses basic Dublin Core metadata elements to describe a photograph using Title, Creator, Date, Description and Type.
Additional metadata that could be captured about the photo includes the location it was taken, the copyright owner, licence information, information about the camera used to take the photo, technical specification include image resolution ec.
Metadata Example: Interview
The example below uses basic Dublin Core metadata elements to describe a piece of research data, in this case an interview, using Title, Creator, Date, Description and Type.
In the case of a single interview it's likely that multiple 'digital objects' for each interview will exist, for example the audio recording, a text file with the transcript and perhaps an NVivo analysis file. These can all be linked using metadata. Other metadata elements that might be useful to record include the topics covered in the interview, some demographic information about the interviewee ec.