Research Data Management: Documentation
At a Glance
Good quality documentation allows others to understand and use your data. Documentation can include:
- Interview protocol
- Questionnaires & interviewer instructions
- Codebook or data dictionary
- Information sheets, Consent forms, Ethical approval
- Database schemas
- Methodology reports
- Provenance information about sources of derived or digitised data
CESSDA's Tips for Creating Documentation
- Do not panic. Much documentation is simply good research practice, so you are probably already doing much of it.
- Start early! Careful planning of your documentation at the beginning of your project helps you save time and effort. Do not leave the documentation for the very end of your project. Remember to include procedures for documentation in your data management planning.
- Think about the information that is needed in order to understand the data. What will other researchers and re-users need in order to understand your data?
- Create a separate documentation file for the data that includes the basic information about the data. You can also create similar files for each data set. Remember to organise your files so that there is a connection between the documentation file and the data sets.
- Plan where to deposit the data after the completion of the project. The repository probably follows a specific metadata standard that you can adopt.
- Document consistently throughout the project. Data documentation gives contextual information about your dataset(s). It specifies the aims and objectives of the original project and harbours explanatory material including the data source, data collection methodology and process, dataset structure and technical information. Rich and structured information helps you to identify a dataset and make choices about its content and usability.
The project-level documentation explains the aims of the study, what the research questions/hypotheses are, what methodologies were being used, what instruments and measures were being used, etc. In the accordion the questions that your project-level documentation should answer are stated in more detail:
- For what purpose was the data created
- What does the dataset contain?
- How was data collected?
- Who collected the data and when?
- How was the data processed?
- What possible manipulations were done to the data?
- What were the quality assurance procedures?
- How can the data be accessed?
Collecting this information in one document will help where new members join a research team, when writing up a paper or if you plan on sharing your data at the end of the project.
For more information:
Data-level or object-level documentation provides information at the level of individual objects such as pictures or interview transcripts or variables in a database. You can embed data-level information in data files. For example, in interviews, it is best to write down the contextual and descriptive information about each interview at the beginning of each file. And for quantitative data variable and value names can be embedded within the data file itself.
For quantitative data document the following information is needed:
- Information about the data file
- Data type, file type, and format, size, data processing scripts
- Information about the variables in the file
- The names, labels and descriptions of variables, their values, a description of derived variables or, if applicable, frequencies, basic contingencies etc. The exact original wording of the question should also be available.
- Variable labels should:
- Be brief with a maximum of 80 characters
- Indicate the unit of measurement, where applicable
- Reference the question number of a survey or questionnaire, where applicable
For qualitative data document the following information is needed:
- Textual data, for example interviews, include key information of participants such as age, gender, occupation, location, relevant contextual information
- For qualitative data collections (for example image or interview collections) you may wish to provide a data list that provides information that enables the identifying and locating of relevant items within a data collection:
- The list contains key biographical characteristics and thematic features of participants such as age, gender, occupation or location, and identifying details of the data items;
- For image collections, the list holds key features for each item;
- The list is created from an initial list of interviews, field notes or other materials provided by the data depositor.
For more information: