Skip to main content

Research Data Management: File Names

Bringing together University resources and services to facilitate researchers in the production of high quality data

File Names

Research data files and folders need to be labelled and organised in a systematic  and consistent way so that they are easy to find, both for you and others in your research team. There is no one recommended way to name your files and folders, but you should be consistent. If you work as part of a research group you should decide on a file and folder naming system with your colleagues. Below are suggested file and folder naming conventions:

  • It’s generally useful to aim for file names which are concise, but informative – it makes life easier if you can tell what’s in a file without having to open it. File names should include enough information to uniquely identify the data file e.g. project acronym, study title, instrument, location, date, data type, version number, filetype
  • Similarly, being consistent in your file naming practices will make it easier to locate the file you want. Within a research group, you may want to agree on file naming conventions early on in the project
    • Prefer lower-case file names as, in general, these are less software and platform dependent
    • Avoid using spaces and special characters in file names, directory paths and field names as automated processing, URLs and other systems often use spaces and special characters for parsing text string. Instead, use underscore ( _ ) or dashes ( - ) to separate meaningful parts of file names. Avoid & , * % # ; * ( ) ! @$ ^ ~ ' { } [ ] ? < > etc. Also avoid using diacritics, e.g. fadas and accute accents.
    • Date order should facilitate sorting chronologically so YYYYMMDD, not MMDDYYY 
    • Time should similarly be ordered HHMMSS
    • When using sequential numbering, use leading zeros to allow for multi-digit versions. E.g.  a sequence of 1-10 should be numbered 01-10; a sequence of 1-100 should be numbered 001-010-100.
    • When including a personal name give the family name first, followed by the initials
  • Operating systems usually default to sorting files alphabetically, so it can be helpful to think about what comes at the start of the file name – is it more useful to order the files by date, by author, or by subject, for example?
  • If you have multiple versions or drafts of a file, it can also be useful to include a version number in the file name – this makes it straightforward to see which copy is the most recent one.
  • Review what you have – don’t keep pointless multiple copies of data; consider what you need to retain, for how long, and what can / can’t be destroyed/deleted. Do this at intervals and at the end of the project. You can also move old, unused items to a folder called “Archive” (or something similar) so they don’t clutter up your screen.

Examples

A file or folder name can provide context about the content of the file or folder, e.g.:

Sevilleta_LTER_NM_2001_NPP.csv


Sevilleta_LTER is the project name
NM is the US state abbreviation
2001 is the calendar year
NPP represents Net Primary Productivity data
csv stands for the file type—ASCII comma separated variable

 

Further examples:

  • FG1_CONS_2010-02-12.rtf (interview transcript of the first focus group with consumers, that took place on 12 February 2010)
  • Int024_AP_2008-06-05.doc (interview with participant 024, interviewed by Anne Parsons on 5 June 2008)
  • BDHSurveyProcedures_00_04.pdf (version 4 of the survey procedures for the British Dental Health Survey)

 

Different ordering sequences can be achieved the following ways:

Order by date:


2013-04-12_analysis_ASPH.xlsx
2013-04-12_raw-data_ASPH.txt
2012-12-15_analysis_JARID1A.xlsx
2012-12-15_raw-data_JARID1A.txt


Order by subject:


ASPH_analysis_2012-12-15.xlsx
ASPH_raw-data_2012-12-15.txt
JARID1A_analysis_2013-04-12.xlsx
JARID1A_raw-data_2013-04-12.txt

Order by type:


Analysis_ASPH_2012-12-15.xlsx
Analysis_JARID1A_2013-04-12.xlsx
Raw-data_ASPH_2012-12-15.txt
Raw-data_JARID1A_2013-04-12.txt


Forced order with numbering:


01_JARID1A_raw-data_2013-04-12.txt
02_JARID1A_analysis_2013-04-12.xlsx
03_ASPH_raw-data_2012-12-15.txt
04_ASPH_analysis_2012-12-15.xlsx

File Re-Naming Tools

Although all operating systems have in-built tools for managing files, there are software tools that can organise research data files and folders in a consistent and automated way through batch renaming (also known as mass file renaming, bulk renaming). Batch renaming software exists for most operating systems.

There are many situations where batch renaming may be useful, such as:

  • where images from digital cameras are automatically assigned base filenames consisting of sequential numbers
  • where proprietary software or instrumentation generate crude, default or multiple filenames
  • where files are transferred from a system that supports spaces and/or non-English characters in filenames to one that doesn't (or vice versa). Batch renaming software can be used to substitute such characters with acceptable ones.

Examples of bulk renaming tools include:

Windows
Ant Renamer http://www.antp.be/software/renamer 
RenameIT http://sourceforge.net/projects/renameit/   
Bulk Rename Utility http://www.bulkrenameutility.co.uk/

Mac
Renamer4Mac http://renamer4mac.com/

Linux
GNOME Commander http://gcmd.github.io/ 
GPRename http://gprename.sourceforge.net/

Further Resources

Jeff Haywood talking about the importance of good file management in research.

For a detailed description of file naming standards and procedures, consult JISC Digital Media – Choosing a File Name

Digital Science has recently developed Projects, an application that lets researchers safely manage and organize their research data on the desktop. It provides a visual timeline to make finding files easy, backup functionality to help seamlessly recover previous versions of files, annotation features and a structured hierarchy to encourage users to organise their files. (Link also to Tools and Tutorials)

© 2015 University College Dublin Library T + 353 (0)1 7167583 | E library@ucd.ie Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Ireland License