Have you ever heard a song you couldn’t name, a book title you couldn’t remember, or saw a scene from a film that looked vaguely familiar?
If it’s something classic (and you’re lucky) it might be as simple as quoting a line or recalling a single detail for someone who ‘gets’ the reference to tell you where it comes from. However it’s not always possible to work out what something is with only a single piece of information to go on.
In most cases if you’re trying to find something - a musical track, a book, a film - you need to know a little more about it in order to search for it. You might know the title, approximately when it was made or the name of the writer, composer and/or producer. Each bit of information will help you to track it down.
In the digital environment, it’s no different. We often need to know the same things in order to search Google and while there are smartphone Apps that help us identify what we’re listening to, looking at or where to find something (even whales in NSW), at times a question like “what song is this?” can still generate pained expressions from our friends as they rack their brains.
This is why the information we save as part of our digital files can be so important.
What things can digital files tell us? And what are the important bits?
- Date and time file was created
Don't be fooled, this isn't the very first date of when the file was made (like the music track I started back in 1997 and never finished.) This date is the date of this copy of the file. If the file has been moved around, this date will change. For example, if I’ve sent my music track to someone via a file sharing service and they download it today, then this date would be today’s date (not some time back in 1997).
- Date file was last modified
This is a useful date because it’s a record of when the contents of the file was last changed. Let’s say the last time I bothered to work on my music track was 2006 (actually, it probably was...), then this ‘last modified date’ would be sometime in 2006.
- File name
It’s important to call your file something that is clear and memorable. Knowing what you named your file can help you search your computer when you can’t remember exactly where you’ve stored it. (A date in the file name and versioning information can’t go astray either.) For example: mysoundtrack_v2_draft_20060801.wav
There are other bits of information that are less obvious, but may also be important. For example, the name of the software I was using to work on my file. In the case of digital photographs this information might include whether the flash fired or not, or even the serial number of the camera. This information is recorded inside the file itself, so you can’t see it. All this information can help you build a profile about this file and this becomes important if you or someone else wants to find this file now or in the future. It will also help narrow your search down from the gazillions of other files out there.
What about managing your own files?
Some of the information mentioned is ‘embedded’ in your files, whereas other parts of the information can be easily changed – accidentally or otherwise. It turns out, it's really easy to change some parts of this information. Managing and maintaining your files is an important thing to do. Just as important is to always make sure you have a backup!
So what do digital archivists do?
If you know what information about a file is at risk of being easily changed, then you can use tools to help you retain (aka preserve) this information about each of your files. There are tools to help you move files from one location to another and ‘keep’ the original information about the file unchanged, including TeraCopy, Exactly or via the command line (that’s another blog post altogether). There are also tools for working out what types of files you have (documents, eBooks, photographs, sound etc.) and help you to identify the types of files you have, such as DROID.
In the field of digital curation, retaining information about each file is important because we know people in the will need that ‘original’ information. (It's not particularly useful if my music file only has today’s date associated with it – given I first created it in 1997 and then worked on it again in 2006.) In this respect, files are no different to books or other paper-based documents. Archivists talk about this in terms of authenticity, integrity and provenance. In the digital environment, preserving as much of this original information means a better chance of assuring future users of the authenticity, integrity and provenance of a file.
So let’s recap. What do we know about our files?
You hopefully know a little bit more about how files work.
If I've taken care of my file, information about the format, when it was created and when I last worked on it should still be available.
If not, then some of this information may have been lost however it's likely other parts of this information will still be ‘embedded’ in the file.
If there’s not enough of this information still available, then I'm probably going to have to listen to the sound file to work out what it is.
So what have I really been talking about? - Metadata!
All these different bits of information are metadata. Metadata can be very useful for people trying to track things down and especially for collecting institutions to know what it is they are collecting. You can tell a lot from just metadata alone. For instance, you now know a lot about an unfinished track of music and you’ve not heard any of it yet.
Written by Somaya Langley, State Library of NSW