Metadata is a foreign concept? Whaaaat?!? (Part One) – A guest post by Greg Bak

[This is a guest post by Greg Bak, Archival Studies, Department of History, University of Manitoba. ]

Thanks to Kate for agreeing to publish my recent SAA presentation on her blog. In Part One of this guest post I discuss some of the Twitter reaction to my talk; in Part Two I include the slides and speaking notes from my talk.

Okay, so I’m paraphrasing here, but the title of this post summarizes reactions on Twitter to my presentation at SAA 2013 session 701. In the course of my talk I suggested that “Metadata is a natural concept for librarians and a foreign concept for archives.” Here are a few tweets that followed:

 Brad Houston:
Hmm. Metadata a foreign concept to archivists? Don’t think I agree with that at all. Used all the time, even if the word isn’t #Saa13 #s701 

Kind of getting annoyed by the assumptions made in this preso. Metadata is implicit in most description we do as archivists #Saa13 #s701 

Geof Huth:
How could say this and use the word “folksonomies” in the same presentation?

 Couldn’t figure out how he came to this conclusion. I mean, finding aids (of any kind) are metadata.

Things didn’t get much better when I went on to suggest that metadata, as a concept, is foreign to social media, too:

 Krystal Thomas:
hmm, also not sure I am buying the idea that metadata is foreign to social media though something to think about #s701 #saa13

 Andrew Berger:
Metadata is foreign to social media? #saa13

 Brad Houston:
Metadata is foreign to social media?” Um, I’ve got a spreadsheet of #Saa13 tweets on Google Drive which says otherwise #s701 

Thankfully, a couple of folks picked up the nuances and saved me from myself:

 Mark Matienzo:
From the looks of Twitter my colleagues are seriously misunderstanding Greg Bak’s presentation #saa13

 Sami Norling:
Metadata is a natural concept for librarians and a foreign concept for archivists (at least at its introduction) #saa13

 Seth Shaw:
“Metadata is foreign to social media”? I don’t buy the argument though I accept the implication: it is all ‘just’ data. #s701 #saa13

Sami Norling perceptively noted the emphasis I put in my oral remarks on archivists’ initial reluctance, in the 1980s and 1990s, to embrace metadata as a concept, while Seth Shaw evaluated my statement in light of the definition of metadata that I used in my paper. Mark Matienzo urged that people not react to my (poor) choice of wording, but take into account the ideas behind the words.

Not that I was using an obscure or idiosyncratic definition of metadata: I defined it as “data about data.” My point was that when defined in this way, the very concept of metadata requires that there be primary data (for example, a digital object or an analog document) and secondary data (data that is outside of, above or apart from the primary data).

My contention is that when the term began to gain currency among archivists in the 1990’s there was an instinctive reaction against it, followed by an attempt to re-frame it into archival terms. Adrian Cunningham, writing in Archival Science in 2001, scoffed that “When most of us first encountered the term metadata, we were probably repelled by yet another debasement of the English language by a bunch of barbarian techno-boffins.”  Cunningham presses on, discussing various definitions of the term before suggesting that “metadata is simply a new term for information that has been around for a very long time, but which now looks a bit different due to the advent of computer technology.” He rounds off his brief discussion with the claim that “archivists are metadata experts – it is just that we tend not to think in those terms,” and lists some examples of what he would consider archival metadata: finding aids, index cards, file covers, file registers and so on.

In my paper I sought to return to the initial wariness of archivists for the concept and re-evaluate this reluctance. What if archival anxiety around “metadata” was triggered not by fear of “debasement of the English language”, but rather from concern for debasement of archival theory?

This is the real issue: in archival theory, the kind of data typically identified as “metadata” is an integral part of the record. It is evidence of relationships among records and records users. It is not “meta” data; it is simply data. It is data that must be acquired and managed as a necessary part of the record. It is the data that makes the difference between a bunch of discrete, solitary items and a fully interrelated set of archival records.

This, moreover, is also how such data is managed within social media applications. Data that describes the use of information resources is not “meta” data, it is simply data: data that enables the weighting of search results, creating tangible differences in rankings, visibility and usefulness.

I am presently writing my SAA presentation for peer-reviewed publication. If you would like to see how I presented these ideas at SAA, my presentation slides and speaking notes will be included in “Part Two” of this post. I welcome any and all feedback, either in this blog’s comments or by sending me an email at


Cunningham A (2001). Six degrees of separation: Australian metadata initiatives and their relationships with international standards. Archival Science 1.3:271-283.


Be Sociable, Share!

6 thoughts on “Metadata is a foreign concept? Whaaaat?!? (Part One) – A guest post by Greg Bak”

  1. Great post, and I look forward to Part 2, not to mention Greg’s forthcoming publication. This was one of the most interesting presentations I saw at SAA this year and I am glad the conversation is continuing here.

  2. Right! As the loudmouth quoted no fewer than three times above, let me attempt to defend myself 😉

    First, thanks for posting this. (And thanks to Kate for providing the forum for doing so.) By way of preamble, let me note that my tweets this year were my *primary* notes from the conference (which is why I was so careful to have a way to collect them in case I couldn’t get to them before they scrolled off search), and furthermore that this year I was taking notes primarily on my tablet. Because I am too cheap to buy a Bluetooth keyboard, I was doing this by way of the on-screen keyboard, which is *slightly* easier than doing the same thing with a smart phone, but not by much. (Note to self: buy netbook or bluetooth keyboard for next conference.) All of which is by way of saying that in my furious effort to keep up with the information being presented (and this was easily one of the more info-dense presentations I saw at the conference, which is a good thing!), I missed the bit about “in the 1990s.” Oops. So mea culpa on that one.

    Having said that. To the casual, layman, or distracted (hi!) user, the formulation of “metadata is foreign to X” *sounds* reductive. Perhaps using the “metadata qua metadata” construction could have saved some of the e-ink spillage over this, since, as noted above, archivists do create and use metadata all the time. Granted, a lot of us don’t really *think* about archival description in terms of metadata (I didn’t really come face-to-face with my own metadata creation and usage until I started doing data management and electronic records), but it’s sort of like an autonomic nervous function– it happens whether or not you are willing it to happen. The point about the relationships between fonds *themselves* being metadata is a really solid one, and having more time to really look at your slides and notes is giving me a better sense of the distinction you were trying to make. There’s a lot more that goes on in the archival world re: the metadata/data relationship than most of us consider, and I think this is a really important presentation for people to see, if only to at least be thinking about it.

    To touch on the third quoted tweet of mine briefly: My concern was that you were talking about metadata in a social media as *merely* data, vs. a separate category of data that still needs to be preserved along with the “content”. I think that the OAIS model definitely supports the latter interpretation, since metadata is included within all of the information packages, but I think to say that there is *no* distinction in either direction is incorrect, even in a social media context. The more I read over your notes the more convinced I am that you agree that this is the case– you speak in particular about this in your Google Maps example–but again, this was unclear to me in the context of the talk itself. (Say this about the PRISM business: it is getting a lot more people to think about captured metadata as data in of itself.)

    Don’t be fooled by my tweeted irritation and the length of this comment! I really did like this presentation, and I was considerably less grumpy about it by the end, especially during your discussion of archives as social networks and the need to restructure our data management accordingly. I hope this comment is taken in a spirit of constructive criticism re: places where additional clarity might be useful. It’s an important argument and I hope the presentation sparks a lot of discussion around same.

  3. Hi Courtney – yes, I was concerned about the Twitter stream, too. That was why I decided to post the slides and notes. I’m glad that you found the posting here.

    Hi Meg – thanks for the boost! It was thrilling to come to NOLA and air out my complicated love-hate relationship with metadata.

    Hi Brad – it was really helpful to have your, and others’, reactions to the talk on Twitter. No need to apologize for your oh-so slightly cranky tweets – that was a good sign to me that I was loosing a core audience for my paper. Clearly, any archivist willing to defend our profession as metadata experts is my kind of archivist!

    You are correct to note that aspects of my talk are reductive – this is the inevitable outcome of squeezing a complex argument into 20 minutes. And as you point out, one glaring oversimplification is the slide in which I suggest that everything is “just” data. There are substantial and important differences between, say, digital objects and structured data, differences that pose major challenges for digital management and preservation. But, on a conceptual level, we should be careful to value equally the object and the structured data associated with it. My concern is that we often fail to do this, particularly when it comes to the majority of user-generated data (“beautiful photo!”) and analytics data.

    Kate – thanks again for providing the digital salon!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.