“The Hidden Curse of Automation” & archives

A friend on Facebook posted a link to this Los Angeles Review of Books article by Clive Thompson about Nicholas Carr’s book The Glass Cage: Automation and Us. The review raises many issues, but as usual I was reading it with archives in mind. Specifically, this discussion made me think about the possible problem of historians and scholars relying too heavily on keyword searching of digitized archival sources rather than pursuing more old-fashioned (and time consuming) practices. I say “possible problem” because I do not know, of course, that this is what’s being done, but I have certainly heard chatter that leads me think it’s worth considering.

This also brought to mind a long-ago tweet from Patrick Murray-John, who asked “Would archivists accept topic modelling on OCRed items as a collection level description?” As I recall my response was something like, “No. But it would be a very useful resource or accompaniment to such a description.” Just as Carr (according to Thompson) is not opposed to technology, neither am I. But I think both authors raise points that are worth injecting into our discussions with all of our users about the extent to which they use–and rely on–the time-saving features that technology supports, and what information they may be missing if they are relying on it exclusively.

Be Sociable, Share!

2 thoughts on ““The Hidden Curse of Automation” & archives”

  1. Very interesting. Two thoughts off the cuff:

    *I think for much (I suspect, the majority) of digitized archival material, OCR is not an option nor will it ever be, for reasons of both technology and source quality. The former may improve, as google-power tools filter down to archives, but manuscript, poor-quality type/print material, and non-Latin alphabets will remain difficult to OCR. Researchers likely will have to continue to rely on archivist-generated arrangement and description, and manually review digitized materials that are described at the folder- or box-level. This is similar to “traditional” access to archives, except one can do the reviewing anywhere with an internet connection.

    *As far as OCR replacing human-generated description, Larisa Miller’s 2013 AA article “All Text Considered: A Perspective on Mass Digitizing and Archival Processing” made that case for certain suitable materials (pretty well, I think). Trevor Owens unpacked it a little more in a blog post titled “Mass Digitization, Archives, and a Multiplicity of Orders & Arrangements,” but I’m a little surprised that there hasn’t been more reaction to Miller’s provocative but user- and access-focused argument.

    The key insight that we can probably all agree on is that everyone, user and archivist alike, should understand what is lost and gained when using any particular technology to access archival materials. Of course, that is far easier said that done…

  2. It’s interesting that the review didn’t note that commercial airline travel has become significantly safer over the last decade. This improvement has a number of causes, but one is precisely the automation that is being questioned. Automation has reduced the number of accidents caused inadequate flying skills, but has raised the new risk of deskilling pilots. This has resulted in a (smaller) number of accidents caused by flight conditions that exceed the skills of the pilots. That’s a win by my standards, but any student of safety will recognise the merry-go-round.

    What is the relevance to archives? Precisely the lack of recognition that all change has positives and negatives, and an excessive focus on (fear of?) the *possible* negatives.

    To take the two examples raised.

    Yes, it might mean that researchers rely more on keyword searching. This might actually improve the research they do – finding precursors and consequences that they otherwise would not have found. The reduction in time required to search might free up time to produce more research, or research more broadly across the corpus. Who knows what will happen?

    As for topic maps, finding aids should document the context of the records. This consists of a lot more than a description of the content. So, no, I don’t think topic maps should be accepted as a replacement for a finding aid. But, topic maps could be a better description than text generated by an archivist. An algorithm may produce a description that reflects keywords in the corpus rather than what the collection is supposed to be about, or what a single person thinks it is about. It has the advantage that it can be done again as algorithms are improved (how often are finding aids revised?). And it is relatively cheap, so collections that are currently not findable due to resource constraints can be found. The result may be better research – or perhaps worse. We’ll only know when we try it.

    In my view, getting our knickers in a twist over the possible negative outcomes is not productive. We’d do better evaluating outcomes, understanding what went wrong (or right!), and improving. Just like the safety investigators do.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.