“The Hidden Curse of Automation” & archives

A friend on Facebook posted a link to this Los Angeles Review of Books article by Clive Thompson about Nicholas Carr’s book The Glass Cage: Automation and Us. The review raises many issues, but as usual I was reading it with archives in mind. Specifically, this discussion made me think about the possible problem of historians and scholars relying too heavily on keyword searching of digitized archival sources rather than pursuing more old-fashioned (and time consuming) practices. I say “possible problem” because I do not know, of course, that this is what’s being done, but I have certainly heard chatter that leads me think it’s worth considering.

This also brought to mind a long-ago tweet from Patrick Murray-John, who asked “Would archivists accept topic modelling on OCRed items as a collection level description?” As I recall my response was something like, “No. But it would be a very useful resource or accompaniment to such a description.” Just as Carr (according to Thompson) is not opposed to technology, neither am I. But I think both authors raise points that are worth injecting into our discussions with all of our users about the extent to which they use–and rely on–the time-saving features that technology supports, and what information they may be missing if they are relying on it exclusively.

For non-archivists: What aspect of archives do you wish you knew more about? What’s a mystery to you?

As noted in the previous post, I’ve got a another new project in the works, scheduled for an early 2015 launch. It will be about archives (of course) and targeted at the general public. I’m working on finalizing the scope and project goals at the moment, and I want to make sure I’m aiming for the right goals and including the right content, so last week I posted on Twitter:

Unlike the previous question (aimed at archivists) I didn’t get a lot of responses to this one, so I’m throwing it out there again. Historians, scholars, family history researchers, and all “civilians”! What do you want to know about archives? What should I make sure I cover in my new project?

 

My talk from #AHA14: A Distinction worth Exploring: “Archives” and “Digital Historical Representations”

A few weeks ago I was part of the panel, “Digital Historiography and the Archives” at the 2014 meeting of the American Historical Association. [UPDATE: All papers from this session are now available online here.] As with my previous foray into a historical conference, it was an interesting experience and an informative one. Viewing how historians describe or refer to our resources, practices, and profession when talking to audiences of their peers is fascinating. The AHA is in New York in 2015, and I highly recommend that more archivists try to get on panels and attend.

When presented with the topic of this session, I was uncertain what response to take in the limited time (about 15-20 minutes, I think), so I was fortunate, as I mention in my remarks, that I procrastinated and waited until I saw the other speakers’ notes before I decided on my approach. I’m still not sure it was the most effective one possible, but it seemed to fill what I perceived as a genuine need to ensure that the necessity to “unpack” and question digital resources was explored. And since my preference is always for practical rather than theoretical discussions, I took a practical approach. The full text of the other speakers’ talks will hopefully be up soon on the AHA site, and when it is I’ll link to it so you can see the full context of my remarks. It was an interesting panel and I look forward to more discussion, both here and hopefully on the AHA site about how historians and archivists can work together to best support “digital historiography.”

I spoke without slides, and this is an only slightly modified version of the text from which I spoke. I’ve added links to the sites I reference and other sources that might be useful.

In approaching this session and this topic, I had trepidations, as I often do, about how the other speakers and the audience would be framing their conception of “archives.” In preparing my talk I read an article Josh [Sternfeld] had written for an archival journal in 2011 [“Archival Theory and Digital Historiography: Selection, Search, and Metadata as Archival Processes for Assessing Historical Contextualization,” American Archivist Fall/Winter 2011] and was pleased to see his careful usage of the phrase “digital historical representations” as an umbrella term covering some of the products created by archives, as well as a range of products created by other sources.

In approaching the subject of archives with historians and other humanities scholars, I often feel somewhat pedantic in my continual emphasis on the meaning of words. But after all, words represent concepts and perceptions of reality, and if those words aren’t clearly communicating what we intend, then it’s hard to achieve meaningful progress. What I’d like to talk about in the time I have, and hopefully as part of the discussion, is to illustrate the points Josh and Katja have made about the importance of questioning, understanding, and articulating the context of creation of digital historical representations by discussing the differences between different types of digital information sources created and used by historians—many if not most of which are often all referred to as “archives.”  In all of these cases the context of the creation of the information sources is critical to understanding the problems that may be inherent in that source and which the researcher should take into consideration. I am not a historian, but I would think that understanding why and how an information resource was created—that is to say, its context—is more valid than ever in digital historiography.

Everyone here is familiar with what for lack of a better term I’ll call “traditional” archives—that is, primarily paper-based (or non-digital) largely unique materials, brought together in repositories in aggregations either created by the originating organization or person, or by a third party, such as a scholar, manuscript dealer, or the repository itself (as in special collections).  Appraisal and selection of such materials is a multi-dimensional process, as you might imagine, with many factors involved, including sometimes political influence, censorship on the part of the creator/collector, resource limitations on the part of the repository, random chance and “acts of God.” How and why the materials on our shelves end up there is not always a straightforward story and one that is usually not captured in detail in the public description of the materials. How the materials were aggregated and for what purpose is usually described at some level in the finding aid, but documentation in this area is sometimes sporadic. I would guess most archivists believe—rightly or wrongly—that fields like “Custodial History,” “Appraisal, Destruction and Scheduling Information,” and “Administrative/Biographical History” (which applies to creators of aggregates) are not valued by most users.   To be honest, I’m not sure how often it’s even of interest to historians, or at least how often they ask the archivist about more information if the finding aid is skimpy in this regard. Anecdotal evidence from my colleagues and user studies indicate that it is not widely valued or used by users.

Again, that’s “traditional” physical archival materials, represented digitally by descriptions in online finding aids, catalog records, etc. For these materials, what has changed for historians in the modern digital age, I think is the increased expectation—and reality—that more descriptive information about materials will be made available online, and also the ability to easily create their own digital copies with digital cameras and smart phones.

Next we have collections of digitized analog historical materials—sometimes called “digital archives.” These may be topically based—assembled from holdings of many repositories, like the William Blake Archive or the Wilson Center Digital Archive, which is focused on documents related to international relations. Or they may be all from one repository—as in the recently launched FRANKLIN site, which provides online access to digitized collections from the Franklin D. Roosevelt Presidential Library and Museum. These collections may be created by archivists, librarians, historians, passionate amateurs, nonprofit organizations or for-profit companies.  Because these digital historical representations, to use Josh’s term, are created by such a wide range of sources, it’s critical to know about the context of these collections—including who assembled them, what their purpose was, and what criteria they used.

Often when historians are talking about archives, when I probe to see what they mean, it is these kinds of collections they are referring to. Katja’s point that it’s important to know where the individual original materials are located and where they fit in their archival context is a valid one, but it’s also important to understand where they fit in the context of the new digital collection. On what basis were items added to this collection? Why were some items excluded? To what extent is what’s being presented a subset of what’s available? Where does the metadata come from? How was it created and reviewed?  As with online finding aids for physical collections, what you’re accessing in this kind of digital collection is a surrogate—a description of that object or aggregate created by a person to represent it. Even the scan is a surrogate—although hopefully an accurate one.  Descriptions and metadata can be subjective and also subject to errors.

It seems to me as if these kinds of collection—or “digital archives” as they’re commonly called, would raise a host of questions in terms of digital historiography—some similar to those presented by online information for “traditional” archives, but many others that are different.

Yet a different kind of aggregate, also sometimes called “digital archives” are groups of born-digital materials as opposed to the digital surrogates of analog originals I just talked about. These types of aggregates, kept together because they come from a single source or creator, reside primarily within archives and special collections repositories, and consist of records created or received by an organization in the course of business, maintained by them and transferred to their associated archival repository. For example, the electronic records created by the Census Bureau and transferred to the National Archives. You can also have the equivalent of the “papers” of a person or family, such as Salman Rushdie collection at Emory, which contains the contents of his personal computers. For these kinds of aggregates archives have most of the same kinds of issues with selection, appraisal, and custodial history as they do with non-digital materials, but with additional issues raised by their digital format, as Katja noted, related to reliability and authenticity as well as how to provide access.

And last but not least, you can have assembled collections of born-digital materials—yet another category of what are termed “digital archives.” The September 11 Digital Archive created by the Center for History and New Media is a good example of this type of collection. In this case—and also with the Internet Archive—the collection serves a critical function: acquiring born-digital materials that might not otherwise survive. Many born-digital materials are more fragile than their analog counterparts for various reasons, and so some of these collections are similar in function to special collections libraries, which pull together valuable individual items for preservation. It’s also worth noting that in digital collections, copies of materials can reside in more than one collection. For example, in the September 11 collection there are copies of documents created by the New York City Fire Department (Incident Action Plans). Presumably there are also copies of these born-digital records being transferred to the official repository for the municipal records of New York City.  These kinds of “digital archives” combine the issues related to assembled collections—that is, the necessity of exploring who is creating them, for what purpose and using what methods— and those concerns related to born-digital materials as far as preservation and authenticity.

Coming back to Josh’s use of the term “digital historical representations,” I’m happy to see this broader term being used in discussions about “archives” and digital historiography. For me, many products that come under this term—like databases and sources like Google Books—would be removed one step (or more than one step) too far to be categorized as “archives.” I would consider these as separate intellectual products created from archival sources.  And, indeed, in a way, so are any of the collections in which copies of archival materials are removed from their original context and “re-mixed” to be part of a new creation—a new “digital archives” like Valley of the Shadow, to use a classic example. In fact, in a pre-digital era analogous versions of the scholarly products I’ve talked about here (other than databases) would still have existed, I think, and been called something other than “archives”—they would have taken the form of exhibits, edited volumes of letters or printed collections of documents, assembled and edited by historians or other sources. The question of why the word “archives” has been adopted to refer to collections of materials is one for a different discussion, but I do think it’s worth noting that this co-opting of the word does seem to be a rather recent development.

I hope the efforts being discussed today encouraging more rigorous assessment of digital historical representations will result in a greater understanding and appreciation of what makes archives distinct from these other kinds of products. I often fear that this appreciation and understanding is being lost as fewer historians work with “old-fashioned” physical archival collections, and do most of their work online, where it is easy to think that all digital collections are the same. The value of the collections of materials preserved in archives often lies in the relationship of the records to each other—what’s called the archival bond—which means that the whole is greater than the sum of the parts. As a whole, the materials provide evidence about the activities of the creator.

In considering the topic of this session, I’d like all of us to consider this as two way street. It’s heartening to see archival concepts such as appraisal and provenance being discussed at an AHA session and so information flow from the archival literature to this audience, and hopefully this will continue.  On a related topic, I’m always interested in hearing how much historians actually know about either archival theory or practice. Anecdotal evidence provided by many of my archivist colleagues suggests that such knowledge is, shall we say, uneven. So that’s another topic that might be worth discussing—how much do historians know about archives and what more would be helpful or necessary to assist in their work.

But I also want to see information flow the other way, and I hope we can get into this a bit in the discussion that follows. That is, I’m interested in learning what digital historiography, that is the study of the interaction of digital technology with historical practice—what can this new field of study and you as historians tell the archival profession—and me specifically. How has the way you do your work changed?  And how can archives and archivists do things differently to assist in that?

Today’s conversation is about how digital technology has changed the way you do your work as historians, and certainly it has also effected the way archivists do our work as well. Among the most significant of those ways is in the increased workload to create descriptions and digital copies to post online, find ways to collect and preserve digital materials, and of course, actively connect with the public via the ever widening world of digital tools and social media. Digital technology has increased the user base for archival resources, meaning that the connection between our historian users and archivists is more diluted than it was in the past. In prioritizing our work and establishing our practices, archivists are trying to meet the needs of the broadest range of users. In so doing, it’s possible that the more specialized needs of historians—if indeed they are different from other users—are not being met. We need to keep an ongoing dialog between our two professions to ensure that we’re all working together as effectively as possible to support the historical enterprise.

I look forward to discussing both archival theory and practice, and hopefully historical practice as well, in the discussion that follows, and in many subsequent conversations.

Again, update: All papers are now available on Michael J. Kramer’s blog.

Sessions of possible interest for archivists at American Historical Association Annual Meeting, Jan. 2-5 in DC

I did a roundup yesterday on Twitter, but here collected in one place for your convenience is my attempt to list the sessions that seem to have a bearing on archives or special collections from the program of the annual meeting of the American Historical Association this January (2-5) in Washington, D.C. As I said on Twitter, the most kickass session will be Digital Historiography and the Archives (ahem, yes, that’s the one I’m part of). However, there are many other good sessions in that same timeslot as well as throughout the conference. The hotel rate is a quite reasonable $130, and I know quite a few of you would find a meeting in DC easy to attend, so I hope to see many other archives people there. I’ve attended this meeting once before and found everyone to be quite friendly, so don’t be intimidated by the fancy academics. Let me know if there are any sessions that should be added to this list.

Continue reading “Sessions of possible interest for archivists at American Historical Association Annual Meeting, Jan. 2-5 in DC”

Please share your “Summer Tips for Visiting Archives” on AHA blog

I was asked to share my thoughts for today’s post on the AHA blog, “Summer Tips for Visiting Archives.” As you might expect, my thoughts were voluminous, but most of my recommendations made it into the post. I hope you’ll add anything that’s missing by posting a comment over on the AHA site. I’m glad to see them sharing this kind of content as well as recognizing that “Archivists are highly trained professionals, not just goody-retrieval machines, and should be seen and treated as partners in your research.”

A Twitter colleague observed that he thought I might get into trouble with the academics for suggesting that an archivist will be more skilled at locating information in his or her own collections than an outside researcher. Time will tell if I get castigated in the comments, but I am willing to defend the professional skills and knowledge of (most of) my colleagues. If a researcher clearly explains what s/he is looking for and the archivist is experienced and knowledgeable about the collections, I’ll give the archivist the advantage.

But, please take a look the post over on the AHA blog and share your advice for researchers there.

Help pick the next book for the archivists’ book group

If you’re looking for an excuse to avoid the beautiful spring weather, why not read a book? Even better, why not read a book and archives and then discuss it with archivists? (While flexing your toned biceps, of course.)

If you’re intrigued, head over the Archivists Reading Together blog and vote in the poll to choose the next book. And if you like, you can go back and see what we’ve said about the previous two books Dust and History’s Babel. 

A question for researchers with experience in the pre-Internet era

I’d like to confirm what I think is a pretty logical assumption about the driver for changes in archival practice. To do this I would like the input of people who conducted research in archives before the glorious age of the Internet. (I am thinking primarily of people conducting scholarly or subject-oriented research rather than people interested in family history and genealogy.)

  • Do you think it’s accurate to say that before the widespread use of the Internet historians and other researchers did not have an expectation that descriptions of all an archives’ holdings would be accessible via the available research tools?
  • Was there an accepted expectation that discovering collections with relevant materials might involve several stages of discovery? If so, what were those stages? Looking in printed sources (like NUCMC), asking colleagues, following references in footnotes, contacting archivists?

As is probably clear from the questions, my hypothesis is that it is the easy and seemingly all-encompassing nature of information available on the web that has driven archivists to seek to provide online access to some level of information about all the holdings in their collections. My assumption is that prior to the Internet there was no assumption that such access would be possible, and that it was expected that there would be what we now call “hidden collections” which would have to be “discovered.”  (As opposed to today when archivists believe that our users expect that some level of intellectual access will be provided online for all materials, and that our users have an expectation that one easy search tool that reveals to them all the relevant materials across archives should be possible.)

Are my assumptions about research practices in the pre-Internet age accurate? Many thanks.

NOTE: There is a different question for archivists with experience in the pre-Internet era posed in the next post.

Guest post: An archivist at THATCamp New Orleans

Thanks, Eira Tansey for this guest post about THATCamp:

One of the perks of living in New Orleans (besides, of course, all the outlets for laissez les bons temps rouler) is the number of conferences coming through town. This brings many opportunities for attending workshops, sessions, and events from outside of the archivist-niche that I normally wouldn’t have the travel funds to access. When the American Historical Association came into town, with a THATCamp during the first day of the conference, I was excited to attend an event that I’d been intrigued by for a long time.

For those who don’t know, THATCamp exists somewhere between a workshop, meetup, roundtable and conference. It is often described as an “unconference.” THAT stands for “The Humanities and Technology.” Going to THATCamp is different than the typical workshop or conference experience, because the schedule is created that day (brief proposals are submitted by participants ahead of time through the specific THATCamp website, e.g. AHA’s THATCamp site). To determine the schedule, each proposer gives brief remarks to the assembled group about their proposal. Following all the proposals, a show of hands is taken to determine interest in scheduling proposals.THATCamp is explicitly non-hierarchical – no one is accorded more or less respect or floor-time based on their professional status.

As with any meeting with multiple sessions, inevitably there are slots with overlapping interesting sessions. THATCamp organizers encourage people to move between sessions if one isn’t holding their attention, and reminded the proposers not to take such actions personally. Moving between sessions has always been my MO at traditional conferences, but it was a relief to hear it so openly embraced in this setting.

The first slot of the day included a discussion on the recently released Ithaka report. Kate has discussed this report before, and I was curious to see what historians had to say about it, given the response generated within the librarian/archivist communities. The turnout for this session was small, but I’d estimate the makeup of the attendees split in half, between librarians/archivists, and historians. As a result, a lot of the discussion centered around library and archival practices, without as much insight into how historians reacted to the report. One of the initial criticisms that came up was the unrealistic expectation that libraries could manage to have more librarians specializing in particular subfields (p. 43). Besides the obvious issue of funding, are librarians and archivists truly obligated to be experts in every possible subfield?

One of the historians noted her frustration with the lack of a centralized location for finding archival sources. The librarians and archivists in the group asked if she had heard of or used ArchiveGrid, and this was new to her. Of course, ArchiveGrid is a fantastic resource but it is only as good as a) archives that can make finding aids available online and b) archives that contribute those finding aids to ArchiveGrid.

A point I brought up was what the problematic phrase “research archivist” (p. 42), based on recommendation #4 to archives:

Historians deeply value the expertise of the research archivist, and archives should ensure that they are devoting adequate resources to engaging actively as interpreters of the collection and important connectors within their subfield. Archivists can play a patron services role in working with historians, and they should be afforded the time and other resources needed to serve researchers in this role. Archives are uniquely positioned to facilitate connections within the community of researchers who use their materials, and should make efforts to support engagement between researchers.

The inevitable question of “When will we have the all-digital archive” came up. In retrospect I have to believe that this wasn’t a serious question, but some of the librarians/archivists in the room pointed out that even if archives were funded at the levels that could even make this conceivable, the massive IP/copyright barriers to “digitizing everything” make it unlikely any time soon.

The proposer of the session raised a point which I think deserves significantly more exploration than we could do justice to in this session: At what point are archivists and librarians collaborators with historians, and at what point are they supporters? In what ways are archivists accorded similar respect and recognition as scholars, and in what ways are they viewed as something akin to helpmates? A few related turns in the discussion included someone asking (paraphrasing) “Where is the incentive for faculty to gain skills that enable them to work more productively with archivists and librarians?” This probably relates back to similar problems within digital humanities (e.g., how can digital humanists use DH projects as evidence for tenure/promotion). Another question was raised regarding whether the Ithaka report would help librarians and archivists get leverage for activities they’re already doing. The librarians and archivists present noted that the distinction that “archivists give you the originals, librarians give you secondary sources” was very artificial.

This was an interesting exploratory discussion, but I have to imagine that the historians who showed up were already interested in the relationships between librarians, archivists and historians. What about the historians who don’t care about those relationships or linkages? (And by extension, how much should that concern archivists?)

I should note that a staff member from (if I recall correctly) the National Endowment for the Humanities was present at this session – NEH helped fund this particular Ithaka report, however more reports will be forthcoming on the changing research practices of other scholars. (I don’t believe the NEH is funding the subsequent reports, but I could be wrong). There was also a session during AHA itself about the report. Unfortunately I was unable to attend that session, but there was a recap and remarks from one of the panel’s speakers. The points raised in these recaps probably deserve their own more developed responses (e.g., if archivists are “decreasingly well positioned to facilitate access to archival materials”, my own gut reaction is that’s due to our funding sources remaining absurdly reduced or stagnant, not because the profession does not want to meet new challenges).

The other sessions I attended during THATCamp including envisioning the teaching spaces of the future, much of which covered the idea that learning is no longer closely aligned with the classroom as setting (clearly, the experience of libraries retooling their spaces as learning commons, workshops, and other active environments has a lot to offer to this discussion), a session on collaborative mapping tools, and a session on programming for historians (in which one of the participants showed off a script he made to identify the box and folder numbers of images he took during archival research).

Attending THATCamp AHA was a great experience – I think it’s critically important for the voices of archivists to be present at conferences such as AHA. Likewise, I think THATCamp is insightful for archivists, since so many digital humanities projects incorporate archival materials. THATCamp is a welcoming atmosphere – regardless of your experience level. I encourage all archivists and librarians to attend a THATCamp. Given how widespread it’s become, there’s probably one coming near you.

**And it probably goes without saying, but the demarcation between “front of house” and “back of house” archivists often and necessarily overlaps. I have a job which ostensibly is that of a primary processing position, but I serve on the reference desk several hours a week, as do all my colleagues. We also have a public services librarian. So, in many archives, often people perform both duties and the line between the two sets of skills can be fuzzy. This points back  to previous points Kate has made that historians would benefit from knowing more about the workflows, hierarchies, and institutional structures of archives-land.

What books about archives should historians read?

In thinking about book groups yesterday, I thought it would be interesting to have historians read books from our discipline to help them learn about archives. So I’ll pose here the question I posted on Twitter, what one book do you think would give historians (and other scholars) the best understanding of what archives are and how they function? And what archivists really do, too, I suppose. Here are some suggestions from Twitter:

I also think that Archives Power might be a good choice.  And although I had issues with some of its conclusions, Fran Blouin and William Rosenberg’s Processing the Past does do a good job of providing an overview of the field. I suppose the answer is that there is no one book, so perhaps putting together a reading list might be a more effective approach. Thoughts on the suggestions so far? More to add?

Archives book group, anyone? Several options & call for suggestions

I’ve been considering doing another book group, with a somewhat different format than the Reading Archives Power one we did a couple of years ago. (Wow. That was three years ago now. Time flies.) I was thinking of trying to  give myself a goal of reading one  archives-related book a month. If I turn this into a book club thing it will give me more of a commitment to really do it, and I’m sure that would be true for others. Anyone interested?

I’ve been interested in reading books about archives and the use of archives written by historians (and other non-archivists), along the lines of:

Other possibilities include books about the digital world, such as:

Or something like:

I think reading books outside our own discipline might be interesting, although of course, there are plenty of wonderful books written by archivists that I have not gotten around to reading too.

In a follow-up post I’ll ask about a different reading list–what books about archives would you must like historians to read? But for now, if you’re interested in participating in a book club for archivists, and if you have ideas about what you’d like to read, please leave a comment.