How should NARA support user contributions to enhance description of collections?

Note, that’s not “Should NARA support user contributions,” it’s “How should NARA support user contributions.” The time has come for our National Archives to start drawing on the collective wisdom and energy of the Web to enhance its online descriptions. The question is, how should that best take place?

In considering this question, I was reminded of a previous post about how the “space” in which interaction takes places affects the quality/quantity of interaction. Building on that discussion, I can see several possible ways to proceed (although I’m sure there are others).

First, allowing users to add tags, comments, and additional information to the catalog records in ARC or to other descriptive information on the NARA site. Questions immediately arise about the level of moderation this would entail, both to avoid information with no value and potentially offensive information. Does the question of moderation arise if only tags are permitted? I think it would, although it certainly might involve less time. I would be surprised if NARA would allow users to post information on their site (even if the information were clearly differentiated from NARA-provided data) if it did not go through a moderation process, wouldn’t you? This also requires that users add their information within the current descriptive structure (Record groups, series, file units, etc. and as well to the the records for people and organizations). So this option is essentially allowing users to annotate and supplement NARA’s information within NARA’s current descriptive products.

A second option would be creating a separate space, still controlled and moderated by NARA, dedicated to collecting user-provided information along the lines of The National Archives (UK)’s Your Archives wiki. The advantage of this option is that it clearly separates user-provided information from “official” information, and also allows the user community more freedom in how it structures the information it provides (at least in the wiki model, users can add pages, etc.). In such a model there might be a greater reliance on the kind of community policing one sees in Wikipedia, where inaccurate information is identified and deleted by the community of interest for the topic. Clearly this kind of site would also have to be monitored or moderated. And, of course, it wouldn’t have to take the form Your Archives does, of one large resource that is sub-divided. Smaller topical “spaces” could be established, perhaps around areas that have an active community of interest (or for which information is particularly needed).

Another option would be to directly solicit the participation of researchers in the description of materials. If a researcher is working with a given group of records, there’s a good chance he or she may know more about the materials than the description reveals. Why not provide them with a template for providing descriptive information (and guidance about what information to provide) and let them take a crack at adding more to the description provided? Yes, of course, all of it would have to reviewed and some of it might be worthless, but there are many highly skilled researchers who might be able to provide either relatively complete descriptions or at least valuable supplementary material. NARA may even have developed its own online tutorials for its staff about how to write descriptions, which could be easily reworked as a tool to train researcher volunteers.

I don’t know about the viability of this idea, but I’ll throw it out there anyway. As a way of possibly mitigating “frivolous” tags, comments, and notes in Option #1, provide a “space” that’s dedicated to adding personal or creative content to collection descriptions. A place where people can essentially have tools to remix or annotate NARA’s content any way they want. (Yes, again, within the terms NARA would have to establish to ensure people weren’t creating offensive products.) But think about the potential for that one–galleries, exhibits, videos, performances? If it actually took off it could even the kind of thing where notable examples were highlighted on a regular basis. And, while we’re at it, why not actually make this area a larger playing field and have it also draw from the collections of the Smithsonian and the Library of Congress? That’s an idea, isn’t it?

But I wandered away from the issue of description. Still, providing an area for “play” might help keep the “serious” area more serious. Just a thought. Similarly, providing designated “discussion spaces” (in either of the first two options) might provide a channel for debate or information exchange other than the comments on the descriptive information.

In the comments on the earlier post I referred to above people also discussed the need to consider collaborative sites that the archives doesn’t control, created by communities of interest–either scholarly or not. “Partnering with a community, in a neutral space, as power equals” as one wise commenter put it. I feel as though I’m once again wandering away from the topic of user contributions to descriptions, but not entirely. Communities might be more inclined to share their knowledge in a space where they are “power equals.”

So, I’ve provided you with a range of options, from small steps that have already been implemented elsewhere to possibilities that might not yet exist anywhere. Can you add any other possibilities for harnessing the wisdom of NARA’s users? Which of the ideas above do you think has the most promise?

Be Sociable, Share!

5 thoughts on “How should NARA support user contributions to enhance description of collections?”

  1. Two things immediately spring to mind –

    1. Support machine tags. Machine tags are smart tags – tags which have semantic relations embedded in them. So instead of just tagging a record with a name – ‘John Smith’ – you can include a url which uniquely identifies a particular John Smith (through WorldCat or VIAF or People Australia etc). Similarly you might add geocodes to identify a location. Perhaps even more interestingly, users could use machine tags to indicate relationships *between* records.

    Machine tags give users a lot more power than normal tags. They can develop their own descriptive frameworks. Groups of researchers might organise around certain vocabs – as astronomers have done using machine tags on Flickr. But machine tags are also pretty foreign looking – so there needs to be tools developed to help in their construction and use (like my identity browser http://wraggelabs.com/people/ ).

    2. Facilitate descriptive activity *not* on the NARA site. More and more description is going to be happening elsewhere, this needs to be encouraged and then harvested back into existing finding aids. The obvious example is Flickr of course. If Flickr photos are (machine) tagged with an appropriate identifier then its an easy matter to harvest any user comments or tags and display them in the collection db. In fact I used NARA in a demo of how this could be done with a little Greasemonkey script – http://discontents.com.au/shoebox/archives-shoebox/harvesting-context-1 .

    But what about beyond Flickr? I’m not sure if there are Zotero translators for NARA dbs, but I think there are. If so, users can tag and annotate record details in their own research dbs and then share them with the world through a Zotero public group. As I’ve suggested a number of times in relation to the NAA, this means people have the power to create what is effectively an alternative, user-generated, finding-aid.

    There is already a basic API to access records from the Zotero server. As this develops it should be possible for NARA, and anybody else to harvest tags and annotations back into the NARA site as with the Flickr comments.

    What about other potential sources of descriptive value – blog posts? If blog posts could be tagged with a unique identifier that relates them back to a particular record, these could also be harvested. Citations? As the archival literature states, footnotes are still one of the main methods of discovery used by researchers. Why not mine footnotes of online articles to build dynamic ‘if you enjoyed this record you might also enjoy…’ type lists.

    There’s some basic infrastructure involved of course – primarily good quality persistent identifiers, that can be easily exposed and used.

    I think we have to think beyond what people can do on the NARA (or any other archives) site to how can archival finding aids take advantage of all the descriptive value being added out in the rest of the web.

  2. I have a number of ideas, so I am going to just throw them in the mix in no particular order:

    1) One example of an interesting collaboration model that respects the original content yet encourages exploration of the feedback of others is eComma: http://ecomma.cwrl.utexas.edu/0.2.0/ (currently in use for collaboration around the Collaborative Rubáiyát: http://scholar.hrc.utexas.edu/rubaiyat/ ).

    2) I like the idea of a separate space for personal additions – but I would want an clear method via which information can be earmarked for review by NARA staff for promotion into the ‘official’ description. This review could be triggered via request or by the meeting of some threshold such as a certain amount of descriptive content being added.

    3) I agree that going where the people are (such as the Flickr Commons model) is useful and likely to generate broader participation than assuming people will come to the NARA site to tag and comment on content.

    4) Designate an official Web Annotation tool (see here for a good list of example tools: http://en.wikipedia.org/wiki/Web_annotation ) for use on the NARA site with clear communication that the content will be reviewed for promotion to the official description.

    5) How about working with Footnote.com to pull descriptive content added about NARA records via their already existing platform? This page has a list of annotations on the Amistad Federal Court Records: http://www.footnote.com/documents/6268041/amistad_federal_court_records/ . I am not aware of any existing plans to pull annotation content back into NARA’s systems – anyone know better than I? I also wonder to what degree Footnote supports the type of machine tags that wragge mentions above. They do seem to have the concept of known people – such as this page for George Washington: http://www.footnote.com/page/110513005_george_washington/

    I am excited to hear what others think of ALL the ideas suggested.

  3. A great post! I’d like to second wragge as I think supporting “descriptive activity *not* on the NARA site” is probably a more important step.
    You could expose the data in the catalogue via an api, or an RSS feed, or even a *big* download.
    We’ve seen examples of this outside the archives world. For example, LibraryThing takes advantage of public apis to the LOC catalogue to build its bibliographic database (and online community). Sydney’s Powerhouse Museum has a “Download the catalogue” button on its search page (http://www.powerhousemuseum.com/collection/database/download.php ). And of course there is DigitalNZ.
    By releasing archival descriptive data for use by third party sites you’d create the potential for not just web 2.0 services, but also web 3 semantic stuff maybe, and perhaps even finally realise web 1.0 pipedreams like archival portals and federated search.
    Of course to get the most out of archival data you’d want greater standardisation, of descriptive practices as well as persistent IDs (EAD/EAC??).

  4. Thanks to everyone for your great ideas!

    The many suggestions about harvesting or otherwise gathering information about the records created outside the archives-controlled spaces would essentially develop a new role for the archivist as a kind of curator of descriptive information, wouldn’t it? I mean, not all of the information harvested would be appropriate for addition to the formal catalog. Someone would have to filter and judge what information become part of the official description. However, as you point out, the official description is perhaps only one option out of many to learn about the materials.

    Jeanne, I thought of harvesting back the Footnote (and Ancestry) user-added data too, but I think there’s a clause in the agreement with Footnote (and Ancestry) that specifically states that Footnote retains ownership of the user-provided content. If I’m right, then this isn’t an option.

    Richard, the catalog data is now available for download from data.gov – but are you thinking of something that’s easier to deal with?

  5. Kate, I had completely missed the catalogue on data.gov, that’s fantastic!

    It is that sort of initiative that I think should be the focus: build great websites for sure but do it with a showcasing mindset and plan for the really creative use of your material to happen elsewhere. Rather than being the ultimate service provider, focus on providing authentic and accessible content.

    I looked at the catalogue data on data.gov and it is great but, as Berners Lee suggests (http://www.w3.org/DesignIssues/GovData.html), “just doing it” and sticking your raw material online is really the first step. NARA could go further here, for example, the file format might be improved (it is a series of NARA-specific XML files); Berners Lee would probably want it in RDF but what about standard archival descriptive formats like EAD? And how usable is it? Is an annually produced, super-large download going to satisfy the needs of most users? Perhaps a query-able API that returns results in a standard format like opensearch would be easier to adopt by users with bandwidth or storage constraints.

    And, as you and Jeanne note with reference to footnote.com, having released your data what you then want to do is make sure you get some value back, that it isn’t simply a one-way street. Today’s O’Reilly Radar blog has an interesting post on kind of this issue: it is suggests taking an “open source” mentality to “open data” and aiming to “collaboratively” build datasets – http://radar.oreilly.com/2010/03/truly-open-data.html

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.