Winners: Best Use of Crowdsourcing for Description

Next, the spot light turns to the winners of the Best Archives on the Web awards in the category Best Use of Crowdsourcing for Description. This is the definition of the category:

Whether through Flickr, wikis, blogs or allowing users to comment on descriptions in their online catalogs, many archives are starting to harness the power of their regular researchers as well as experts around the world to help augment or create descriptions for their collections. This award will recognize crowdsourcing efforts that have resulted in a significant exchange of information for the institution.

The judges selected one winner and singled out one nominee to receive an Honorable Mention. And they are . . .

Winner: Nederlands Instituut voor Beeld en Geluid (Netherlands Institute for Sound and Vision)

The description of the project from the nomination statement:

“To explore the impact and success criteria of social tagging in the audiovisual heritage domain, a large-scale video labeling pilot, Waisda?, was launched in March 2009. The goal of Waisda? (which translates to ‘What’s That?’) is to collect user tags that can help bridge the semantic gap, to collect time-related metadata, and to offer people a new way of interacting with television programs, thus creating a connection with the television archive. Waisda? is the world’s first operational video labeling game in the cultural heritage field.

Waisda? invites players to tag what they see and hear. They receive points for a tag if it matches one their opponent has entered within a time frame of ten seconds. The underlying assumption, based on the ‘Games with a Purpose’ by Luis von Ahn, is that tags are most probably valid if there’s mutual agreement. Waisda? introduced three innovations: Using gaming as method to annotate television heritage, actively seeking collaboration with communities connected to the content, and using curated vocabularies as a means to integrate tags with professional annotations.

From the launch in March 2009 to November 2009 (period of the evaluation, the website is still operational, see the WebScience paper by Oomen et al. for more information), over 340,000 tags were added, of which 40.3% consists of matching tags (added by different players within the ten second time frame. In total, 42,068 unique tags have been added.

Waisda? was executed by the Netherlands Institute for Sound and Vision and KRO Broadcasting (Dutch public broadcasting organization). The Business Web & Media Group of VU University Amsterdam performed additional research on topics such as game play and tag quality. (They carry out research in light of their involvement in the PrestoPRIME European research project.) The software company Q42 built the application.”

Waisada? received a lot of love from the panel of judges: “Looking through the site I just wished that I knew Dutch, so that I could play. In some ways it reminded me of the Google Image Labeler game, but its application to video content was novel. Based on the nomination form and the accompanying papers, it appears that the data gathered through the game has in some cases been very useful to enhance the description of the videos. I also appreciated the work that the project team had gone through to market the site to their desired audience, including their use of social tools such as Twitter.” The rigor of the evaluation and documentation, as well as the sheer fun of the project, were key in helping snag the win for Waisda?. Also, it’s not every nomination that gets this response from a judge: “I also very much enjoyed watching the Dutch reality show about the farmer.”

Resources in English:
– Background on the game and an English summary of the evaluation can be found on the Images for the Future blog.
– Also, two papers on Waisda? were presented at the WebScience conference this year in Raleigh, N.C..

Honorable Mention: PhotosNormandie on Flickr

Longtime readers may remember that I wrote about the PhotosNoramandie Flickr group back in April 2009. Then, as now, the group exists because of the volunteer efforts of two people with the talent and the interest to make it possible-Patrick Peccatte and Michel Le Querrec-and because of the flexible and popular platform that Flickr provides. The purpose of PhotosNormandie is simple–to make archival images of the Allied invasion of Normandy more easily discoverable by more users and to attempt to correct and supplement their existing metadata. The fact that this takes place entirely outside the archival context makes it both more interesting and perhaps more threatening. Patrick and Michel represent no archives, but rather the kind of passionate amateurs who choose to devote their time to advancing knowledge about archival materials. The lack of a connection back to the original archival collections troubled the judges, but they noted that “this project does a lot of things right– in particular harnesses an existing community and tech infrastructure rather than trying to reinvent the wheel or try to get people to a website where they wouldn’t regularly go.”

And so, congratulations to our two notable European examples of using crowdsourcing for description!

“The Semantic Web: What It Is and Why It Matters” and “Linked Data & Archival Description”

Via the great Mashable site, a video explaining the Semantic Web (sometimes referred to as “Web 3.0”).

Web 3.0 from Kate Ray on Vimeo.

Putting this in an archival context (although without any snazzy music), see also from the great Mark Matienzo:

If you find yourself wanting more, Mark has some other presentations available on Slideshare about the possibilities of linked data for archives.

How should NARA support user contributions to enhance description of collections?

Note, that’s not “Should NARA support user contributions,” it’s “How should NARA support user contributions.” The time has come for our National Archives to start drawing on the collective wisdom and energy of the Web to enhance its online descriptions. The question is, how should that best take place?

In considering this question, I was reminded of a previous post about how the “space” in which interaction takes places affects the quality/quantity of interaction. Building on that discussion, I can see several possible ways to proceed (although I’m sure there are others).

First, allowing users to add tags, comments, and additional information to the catalog records in ARC or to other descriptive information on the NARA site. Questions immediately arise about the level of moderation this would entail, both to avoid information with no value and potentially offensive information. Does the question of moderation arise if only tags are permitted? I think it would, although it certainly might involve less time. I would be surprised if NARA would allow users to post information on their site (even if the information were clearly differentiated from NARA-provided data) if it did not go through a moderation process, wouldn’t you? This also requires that users add their information within the current descriptive structure (Record groups, series, file units, etc. and as well to the the records for people and organizations). So this option is essentially allowing users to annotate and supplement NARA’s information within NARA’s current descriptive products.

A second option would be creating a separate space, still controlled and moderated by NARA, dedicated to collecting user-provided information along the lines of The National Archives (UK)’s Your Archives wiki. The advantage of this option is that it clearly separates user-provided information from “official” information, and also allows the user community more freedom in how it structures the information it provides (at least in the wiki model, users can add pages, etc.). In such a model there might be a greater reliance on the kind of community policing one sees in Wikipedia, where inaccurate information is identified and deleted by the community of interest for the topic. Clearly this kind of site would also have to be monitored or moderated. And, of course, it wouldn’t have to take the form Your Archives does, of one large resource that is sub-divided. Smaller topical “spaces” could be established, perhaps around areas that have an active community of interest (or for which information is particularly needed).

Another option would be to directly solicit the participation of researchers in the description of materials. If a researcher is working with a given group of records, there’s a good chance he or she may know more about the materials than the description reveals. Why not provide them with a template for providing descriptive information (and guidance about what information to provide) and let them take a crack at adding more to the description provided? Yes, of course, all of it would have to reviewed and some of it might be worthless, but there are many highly skilled researchers who might be able to provide either relatively complete descriptions or at least valuable supplementary material. NARA may even have developed its own online tutorials for its staff about how to write descriptions, which could be easily reworked as a tool to train researcher volunteers.

I don’t know about the viability of this idea, but I’ll throw it out there anyway. As a way of possibly mitigating “frivolous” tags, comments, and notes in Option #1, provide a “space” that’s dedicated to adding personal or creative content to collection descriptions. A place where people can essentially have tools to remix or annotate NARA’s content any way they want. (Yes, again, within the terms NARA would have to establish to ensure people weren’t creating offensive products.) But think about the potential for that one–galleries, exhibits, videos, performances? If it actually took off it could even the kind of thing where notable examples were highlighted on a regular basis. And, while we’re at it, why not actually make this area a larger playing field and have it also draw from the collections of the Smithsonian and the Library of Congress? That’s an idea, isn’t it?

But I wandered away from the issue of description. Still, providing an area for “play” might help keep the “serious” area more serious. Just a thought. Similarly, providing designated “discussion spaces” (in either of the first two options) might provide a channel for debate or information exchange other than the comments on the descriptive information.

In the comments on the earlier post I referred to above people also discussed the need to consider collaborative sites that the archives doesn’t control, created by communities of interest–either scholarly or not. “Partnering with a community, in a neutral space, as power equals” as one wise commenter put it. I feel as though I’m once again wandering away from the topic of user contributions to descriptions, but not entirely. Communities might be more inclined to share their knowledge in a space where they are “power equals.”

So, I’ve provided you with a range of options, from small steps that have already been implemented elsewhere to possibilities that might not yet exist anywhere. Can you add any other possibilities for harnessing the wisdom of NARA’s users? Which of the ideas above do you think has the most promise?

You can now buy a copy of “Web 2.0 Tools and Strategies for Archives and Local History Collections”

Yes, my book is now available—there’s a Neal-Schuman version (published in the US) and a Facet version (published in the UK). You can find them both on Amazon. Let me make this clear: this is not a scholarly book. I wrote the book that contained everything I thought anyone needed to know who was thinking about implementing social media in their archives, special collection, historical society, or local history collection. I wrote it to be practical. (You want a scholarly book? I’m working on that one for SAA. It’s going to be, if I do say so myself, really good. But that’s a whole different blog post.) You can see for yourself by looking at the table of contents on the Neal-Schuman site.

As I say in my Acknowledgments:

This book would not have possible without my own social network of friends and colleagues on Facebook and Twitter, and the wonderful community of people who have engaged in discussion of these issues with me on my blog, ArchivesNext. A friend joked that this would be a crowdsourced book, and in some ways, it is. The world of Web 2.0 is too large for anyone to keep up to date on everything that’s happening, and so I am happy to be part of a community of archivists working toward integrating Web 2.0 technology and thinking into our archival institutions.

One of the things I’m most pleased with are the interviews with so many archivists who have successfully implemented Web 2.0 tools. These interviews are usually a couple of pages long and focus on their own experiences and lessons learned. My thanks to these lovely people who contributed interviews (in order of appearance):

Sara Piasecki, Oregon Heatlth & Science University
Stephen Fletcher, University of North Carolina at Chapel Hill
Gavin Freeguard, The Orwell Prize
Emma Allen and Joshua Shindler, The National Archives (UK)
Heather McClenahan, Los Alamos County Historical Society
Lin Fredericksen, Kansas State Historical Society
Julie Kerssen, Seattle Municipal Archives
Amy Schindler, The College of William and Mary
Katrina Harkness and Joshua Youngblood, State Library & Archives of Florida
Mark E. Harvey, Archives of Michigan
Ann Cameron, Gill Hamilton and James Toon, National Library of Scotland
David Hovde, Purdue University
Matt Raymond, The Library of Congress
Lauren Oostveen, Nova Scotia Archives
Molly Kruckenberg, Montana Historical Society
David Smith, Archives New Zealand
Tracey Baker, Minnesota Historical Society
Michele Christian, Iowa State University
Colleen McFarland, University of Wisconsin—Eau Claire
Tim Sherratt, National Archives of Australia
Matthew Davies, National Film & Sound Archive (Australia)

When I was writing the book I wanted to include as many real-world examples as possible to illustrate the different things archives and historical organizations are doing on the web. It was only when I was compiling the index that I realized just how many places I referenced. Here, for my amusement, and I hope yours, is a list of all the archives sites and organizations mentioned: Continue reading “You can now buy a copy of “Web 2.0 Tools and Strategies for Archives and Local History Collections””

Help build a library of current archives resources

This post is long overdue, so I apologize to my Scottish kinfolk for not getting it up sooner.

The brilliant people at the Centre for Archive and Information Studies (CAIS) at the University of Dundee are trying something new–developing a “link library” of online materials relevant to CAIS students. You can read a full description in their own blog post here and you can see the work in progress at Delicious here. But, in short:

All we’re asking is that whenever you save a bookmark on a record keeping or related subject with your own delicious account (which is free) you tag it ‘for:CAIS_Archives’ (without the inverted commas). That sends the bookmark to our inbox and we can then save it for inclusion in the main list. The reason we’ve used this approach is so that we can keep a modicum of control over the vocabulary we use for tagging. As the list of links grows the tags will become crucial for discovery. However, we will take into consideration any tags you have already attached to the link.

It’s a project that has potential value far beyond CAIS’s own needs, so I encourage you to start sending links to the CAIS team via Delicious. (Don’t have an account yet? Here’s their getting started info.)

Contest: Looking for parties in Flickr

As you might have seen, I’ve been having a great time making Galleries in Flickr, drawing on the collections of archives, special collections, museums, and related organizations. It all started with dogs, but then moved on to things like elephants, goats, typewriters, umbrellas, and, inevitably, more dogs.

I’ve been wanting to do a gallery of images showing parties or celebrations, but I’m afraid that a lot of you won’t have tagged those images with those kinds of terms. (When I start a gallery, I turn first to the ArchivesOnFlickr group and search there for my terms. Then I search through the images on the Commons, and my last resort is a general search of all of Flickr.)

So, I’m asking for all of you to send in links to your images on Flickr from your collections that show parties, celebrations, or just people having a good time. I’ll take the ones I like best and create a new Flickr gallery (or maybe two, if we get enough good submissions). I’ll leave this open until the day after Thanksgiving, so you’ve got plenty of time to come up with some suggestions. Note that these must be images from your archival collections. (I guess I could start a second gallery for current images of parties in the archives. Do you think we could find 18 of those? 😉 )

So, the contest has officially begun! Post links to your images that best show people having fun, and you may win a place of honor in one of my galleries!

Four “places” for archives to interact with users

So, building on the post from a few days ago with Clay Shirky’s observations about how to create a situation that will lead to fruitful collaboration with online users, I’m going to talk about four different “places” where archives invite user participation, and the kinds of implicit and explicit social contracts created by them. This is the first time I’m presenting these ideas, which need a lot more work and thinking through, but in the spirit of collaboration, I’m sharing them here in their raw form. Continue reading “Four “places” for archives to interact with users”

It’s not really ready, but here’s the Archives 2.0 Wiki . . .

Ok, yes, I’ve been dragging my feet on this forever and have decided to walk the walk and not just talk the talk–so, even though it has lots of problems and it’s not perfect and I haven’t added all the content I want to …. here’s the Archives 2.0 wiki.

This is a replacement for the list of archives doing 2.0 stuff that I’ve had on this blog, but which was getting too long to be easily navigated. And the number of archives using these tools has exploded and I can’t keep up. Which is great news, and makes the wiki a natural step.

Like many archivists, I’m a bit of a control freak so I’m not super comfortable with opening the site up to editing. Because there’s a structure that I want to maintain and I’m not sure people will follow it properly. But, ok, let’s give it a try. Although the site has disclaimers that say you can’t edit it, right now, you can (if you register), so if you have something to add, please do so. If I find this isn’t working out, I’ll change the permissions and accept submissions via the email address: wiki [at] — which you can also use if you’re not comfortable editing and you want me to add your link. But be aware that it may take me a while to get around to it.

So, here it is, add your stuff and let me know if you think there should be new pages or sections. Remember it’s a work in progress. Hope you find it useful!

U.S. National Archives finally joins the crowd on Flickr

Apparently in the last few days the National Archives and Records Administration (NARA) has joined the ranks of archives–big and small–who are sharing their images via Flickr. NARA is currently sharing 195 images, representing “Favorites of the U.S. National Archives,” “Women’s Bureau Photographs” (only 12), and two sets from the DOCUMERICA collection–“DOCUMERICA Favorites” (consistently oddly of only 3 images), and a set focusing on the DOCUMERICA images on one photographer–Michael Philip Manheim.

NARA has made some good choices–allowing users to add tags and notes to the images. They have supplied only basic tags for some of the images (although others are most extensively tagged), but have included subject headings in their descriptive metadata. The images are classified as having “no known copyright restrictions,” which is the same classification that images in the Flickr Commons also bear. These NARA images are not in the Commons, although it would seem they clearly belong there (along with so many of the ArchivesOnFlickr). It would be interesting to get a public statement from NARA about why they’re not joining the Smithsonian Institution and the Library of Congress in the Commons–or at least what the hold-up is. It would also be interesting to hear what their overall population plan is for Flickr–why were these images posted first and what other kinds of images can we expect to be added?

I’ve heard that changes are underway at the Commons that will make it easier for organizations to join, but haven’t seen any confirmation of that, so perhaps soon NARA and the rest of the archives on Flickr will be able to join in on the fun in the Commons, but in the meantime, enjoy the images, share your thoughts about this new addition (and any other Flickr issues) in the comments here, and spread the word–and yes, Nixon and Elvis are there . . .

Future of archives? “Passionate amateurs” doing “detailed curating”?

Today, via the miracle of the Intertubes, I am getting to observe what is happening at a meeting about “Smithsonian 2.0.” Here is the official meeting site, here’s what I believe is the project’s authorized blog, but most importantly, you can read what attendees are writing about the meeting on Twitter (#SI20).

Just a few minute ago, Dan Cohen, was describing the talk being given by Chris Anderson (of “Long Tail” fame), and, describing a thought of Anderson’s tweeted this:

SI could even do live crowdsourcing. Passionate amateurs would prob travel to look at archives and do detailed curating

I re-tweeted this, since I’ve got some archivists “following” me now, saying that I was skeptical and asking for others’ thoughts. Well, what do you think, blog readers? This has been talked about before, but as I said, I’m skeptical. For records that have genealogical value, you might be able to find people to do this, but I think that might be the extent of it. Am I too pessimistic?

Or, let’s break that question up. First, do you think there is a large enough pool of “passionate” people (forget the amateur part) who would be willing to travel to your archives and do detailed cataloging and then share that with the world? (Not even considering how you would be integrating that “curating” into your existing structure for description.) Second, what if the materials were online, and no travel were necessary? There are some precedents for this–users adding value to digitize materials–with Flickr and How realistic do think expectations like this are–particularly in regards to people actually coming in to the archives to do this kind of work?

P.S. This is a nice example of people using 2.0 tools like blogs and Twitter to expand the discussion taking place a meeting with limited attendance to a much wider audience. Again, think about the usefulness of such tools for Austin this summer.