reading notes: what exactly are we trying to capture?

Davis, Corey. “Archiving the Web: A Case Study from the University of Victoria.” Code4Lib Journal, no. 26 (2014). Accessed November 4, 2014.

 ‘It’s not your grandfather’s web anymore.’

-Negulescu and Rosenthal qtd. in Davis

I found this article to be a great starting point for my exploration of web archiving. Davis provides excellent background on web archiving as well as the areas of interest in developing a web archiving program. In particular, I appreciated Davis’ explanation of why dynamic websites are so difficult to capture well. In light of this difficulty, Davis wonders if we might be able to encourage website creators to build sites that are “optimized for web archiving.” Overall, that kind of task seems daunting. However, it might be possible to work with local website creators (at one’s institution) regarding needs for web archiving.

Davis also briefly discusses the nature of web documents and websites as objects to collect – are they archival objects with original order or discrete objects? This is a key question to grapple with when collaborating with colleagues (from library land and archive land) on development of web archiving initiatives.

Things for future consideration:

Description and arrangement: What kind of metadata fully captures the context of the site overtime? How would web archives of a university domain fit into existing institutional records? Would it need to? How should web archives be represented in archival description? Are most web archives at this point just stand-alone topical collections?

Use cases for web archives: Do you need to have expressed need for web archives before investing in the efforts? If you build it, will they come?  I think it’s important to look at existing collecting strengths and policy – and archive the web accordingly. Maybe the really big question is – how should frequency of crawls be determined?

The big question raised: Davis asks: “what exactly are we trying to capture? … This database—which represents the majority of the project’s human effort—arguably has more value than the website itself.”

I’m just going to keep thinking about that for now…

Today’s coffee: New England Coffee        

The reading notes posts found on this blog are intentionally question-filled and causal. Each notes post serves as a sort of open journal record of my professional development reading as the MIT Libraries Fellow for Digital Archives. See the introduction post for more on this series. I welcome suggestions for future readings—current or archival!


digital distinctions – NE NDSA meeting

On Thursday I attended the New England regional National Digital Stewardship Alliance meeting at U-Mass Amherst. The meeting consisted of morning presentations, afternoon lightning talks, and open discussion. The organized talks included information on the Archivematica/DuraCloud pilot project, Dataverse services, taxonomy, and collaboration in digital preservation. We also heard from the current Boston NDSR residents.

During the discussion time, my group talked about handling preservation for digitized vs. born-digital content. In our allotted thirty minutes, we covered a lot of ideas and personal experiences that fit into this general topic. We discussed differences in digitization for print materials (books) and analog AV material—noting that the value of the digitized product varies greatly across content type. We wondered about how to prioritize content for various levels of long-term preservation action. We considered differences between licensed born-digital library materials, research data, and various born-digital content found in archives. In the end we didn’t reach specific conclusions, but we posed three questions to the overall group.

  1. How do we prioritize preservation actions (selection, reformatting, processing, long-term storage, reappraisal/deselection) for digital content (born-digital, born-digital legacy media, digitized analog AV media, digitized print/photographs)?
  2. If it’s born digital, is it more valuable?
  3. How do we highlight the importance of long-term digital preservation at the outset of research, object creation or digitization — rather than pushing quick for access and leaving digital preservation as an after-thought?

I look forward to the next NE NDSA regional meeting!

Today’s coffee: New England Coffee


Hope you have a spooky Halloween!

Introduction: Deliberate Reading

On a chilly May day earlier this year, I graduated from the University of Michigan’s School of Information ready to enter the exciting world of libraries and archives and prepared to wrangle information in digital formats for preservation and access.

As I searched for a full-time position, I continued working as a digital processing assistant at a U-M archive. Over the summer, I developed valuable skills with digital curation tools and policy development. I spent a lot of time skimming blog posts and skipping around listserv emails with special attention to all things digital curation related. This was very useful to my work, but without course syllabi and assigned readings I often felt a lack of connection to larger conversations about the role of archives and archivists in society.

I decided I needed to step-up my professional reading game. Less skimming. More structure. And coffee. Thus, I decided to create a “coffee hour” schedule of reading on a weekly basis. During this hour, I would commit to reading one article, report or blog post–in it’s entirety. Followed up by a quick recap or thoughtful response as appropriate in a journal.

I happily put this plan on hold when I discovered I would be joining the MIT Institute Archive and Special Collections team as the Fellow for Digital Archives this fall. Now that I’m somewhat settled in to life in New England and my new professional role, I want to take up the deliberate reading charge once more.

This blog will be host to my coffee hour reading responses as well as occasional posts about my experience as a new professional and a MIT Libraries Fellow.


Today’s Coffee: Peet’s brew from Bosworth’s Cafe