web archiving resources for NDSA NE crew (and anyone else reading this!)

This list of resources is shared as a compliment to a presentation I gave at the NDSA New England meeting on September 25, 2015. The presentation discussed the MIT Institute Archives’ efforts to acquire websites without a hosted service. I talked about how technology is important, but policy development and planning are key activities that can be accomplished even if new technology isn’t possible right away. The presentation also highlighted the tools we’re finding useful that are easy for an archivist with limited programming skills to use (web recorder, wget and web archive player). I’ve previously talked about some of these activities on ArchiveHour, see that post here.

P.S. Every time I think I’ve got a handle on the essential web archiving resources, I find out about something new. I also realize that a lot of work has gone into web archiving development long before it was something I first learned about in 2013. With this in mind, it’s quite possible that a lot of good stuff is missing from the following list — please add resources you love in the comments or alert me of my ignorance via contact page. =) thank you!

Get Started

  • International Internet Preservation Consortium (IIPC) website – What is web archiving?
  • IIPC blog post (2015), Ian Milligan – “So You Want to Get Started In Web Archiving?” Provides an excellent list of blogs to follow.
  • Archive-It Web Archiving Life Cycle – the examples are specific to Archive-It service and partners, but in any case the life cycle breakdown and concepts are helpful to think about the range of activities and policy that go into a web archiving program.
  • DPC Technology Watch 13-01, (2013), Maureen Pennock “Web-Archiving”
  • NDSA 2013 Web Archiving in the United States survey report

Learn More 

  • Columbia hosted web archiving meeting June 2014 recorded some sessions – available on YouTube
  • The Future of Web Archiving panel at the Digital Preservation 2014 meeting is available on YouTube.
  • Capture all the URLs” (2014) paper by Alexis Antracoli, Steven Duckworth, Judith Silva, Kristen Yarmey.
  • Archiving the Web: A Case Study from the University of Victoria (2014) Code4Lib paper by Corey Davis
  • Development of University of Michigan web archives, 2011 SAA paper by Mike Shallcross. Find it here.

Stay Tuned

Tool Information & Guides

Web Recorder & other tools created by Ilya Kreymer

wget – if you’re like me, this tool might require some googling to figure it out! Use version 1.14 or newer to use a command to create WARC file output.

Other tools/information:

Online Documentation 

The following institutions provide collection policies, frequently asked questions, and other program information via their websites. Thank you, folks!

Know of others? Add them in the comments, please!

Researcher Perspectives

  • Web Science and Digital Libraries research from Old Dominion University – blog
  • University of Waterloo historian and researcher – Ian Milligan blog
  • Web Archives for Historians blog – maintained by Ian Milligan and Peter Webster

Upcoming Events

Web Archives 2015: Curate, Capture, Analyze, hosted by the University of Michigan Library in November.

Thanks to UMass Dartmouth and Brown for hosting the NE NDSA meeting this year.

Updates – resources found after this post went live:

  • ALA connect webinar in April 2015 by Lisa Snider.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s