The Archive-It Blog is moving!

To our loyal followers via and RSS,

The Archive-It blog is getting ready to move to it’s new home, on! The new URL will be: Please update your bookmarks. At this time the new blog will not be RSS compatible, but we are working on it.

Follow us on Twitter at to learn about new posts and keep in touch!


How to archive like a 5th grader

“Doing a project based on archiving is not as easy as it looks. It takes a lot more skills to finish a job like this. My Web Archiving experience was amazing and will always be archived in my mind.”   – 5th grade student, NYC Public School 174, Queens

photo 3

5th Grade students from NYC Public School 174, Queens

The K12 Web Archiving Program is wrapping up its 7th year of collaboration with 5th through 12th grade classrooms from around the country.  We’re very excited to show off the collections that the students have created,  on such topics as Recreation for Teens in Northern Virginia and Food Central. You can browse the collections on the K12 Web Archiving Program website:

This program provides a new perspective on saving history and culture, allowing students to actively participate and make decisions about what content will be saved.  But don’t take our word for it! The 5th graders from NYC Public School 174 in Queens, New York had a lot to say about their involvement in the program.

“Being part of the Web Archiving project was a dream come true. I got a second chance because we participated in third grade! I loved looking up and archiving websites to add to my collection. I learned a lot of things from this, such as how to crawl websites and what the hard parts of archiving were.”

“Participating in the Web Archiving project helped us learn more about our modern day technology and what we really wanted to save.”

“The Web Archiving Project was a fun while difficult experience. I enjoyed searching for all of the websites in the many different categories we had. The day we finished we were very happy We all jumped up And high-fived. I think the project improved my research skills a great deal. Trying to find all of the different websites and writing descriptions for all of them was a great way to improve my thinking. Overall, I think the project was a great experience and helped me very much.”

2014 Election archiving collaboration takes off

By Sylvie Rollason-Cass, Partner Specialist

This year, the Archive-It team is partnering with researchers Michael Dougal and Ryan Hübert at the University of California, Berkeley, Karen Jusko, Allison Anoll, and Mackenzie Israel-Trummel at Stanford University, and Mike Parkin at Oberlin College to archive the campaign websites of the 2014 congressional candidates. These websites contain valuable historical information that is added, changed, and taken down frequently during a campaign cycle, as Nicholas Taylor writes in the Stanford library blog :

“Congressional campaign websites are valuable primary source material for historians, social scientists, and the public to better understand the evolution of political communication in the Web era. Campaign websites also afford unique opportunities for the mass collection of materials that would have been previously difficult to acquire outside of the candidate’s district. While it is a truism that the Web is constantly changing and broken links are an inevitable outcome, campaign websites are predictably ephemeral given their time-limited purpose.

Currently, our 2014 Primaries collection contains over 1,800 websites archived on a weekly basis since January 2014. We will be updating the collections weekly until mid November 2014. You can take a look at this year’s archived sites and watch the evolution of candidates’ websites over the course of the primaries and onward on the 2014 Congressional Election Cycle Archive-It page.

The Internet Archive is no stranger to collecting Election content and has been working with the Library of Congress on collections of Election websites for over 10 years. Those collections are housed at the Library of Congress, you can view some of them on their Web Archives page. You can also read the rest of Nicholas Taylor‘s blog post here.


5 Questions with Lead Engineer Noah Levitt



We hope you’ve enjoyed getting to know our new team members on the blog, and we thought you might also like to know a little bit more about the engineer behind Archive-It for the past 6 years.

Noah Levitt, Archive-It’s Lead Engineer, has been crucial in improving Heritrix, developing new capture mechanisms like Umbra, and providing important technical support for Archive-It Partners who archive complicated and dynamic content from the web. Thanks, Noah!

1. Before working at Archive-It, what project are you most proud of?

Back in 2003-2004 I was briefly deeply involved with the open source GNOME desktop, mostly working on internationalization. I wrote the unicode character map that is still distributed with GNOME today, and I also contributed to the pango text rendering library and related libraries.

2. You have been working with web archiving crawl technology for more than 6 years now! What interests you most about archiving the web?

I strongly believe in the mission of the Internet Archive, “Universal access to all knowledge”. Information is power and it should belong to everyone. Milan Kundera wrote, “The struggle of man against power is the struggle of memory against forgetting.” Along the same lines George Orwell wrote in his novel 1984, “He who controls the past controls the future.” Today the web is the medium where probably more information is published and consumed than anywhere else. We can’t let all of that fall down the memory hole.

3. What content from the web are you most passionate about archiving?

I think we should save everything we can. But if I had to choose one area, I would say news as it happens, primary sources, things like that. Information that will become part of the historical narrative.

4. In addition to being a talented engineer, you are also a musician. What bands and music projects are you currently a part of?

I play in two Balkan brass bands, Inspector Gadje and Fanfare Zambaleta. We play mostly music of the Romani people of Serbia, Macedonia, Romania, Bulgaria, Turkey, etc.

5. What are you most excited about for Archive-It 5.0?

Everything! Better user experience, fresh clean new platform, new tools for reporting and analytics, more intuitive yet powerful crawl scoping, to name a few things.

Our Marathon: One year after the Boston Marathon Bombing

our marathon copySource

In the year since the bombing at the Boston Marathon, the digital archive project Our Marathon of Northeastern University in Boston has collaborated with WBUR, the Boston City Archives, and others to bring together stories, videos, photos, social media, and memorial messages in an effort to help the public understand the tragic events that took place during the week of April 15th, 2013 and to promote the healing process. As a community project, they have been hosting events in the Boston area where attendees are able to share their stories. Find out where and when these events are being held on their events page.

The Archive-It team is proud to contribute to the Our Marathon project with web content from our 2013 Boston Marathon Bombing collection. This collection is comprised of contributions from individuals which include news articles, blogs, social media sites, and organizational websites related to the bombing. The full Archive-It collection can be viewed here.

Stories, Maps, and Punk Rock Archives: An NEA Conference Recap

-By Sylvie Rollason-Cass, Partner Specialist


Last week I traded sandals for snow boots to attend the New England Archivists (NEA) Spring 2014 meeting in Portsmouth NH. As a Midwest transplant living on the West Coast, meeting with New England archivists was a brand new experience for me. I was excited to learn from professionals working in an area with such a rich archival history. Since we on the Archive-It team are on the heels of our 4.9 release and working on 5.0, it was especially exciting to be able meet with a number of our Archive-It partners in person, as well as folks interested in learning more about Archive-It. We had a lot to talk about!

But back to the conference…

I was lucky enough to be able to attend a number of the sessions, and chose to focus primarily on discussions of digital and web archiving. Here is a little more info about a few that stood out:

A conversation with Ian MacKaye:

Ian MacKaye, a punk rock icon with archivist leanings, discussed his Fugazi Live Series archive. He and his team spent countless hours and much of their own money to digitize over a decade’s worth of cassette tapes of Fugazi live shows and compiled them into a website that allows users to download a copy for as little as $1 (or as much as you’d like to contribute).

Nostalgia, Art & the Archive:

A discussion of creative and unusual ways archival materials have been used and issues surrounding their re-use. Specific topics that were covered included the WhatWasThere project, hauntological music, and colorizing historical photographs.

Sharing Stories: The NEA/StoryCorps Project World Café:

Archivists involved in the NEA 2013 Story Corps project with the Worchester, MA community discussed how the project came to be, the role of oral histories in libraries and archives, and some personal experiences interviewing and being interviewed for the project.

Our Marathon: The Boston Bombing Digital Archives Roundtable & discussion:

Our Marathon is a “memorial and long term preservation project” to collect and preserve content related to the 2013 Boston Marathon bombing. So far their collection contains photos, videos, oral histories, and web content. The Archive-It team is partnering with the Our Marathon project to incorporate content from our Boston Marathon Bombing collection into the Our Marathon collection. (Keep an eye out for more info about that)

I also had the opportunity to sit in on some of the NEA Jeopardy Tournament. It was a lot of fun, but unfortunately my knowledge of NEA history and New England archival repositories is pretty limited, I was much more comfortable on the sidelines!

Other session topics included open source tools, corporate archival collections, social media, and moving image and sound archives (just to name a few). There just weren’t enough hours in the day to hear more, but I’ll be looking forward to next year!

Descriptions of the sessions are available on the NEA Spring Meeting page, and you can read the tweets on the NEA Spring 2014 Storify.

5 Questions with Reed Tech’s David Wilson

DWilson 3In November we announced Archive-It’s partnership with Reed Tech, including our collaboration to market and expand the Archive-It partner community. We are excited to share that David Wilson of Reed Tech will be meeting with some of our prospective partners to help bring awareness to web archiving and showcase the Archive-It service. Learn more about David below! If you are interested in learning more about Archive-It, please consider attending a live informational webinar by contacting us here.

1.  Before working with the Archive-It Team, what project are you most proud of?

Before working with Archive-It, I assisted many organizations in the legal, financial and healthcare industries as well as different school districts to archive their websites and social media. I am proud of having worked with each organization and providing them with a more efficient manner to archive websites and social media to achieve their various goals.

2.  What kinds of web content would you choose to preserve from the web?

Any web content that is sports-related: league websites, team websites, blogs, and social media. With the rapid evolution of digital communication and social media, how can we know Twitter will exist 40 years from now? It would be really neat to be able to see what the NFL website from 2014 looked like in 2050 or what hashtags were trending during the Masters. Statistics and sporting events are preserved and archived; Hall of Fame museums archive memorabilia and milestones; therefore, archiving digitally-born information to ensure it is available forever will only enhance these other collections for future generations to come.

3.  What is your favorite thing about working with Archive-It Partners?

Archive-It Partners understand the importance of preserving the cultural heritage of an organization. I come from a commercial background where the organizations were typically archiving websites or social media to comply with regulations or record retention policies. In many cases it was a “set it and forget it” scenario, as opposed to an organizational collaboration to create and manage an archive of websites and social media with the intention of having these collections made publicly available forever.

4.  During these cold winter months, what do you do to keep yourself busy when you aren’t working?

Shovel snow!  I’ve become a bit of an expert at this, a connoisseur if you will. I’ve even mastered the two shovel technique! I am very thankful that winter is almost over.

5.  You have some Spanish roots, as well as the experience of studying abroad in Spain. What is your favorite Spanish delicacy?

I could spend hours listing my favorite Spanish cuisine. Over the years I have traveled extensively throughout Spain, visiting my mother’s family and while studying abroad. There are many different foods that I grew up eating regularly, paella, cochinillo asado, tortilla española, and gazpacho. There is one seasonal delicacy that stands out in my mind: Torrijas, a traditional treat during the Easter celebration. Torrijas are very similar to French toast, but sweeter and soaked in wine, and it is socially acceptable to eat them at any time of the day.