Main content area

How do you collect a website?

NSW Government produces many important records that document the social and political history of our state. The Library has a long history collecting physical information about our government and providing links to NSW websites to the Pandora project but in mid-2014 the Library began harvesting NSW government websites on a regular basis . 

The State Library uses a platform and tools provided by Archive-It -  based on the same software used by Internet Archive.

The initial harvest of NSW Government websites was part of a pilot project to explore the potential for domain harvesting as a tool for capturing government information.

The harvest successfully covered information from government departments and ministries including annual reports and the NSW Budget Papers (1988-2016). Altogether a total of 2.6TB of data was collected covering over 61 million documents.

You can explore the full collection of NSW government websites that have been archived here.


Preserving NSW Digital Content

Archive-IT NSW  web harvest collection data

The Library's continuing implementation of a new collection management system will enable the ingestion of digital content including the web harvest, social media, digital photographs, oral history, literary manuscripts and a wealth of other born and turned digital material. This work supports the Library's Digital Collecting Strategy.


Log in to post comments