Amplify audio transcription tool

Last updated:  22 March 2024

Amplify is a crowdsourcing platform where libraries can temporarily place digital audio materials from their collections paired with machine-generated transcripts to engage people in the community to help correct any errors in the transcripts.  

Once completed, your audio materials and newly corrected transcripts can be moved off Amplify to a permanent location of your choice, such as your library management system or library website. 

Introduction

The aim of Amplify is to promote access to rich cultural collections from across New South Wales, as well as to provide an engaging and user-friendly tool to create and correct transcripts at low cost. The benefit of Amplify is not only to provide online access to previously inaccessible material, but also to create an opportunity for libraries to engage with communities of interest in correcting any errors or gaps in the machine generated material. It is a great way for researchers, volunteers and the general public to get ‘hands-on’ with cultural heritage and listen in depth to the stories of NSW and Australian history.  

Why is audio transcription important?

If you have audio material in your library's collection, it is very likely that there are users of your library service who want to access it - ideally online! As well as digitising your audio collections for the purpose of preservation and access, providing text-based transcripts is an important way to enrich the value and usefulness of your local studies material.

Improving the accessibility of your collections

Providing access to transcripts for your audio material ensures that your collections can be accessed by all members of your community. All government institutions should be committed to making any content they publish online accessible to all users, which includes members of your community that may access material in alternative ways, such as with the use of screen readers or other assistive technologies. The Web Content Accessibility Guidelines (WCAG) set out recommendations and criteria that Australian government institutions, including libraries, are legislated to follow under the Disability Discrimination Act 1992 (Cth). The guidelines make clear that all multimedia content, including audio and video material, should be published online with an accompanying 'text-based alternative' in order to meet baseline accessibility requirements.

Improving your research experience

As well as meeting accessibility requirements, providing text-based transcripts of your audio collections will ensure an easier and more productive research experience for users of your collections. Many researchers prefer to browse transcripts rather than listen to hours of audio material. Transcripts make it much easier to browse key themes and topics within a recording, as well as search for and identify named places and people. Transcripts can also make your audio collections more discoverable by being indexed by search engines such as your library catalogue or Google, making it easier for users to find relevant material.

How can my Library get involved?

By joining the Amplify platform, your library will be set up with its own presence to host your digitised oral history collections, making them available for listening and transcription to the public. As well as complimentary training, you will also receive membership to the Amplify Community of Practice, an online hub providing support, platform updates, resources, best practice advice and ongoing peer-to-peer communication in the use of Amplify.  Learn more about how your library can access Amplify for your own oral history collections.

New subscriptions to Amplify will be opened to eligible public libraries in NSW via staggered application rounds each year. 

The number of successful applicants in each round will be dependent on how many submissions we receive, as well as other factors such as, for example, the ability to group geographically situated libraries into one round to help streamlining and scheduling training. We aim to include up to 5 new public libraries for each round, with a minimum of 2 rounds per year. 

How can I make Amplify successful for my library?

Amplify provides a collaborative and cost effective way to create transcripts for your audio collections. Using a third party machine-learning service, Amplify creates computer-generated transcripts for every audio file you upload. These computer-generated transcripts can often include errors and Amplify acts as a crowdsourcing platform where you can engage your local community to assist in the review and correction of your transcripts. 

While much of the platform is technology driven, Amplify is not a platform where you can 'set and forget' your audio  collections. Rather, Amplify is a temporary tool for transcription and community engagement. There are several factors to consider to ensure Amplify is a success for your library.  

Digitised collections

You must have digital - either digitised or born digital - audio files  prepared prior to commencing your use of Amplify. Ideally, you will also have access to digital image collections to accompany your audio material.

Staff resourcing

Like any other digital content platform, Amplify requires ongoing resourcing. Tasks will include preparing, managing and publishing content, community engagement, transcription moderation and participation in the Amplify Community of Practice. Achieving corrected and completed transcripts for your audio collections will not be possible without dedicated and sustained resourcing from your team.

Community engagement

Engaging with your stakeholders is a critical component to making Amplify successful for your library. Your local community is the most likely to be invested in your audio material, and therefore the most likely to participate in the review and correction of transcripts. Sustaining this community interest will be imperative to reaching your goal of completed transcripts within a reasonable timeframe. See the Engaging Community Participation section for more details about potential community engagement strategies. 

Managing collections

What audio collections are best suited to Amplify?

Amplify is designed to cater to oral history recordings with any number of people speaking clearly. The better quality the recording, taking into consideration factors like minimal background noise, good quality microphones, and clear and audible annunciation by speakers, the more likely a better quality machine-generated transcript will be produced. If a recording from your collection is of poorer quality it can still be used on Amplify, but be wary that this will produce a poorer quality original machine-generated transcript, meaning that it will require more correction and therefore take longer to be completed. 

The best results generally come from recordings of interviews and conversations with only two people, however group discussions can also be used, although it is often hard to clearly identify individual speakers within a recording and multiple voices often result in a less accurate machine-generated transcript. 

Amplify cannot be used for any songs or other musical recordings. Poetry readings are acceptable. 

You can learn more about the types of collections and content that are most suitable to Amplify by reading our case studies.

Languages other than English

Amplify has limited capacity to transcribe recordings in languages other than English, which is due to limitations  of the third party machine transcription service used. 

Recordings in the following languages can be uploaded: French, German, Italian, Dutch, Spanish and Portuguese. Please note that transcripts of any recordings in these languages will not be translated into English, so the transcripts on Amplify must also be in the same language as the recording itself. Audio collections of this nature could provide a great engagement opportunity with language speaking groups within your community. 

For recordings that have mixed languages, for example English and German spoken within the one interview, the transcript will only be in English, meaning that anything spoken in German will have been attempted by the machine service to be transcribed in English. These type of transcripts will definitely contain errors that will need careful attention to correct.

File requirements

Audio files

Audio file types acceptable for uploading to Amplify are mp3, mp4 or m4a formats. If your digital audio files are in WAV format for the purpose of preservation, you will need to create an access derivative from these master files in one of the three acceptable formats noted above. 

The bitrate of your audio files is important because this will impact the quality of your machine-generated transcript. The recommended bitrate for all audio files is 96kbps (kilobits per second). Anything lower than this may decrease the accuracy of your transcript, and anything higher than 96kbps will have minimal impact on improving your transcript quality, but will create much larger file sizes which will use up more of the allocated storage space unnecessarily.

Image files

Image files paired with your audio collections can be uploaded in JPEG or PNG format. Images should be at a minimum resolution of 72dpi (dots per inch) and have a minimum width of 1000 pixels, ensuring that images will be high enough quality at any screen size.

Copyright, privacy and censorship

Copyright and privacy of collection materials can be complex issues. It is your responsibility to ensure that all collections you publish online are either out of copyright or permission has been sought from the copyright holder to make available online. You are also responsible for ensuring that any audio files or images you upload do not breach any privacy restrictions that may be applicable to your collections.  

The State Library of NSW recommends taking a risk management approach to managing copyright, especially when the copyright status of a particular collection is not known, or there are privacy issues and any concerns about sensitive subject material.  

To determine the level of risk, a risk management matrix may assess factors like collection age, subject matter, historical context, cultural sensitivities, and how the material was acquired. When a collection is considered low to medium risk, it is still suitable for publishing on Amplify, supported by a takedown policy. The State Library of NSW has a standard takedown policy that applies to all material published on Amplify. 

An example of a low risk collection might be a series of interviews commissioned by your library whereby you are the copyright holder and each speaker signed a rights agreement form. An example of a potentially high risk collection might be a recently recorded series of interviews with local residents where the interviewees state their home address or other personal information.  

It is also important to consider whether any collections you wish to upload to Amplify may contain any sensitive or potentially controversial subject matter. Culturally sensitive or defamatory content should be carefully considered before publishing online.  

Engaging community participation

It is imperative for you to have strategies in place for engaging community participation in listening to and transcribing your audio collections. Your collections most likely focus on your local area or a specialised topic area related to your community, so your community is best placed to help transcribe these collections. There are a number of strategies that your library could use to promote engagement:

Promote collections to local community through regular communication channels such as social media, library and council websites and eNewsletters.

Harness the passion and subject matter expertise within your local area by promoting specific collections to known community groups. For example, if you have a collection of interviews with volunteer firefighters, target the promotion of the collection to your local fire brigade Facebook group.

Partner with local organisations such as historical societies and schools to recruit volunteer transcribers. School students may like to work together on transcribing a collection for a group project.

Recruit regular onsite volunteers who have an interest in transcribing and who would like to support your library.

Create a dedicated Amplify space in your library to encourage your visitors to listen to and transcribe your collections. All you need is a computer and a set of headphones.  

Participate in a local pop up event promoting your collections on Amplify - take a couple of laptops and headphones with you. You could run a 'transcribe-a-thon' event with a specific goal set, like successfully completing transcripts of 10 interviews by the end of the day.

Contact interviewees and their families directly to let them know the collections are available. You are likely to have more participation in transcription from the community when they have a personal connection to the interviewee or subject matter. 

Community of practice

When signing up to use the Amplify platform, you will also join an online Community of Practice for all participating organisations. The Community of Practice is a space for all organisational users of Amplify to share resources, ask questions and troubleshoot issues, share success stories and advice, provide feedback, help prioritise feature development and provide peer-to-peer support. The Community of Practice is intended to ensure all Amplify users are getting the most out of the platform for their audio collections and that all users can learn from the experiences of others by sharing local experiences and strategies, as well as identifying opportunities to collaborate together whenever possible. 

The Community of Practice online space includes features such as:

  • Knowledge base
  • Resources and documentation
  • Monthly usage reports
  • News and alerts
  • Instant group chat
  • Feature requests and development opportunities
  • Shared calendar 

Exit strategy and LMS best practice

Amplify is a tool for creating transcripts for your audio collections. It is not a permanent 'home' for the hosting of these collections. When you sign up to use Amplify, you are committing to actively using the platform to engage users to review and correct your transcripts in a timely manner. Once your transcripts have been completed, your collections should be removed from Amplify to make way for new collections to be added and transcribed. 

There are multiple strategies you could use to encourage engagement with your collections. For example, some organisations might decide to upload an entire collection which includes multiple files at one time. This is a good way to promote a collection to your community and gives users the opportunity to gain a better understanding of the collection as a whole. This can be effective if you have a collection that is made up of a series of consecutive or related recordings rather than standalone files. The consequence of uploading collections with a number of files means that it will take longer for all transcripts to be reviewed and completed. 

As an alternative example, some organisations may choose to upload only one or two recordings at any one time. This means focusing on the promotion of a single interview or recording with a singular topic area or subject matter. This may work well if you have a smaller number of files in your collection, or if you're hoping to get a particular transcript completed more quickly. 

Whatever strategy you use to promote and engage your collections, the aim is to have the audio material transcribed as quickly as possible so the completed files can be removed from Amplify to make way for new collections. 

As a result you will need to consider what solutions are in place to provide permanent access to your audio collections and transcripts. Will permanent access be provided  through the library catalogue and/or a dedicated project website. Where will the final transcript files be stored? It is important to consider a workflow for ongoing access to your audio collections prior to using Amplify, to ensure Amplify is part of the overarching strategy related to the preservation and access of your collections.

Frequently asked questions

What is Amplify? 

Amplify is a crowdsourcing platform where libraries can publish digital audio materials from their collections, paired with machine-generated transcripts. These audio files can then be accessed and listened to by members of the public, who can also help to correct any errors they may find in the computer generated transcripts as they listen along. 

The aim of Amplify is to promote access to rich cultural collections from across New South Wales, as well as to provide an engaging and user-friendly tool to create and correct transcripts at low cost. 

What kind of collections does my library need in order to participate? 

In order to use the Amplify platform, your library must have digital audio materials – either digitised or born digital audio files. It will also be beneficial if you have access to digital images that can be paired with your audio collections, however this is not essential if you do not have images available. 

Will using Amplify cost my library any money? 

The State Library of NSW will subsidise costs for all eligible public libraries in NSW in their entry-level use of Amplify. We will do this using a tiered subscription approach wherein each public library will have access to the “basic” subscription tier at no cost to them. The basic tier includes 5 gigabytes of file storage for audio and images. If an individual library wants to exceed that storage limit, they will then be upgraded to the next tier which will incur an annual subscription cost, however this upgrade will be at the discretion of the participating public library and will only occur after consultation with the State Library of NSW.  

What kind of resourcing from my library will be required? 

Using Amplify is an ongoing commitment from your library. The State Library can offer you access to the technology and technical support to use Amplify as a vehicle for transcription and community engagement, but your success in using the platform will be dependent on the ongoing resources you commit to the project. 

Some of the tasks you will need to resource include:  

  • Content creation and management including some moderation
  • Ongoing promotion of your collections 
  • Recruiting local volunteers and community engagement 
  • Ongoing technical support and troubleshooting for your volunteers 
  • Ongoing management of copyright of your collections 
  • Ongoing management of any enquiries related to your audio collections  
How many libraries will be accepted to the platform? 

This is dependent on how many applications we receive per each expression of interest round, but there is no specific limit and we aim to accept as many eligible public libraries as possible each round. 

What criteria do you assess applications against? 

There are no set criteria in which each application is assessed, but some factors that might influence our decision may include:

  • Public libraries access to digitised audio and image files
  • Variety and scope of content areas
  • Geographic location of applying libraries 
  • Guarantee of resourcing from participating libraries 
Will you provide ongoing training and support for my library in the use of Amplify? 

Yes. As part of our roll out to public libraries, we will be providing face-to-face training for all new participants. Your library will also receive membership to the Amplify Community of Practice, an online hub providing support, platform updates, resources, best practice advice and ongoing peer-to-peer communication in the use of Amplify. This online hub will be the first port of call for ongoing support and training.

My library doesn’t yet have digital audio material. Can we participate in Amplify? 

To add collections to Amplify your library must have digitised collections. Your complete archive of oral histories and audio material does not need to be digitised, but at least a portion of your total audio collections must be in digital format to be eligible for use on Amplify. 

How long is the commitment for using Amplify? 

There is no set period that you will be locked into using the platform. However, Amplify should be seen as a tool for the creation and correction of transcripts, not as a long-term host for your audio collections. If you are accepted into the Amplify program, you will be required to have transcriptions for your collections completed within a pre-agreed timeframe. Once your audio collections have been transcribed in full, you will be able to extract your complete transcripts and then take your audio materials down from the platform.  

If my library participates in the Amplify platform will the State Library acquire our collections? 

No. The State Library will provide you with the technology and infrastructure to temporarily host your audio collections for the purpose of transcription, but your collections remains yours and the State Library will not keep any copy of your materials other than automated technical backups of the Amplify platform as whole as part of our disaster recovery procedures. 

Further information

Please contact us if you want to find out more.