Transcription: Audio Recording to Words on a Page

Following on from a recent post about recording interviews, some questions were raised about transcription. Transcription is a topic that on the surface seems quite practically focussed, in terms of how to record what was said in an individual or group interview. But the choices you make really depend on the kind of research you are doing. An exhaustive consideration of transcription is beyond the scope of this blog, so I’ll mainly focus on my own experiences and the decisions I made along the way. If you have experiences with transcription and/or tips you’d like to share, please do so in the comments below.

Laptop with soundwave image on screen.

Something to bear in mind is that however you do it, transcription is likely to take a long time. Of course, if you are short on time you can engage the services of a professional transcriptionist, though you need to know that you want to do this ahead of time so that you can incorporate it into your ethics approval application and your information for participants. Like a lot of research methods writers, I would advocate for doing your own transcription if you can as it gives you a level of familiarity with your data that would be hard to get otherwise. Don’t believe instructional videos that tell you they have the answer for a quick way of transcribing. I got sucked in by one that advocated the use of voice-recognition software and made out that it was possible to listen and dictate in real time without stopping, which was unrealistic (unless you are already a pro with this kind of software and have a programme that is trained to recognise your voice really well). I was not a pro and the software I used was not as well trained as it could have been, so that didn’t work out for me. My next port of call was a free piece of software called SoundScriber which is essentially a looping tool: rather than using a pedal to move your recording forwards and backwards as needed, SoundScriber allows you to choose a certain amount of audio to cycle through (e.g. 5 seconds at a time) and the number of times it should be repeated before moving on to the next 5 second segment. This gives you time to pick up all the details you need to get from each segment of the audio before you move on. There are other settings that can be altered too including the playback speed. This worked for me because it meant I just focussed on the transcription with less need for manually moving the recording backwards and forwards (although there was still a little bit of this), but it doesn’t necessarily make for the quickest transcription process, especially if you’re going through twice to check for errors (which I definitely recommend). There are other approaches to transcription, including using a foot pedal rather than a looping tool, and programs that incorporate audio navigation and typing in one package.

The other big decision with transcription is how much detail will you include? This will to a certain extent be guided by you research approach and focus. If you are a linguistics researcher, you may consider a phonetic transcription system such as the International Phonetic Alphabet. If you want to capture a lot of contextual information in your transcripts as well as the words themselves, you may consider Jeffersonian transcription which has annotations for a broad range of verbal and non-verbal aspects of conversation (see Hepburn and Bolden, 2017, for a comprehensive guide). You can also create your own system of what to include and how to annotate it, which was my approach. In my topic I had no plans to analyse aspects such as pauses, so I did not include these unless excluding them changed the meaning of a sentence. I included all disfluencies, false starts and repetitions in the initial transcripts, but I removed them from any quotes that I included in the thesis, so as not to detract from the meaning of what was being said. Going into that level of detail really helped get participants’ words in my head and meant it was easier to make connections when it came to analysing the data.   

To finish this post, I want to share a key piece of advice that my supervisors gave me: stay on top of your transcription. When you’re in the midst of data collection and the excitement of going out and talking to people, it can be tempting to let things slip and get behind with transcription. But it’s such a mammoth task, it doesn’t take too long before you can feel completely overwhelmed by the amount of work to do. So if you can get down a first draft within a few days of the interview itself, that will really help you to stay on top of things. You may still need to go back and check it through a second time, but this feels much more manageable than starting with a blank screen. Whatever you do, transcription is likely to feel like a bit of a slog: be patient with yourself and make sure to do something to celebrate when you finally make it to the end. Kia kaha!


Hepburn, A. & Bolden, G.B. (2017). Transcribing for Social Research. London: SAGE.

About Kathryn Oxborrow

Dr Kathryn Oxborrow is the temporary Researcher Development Coordinator at AUT's Graduate Research School. In her PhD research she investigated how non-Māori librarians in Aotearoa learn about and engage with Māori knowledge in their lives and work. Kathryn is originally from the UK and moved to New Zealand in 2010.

Leave a Reply

Your email address will not be published. Required fields are marked *

 characters available