In a typical free recall experiment, after the experiment completes the
experimenter (or a team of experience-hungry undergraduates) will
manually transcribe the verbal responses from a subject by listening to
audio files, and coding each word. This process can take hours, and is
typically not exciting, to say the least. To help with this problem, we
created a decode_speech
function, which wraps the Google Speech API
and a software package called ffmpeg
to automatically transcribe the
responses. Furthermore, it allows the experimenter to transcribe in
(almost) realtime, which makes adaptive free recall experiments a
possibility. To use this feature (assuming that you are using a mac or
linux machine), you must first set up ffmpeg and Google Speech API:
ffmpeg
¶ffmpeg
is native application that processes audio and video files.
We will use it to convert .wav files to the .flac format, which will
allow us to send the files to Google Speech. To set up:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install ffmpeg
Under the hood, quail
uses the Google Speech API to transcribe audio
responses. Follow the instructions below to set it up. Note: the API is
not free, but its quite reasonable. Up to 60 minutes/month is free, and
after that it costs $0.006 per 15 seconds. For a typical study (16
study/test blocks) allowing for a minute of recall after each, the price
comes out to ~$0.38 per subject. To set it up, follow these steps:
If you followed these steps, a JSON formatted API keyfile will be downloaded to your local computer. This file is your ticket to speech decoding, so keep it safe. Everything should now be setup! Below is a basic example of how to use it:
#import
import quail
# decode speech
recall_data = quail.decode_speech('../data/sample.wav', keypath='path/to/keyfile.JSON')
# print results
print(recall_data)
The credentials can also be set up as an environmental variable. To do this, edit your .bash_profile, adding the line:
export GOOGLE_APPLICATION_CREDENTIALS='/path/to/keyfile.JSON'
You’ll need to launch a fresh terminal instance and then the
decode_speech
function should work without the explicit keypath:
# decode speech
recall_data = quail.decode_speech('../data/sample.wav')