quail.decode_speech¶

quail.decode_speech(path, keypath=None, save=False, speech_context=None, sample_rate=44100, max_alternatives=1, language_code='en-US', enable_word_time_offsets=True, return_raw=False)[source]¶

Decode speech for a file or folder and return results

This function wraps the Google Speech API and ffmpeg to decode speech for free recall experiments. Note: in order for this to work, you must have a Google Speech account, a google speech credentials file referenced in your _bash_profile, and ffmpeg installed on your computer. See our readthedocs for more information on how to set this up: http://cdl-quail.readthedocs.io/en/latest/.

Parameters:

Parameters:	path : str Path to a wav file, or a folder of wav files. keypath : str Google Cloud Speech API key filepath. This is a JSON file containing credentials that was generated when creating a service account key. If None, assumes you have a local key that is set with an environmental variable. See the speech decoding tutorial for details. save : boolean False by default, but if set to true, will save a pickle with the results object from google speech, and a text file with the decoded words. speech_context : list of str This allows you to give some context to the speech decoding algorithm. For example, this could be the words studied on a given list, or all words in an experiment. sample_rate : float The sample rate of your audio files (default is 44100). max_alternatives : int You can specify the speech decoding to return multiple guesses to the decoding. This will be saved in the results object (default is 1). language_code : str Decoding language code. Default is en-US. See here for more details: https://cloud.google.com/speech/docs/languages enable_word_time_offsets : bool Returns timing information s(onsets/offsets) for each word (default is True). return_raw : boolean Intead of returning the parsed results objects (i.e. the words), you can return the raw reponse object. This has more details about the decoding, such as confidence.
Returns:	words : list of str, or list of lists of str The results of the speech decoding. This will be a list if only one file is input, or a list of lists if more than one file is decoded. raw : google speech object, or list of objects You can optionally return the google speech object instead of the parsed results by using the return_raw flag.

path : str: Path to a wav file, or a folder of wav files.
keypath : str: Google Cloud Speech API key filepath. This is a JSON file containing credentials that was generated when creating a service account key. If None, assumes you have a local key that is set with an environmental variable. See the speech decoding tutorial for details.
save : boolean: False by default, but if set to true, will save a pickle with the results object from google speech, and a text file with the decoded words.
speech_context : list of str: This allows you to give some context to the speech decoding algorithm. For example, this could be the words studied on a given list, or all words in an experiment.
sample_rate : float: The sample rate of your audio files (default is 44100).
max_alternatives : int: You can specify the speech decoding to return multiple guesses to the decoding. This will be saved in the results object (default is 1).
language_code : str: Decoding language code. Default is en-US. See here for more details: https://cloud.google.com/speech/docs/languages
enable_word_time_offsets : bool: Returns timing information s(onsets/offsets) for each word (default is True).
return_raw : boolean: Intead of returning the parsed results objects (i.e. the words), you can return the raw reponse object. This has more details about the decoding, such as confidence.

Returns:

words : list of str, or list of lists of str: The results of the speech decoding. This will be a list if only one file is input, or a list of lists if more than one file is decoded.
raw : google speech object, or list of objects: You can optionally return the google speech object instead of the parsed results by using the return_raw flag.