Speech-to-Text¶
Speech to Text is the process of transforming audio files to text.
This capability is provided by the “Speech to Text” plugin, which you need to install. Please see Installing plugins.
Dataiku provides several speech-to-text capabilities
Native speech-to-text¶
The native speech to text capability of Dataiku provides speech-to-text in English. It is an offline capability, meaning that it does not leverage a 3rd party API.
Warning
The underlying DeepSpeech library requires the following system libraries:
libstdc++6 >= 4.8.5
glibc >= 2.19
libstdc++6 >= 4.8 is not installed by default on several Linux distributions. If that is the case, you will need *sudo * access to the server hosting your Dataiku instance in order to upgrade libstdc++6.
Download DeepSpeech model macro¶
This macro downloads the weights of the DeepSpeech pre-trained model into a folder in your project. Note that this model has been trained on American English speech data.
Speech to Text recipe¶
This recipe takes as input the folder with DeepSpeech weights from the macro and a folder with audio files of .WAV format. The output will be a dataset with two columns: the audio file path and the associated transcription.
AWS Transcribe¶
The AWS Transcribe integration provides speech-to-text extraction in 40 languages
Please see NLP using AWS APIs for more details