Try streaming speech recognition from microphone input with Google Cloud Speech API.
Previously I tried to recognize recorded files with REST API version, so this time I will try streaming recognition with gRPC version.
Google official sample Follow the README procedure in.
This time I will try streaming recognition transcript_streaming.py.
Same procedure as REST version until getting json of Service Account.
GOOGLE_APPLICATION_CREDENTIALS
record_audio
, which is the method of pyaudio. $ python transcribe_streaming.py
and speak into the microphoneWhen started, recognition continues as long as service.StreamingRecognize returns a value in listen_print_loop. (It ends with a timeout when the number of seconds of DEADLINE_SECS elapses).
This sample finishes processing when the statement contains the words ʻexit or
quit(the latter half of * listen_print_loop *), so these words can be stopped as
stop or
end`. If you change it, you can do the same in Japanese.
――Until there is silence for a certain period of time, it is recognized as a continuous utterance even if there is some time.
--Once recognized, ʻis_final = Trueand
confidence are returned with the resulting text. -If you specify ʻinterim_results = True
in * streaming_config *, you can get the recognition result during the utterance.
The recognition in the middle of the utterance seems to be done at the word level, and I am surprised at a speed that I can not think through the network. However, the recognition result in the middle may be wrong, so if you do not hurry, it will end all It's better to wait.
See the gRPC API Manual (https://cloud.google.com/speech/reference/rpc/google.cloud.speech.v1beta1#google.cloud.speech.v1beta1.Speech.StreamingRecognize) for other options.
The Github code is updated quite often, so you should check it daily.
I tried it with the built-in microphone of the laptop / external microphone of USB with MAC and Linux respectively, but after about 3-10 utterances or 15-30 seconds, they do not recognize without any error. Investigation required.
Since it is v1beta1, it seems that it is still in the testing stage. It seems difficult to use it correctly unless you are accustomed to gRPC (and how to handle it from pyton).
Recommended Posts