It seems that the googleAPI interface has been changed and it does not work as it is. If you are new to stream speech recognition in the future, please refer to the following article by @delete. https://qiita.com/delete/items/395776c6843d67fd65fd
It is below.
I want to perform stream speech recognition using the Google Cloud Speech gRPC API! (With a simple VAD)] http://qiita.com/sayonari/items/a70118a468483967ad34
This article does not work well when I did google voice recognition in the new environment, so I will leave a note when I installed it from the beginning. We have not been able to re-verify it properly, so if you find any opinions or corrections, please feel free to give us your opinions.
--Machine: MacBook Pro (Retina, 13-inch, Early 2015)
google cloud API dashboard https://console.cloud.google.com/?hl=ja The method of making the speech API available with the google cloud API is quite annoying, but there is a lot of information, so please google it yourself.
Create a project with your favorite name.
I made a project called GoogleCloudAPI-ASRtest
.
If you can use SpeechAPI, the ID in the API will be displayed, so remember it.
pip install google.cloud.speech
https://cloud.google.com/sdk/docs/quickstart-mac-os-x?hl=ja
Run ʻinstall.sh`
gcloud init
You will be asked "You must log in to continue. Would you like to log in (Y / n)?", So enter Y.
The browser will start up, so log in with the google account registered with the API.
Under "Pick cloud project to use:", the API project name is listed along with the number, so select the project in which SpeechAPI is registered.
「Do you want to configure Google Compute Engine (https://cloud.google.com/compute) settings (Y / n)? ”Select the server as Y. It was changed to "[2] asia-east1-b".
pip install gcloud
Add the installed directory to PYTHONPATH
In my case, it was as follows.
export PYTHONPATH="/Users/nishimura/.pyenv/versions/3.6.1/lib/python3.6/site-packages:$PYTHONPATH"
If you add this to ~ / .bash_profile
, it will be executed automatically every time, which is convenient.
gcloud auth application-default login
The browser will start up, so log in with your google account. Application approval.
https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/speech/cloud-client
If you execute transcribe_streaming_mic.py
, you can recognize the voice in English.
Specify directly in the program (in the main function) where config is set.
language_code='ja-JP'
Why not rewrite the config in the main function like this?
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=RATE,
language_code='ja-JP',
speech_contexts=[speech.types.SpeechContext(
phrasesHints=["Kita has come","Really"]
)]
)
However, since "reading" is not given, it is not recognized well when written in kanji. Sorry. If anyone knows "how to give reading", please let me know m (_ _) m
Official manual https://media.readthedocs.org/pdf/google-cloud-python/latest/google-cloud-python.pdf
Recommended Posts