I will post my first article.
Julius is a voice recognition system developed by Kyoto University, and although its accuracy is poor, it has the advantage of being free and can be operated offline.
In the current article, I felt annoyed because there were only things that could not be done without using socket communication to analyze Julius with python, so I hit the command to move Julius directly from python and the code to analyze the result I summarized it as a class.
Below github source code https://github.com/rirma/JuliusVoiceAnalyzer.git
Please note that unnecessary libraries will also be installed. If you don't need it, delete the pip3 installation at the bottom.
① Create a docker container by hitting the following command in the downloaded folder.
$ docker-compose up -d --build
② Enter the created container.
$ docker-compose exec python bash
③ Move to the folder where VoiceAnalyzer.py is located and execute the program.
# cd opt/public/src
# python3 VoiceAnalyzer.py
④ If you say something and get the following execution result, you are successful.
start record
Saved.
20201005220733.wav
enter filename->...........................................................................................................................................................................................................enter filename->1 files processed
Good morning.
A class is defined in VoiceAnalyzer.py to facilitate voice analysis in Julius.
def __init__(self, chunk = 1024, format = pyaudio.paInt16, channels = 1, rate = 44100, record_seconds = 2, threshold = 0.1)
Parameters to change as needed
chunk: audio file chunk channels: Channels for audio files rate: sampling rate record_seconds: length to record (seconds) threshold: Loudness (0 ~ 1) to start recording, to prevent recording from starting due to noise
def start_record(self, dir_name = '../sound/')
dir_name: Directory name where the audio file is saved Return value: Saved file name (directory name omitted)
def analyze_voice(self, file_path)
file_path: File to analyze (eg'../sound/20201005220733.wav') Return value: File voice recognition result string
Julius isn't very accurate, but I think it's useful when you want to create your own startup sound, such as "Hey Siri". I uttered the startup sounds I wanted Julius to recognize many times and absorbed the error by listing them. We hope you find it useful for interactive apps.
Recommended Posts