In the Windows environment, you can easily read it out with softalk etc., but in the Linux environment you can not execute it unless you do your best with wine etc. There is no choice but to call AquesTalk directly, which softalk is calling.
If you just want to synthesize speech, it is easier to call Google Translate's text-to-speech API with gTTS. When I try to install open-jtalk, it conflicts in my environment, so the solution is troublesome and I have not tried it.
Use the AquesTalk2 library from c and python code to synthesize speech.
Download the evaluation version from Download \ | Aquest Co., Ltd.. This article uses AquesTalk2. The operation with AquesTalk and AquesTalk10 is unconfirmed.
In addition, in the evaluation version, all the tones of "na line, ma line" will be "nu".
Install the library according to the manual included in the downloaded file. Please read as appropriate the version of the library and whether to use lib or lib64.
$ cd aqtk2-lnx-eva/lib64
$ cp libAquesTalk2Eva.so.2.3 /usr/lib
$ sudo ln -sf /usr/lib/libAquesTalk2Eva.so.2.3 /usr/lib/libAquesTalk2Eva.so.2
$ sudo ln -sf /usr/lib/libAquesTalk2Eva.so.2 /usr/lib/libAquesTalk2Eva.so
$ sudo /sbin/ldconfig -n /usr/lib
Compile and run ʻaqtk2-lnx-eva / samples / SampleTalk.c`. You can follow the attached manual.
Since a speech synthesis function is prepared for each encoding, modify SampleTalk.c according to the encoding such as terminal. UTF-8 is used here. If the function that matches the encoding is not called, it will not work properly.
// unsigned char *wav = AquesTalk2_Synthe_Euc(str, 100, &size, NULL);
unsigned char *wav = AquesTalk2_Synthe_Utf8(str, 100, &size, NULL);
Compile with the library and header path specified.
$ g++ -o SampleTalk samples/SampleTalk.c -lAquesTalk2Eva -Ilib64
The sample takes text from standard input and outputs wav format data to standard output. It will read out "Slow down ** Nu **".
$ echo "Take your time" | ./SampleTalk > sample.wav
Load and run the library with ctypes. You can change the voice quality by specifying the font file attached to the evaluation version. If not specified, it works by default. The default is fairly easy to hear.
from ctypes import *
def synthe_utf8(text, speed=100, file_phont=None):
if file_phont is not None:
with open(file_phont, 'rb') as f:
phont = f.read()
else:
phont = None
aqtk = cdll.LoadLibrary("libAquesTalk2Eva.so")
aqtk.AquesTalk2_Synthe_Utf8.restype = POINTER(ARRAY(c_ubyte, 0))
size=c_int(0)
wav_p = aqtk.AquesTalk2_Synthe_Utf8(text.encode('utf-8'), speed, byref(size), phont)
if not bool(wav_p):
print("ERR:", size.value)
return None
wav_p = cast(wav_p, POINTER(ARRAY(c_ubyte, size.value)))
wav = bytearray(wav_p.contents)
aqtk.AquesTalk2_FreeWave(wav_p)
return wav
if __name__ == '__main__':
with open('./default.wav', 'wb') as f:
wav = synthe_utf8(u"Take your time", speed=100)
f.write(wav)
with open('./yukkuri.wav', 'wb') as f:
wav = synthe_utf8(u"Take your time", speed=100, file_phont='aqtk2-lnx-eva/phont/aq_yukkuri.phont')
f.write(wav)
In the evaluation version, "na line, ma line" is "nu", but it seems that you need to obtain a license to read it correctly. For personal use, you can get a development license for less than 2000 yen. (As of June 9, 2018) See below for details such as distribution rules. Personal license \ | Aquest Co., Ltd.
Recommended Posts