I will post for the first time. My name is Denki Sheep from Golden Bridge. My main business is Chinese translation, but my hobby is programming.
I've been experimenting with languages, but I'm still a Sunday programmer. I want a professional person to hit me more and more!
Recently, due to the influence of Corona, interpreters are in special demand online. It is often done in advance in a video recording format so that communication may become unstable. As a result, not only translation but also ** subtitles ** have been handled more and more.
Until now, it was a small amount, so I had to do it manually, but since it was a big deal, I decided to create an SRT file to save labor.
A de facto standard format for pasting subtitles into videos. The contents are very simple.
1
00:00:01,050 --> 00:00:05,778
First subtitle
2
00:00:07,850 --> 00:00:11,123
Next subtitle
3
00:01:02,566 --> 00:01:12,456
Third subtitle
like, index
With this file, you can import subtitles into a video file at once.
We use AmiVoice Cloud Platform for Japanese transcription drafts.
Since it was made in Japan, it is more accurate than Google etc. in Japanese.
Writing from scratch is difficult, so [Sample Program](https://acp.amivoice.com/main/manual/%e3%82%b5%e3%83%b3%e3%83%97%e3%83%ab Just tweak% e3% 83% 97% e3% 83% ad% e3% 82% b0% e3% 83% a9% e3% 83% a0 /).
By executing this sample program, you can get the JSON data recognized by voice. For more information
[Returned JSON (AmiVoice)](https://acp.amivoice.com/main/manual/if%E4%BB%95%E6%A7%98-http%E9%9F%B3%E5%A3 % B0% E8% AA% 8D% E8% AD% 98api% E8% A9% B3% E7% B4% B0 / # response)
Key | Key | Key | Description |
---|---|---|---|
results | Arrangement of "recognition result of utterance section" | ||
confidence | Degree of reliability(A value between 0 and 1. 0:Low reliability, 1:High reliability) | ||
starttime | Speaking start time (voice data starts with 0) | ||
endtime | Speaking end time (voice data starts with 0) | ||
tags | Unused (empty array) | ||
rulename | Unused (empty string) | ||
text | Recognition result text | ||
tokens | Array of morphemes of recognition result text | ||
written | Notation of morpheme (word) | ||
confidence | Morpheme reliability (likelihood of recognition result) | ||
starttime | Morpheme start time (voice data starts with 0) | ||
endtime | Morpheme end time (voice data starts with 0) | ||
spoken | Morpheme reading | ||
utteranceid | Recognition result information ID*1 | ||
text | Overall recognition result text that combines all of the "recognition results of the utterance section" | ||
code | One-letter code representing the result*2 List of code and message contained in JSONchecking ... | ||
message | Character string representing the error content*2 List of code and message contained in JSONchecking ... |
From this JSON data, use starttime, endtime, and written in tokens to arrange it in SRT format.
Once you get the JSON, we will start converting immediately. As a condition to separate subtitle blocks
--Punctuation (,?) --Time (milliseconds)
I will use around. Also, subtitles don't have punctuation, so skip them.
These elements are command line arguments that I have made flexible.
import argparse
import json
parser = argparse.ArgumentParser()
parser.add_argument("file", help="Designate JSON file name to read")
parser.add_argument("-d", "--delimiters", help="Designate delimiters to separate subtitles. Default value is ['。','、']", default="。,、")
parser.add_argument("-s", "--skip", help="Designate skip words which do not inculud in subtitles. Default value is ['。','、']", default="。,、")
parser.add_argument("-t", "--time", help="Designate allowed time for single subtile by millisecongds. Default value is 5000", default=5000, type=int)
parser.add_argument("-c", "--charas", help="Designate allowed charas for single subtile. Default value is 25", default=25, type=int)
class SRTFomart():
def __init__(self, args):
self.text = ""
self.blocks = []
self.delimiters = args.delimiters.split(",")
self.skipWords = args.skip.split(",")
self.time = args.time
self.charas = args.charas
def readFile(self, file):
f = open(file, "r", encoding="utf-8")
contents = f.read()
f.close()
data = json.loads(contents)["results"][0]
self.text = data["text"]
self.readTokens(data["tokens"])
def readTokens(self, tokens):
sub = ""
startTime = 0
index = 1
# subTitles = []
for token in tokens:
written = token["written"]
#Set startTime if subtitles are empty
if sub == "":
#Even if the subtitles are empty, skip if the contents of the Token are punctuation marks, etc.
if written in self.delimiters or written in self.skipWords:
continue
else:
startTime = token["starttime"]
#Create subtitle breaks
#Store subtitles in blocks under each condition and reset once
#If you hit a punctuation mark
if written in self.delimiters or len(sub) > self.charas or token["endtime"] - startTime > self.time:
self.blocks.append(self.createSRTBlock(index, startTime, token["endtime"], sub))
sub = ""
startTime = 0
index += 1
#Connect subtitles except for conditions
else:
if written not in self.skipWords:
sub += token["written"]
#For loop so far
#Store the last block
self.blocks.append(self.createSRTBlock(index, startTime, tokens[-1]["endtime"], sub))
def createSRTBlock(self, index, startTime, endTime, sub):
stime = self.timeFormat(startTime)
etime = self.timeFormat(endTime)
return f"{index}\n{stime} --> {etime}\n{sub}\n"
def timeFormat(self, time):
time_ = time
ms_ = int(time_ % 1000)
time_ = int((time_ - ms_) / 1000)
sec_ = int(time_ % 60)
time_ = int((time_ - sec_) / 60)
mn_ = int(time_ % 60)
time_ = int((time_ - mn_) /60)
hr_ = int(time_ % 60)
if ms_ < 10:
ms = f"00{ms_}"
elif ms_ < 100:
ms = f"0{ms_}"
else:
ms = str(ms_)
if sec_ < 10:
sec = f"0{sec_}"
else:
sec = str(sec_)
if mn_ < 10:
mn = f"0{mn_}"
else:
mn = str(mn_)
if hr_ < 10:
hr = f"0{hr_}"
else:
hr = str(hr_)
return f"{hr}:{mn}:{sec},{ms}"
def exportSRTText(self):
return "\n".join(self.blocks)
if __name__ == "__main__":
args = parser.parse_args()
if not args.file.endswith(".json"):
print("Please set json file")
else:
srt = SRTFomart(args)
srt.readFile(args.file)
text = srt.exportSRTText()
srtName = args.file.replace(".json", ".srt")
f = open(srtName, "w", encoding="utf-8")
f.write(text)
f.close()
print("done")
You have successfully converted to SRT format.
Finally, all you have to do is adjust or translate the generated SRT file and import it into your video editing software. With Davinci Resolve, you could just drop it from the media pool onto your video track.
From the manual work so far, it seems that we can expect considerable efficiency improvement!
――I want to connect with machine translation! --Japanese proofreading with Word etc. → Looking for a way to reuptake.
Good luck with creating subtitles for the New Normal era!
Recommended Posts