Have you ever wanted to do ** reverse translation ** in Python for ** Data Augmentation **, such as in an NLP competition?
For example, Kaggle's Toxic Comment Classification Challenge uses this technique for its 1st place solution. https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/discussion/52557
In this article, I will explain how to easily reverse translate using ** machine translation ** in Python.
Example of reverse translation by machine translation Quote: https://amitness.com/2020/05/data-augmentation-for-nlp/
With ** googletrans **, you don't need an API key and you can easily reverse translate.
The environment assumes Python3.
$pip install googletrans
from googletrans import Translator
def BackTranslation(text, original_lang, via_lang):
translator = Translator()
return translator.translate(translator.translate(text, dest=original_lang).text, dest=via_lang).text
In the argument text, specify the original text, in original_lang, specify the original language, and in via_lang, specify the language you want to go through.
For the languages that can be specified for lang, refer to the following googletrans documentation. https://py-googletrans.readthedocs.io/en/latest/
「The destiny of man is in his own soul.」 I will try to reverse translate the English sentence that says, via Japanese.
text = "The destiny of man is in his own soul."
BackTranslation(text, "en", "ja")
The return value (result of reverse translation) is as follows.
Results of reverse translation
'The fate of man lies in his own soul.'
Also, if you output the relayed language (Japanese), it will be as follows.
Relayed language
Human destiny lies in his own soul.
A Visual Survey of Data Augmentation in NLP https://amitness.com/2020/05/data-augmentation-for-nlp/
Googletrans: Free and Unlimited Google translate API for Python https://py-googletrans.readthedocs.io/en/latest/
Is reverse translation an alchemist of machine translation? http://deeplearning.hatenablog.com/entry/back_translation
Recommended Posts