Official tutorial Note that it didn't work if I proceeded according to the street. In this article, we will explain the procedure for launching CloudTPU and GCE VM instance and building an English-Japanese translation model with transformer, which is one of the NMT models.
--You have already created a project on Google Cloud Platform --Billing is enabled for the created project
Cloud Console Enter the following in the Cloud Console to launch a new Cloud TPU and GCE VM instance.
cloud_console
#Set of project IDs
gcloud config set project <project_id>
#Start ctpu(ctpu name is transformer)
#Also launch a GCE instance
ctpu up --name=transformer --tf-version=1.14
In the official tutorial, it is supposed to start with ctpu up
, but since the version does not match the default tensorflow
of the GCE VM instance, an error will occur if you proceed according to the tutorial.
To follow the tutorial, you need to match the version of CloudTPU tensorflow
with that of your GCE VM instance.
GCE(Google Computing Engine)
We will explain the procedure for learning and inferring with the transformer model based on your own data set (English-Japanese translation) stored in GCS (Google Cloud Strage).
Below, we will proceed with SSH connection to the GCE VM instance created with ctpu up
.
.
├── src
│ ├── __init__.py
│ └── myproblem.py
└── tmp
└── t2t_tmp
└── sample.picke
gsutil cp gs://<budge_name>/sample.pickle ./tmp/t2t_tmp/sample.pickle
Here, sample.pickle
is a two-column data frame consisting of english (English) and japanese (Japanese).
If you want to use your own dataset, you need to implement and register PROBME
.
Reference: https://tensorflow.github.io/tensor2tensor/new_problem.html
Here, create the following two Python
scripts.
python:./src/__init__.py
from . import myproblem
python:./src/myproblem.py
import pickle
import numpy as np
from tensor2tensor.data_generators import problem
from tensor2tensor.data_generators import text_problems
from tensor2tensor.utils import registry
@registry.register_problem
class Translate_JPEN(text_problems.Text2TextProblem):
@property
def approx_vocab_size(self):
return 2**13
@property
def is_generate_per_split(self):
return False
@property
def dataset_splits(self):
return [{
"split": problem.DatasetSplit.TRAIN,
"shards": 9,
}, {
"split": problem.DatasetSplit.EVAL,
"shards": 1,
}]
def generate_samples(self, data_dir, tmp_dir, dataset_split):
with open('./tmp/t2t_tmp/sample.pickle', 'rb') as fin:
sentences = pickle.load(fin)
for row in np.array(sentences):
yield {'inputs': row[0], 'targets': row[1]}
#Set of environment variables
export STORAGE_BUCKET=gs://<project_name>
export DATA_DIR=$STORAGE_BUCKET/transformer
export TMP_DIR=/tmp/t2t_tmp
export PATH=.local/bin:$PATH
export PROBLEM=translate_jpen
export TRAIN_DIR=$STORAGE_BUCKET/training/transformer_ende
export MODEL=transformer
export HPARAMS=transformer_tpu
#Self-made script
export USR_DIR=./src
After preprocessing based on your own ./src/myproblem.py
, you will learn.
Here, cloud_tpu_name
directly specifies the name specified in ctpu up
. (If you specify it with $ TPU_NAME
, an error will occur.)
Reference: https://stackoverflow.com/questions/59089613/tpu-core-error-on-google-cloud-platform-cannot-find-any-tpu-cores-in-the-system
It took about 3 hours for a dataset of about 60,000 translations, depending on the amount of data.
#Preprocessing
t2t-datagen \
--problem=$PROBLEM \
--data_dir=$DATA_DIR \
--tmp_dir=$TMP_DIR \
--t2t_usr_dir=$USR_DIR
#Learning
t2t-trainer \
--data_dir=$DATA_DIR \
--problem=$PROBLEM \
--train_steps=40000 \
--eval_steps=3 \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR \
--t2t_usr_dir=$USR_DIR \
--use_tpu=True \
--cloud_tpu_name=transformer
After learning, perform inference.
You can perform translations in an interactive shell by setting the decode_interactive
parameter to True.
If you want to infer locally based on the learning result of _CloudTPU, please refer to the following. _
https://qiita.com/yolo_kiyoshi/items/209750f27f582ed48257
#inference
t2t-decoder \
--data_dir=$DATA_DIR \
--problem=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR \
--t2t_usr_dir=$USR_DIR \
--decode_hparams="beam_size=4,alpha=0.6 \
--decode_interactive=true
--English-Japanese translation with Transformer https://qiita.com/yoh_okuno/items/35dcb8b2de7cd245119b --Japanese-English translation with tensor2tensor https://datanerd.hateblo.jp/entry/2019/07/25/101436 --Try seq2seq with your own data using Tensor2Tensor https://www.madopro.net/entry/t2t_seq2seq
Recommended Posts