A library that generates dummy data (test data). PHP and Ruby also have the same name, and it has a de facto atmosphere. https://github.com/joke2k/faker
This time, I will introduce it so that I can generate address data in Japanese.
What kind of data can faker generate? Let's write a simple example first.
sample.py
from faker import Factory
f = Factory.create()
print f.name()
print f.address()
print f.phone_number()
print f.date()
Execution result
Jennie Homenick
Petramouth, WI 21918-9349
177.513.9541
1998-12-21
It will generate the data nicely, but the default is English-speaking notation.
Data in other languages can also be generated by specifying location
in the argument of Factory.create
.
I'm curious about Japanese support, but with the commit of @ ta2xeo about a month ago, names and phone numbers can now be generated in Japanese.
And this time, I made it possible for me to generate an address as well. Let's see it together.
sample_ja_JP.py
from faker import Factory
f = Factory.create('ja_JP')
print f.name()
print f.phone_number()
print f.date()
print f.address()
print f.address()
print f.zipcode()
print f.prefecture()
print f.city()
print f.town()
print f.chome()
print f.ban()
print f.gou()
print f.building_name()
Execution result
Akiko Matsumoto
070-1472-1794
2011-03-04
11-4-20 Hanakawado, Tsurumi-ku, Yokohama-shi, Fukushima Corp Minowa 553
31-24-20 Ujiie Shinden, Sammu City, Toyama Prefecture
121-0122
Akita
Koganei City
Taitung
11th Street
No. 8
No. 13
Palace
As you can see, there are almost no real addresses, good or bad. It may not be possible to generate consistent data, or it may not support various address display formats in Japan, but for the time being, it is better than English notation.
~~ It seems that the Japanese version has not been released to PyPI yet. ~~ ~~ If you want to use it, please install it from the GitHub repository. ~~
Since it was released in v0.5.1, the steps in this section are unnecessary.
You can generate test data with a library such as faker, but there are cases where dummy data alone does not work. In such cases, I usually want to mask some of the data in the production environment and use it, so I created a tool for that. Of course I use faker.
A tool called Hermes that masks only specific columns in CSV. It is still poor, but I plan to make steady improvements. https://github.com/ohbarye/Hermes
Recommended Posts