During this time, @TakesxiSximada posted an article Code reading of the library Safe that examines password strength in Python, so I tried it myself. I did.
So I picked up a library called faker that I had been interested in using for a while.
faker is a library that generates dummy test data nicely. This is the python version of the one you often see in other languages.
https://pypi.python.org/pypi/fake-factory/0.5.3 https://github.com/joke2k/faker
Install You can install it with pip.
$ pip install fake-factory
>>> from faker import Factory
And it is OK if you generate a generator that creates test data.
>>> fake = Factory.create()
After that, the test data will be returned like this.
>>> fake.name()
'Anfernee Reichel'
>>> fake.address()
'084 Tiney Fork Suite 757\nPort Earl, MI 20240-1776'
>>> fake.text()
'Facilis non eligendi qui deleniti ullam est. Ab minus est non et occaecati laborum sequi. Vero consectetur repellendus dicta velit. Quisquam omnis alias error sed totam.'
It also supports multilingualization, and can be realized by passing locale
as an argument toFactory.create ()
.
>>> fake = Factory.create('ja_JP')
>>> fake.name()
'Yumiko Tsuda'
>>> fake.address()
'32-22-3 Shiba Park, Chuo-ku, Gunma Prefecture Kamihiroya Heights 400'
>>> fake.text()
'Non ut in unde ipsa fugiat excepturi voluptate. Enim molestias voluptatem aperiam. Est fuga distinctio sit officia qui velit numquam sint.'
Japanese data was not prepared for text, so the default ʻen_US` data is returned.
By the way,'fakeis an instance of
faker.generator.Generator ()`.
>>> type(fake)
<class 'faker.generator.Generator'>
Then I would like to read the code, Before that, it is quicker to understand who the Provider is in faker, so I will explain the Provider first.
Each Provider is stored under faker / faker / providers.
├── providers
│ ├── __init__.py
│ ├── __pycache__
│ ├── address
│ ├── barcode
│ ├── color
│ ├── company
│ ├── credit_card
│ ├── currency
│ ├── date_time
│ ├── file
│ ├── internet
│ ├── job
│ ├── lorem
│ ├── misc
│ ├── person
│ ├── phone_number
│ ├── profile
│ ├── python
│ ├── ssn
│ └── user_agent
For each category such as ʻaddressand
barcode`, directories corresponding to each language and Provider which is the base of each category are implemented.
Here, we will focus on person
and follow the source.
The subordinate of person
looks like this.
├── providers
│ ├── __init__.py
│ ├── person
│ │ ├── __init__.py
│ │ ├── bg_BG
│ │ ├── cs_CZ
│ │ ├── de_AT
│ │ ├── de_DE
│ │ ├── dk_DK
│ │ ├── el_GR
│ │ ├── en
│ │ ├── en_US
│ │ ├── es_ES
│ │ ├── es_MX
│ │ ├── fa_IR
│ │ ├── fi_FI
│ │ ├── fr_FR
│ │ ├── hi_IN
│ │ ├── hr_HR
│ │ ├── it_IT
│ │ ├── ja_JP
│ │ ├── ko_KR
│ │ ├── lt_LT
│ │ ├── lv_LV
│ │ ├── ne_NP
│ │ ├── nl_NL
│ │ ├── no_NO
│ │ ├── pl_PL
│ │ ├── pt_BR
│ │ ├── pt_PT
│ │ ├── ru_RU
│ │ ├── sl_SI
│ │ ├── sv_SE
│ │ ├── tr_TR
│ │ ├── uk_UA
│ │ ├── zh_CN
│ │ └── zh_TW
Next, look at __init __. Py
directly under / faker / providers / person
. to watch.
from .. import BaseProvider
class Provider(BaseProvider):
formats = ['{{first_name}} {{last_name}}', ]
first_names = ['John', 'Jane']
last_names = ['Doe', ]
def name(self):
"""
:example 'John Doe'
"""
pattern = self.random_element(self.formats)
return self.generator.parse(pattern)
@classmethod
def first_name(cls):
return cls.random_element(cls.first_names)
@classmethod
def last_name(cls):
return cls.random_element(cls.last_names)
#Omitted below
Like this, the Provider that is the base of the Person Provider of each language is implemented.
It inherits BaseProvider that implements classmethods that extract data randomly such as random_element ()
. You can see that.
Then, inherit this Provider and create new properties and methods, or override them to prepare Providers corresponding to each language. Please refer to the following for the Person Provider for Japanese. https://github.com/joke2k/faker/blob/master/faker/providers/person/ja_JP/init.py
https://github.com/joke2k/faker/blob/master/faker/factory.py#L14-L44
This method creates an instance of <class'faker.generator.Generator'>
and returns it.
In the following process, each Provider is set in faker
which is an instance of<class'faker.generator.Generator'>
based on locale
passed as an argument of Factory.create ()
. I will.
(If there is no Provider corresponding to the specified locale
, the one with DEFAULT_LOCALE ʻen_US` is set.)
for prov_name in as:
if prov_name == 'faker.as':
continue
prov_cls, lang_found = cls._get_provider_class(prov_name, locale)
provider = prov_cls(faker)
provider.__provider__ = prov_name
provider.__lang__ = lang_found
faker.add_provider(provider)
Next, let's take a look at the ʻadd_provider (provider)` that came out in the above process.
https://github.com/joke2k/faker/blob/master/faker/generator.py#L22-L39
The public method defined by Provider (ex. <Faker.providers.person.ja_JP.Provider>
) passed as an argument is added to the generator format.
https://github.com/joke2k/faker/blob/master/faker/generator.py#L70-L75
The word format suddenly appeared at Generator.add_provider ()
, but I'm just doing setattr () on the Generator instance.
By doing Factory.create ()
as we have seen so far
You can get an instance of <class'faker.generator.Generator'>
with all the public methods defined in the Provider group of each language set in attributes.
Thanks to this, just by calling fake.method_name ()
as shown below, method_name ()
implemented in the Provider of each language is executed and random test data can be obtained. I'm sorry.
>>> fake.name()
'Anfernee Reichel'
I'm exhausted and I'm only following the Factory.create ()
part, but if you understand how to generate a Generator, you'll know how to use this library in other ways.
Code reading with such a thin library is recommended because it was easy to attach and fun!
In the middle of writing this article
"[PersonProvider] of ja_JP
(https://github.com/joke2k/faker/blob/master/faker/providers/person/ja_JP/init.py) keeps the format of name ()
in Japanese Because I was doing, ʻuser_name ()and
domain_word ()` are not displayed properly. "
I ran into the problem.
https://github.com/joke2k/faker/blob/master/faker/providers/internet/init.py#L27-L32
https://github.com/joke2k/faker/blob/master/faker/providers/internet/init.py#L90-L95
He issued a PR to deal with the above problem and merged it safely. https://github.com/joke2k/faker/pull/300
Recommended Posts