--Those who want to learn more about Python development environment and tools --For those who want a more modern Python environment
--For those who are new to Python itself --Python advanced
explain
--A rough description of the tool ――Reason and happiness to use the tool --Reference document / URL
Do not explain
--Specific command --Detailed grammar
Modern Python
Since I started doing research in graduate school, I've been writing Python for quite some time. This is because there are many libraries that are easy to use for research, and iterations such as research are very effective for fast projects.
However, Python is difficult to operate stably because you can try the code in a short period of time and change the behavior.
For example, C ++ needs to be compiled, so you need to think carefully about your design and implement it.
On the other hand, Python is a scripting language, so if you do the worst dark design
, even if the design is appropriate, it will be manageable.
However, such unthinking code will be a big debt in the future.
In fact, the code I wrote at a rapid pace for a treatise with a near deadline now stands in front of me, and I have to redesign and implement it from scratch.
Python, for better or for worse, has the following characteristics:
--No need to set the type ――By that amount, you will think about the variable name. --Many modules --Version shift --Abundant package managers ――It's hard to know which one to use --The design is sweet ――It becomes hard to read
This time, I'll summarize the tools that may solve these problems. However, since the actual usage and details are omitted, please refer to other articles.
I want to get used to Python and become an intermediate person.
There are many types of Python package managers. Earlier, I explained how to build a Python environment in the article I can't seem to put an end to the python environment construction war with illustrations. From that point on, my understanding progressed to some extent, and I felt that it would be okay to build the current Python environment, so I would like to summarize it.
pyenv
can ** manage multiple Python versions **.
Specifically, you can install Python3.7, Python3.8, and Python3.9 respectively and switch to your liking.
By switching the Python version in this way, you can support multiple projects.
For example, the code for an old project only works with Python 2.7! There is also something like.
Therefore, it would be nice to be able to switch versions with pyenv.
However, although pyenv can switch versions, it cannot create virtual environments. This virtual environment refers to the environment used for each project in this article.
Imagine if you can't isolate the virtual environment. Suppose you've developed a machine learning project and then assigned to another project to develop Django. At this time, the machine learning pytorch library isn't needed directly in Django. As the number of libraries increases, the operation becomes slower and the required version becomes inconsistent. Therefore, we need a virtual environment where libraries can be installed for each project.
To solve this problem, pipenv
can create multiple virtual environments in a version of Python
.
It is recommended because pipenv has many useful functions such as wrapping venv.
It also makes library versioning smarter, though I won't go into details here.
In summary, pyenv
manages the version of Python itself, and pipenv
manages the virtual environment for a particular Python version.
On Mac, you can install both pipenv and pyenv with brew.
pyenv came out earlier. This time, the new one is poetry.
poetry seems to be a manager who can manage libraries like rust. I'm not familiar with it because I haven't touched it yet, but it seems that one toml file manages everything.
It looks like a pretty new manager, so I'd love to touch it next time.
Python is a scripting language that works without typing. As a result, you can develop at a fairly high speed in the early stages, but in the second half, you spend more time thinking about variables and often suffer from run-time errors.
Therefore, you can code safely using typing
etc. introduced from Python 3.5.
However, it should be noted that ** when the program is executed, no error will be issued even if the variable contains contents of different types **.
It doesn't stop even if different types are included at runtime. However, it will be easier for you to benefit from the IDE and for third parties to understand the meaning of the code, so let's write it aggressively.
typing
def greeting(name: str) -> str:
return 'Hello ' + name
From Python3.5, you can specify types for variables and functions as described above. Specifying the type makes it easier to benefit from the IDE. Also, in the long run, it will lead to more efficient coding.
There is also a Final
keyword, which allows you to set constants etc. more safely.
Please see the reference URL for details.
data classes
It provides decorators and the like to easily create classes that store data.
from dataclasses import dataclass
@dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
namedtuple
You can declare a named tuple. Since you can guarantee data that cannot be rewritten, it seems to be convenient for managing things that do not change.
If possible, it is easier to declare tuples in a class that inherits typing.NamedTuple
.
class Employee(NamedTuple):
name: str
id: int
pydantic
This library is used by FastAPI. It provides type information at runtime.
from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: Optional[datetime] = None
friends: List[int] = []
external_data = {
'id': '123',
'signup_ts': '2019-06-01 12:22',
'friends': [1, 2, '3'],
}
user = User(**external_data)
If it contains untyped data, it will throw an exception ** even at runtime. (On the other hand, typing etc. does not throw a run-time error, so it is easier to see what is wrong with the type and it is more robust.)
docstring
docstring
is a string that represents information such as functions and classes.
Describe what kind of arguments / attributes the function or class has and how it behaves as a character string.
def add(x:int, y:int) -> int:
"""add function.
Calculate the sum of x and y.
Attributes
----------
x: int
Number to be added
y: int
Arguments to add
Returns
-------
int
Notes
-----
I usually don't write this much in these functions (because I know)
"""
return x + y
Writing a docstring has the following advantages.
—— Can convey the behavior of functions and classes to other team members
--The distance between the document and the code is close
--Types and supplementary information become more detailed in IDE etc.
--As will be described later, you can use docstring with the automatic document creation tool called Sphinx
.
—— Above all, it will be good for you in the future
There are several ways to write a docstring. The main ones are the following three.
Each has its own writing style. It seems good to refer to your favorite writing style. Also, if your team has decided how to write a docstring, follow that style.
There are few references on how to write docstring, so it seems good to refer to some library.
-NumPy style Python Docstrings example
When multiple people write code, they have different habits of writing code.
There are various things such as "
or'
, the number of characters, and how to add variables.
There are code check, Lint, formatter, etc. to unify different code styles. You can use these to write more unified Python code.
In this section, I will only touch on the ones I use. Therefore, there are other Linters besides these. Please check it out.
pep
pep
stands for python enhancement proposal
and refers to the documentation coding convention.
A proposal to improve Python, the most famous is pep8
.
pep8 is a coding standard such as the standard library, and most Python code is based on this pep8.
flake8
flake8
is a tool to check the format of code that can be installed with pip.
It will check if you follow the coding standards.
flake8 is a wrapper for the following three libraries.
With flake8, you can set detailed rules such as the number of characters in one line.
Also, if you install the following flake8 plugins with pip, when you execute the flake8
command, those plugins will be executed automatically.
black
black
is a code formatter.
flake8 was tell me where the convention was violated
, but black actually formats the code
.
The black feature is relatively new, and there are quite a few settings that can be changed. Therefore, using black will result in a similar forced format for many projects.
Black is very easy to use, so I definitely want to use it.
mypy
mypy
statically analyzes the annotation / type of the code and tells you the wrong type.
Thanks to mypy, all you have to do is fix the wrong type.
However, it may give an error to the library you are using, in which case you need to generate a stub or install the stub that is already distributed with pip.
isort
isort
modifies the python import order.
Since flake8 has an isort plugin, it seems good to do isort when you are warned that it is out of order.
When creating a Python project, some configuration files will come out. This section describes these configuration files.
setup.py
s
etup.pyis a file used to distribute the project to third parties. Use a module called
setuptools` to create a package that allows you to install project files with pip.
Describe the package information, installation method, URL, etc.
# https://packaging.python.org/tutorials/packaging-projects/?highlight=setup.py#creating-setup-py
import setuptools
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
setuptools.setup(
name="example-pkg-YOUR-USERNAME-HERE", # Replace with your own username
version="0.0.1",
author="Example Author",
author_email="[email protected]",
description="A small example package",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/pypa/sampleproject",
packages=setuptools.find_packages(),
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
python_requires='>=3.6',
)
The command pip install numpy
, which I usually write casually, gets the packages created by setup.py from online (PyPI) and puts them in a directory called site-packages
.
If you want to publish your project online as a package, write setup.py
.
MANIFEST.in
When creating a package from a project with setup.py
, there is When you want to include other than python files in the package.
For example, an image file or an audio file.
At this time, by creating a file called MANIFEST.in
, you can more easily build by including various files in the package.
setup.cfg
setup.py
is required to publish and install as a package.
However, if you directly specify Author or files to include, it will be difficult to change them later.
Then, by creating an additional configuration file, setup.cfg
,
You can manage the information used in the package independently.
If you see setup.cfg
when you run setup.py, retrieve the information, overwrite the contents, and then create the package.
[metadata]
name = my_package
version = attr: src.VERSION
description = My package description
long_description = file: README.rst, CHANGELOG.rst, LICENSE.rst
keywords = one, two
license = BSD 3-Clause License
classifiers =
Framework :: Django
License :: OSI Approved :: BSD License
Programming Language :: Python :: 3
Programming Language :: Python :: 3.5
[options]
zip_safe = False
include_package_data = True
packages = find:
scripts =
bin/first.py
bin/second.py
install_requires =
requests
importlib; python_version == "2.6"
requirements.txt
A file that shows a list of packages installed with pip.
It doesn't have to be named requirements.txt
, but it is customarily named.
By creating this file and including it on GitHub etc.
A third party can easily install the package with pip install -r requirements.txt
.
However, there are problems in requirements.txt such as it is not suitable for dependency resolution and it is difficult to update the library version. Therefore, pipenv currently uses another package management file called Pipfile and poetry uses pyproject.toml.
Pipfile/Pipfile.lock
Pipfile
, Pipfile.lock
is a pipenv management file that solves the requirements.txt problem.
The Pipfile will contain the ** directly dependent ** libraries.
For example, if you want to create a project that hits the URL using requests
, create the following Pipfile.
[[source]]
url = "https://pypi.python.org/simple"
verify_ssl = true
name = "pypi"
[packages]
requests = "*"
At this time, requests
depends on other libraries.
For example, we internally use a library called chardet
that determines the character code of a file.
Here, the version of chardet
is not entered in the Pipfile.
On the other hand, all versions of the library are listed in Pipfile.lock
.
At this time, the top-level library you want to use directly can be managed with Pipfile
, so you can easily consider upgrading the version.
Also, all dependent library versions are managed by Pipfile.lock
, making it easy for a third party to have the same execution environment.
This solved the dependency problem by separating the files.
pyproject.toml
pyproject.toml
is a file defined in PEP 518
that manages package settings.
Recently, a package manager called poetry uses this file,
It seems that it is not a configuration file limited to poetry.
Any supported package manager can use pyproject.toml.
Previously, many files such as requirements.txt
, setup.py
, setup.cfg
, MANIFEST.in
were required to publish the package.
pyproject.toml is the only file that makes up for all of this.
I'm currently using pipenv, but I'm interested in poetry and pyproject.toml, so I'll try it.
tox.ini
tox
is a library that automates Python testing.
By hitting the tox command, all the contents of the test written in tox.ini
will be executed automatically.
If it's about one pytest
, you only have to type that command every time.
However, you may want to test with multiple versions, such as python2.7
, python3.8
, python3.9
, depending on the version you are distributing.
You may also want to just test for compliance with coding standards like flake8.
Therefore, by using tox and the configuration file tox.ini, all tests can be executed automatically with just one command. What's more, tox creates other versions of the Python environment inside a special directory that tox handles. As a result, a virtual environment is created for each test, and there is no dependency between tests.
[tox]
#Specify the environment to use
#Flake8 with matching name-py38 is[testenv:flake8-py38]To run
#py38 is[testenv:py38]Because there is no[testenv]To run
envlist =
py38
flake8-py38
mypy-py38
[testenv]
deps = pipenv
#Command to run in test
#This command just does pipenv install, so of course it passes the test
commands =
pipenv install
[testenv:flake8-py38]
basepython = python3.8
description = 'check flake8-style is ok?'
commands=
pipenv install
pipenv run flake8 gym_md
#setting file
# https://flake8.pycqa.org/en/latest/user/configuration.html#configuration-locations
[flake8-py38]
max-line-length = 88
[testenv:mypy-py38]
basepython = python3.8
description = 'check my-py is ok?'
commands =
pipenv install
pipenv run mypy gym_md
PyPI
PyPI is a site where you can upload python libraries.
If you did pip install
, it is downloaded from here.
There are multiple testing tools in python.
unittest
The standard test library is unittest
.
Included in the standard package, you can write tests without installing.
Inherit the TestCase class and create a method starting with test.
import unittest
class TestStringMethods(unittest.TestCase):
def test_upper(self):
self.assertEqual('foo'.upper(), 'FOO')
def test_isupper(self):
self.assertTrue('FOO'.isupper())
self.assertFalse('Foo'.isupper())
def test_split(self):
s = 'hello world'
self.assertEqual(s.split(), ['hello', 'world'])
# check that s.split fails when the separator is not a string
with self.assertRaises(TypeError):
s.split(2)
if __name__ == '__main__':
unittest.main()
pytest
pytest
is a third-party test library.
pytest tests based on functions and gives more detailed errors than unittest.
For example, if the output value is incorrect as shown below, the wrong location and its value will be output. I use pytest because pytest is easy to use and the output value is easy to understand.
# content of test_sample.py
def inc(x):
return x + 1
def test_answer():
assert inc(3) == 5
$ pytest
=========================== test session starts ============================
platform linux -- Python 3.x.y, pytest-6.x.y, py-1.x.y, pluggy-0.x.y
cachedir: $PYTHON_PREFIX/.pytest_cache
rootdir: $REGENDOC_TMPDIR
collected 1 item
test_sample.py F [100%]
================================= FAILURES =================================
_______________________________ test_answer ________________________________
def test_answer():
> assert inc(3) == 5
E assert 4 == 5
E + where 4 = inc(3)
test_sample.py:6: AssertionError
========================= short test summary info ==========================
FAILED test_sample.py::test_answer - assert 4 == 5
============================ 1 failed in 0.12s =============================
doctest
doctest
allows you to run tests on the docstring
that came up earlier.
When you import doctest, if the execution example is written in docstring with >>>
, it will test whether it works as it is.
You can't write complex tests like pytest
.
However, since it is in the docstring, it can be used as a test and presented to a third party as an execution example.
This makes it easier to understand how your code behaves and makes it easier to modify your code.
def square(x):
"""Return the square of x.
>>> square(2)
4
>>> square(-2)
4
"""
return x * x
if __name__ == '__main__':
import doctest
doctest.testmod()
tox
As we mentioned earlier, you can automate multiple test commands by writing tox.
Sphinx is a tool that makes it easy to create beautiful documents.
Many of the Python library references are written in this Sphinx
.
Sphinx uses a markup language called reStructuredText
to create documents.
At this time, sphinx-apidoc
, which is a function to automatically create a document, is attached, and if you write the docstring properly in the Python code, you can create a document with a single command.
Therefore, if you write a docstring, you can leave the type, information, etc. for the future, and it will be a reference as it is.
This makes the document less updated than the code, and less likely to incur the liability of the document becoming a mere ghost.
However, the bst file created by shpinx-apidoc
is the default, so you will have to edit it yourself to make it fine.
cookiecutter
cookiecutter
is a tool that makes it easy to create beautiful Python projects.
The Available Templates (https://github.com/topics/cookiecutter-template) are published on GitHub, which makes it easy to create well-designed projects locally.
While setting my information on the boiler plate
Think of it as a tool that can be created locally.
For example, you can easily create a project with the following settings by specifying cookiecutter-pypackage.
--Sophisticated project design --Test automation with Travis CI --Document creation using Shpinx --Testing in multiple environments using tox --Automatic release to PyPI --CLI interface (click)
I'm glad that I can prepare all the items I have explained so far.
I've summarized the Python development tools. While summarizing, I realized that I was still not very familiar with it and could not master it.
I want to practice every day so that I can write pythonic
code.
Recommended Posts