Understand Python packages and modules

Introduction

When you're touching Python, it's hard to tell the difference between a module and a package. Especially when using an IDE called PyCharm, it behaves strangely and strangely.

Therefore, I would like to summarize the struggle between the package and the module so that I can look back on it later.

Touch the module

First, let's touch the module.

What is a module?

Modules represent files in Python. In other words, if you create a file called ʻadd.py, it will be a module called ʻadd.

By separating the functions for each module, it is possible to avoid bloating the functions to a single file.

directory

Currently, we use such a directory structure.

In a directory called Python_Tutorial Contains the following:

.
├── __pycache__
├── calc
│   ├── add.py
│   ├── calc_main.py
│   ├── mul.py
│   └── sub.py
└── main.py

Write the module in the calc directory Then try calling another compute module from calc_main. After that, try using main.py directly under Python_Tutorial.

Preparation

add.py, mul.py, sub.py Prepare.

print('add.py')
print(__name__)


def add(x, y):
    return x + y


if __name__ == '__main__':
    print(add(10, 20))
print('sub.py')
print(__name__)


def sub(x, y):
    return x - y


if __name__ == '__main__':
    print(sub(10, 20))

print('mul.py')
print(__name__)


def mul(x, y):
    return x * y


if __name__ == '__main__':
    print(mul(10, 20))

What is name

with add.py etc.

if __name__ == '__main__':
    print(mul(10, 20))

Is attached, but what is this?

It contains the "name when the file $ f $ .py was run".

For example, try running add.py. then,

add.py
__main__
30

Is output.

This is because it is executed starting from the file $ f $ = add, and main is set to name in the executable file.

Be aware that name in the execution origin file will always be main.

Try calling the module

Try adding, sub, and mul in the same directory from calc_main.py.

calc_main.py



import add

'''
add.py
add
30
'''
print(add.add(10, 20))

from sub import sub

'''
sub.py
sub
-10
'''
print(sub(20, 30))


from mul import *

'''
mul.py
mul
1500
'''
print(mul(30, 50))

Apparently

import File name in the same hierarchy

So, it seems that other files can be read as modules.

Load add

import add

When you do, two prints of add.py are executed. Therefore, it seems that solid writing processing such as print is executed when the module is loaded. Also, name contains add. In this way, it seems that the file name is entered as it is when it is called as a module instead of the execution starting point.

import file name Since it is read in, the function can be used in add.add, which is the namespace of the file name.

This notation seems safe because it protects each other's namespaces even if the ʻadd` function is in another file!

Read sub

from sub import sub

When you do, it seems that you are reading the sub function of sub.py directly. It seems that it is possible to remove it from the module.

However, it seems that the sub function cannot be retrieved directly. Like add.py

'''
sub.py
sub
'''

Is output, so it seems that all the contents of the sub file will be executed at the time of from sub.

Loading mul

from mul import *

You can force all the contents of the mul file into the namespace of calc_main.py at once.

Basically ** it seems better not to use **. The reason is that it causes unexpected behavior such as being overwritten when using the same name as other modules.

Read summary in the same directory

import file name

May be the most convenient.

Try multiple reading

Next, try multiple loading. I'm a tree about how it behaves when the same file is read many times.

Directory structure

To try multiple loading a b A simple configuration like c seems to be easier.

Try the following.

.
├── __pycache__
├── abc
│   ├── a.py
│   ├── ab.py
│   ├── abc_main.py
│   ├── b.py
│   ├── bc.py
│   ├── c.py
│   └── ca.py
├── calc
│   ├── __pycache__
│   │   ├── add.cpython-37.pyc
│   │   ├── mul.cpython-37.pyc
│   │   └── sub.cpython-37.pyc
│   ├── add.py
│   ├── calc_main.py
│   ├── mul.py
│   └── sub.py
├── main.py

Added abc directory. ab.py reads a and b files as modules.

Preparation

print('a file')

aa = 1
print('b file')

bb = 2
print('c file')

cc = 3

The experiment starts with a, b, and c.py as described above.

Preparation 2

Try to read the corresponding files from the ab.py file etc.

ab.py


import a
import b


print('ab file')

bc.py


import b
import c

print('bc file')

ca.py


import c
import a

print('ca file')

abc_main.py


import ab
import bc
import ca

print('abc main file')

abc_main.py

When I run abc_main.py, it looks like this:

a file
b file
ab file
c file
bc file
ca file
abc main file

It seems that the a and b files are being read first. After that, since the b and c files are read, it should be output as b file, but it is skipped and output as c file.

Apparently it doesn't load once it's loaded.

Reading the ca file does not seem to get angry because the c and a files have already been read. Except that if ʻa file` etc. is read once, it will not be read after that and will be output only once.

Apparently, the file is basically read only `once. ``

Try using the module one level below

Currently, I was using modules under the same directory such as abc_main and calc_main.

Then, is it possible to use a module in the next lower directory like main.py?

main.py

main.py


from calc import add

'''
add.py
calc.add
5
'''
print(add.add(2, 3))

As mentioned above, the add module is imported from the calc directory. This is the code that actually works.

In other words, since Python 3.3, you can use the modules under the directory. Also, if you focus on name

calc.add

Is output. Therefore, it seems that it is loaded with the add module of the calc space. This seems to be a feature called namespace import since Python 3.3.

However, it does not work well if you do the following.

main.py



'''
    from calc import add.add
                        ^
SyntaxError: invalid syntax
'''
from calc import add.add
print(add(2, 3))



'''
AttributeError: module 'calc' has no attribute 'add'
'''
import calc
print(calc.add.add(2, 3))

Apparently, it seems that it is not allowed to ** further narrow down ** using "." In the content written after import.

Also, it seems that ** reading a directory as a module ** is not allowed.

package

What is a package?

A package is a "** module ** that manages all modules". For example, I created the calc directory earlier.

.
├── calc
│   ├── __pycache__
│   │   ├── add.cpython-37.pyc
│   │   ├── mul.cpython-37.pyc
│   │   └── sub.cpython-37.pyc
│   ├── add.py
│   ├── calc_main.py
│   ├── mul.py
│   └── sub.py

By putting ** __ init __. Py ** in this calc directory, You can think of the calc directory itself as a ** large module **.

Of course, without __init__.py, the calc directory is just a" directory ". So, by putting __init__.py, the calc directory itself will be regarded as a ** module **.

By using packages, you can handle files in larger units and structure them.

What about __init__.py?

__init__.py is the first thing to be executed when you call that package module.

__init__.py


print(__name__)

If you do, the directory name will be output.

This is because the directory is treated as a "module" by __init__.py.

Try rewriting calc

Try rewriting the calc directory for the package.

calc_main.py


print(__name__)

from . import add
from . import sub
from . import mul

print(add.add(2, 3))

I will try.

At this time, normal python execution does not work.

python calc/calc_main.py
python calc_main.py

And so on

  File "calc_main.py", line 3, in <module>
    from . import add
ImportError: cannot import name 'add'

Error occurs.

python -m calc.calc_main.py

Then

You can run calc_main.py with calc as a package.

As you can see, the package is basically a "relative path" and now seems to be specified.

python calc/calc_main.py
python calc_main.py

It's the reason why it doesn't work Rewrite with the following code.

print(__name__)
print(__package__)

from . import add
from . import sub
from . import mul

print(add.add(2, 3))
python -m calc.calc_main.py

calc
__main__
calc
add.py
calc.add
sub.py
calc.sub
mul.py
calc.mul
5

python calc/calc_main.py

__main__
None
Traceback (most recent call last):
  File "calc/calc_main.py", line 4, in <module>
    from . import add
ImportError: cannot import name 'add'

It will be.

Here, as the difference between them Which package does package belong to? Represents (parent module) On the one that doesn't work, the package attribute is None. So I don't know which package to look for and I'm getting an error. In other words, even if you specify "." Etc., it will be "Relative path from which package?".

for that reason, If you use the calc package from the main directory

main.py



from calc import calc_main

print(10)
calc
calc.calc_main
calc
add.py
calc.add
sub.py
calc.sub
mul.py
calc.mul
5
10

It will be executed properly.

Therefore, it seems that it works well in the situation where the __package__ attribute is specified properly and in the case of relative import.

Reference site [[Python] Import stumbling block](https://qiita.com/ysk24ok/items/2711295d83218c699276#package cannot use implicit-relative-import) I've summarized the Python modules Python Ideas: Differences Between Python Packages and Modules

I will specify the module package with the -m option properly.

In PyCharm?

What should I do when using PyCharm? Apparently PyCharm behaves differently than normal Python. The IDE seems to make good inferences and behave differently.

1. Use relative path when loading modules

Relative paths are used when loading modules within a package.

In the same package, the relative path is used.

calc_main.py


print(__name__)
print(__package__)

from . import add

print(add.add(2, 3))

2. When using the package, write as usual

When you hit the package from above, you can leave it as it is. Call it like a normal module.

main.py


from calc import calc_main

print(10)

3. The file cannot be read in the package, so do it from the command

In PyCharm, there are things like running startup configuration, When trying to execute a file in a package

calc_main.py


print(__name__)
print(__package__)

from . import add

print(add.add(2, 3))

None is output with package of, and an error occurs because the package is not judged. In such a case

From the command line

python -m calc.calc_main

You can give it as.

Running PyCharm launch configuration doesn't work. It's annoying that it's difficult to test the execution of a single file ... Is there a way to do it well ...

https://pleiades.io/help/pycharm/content-root.html

Recommended Posts

Understand Python packages and modules
Python Basic Course (14 Modules and Packages)
Organize python modules and packages in a mess
Get an abstract understanding of Python modules and packages
MIDI packages in Python midi and pretty_midi
[Python / matplotlib] Understand and use FuncAnimation
Introductory Python Modules and conditional expressions
Python virtual environment and packages on Ubuntu
[Python] Package and distribute your own modules
Understand python lists, dictionaries, and so on.
Julia Quick Note [22] Calling Python functions and Python modules
[python] Compress and decompress
List of python modules
Python and numpy tips
[Python] pip and wheel
Batch design and python
Python iterators and generators
Ruby, Python and map
python input and output
Using Python #external packages
Python and Ruby split
Python3, venv and Ansible
Python asyncio and ContextVar
Python --Explanation and usage summary of the top 24 packages
Carefully understand the exponential distribution and draw in Python
Plot and understand the multivariate normal distribution in Python
Carefully understand the Poisson distribution and draw in Python
Manage Python runtime packages and development environment packages with Poetry
Programming with Python and Tkinter
Encryption and decryption with Python
Understand t-SNE and improve visualization
Python: Class and instance variables
3-3, Python strings and character codes
Python 2 series and 3 series (Anaconda edition)
Python and hardware-Using RS232C with Python-
Python on Ruby and angry Ruby on Python
Understand PyTorch's DataSet and DataLoader (2)
Python indentation and string format
[Python] Loading multi-level self-made modules
Python real division (/) and integer division (//)
Install Python and Flask (Windows 10)
Understand PyTorch's DataSet and DataLoader (1)
About python objects and classes
List method argument information for classes and modules in Python
About Python variables and objects
Try to understand Python self
Apache mod_auth_tkt and Python AuthTkt
Å (Ongustromu) and NFC @ Python
# 2 [python3] Separation and comment out
Python shallow copy and deep copy
Python and ruby slice memo
Python installation and basic grammar
I compared Java and Python!
Python shallow and deep copy
About Python, len () and randint ()
A standard way to develop and distribute packages in Python
About Python datetime and timezone
Install Python 3.7 and Django 3.0 (CentOS)
Python environment construction and TensorFlow
Python class variables and instance variables
Roadmap for publishing Python packages