When you're touching Python, it's hard to tell the difference between a module and a package. Especially when using an IDE called PyCharm, it behaves strangely and strangely.
Therefore, I would like to summarize the struggle between the package and the module so that I can look back on it later.
First, let's touch the module.
Modules represent files
in Python.
In other words, if you create a file called ʻadd.py, it will be a module called ʻadd
.
By separating the functions for each module, it is possible to avoid bloating the functions to a single file.
Currently, we use such a directory structure.
In a directory called Python_Tutorial Contains the following:
.
├── __pycache__
├── calc
│ ├── add.py
│ ├── calc_main.py
│ ├── mul.py
│ └── sub.py
└── main.py
Write the module in the calc directory Then try calling another compute module from calc_main. After that, try using main.py directly under Python_Tutorial.
add.py, mul.py, sub.py Prepare.
print('add.py')
print(__name__)
def add(x, y):
return x + y
if __name__ == '__main__':
print(add(10, 20))
print('sub.py')
print(__name__)
def sub(x, y):
return x - y
if __name__ == '__main__':
print(sub(10, 20))
print('mul.py')
print(__name__)
def mul(x, y):
return x * y
if __name__ == '__main__':
print(mul(10, 20))
with add.py etc.
if __name__ == '__main__':
print(mul(10, 20))
Is attached, but what is this?
It contains the "name when the file $ f $ .py was run".
For example, try running add.py. then,
add.py
__main__
30
Is output.
This is because it is executed starting from the file $ f $ = add, and main is set to name in the executable file.
Be aware that name in the execution origin file will always be main.
Try adding, sub, and mul in the same directory from calc_main.py.
calc_main.py
import add
'''
add.py
add
30
'''
print(add.add(10, 20))
from sub import sub
'''
sub.py
sub
-10
'''
print(sub(20, 30))
from mul import *
'''
mul.py
mul
1500
'''
print(mul(30, 50))
Apparently
import File name in the same hierarchy
So, it seems that other files can be read as modules.
import add
When you do, two prints of add.py are executed. Therefore, it seems that solid writing processing such as print is executed when the module is loaded. Also, name contains add. In this way, it seems that the file name is entered as it is when it is called as a module instead of the execution starting point.
import file name Since it is read in, the function can be used in add.add, which is the namespace of the file name.
This notation seems safe because it protects each other's namespaces even if the ʻadd` function is in another file!
from sub import sub
When you do, it seems that you are reading the sub function of sub.py directly. It seems that it is possible to remove it from the module.
However, it seems that the sub function cannot be retrieved directly
.
Like add.py
'''
sub.py
sub
'''
Is output, so it seems that all the contents of the sub file will be executed at the time of from sub.
from mul import *
You can force all the contents of the mul file into the namespace of calc_main.py at once.
Basically ** it seems better not to use **. The reason is that it causes unexpected behavior such as being overwritten when using the same name as other modules.
import file name
May be the most convenient.
Next, try multiple loading. I'm a tree about how it behaves when the same file is read many times.
To try multiple loading a b A simple configuration like c seems to be easier.
Try the following.
.
├── __pycache__
├── abc
│ ├── a.py
│ ├── ab.py
│ ├── abc_main.py
│ ├── b.py
│ ├── bc.py
│ ├── c.py
│ └── ca.py
├── calc
│ ├── __pycache__
│ │ ├── add.cpython-37.pyc
│ │ ├── mul.cpython-37.pyc
│ │ └── sub.cpython-37.pyc
│ ├── add.py
│ ├── calc_main.py
│ ├── mul.py
│ └── sub.py
├── main.py
Added abc directory. ab.py reads a and b files as modules.
print('a file')
aa = 1
print('b file')
bb = 2
print('c file')
cc = 3
The experiment starts with a, b, and c.py as described above.
Try to read the corresponding files from the ab.py file etc.
ab.py
import a
import b
print('ab file')
bc.py
import b
import c
print('bc file')
ca.py
import c
import a
print('ca file')
abc_main.py
import ab
import bc
import ca
print('abc main file')
abc_main.py
When I run abc_main.py, it looks like this:
a file
b file
ab file
c file
bc file
ca file
abc main file
It seems that the a and b files are being read first. After that, since the b and c files are read, it should be output as b file, but it is skipped and output as c file.
Apparently it doesn't load once it's loaded.
Reading the ca file does not seem to get angry because the c and a files have already been read. Except that if ʻa file` etc. is read once, it will not be read after that and will be output only once.
Apparently, the file is basically read only `once. ``
Currently, I was using modules under the same directory such as abc_main and calc_main.
Then, is it possible to use a module in the next lower directory like main.py?
main.py
main.py
from calc import add
'''
add.py
calc.add
5
'''
print(add.add(2, 3))
As mentioned above, the add module is imported from the calc directory. This is the code that actually works.
In other words, since Python 3.3, you can use the modules under the directory. Also, if you focus on name
calc.add
Is output.
Therefore, it seems that it is loaded with the add module of the calc space.
This seems to be a feature called namespace import
since Python 3.3.
However, it does not work well if you do the following.
main.py
'''
from calc import add.add
^
SyntaxError: invalid syntax
'''
from calc import add.add
print(add(2, 3))
'''
AttributeError: module 'calc' has no attribute 'add'
'''
import calc
print(calc.add.add(2, 3))
Apparently, it seems that it is not allowed to ** further narrow down ** using "." In the content written after import.
Also, it seems that ** reading a directory as a module ** is not allowed.
A package is a "** module ** that manages all modules". For example, I created the calc directory earlier.
.
├── calc
│ ├── __pycache__
│ │ ├── add.cpython-37.pyc
│ │ ├── mul.cpython-37.pyc
│ │ └── sub.cpython-37.pyc
│ ├── add.py
│ ├── calc_main.py
│ ├── mul.py
│ └── sub.py
By putting ** __ init __. Py
** in this calc directory,
You can think of the calc directory itself as a ** large module **.
Of course, without __init__.py
, the calc directory is just a" directory ".
So, by putting __init__.py
, the calc directory itself will be regarded as a ** module **.
By using packages, you can handle files in larger units and structure them.
__init__.py
?__init__.py
is the first thing to be executed when you call that package module.
__init__.py
print(__name__)
If you do, the directory name will be output.
This is because the directory is treated as a "module" by __init__.py
.
Try rewriting the calc directory for the package.
calc_main.py
print(__name__)
from . import add
from . import sub
from . import mul
print(add.add(2, 3))
I will try.
At this time, normal python execution does not work.
python calc/calc_main.py
python calc_main.py
And so on
File "calc_main.py", line 3, in <module>
from . import add
ImportError: cannot import name 'add'
Error occurs.
python -m calc.calc_main.py
Then
You can run calc_main.py with calc as a package.
As you can see, the package is basically a "relative path" and now seems to be specified.
python calc/calc_main.py
python calc_main.py
It's the reason why it doesn't work Rewrite with the following code.
print(__name__)
print(__package__)
from . import add
from . import sub
from . import mul
print(add.add(2, 3))
python -m calc.calc_main.py
calc
__main__
calc
add.py
calc.add
sub.py
calc.sub
mul.py
calc.mul
5
python calc/calc_main.py
__main__
None
Traceback (most recent call last):
File "calc/calc_main.py", line 4, in <module>
from . import add
ImportError: cannot import name 'add'
It will be.
Here, as the difference between them Which package does package belong to? Represents (parent module) On the one that doesn't work, the package attribute is None. So I don't know which package to look for and I'm getting an error. In other words, even if you specify "." Etc., it will be "Relative path from which package?".
for that reason, If you use the calc package from the main directory
main.py
from calc import calc_main
print(10)
calc
calc.calc_main
calc
add.py
calc.add
sub.py
calc.sub
mul.py
calc.mul
5
10
It will be executed properly.
Therefore, it seems that it works well in the situation where the __package__
attribute is specified properly and in the case of relative import.
Reference site [[Python] Import stumbling block](https://qiita.com/ysk24ok/items/2711295d83218c699276#package cannot use implicit-relative-import) I've summarized the Python modules Python Ideas: Differences Between Python Packages and Modules
I will specify the module package with the -m option properly.
What should I do when using PyCharm? Apparently PyCharm behaves differently than normal Python. The IDE seems to make good inferences and behave differently.
Relative paths are used when loading modules within a package.
In the same package, the relative path is used.
calc_main.py
print(__name__)
print(__package__)
from . import add
print(add.add(2, 3))
When you hit the package from above, you can leave it as it is. Call it like a normal module.
main.py
from calc import calc_main
print(10)
In PyCharm, there are things like running startup configuration, When trying to execute a file in a package
calc_main.py
print(__name__)
print(__package__)
from . import add
print(add.add(2, 3))
None is output with package of, and an error occurs because the package is not judged. In such a case
From the command line
python -m calc.calc_main
You can give it as.
Running PyCharm launch configuration doesn't work. It's annoying that it's difficult to test the execution of a single file ... Is there a way to do it well ...
https://pleiades.io/help/pycharm/content-root.html
Recommended Posts