Introduction

When you're touching Python, it's hard to tell the difference between a module and a package. Especially when using an IDE called PyCharm, it behaves strangely and strangely.

Therefore, I would like to summarize the struggle between the package and the module so that I can look back on it later.

Touch the module

First, let's touch the module.

What is a module?

Modules represent files in Python. In other words, if you create a file called ʻadd.py, it will be a module called ʻadd.

By separating the functions for each module, it is possible to avoid bloating the functions to a single file.

Preparation

add.py, mul.py, sub.py Prepare.

print('add.py')
print(__name__)


def add(x, y):
    return x + y


if __name__ == '__main__':
    print(add(10, 20))

print('sub.py')
print(__name__)


def sub(x, y):
    return x - y


if __name__ == '__main__':
    print(sub(10, 20))

print('mul.py')
print(__name__)


def mul(x, y):
    return x * y


if __name__ == '__main__':
    print(mul(10, 20))

What is name

with add.py etc.

if __name__ == '__main__':
    print(mul(10, 20))

Is attached, but what is this?

It contains the "name when the file $ f $ .py was run".

For example, try running add.py. then,

add.py
__main__
30

Is output.

This is because it is executed starting from the file $ f $ = add, and main is set to name in the executable file.

Be aware that name in the execution origin file will always be main.

Try calling the module

Try adding, sub, and mul in the same directory from calc_main.py.

`calc_main.py`



import add

'''
add.py
add
30
'''
print(add.add(10, 20))

from sub import sub

'''
sub.py
sub
-10
'''
print(sub(20, 30))


from mul import *

'''
mul.py
mul
1500
'''
print(mul(30, 50))

Apparently

import File name in the same hierarchy

So, it seems that other files can be read as modules.

Load add

import add

When you do, two prints of add.py are executed. Therefore, it seems that solid writing processing such as print is executed when the module is loaded. Also, name contains add. In this way, it seems that the file name is entered as it is when it is called as a module instead of the execution starting point.

import file name Since it is read in, the function can be used in add.add, which is the namespace of the file name.

This notation seems safe because it protects each other's namespaces even if the ʻadd` function is in another file!

Read sub

from sub import sub

When you do, it seems that you are reading the sub function of sub.py directly. It seems that it is possible to remove it from the module.

However, it seems that the sub function cannot be retrieved directly. Like add.py

'''
sub.py
sub
'''

Is output, so it seems that all the contents of the sub file will be executed at the time of from sub.

Loading mul

from mul import *

You can force all the contents of the mul file into the namespace of calc_main.py at once.

Basically ** it seems better not to use **. The reason is that it causes unexpected behavior such as being overwritten when using the same name as other modules.

Read summary in the same directory

import file name

May be the most convenient.

Try multiple reading

Next, try multiple loading. I'm a tree about how it behaves when the same file is read many times.

Directory structure

To try multiple loading a b A simple configuration like c seems to be easier.

Try the following.

.
├── __pycache__
├── abc
│   ├── a.py
│   ├── ab.py
│   ├── abc_main.py
│   ├── b.py
│   ├── bc.py
│   ├── c.py
│   └── ca.py
├── calc
│   ├── __pycache__
│   │   ├── add.cpython-37.pyc
│   │   ├── mul.cpython-37.pyc
│   │   └── sub.cpython-37.pyc
│   ├── add.py
│   ├── calc_main.py
│   ├── mul.py
│   └── sub.py
├── main.py

Added abc directory. ab.py reads a and b files as modules.

Preparation

print('a file')

aa = 1

print('b file')

bb = 2

print('c file')

cc = 3

The experiment starts with a, b, and c.py as described above.

Preparation 2

Try to read the corresponding files from the ab.py file etc.

`ab.py`


import a
import b


print('ab file')

`bc.py`


import b
import c

print('bc file')

`ca.py`


import c
import a

print('ca file')

`abc_main.py`


import ab
import bc
import ca

print('abc main file')

abc_main.py

When I run abc_main.py, it looks like this:

a file
b file
ab file
c file
bc file
ca file
abc main file

It seems that the a and b files are being read first. After that, since the b and c files are read, it should be output as b file, but it is skipped and output as c file.

Apparently it doesn't load once it's loaded.

Reading the ca file does not seem to get angry because the c and a files have already been read. Except that if ʻa file` etc. is read once, it will not be read after that and will be output only once.

Apparently, the file is basically read only `once. ``

Try using the module one level below

Currently, I was using modules under the same directory such as abc_main and calc_main.

Then, is it possible to use a module in the next lower directory like main.py?

main.py

`main.py`


from calc import add

'''
add.py
calc.add
5
'''
print(add.add(2, 3))

As mentioned above, the add module is imported from the calc directory. This is the code that actually works.

In other words, since Python 3.3, you can use the modules under the directory. Also, if you focus on name

calc.add

Is output. Therefore, it seems that it is loaded with the add module of the calc space. This seems to be a feature called namespace import since Python 3.3.

However, it does not work well if you do the following.

`main.py`



'''
    from calc import add.add
                        ^
SyntaxError: invalid syntax
'''
from calc import add.add
print(add(2, 3))



'''
AttributeError: module 'calc' has no attribute 'add'
'''
import calc
print(calc.add.add(2, 3))

Apparently, it seems that it is not allowed to ** further narrow down ** using "." In the content written after import.

Also, it seems that ** reading a directory as a module ** is not allowed.

package

What is a package?

A package is a "** module ** that manages all modules". For example, I created the calc directory earlier.

.
├── calc
│   ├── __pycache__
│   │   ├── add.cpython-37.pyc
│   │   ├── mul.cpython-37.pyc
│   │   └── sub.cpython-37.pyc
│   ├── add.py
│   ├── calc_main.py
│   ├── mul.py
│   └── sub.py

By putting ** __ init __. Py ** in this calc directory, You can think of the calc directory itself as a ** large module **.

Of course, without __init__.py, the calc directory is just a" directory ". So, by putting __init__.py, the calc directory itself will be regarded as a ** module **.

By using packages, you can handle files in larger units and structure them.

What about `init.py`?

__init__.py is the first thing to be executed when you call that package module.

`init.py`


print(__name__)

If you do, the directory name will be output.

This is because the directory is treated as a "module" by __init__.py.

Try rewriting calc

Try rewriting the calc directory for the package.

`calc_main.py`


print(__name__)

from . import add
from . import sub
from . import mul

print(add.add(2, 3))

I will try.

At this time, normal python execution does not work.

python calc/calc_main.py
python calc_main.py

And so on

  File "calc_main.py", line 3, in <module>
    from . import add
ImportError: cannot import name 'add'

Error occurs.

python -m calc.calc_main.py

Then

You can run calc_main.py with calc as a package.

As you can see, the package is basically a "relative path" and now seems to be specified.

python calc/calc_main.py
python calc_main.py

It's the reason why it doesn't work Rewrite with the following code.

print(__name__)
print(__package__)

from . import add
from . import sub
from . import mul

print(add.add(2, 3))

python -m calc.calc_main.py

calc
__main__
calc
add.py
calc.add
sub.py
calc.sub
mul.py
calc.mul
5

python calc/calc_main.py

__main__
None
Traceback (most recent call last):
  File "calc/calc_main.py", line 4, in <module>
    from . import add
ImportError: cannot import name 'add'

It will be.

Here, as the difference between them Which package does package belong to? Represents (parent module) On the one that doesn't work, the package attribute is None. So I don't know which package to look for and I'm getting an error. In other words, even if you specify "." Etc., it will be "Relative path from which package?".

for that reason, If you use the calc package from the main directory

`main.py`



from calc import calc_main

print(10)

calc
calc.calc_main
calc
add.py
calc.add
sub.py
calc.sub
mul.py
calc.mul
5
10

It will be executed properly.

Therefore, it seems that it works well in the situation where the __package__ attribute is specified properly and in the case of relative import.

Reference site [[Python] Import stumbling block](https://qiita.com/ysk24ok/items/2711295d83218c699276#package cannot use implicit-relative-import) I've summarized the Python modules Python Ideas: Differences Between Python Packages and Modules

I will specify the module package with the -m option properly.

In PyCharm?

What should I do when using PyCharm? Apparently PyCharm behaves differently than normal Python. The IDE seems to make good inferences and behave differently.

1. Use relative path when loading modules

Relative paths are used when loading modules within a package.

In the same package, the relative path is used.

`calc_main.py`


print(__name__)
print(__package__)

from . import add

print(add.add(2, 3))

2. When using the package, write as usual

When you hit the package from above, you can leave it as it is. Call it like a normal module.

`main.py`


from calc import calc_main

print(10)

3. The file cannot be read in the package, so do it from the command

In PyCharm, there are things like running startup configuration, When trying to execute a file in a package

`calc_main.py`


print(__name__)
print(__package__)

from . import add

print(add.add(2, 3))

None is output with package of, and an error occurs because the package is not judged. In such a case

From the command line

python -m calc.calc_main

You can give it as.

Running PyCharm launch configuration doesn't work. It's annoying that it's difficult to test the execution of a single file ... Is there a way to do it well ...

https://pleiades.io/help/pycharm/content-root.html

Understand Python packages and modules

Introduction

Touch the module

What is a module?

directory

Preparation

What is name

Try calling the module

calc_main.py

Load add

Read sub

Loading mul

Read summary in the same directory

Try multiple reading

Directory structure

Preparation

Preparation 2

ab.py

bc.py

ca.py

abc_main.py

Try using the module one level below

main.py

main.py

package

What is a package?

What about __init__.py?

__init__.py

Try rewriting calc

calc_main.py

main.py

In PyCharm?

1. Use relative path when loading modules

calc_main.py

2. When using the package, write as usual

main.py

3. The file cannot be read in the package, so do it from the command

calc_main.py

`calc_main.py`

`ab.py`

`bc.py`

`ca.py`

`abc_main.py`

`main.py`

`main.py`

What about `init.py`?

`init.py`

`calc_main.py`

`main.py`

`calc_main.py`

`main.py`

`calc_main.py`