[Python3 / MongoDB] Lightly summarize pymongo processing calls

Overview

As the title says.

I recently started using MongoDB, but I was in trouble because the information was scattered on various sites, so I aimed to "collect what I usually use (` ・ ω ・ ´). " It is an article.

Versions

$ python -V
Python 3.7.1

$ pip list | grep pymongo
pymongo                  3.9.0

Official tutorial

https://api.mongodb.com/python/current/tutorial.html Even if you don't look at my article, this is ... in English

Module import

from pymongo import MongoClient, DESCENDING, ASCENDING

Assuming that the import statement written here has already been executed, I will write the following article.

Instance creation

#I'm going to use this from now on
>>> m_client = MongoClient()

I will list the methods that can be used

Doba

>>> dir(m_client)
['HOST', 'PORT', '_BaseObject__codec_options', '_BaseObject__read_concern', '_BaseObject__read_preference', '_BaseObject__write_concern', '_MongoClient__all_credentials', '_MongoClient__cursor_manager', '_MongoClient__default_database_name', '_MongoClient__index_cache', '_MongoClient__index_cache_lock', '_MongoClient__kill_cursors_queue', '_MongoClient__lock', '_MongoClient__options', '_MongoClient__start_session', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_cache_credentials', '_cache_index', '_cached', '_close_cursor', '_close_cursor_now', '_constructor_args', '_database_default_options', '_encrypter', '_end_sessions', '_ensure_session', '_event_listeners', '_get_server_session', '_get_socket', '_get_topology', '_is_writable', '_kill_cursors', '_kill_cursors_executor', '_process_periodic_tasks', '_process_response', '_purge_credentials', '_purge_index', '_read_preference_for', '_repr_helper', '_reset_server', '_reset_server_and_request_check', '_retry_with_session', '_retryable_read', '_retryable_write', '_return_server_session', '_run_operation_with_response', '_select_server', '_send_cluster_time', '_server_property', '_slaveok_for_server', '_socket_for_reads', '_socket_for_writes', '_tmp_session', '_topology', '_topology_settings', '_write_concern_for', 'address', 'arbiters', 'close', 'close_cursor', 'codec_options', 'database_names', 'drop_database', 'event_listeners', 'fsync', 'get_database', 'get_default_database', 'is_locked', 'is_mongos', 'is_primary', 'kill_cursors', 'list_database_names', 'list_databases', 'local_threshold_ms', 'max_bson_size', 'max_idle_time_ms', 'max_message_size', 'max_pool_size', 'max_write_batch_size', 'min_pool_size', 'next', 'nodes', 'primary', 'read_concern', 'read_preference', 'retry_reads', 'retry_writes', 'secondaries', 'server_info', 'server_selection_timeout', 'set_cursor_manager', 'start_session', 'unlock', 'watch', 'write_concern']

1. Minimal basic processing

Enumerate DB names

>>> m_client.list_database_names()
['admin', 'candles', 'config', 'local'] #DB that was in my mongodb

#The result is the same in both cases ~
>>> m_client.database_names()
['admin', 'candles', 'config', 'local']

Create DB

What ** unnecessary **! Even if the specified DB does not exist, if you input data to it, the DB will be created together! It's convenient, but a little scary ...?

Create Collection

This is also ** unnecessary **! In mongodb, there is a frame called Collection in the DB, and records are inserted in it, but a Collection that does not exist at the time of record insert is automatically created.

  • What is inserted into MongoDB is not a record but originally called a ** document **, but in this article we will use the name of a record that is often used as a unit to insert one by one.

This is also convenient but a little scary. (Even if the Collection specification is incorrect, no error may occur)

Insert record

Instance.db name.collection name.insert_one ({'hoge':'hoge'}) It will be written like this.


>>> m_client.db_name.collection_name.insert_one({'hoge': 'Hoge 1'})
<pymongo.results.InsertOneResult object at 0x7fb567329448>

#There is also an insert method, but I get angry when I use it
>>> m_client.db_name.collection_name.insert({'hoge': 'Hoge 2'})
__main__:1: DeprecationWarning: insert is deprecated. Use insert_one or insert_many instead.
ObjectId('5dec7ab6f8f8434dbcab979a')

You have now inserted two {'hoge':'hoge'}.

Select *

>>> m_client.db_name.collection_name.find()
<pymongo.cursor.Cursor object at 0x7fb56730ddd8> #Are you again...

#You can see the contents by enclosing them in a list and arranging them.
>>> list(m_client.db_name.collection_name.find())
[
    {'_id': ObjectId('5dec7ab6f8f8434dbcab979a'), 'hoge': 'Hoge 1'},
    {'_id': ObjectId('5dec7ae3f8f8434dbcab979b'), 'hoge': 'Hoge 2'}
]

There are two hoge that I just put in.

Select (only one select)

>>> m_client.db_name.collection_name.find_one()
{'_id': ObjectId('5dec7ab6f8f8434dbcab979a'), 'hoge': 'Hoge 1'}

Select by specifying conditions (select * where ~)

>>> list(m_client.db_name.collection_name.find({'hoge': 'Hoge 2'}))
[
    {'_id': ObjectId('5dec7ab6f8f8434dbcab979a'), 'hoge': 'Hoge 2'}
]
# 'hoge'The column value is'Hoge 2'If there is other data that is, all will be acquired


# find_If one, select only one
>>> m_client.db_name.collection_name.find_one({'hoge': 'Hoge'})
{'_id': ObjectId('5dec7ae3f8f8434dbcab979b'), 'hoge': 'Hoge'}

Delete

#The first record found by the condition is deleted
>>> m_client.db_name.collection_name.delete_one({'hoge': 'Hoge 2'})
<pymongo.results.DeleteResult object at 0x7fb567344208>

2. Try harder

Enumerate DB?

#I can't see the contents with just this....
>>> m_client.list_databases()
<pymongo.command_cursor.CommandCursor object at 0x7fb56730dcf8>

#So array
>>> list(m_client.list_databases())
[
    {'name': 'admin', 'sizeOnDisk': 102400.0, 'empty': False}, 
    {'name': 'candles', 'sizeOnDisk': 1875968.0, 'empty': False},
    {'name': 'config', 'sizeOnDisk': 98304.0, 'empty': False},
    {'name': 'local', 'sizeOnDisk': 73728.0, 'empty': False}
]

I don't know what to use .... (´ ・ ω ・ `)

Insert by specifying'_id'

mongodb will automatically assign a unique'_id' to each record when you insert the record.

But if you want to specify the'_id' yourself, do this.

>>> m_client.db_name.collection_name.insert({
    '_id': 'id',
    'hoge': 'Hoge 3'
})

#Then it becomes like this
>>> list(m_client.your_db_name.your_collection_name.find())
[
    {'_id': ObjectId('5dec7ab6f8f8434dbcab979a'), 'hoge': 'Hoge 1'},
    {'_id': ObjectId('5dec7ae3f8f8434dbcab979b'), 'hoge': 'Hoge'},
    {'_id': 'id', 'hoge': 'Hoge 3'}
]

#Should I enter the date and time? I think
>>> import datetime
>>> m_client.db_name.collection_name.insert_one({'_id': datetime.datetime.now(), 'hoge': 'Hoge 4'})
>>> list(m_client.db_name.collection_name.find())
[
    {'_id': ObjectId('5dec7ab6f8f8434dbcab979a'), 'hoge': 'Hoge 1'},
    {'_id': ObjectId('5dec7ae3f8f8434dbcab979b'), 'hoge': 'Hoge'},
    {'_id': 'id', 'hoge': 'Hoge 3'},
    {'_id': datetime.datetime(2019, 12, 8, 14, 58, 31, 579000), 'hoge': 'Hoge 4'}
]

Insert multiple records (bulk_insert)

Pass the data in array format to a method called ʻinsert_many`

#When I run it, it looks like this (I didn't run it this time)
m_client.db_name.collection_name.insert_many([
    {'hoge': 'Hexagon'},
    {'hoge': 'Hexagon'}
])

Delete Collection / DB

#Collection deleted
>>> m_client.db_name.collection_name.drop()
>>> list(m_client.db_name.collection_name.find())
[]

#↑ No more hits

#When there is no Collection, the DB is automatically gone!
>>> list(m_client.list_database_names())
['admin', 'candles', 'config', 'local']

Sort selection results (order_by)

#For the time being, insert a lot of records(5~Run 6 times)
>>> m_client.db_name.collection_name.insert_one({'_id': datetime.datetime.now(), 'hoge': 'Hoge'})

#Descending sort(New date and time is at the top)
>>> list(m_client.db_name.collection_name.find().sort('_id', DESCENDING))
[
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 56, 625000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 984000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 528000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 121000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 54, 696000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 9, 36, 233000), 'hoge': 'Hoge'}
]

#Ascending sort(The old date and time is at the top)
>>> list(m_client.db_name.collection_name.find().sort('_id', ASCENDING))
[
    {'_id': datetime.datetime(2019, 12, 8, 15, 9, 36, 233000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 54, 696000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 121000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 528000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 984000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 56, 625000), 'hoge': 'Hoge'}
]

#Actually, I can still go(I like this one)
>>> list(m_client.db_name.collection_name.find(sort=[('_id', DESCENDING)]))

Select by specifying the range for a specific column

This is the identifier used to specify the range

that's all Super Less than Less than
identifier $gte $gt $lte $lt
#Above a specific value (after that date and time)
>>> list(m_client.db_name.collection_name.find(filter={
    '_id':{'$gte': datetime.datetime(2019, 12, 8, 15, 10, 55, 528000)}
}))
[
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 528000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 984000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 56, 625000), 'hoge': 'Hoge'}
]

#A value that exceeds a specific value (excluding that date and time, after that date and time)
>>> list(m_client.db_name.collection_name.find(filter={
    '_id':{'$gt': datetime.datetime(2019, 12, 8, 15, 10, 55, 528000)}
}))
[
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 984000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 56, 625000), 'hoge': 'Hoge 4'}
]

#From a specific value to a specific value
>>> list(m_client.db_name.collection_name.find(filter={
    '_id':{
        '$gte': datetime.datetime(2019, 12, 8, 15, 10, 55, 121000),
        '$lte':  datetime.datetime(2019, 12, 8, 15, 10, 55, 984000)
    }
}))
[
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 121000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 528000), 'hoge': 'Hoge'},
    {'_id': datetime.datetime(2019, 12, 8, 15, 10, 55, 984000), 'hoge': 'Hoge'}
]

$ gt and $ lt have the opposite meaning

In addition, select a combination of condition specifications

This is detailed! Various search conditions using pymongo (AND / OR / partial match / range search)

(Therefore, omitted ...)

Recommended Posts

[Python3 / MongoDB] Lightly summarize pymongo processing calls
Pandas basics for beginners ⑧ Digit processing
Basics of touching MongoDB with MongoEngine
Basics of binarized image processing with Python
[Python3 / MongoDB] Lightly summarize pymongo processing calls
python image processing
Python file processing
Summarize Python import
mongodb access with pymongo
Python distributed processing Spartan
File processing in Python
Python: Natural language processing
Communication processing by Python
Multithreaded processing in python
First Python image processing
Queue processing in Python
Image processing with Python
Python string processing illustration
Various processing of Python