I access the Datastore in a GAE / py environment, but sometimes I want to access it from GCE or a local environment.
I think there are two ways.
--GAE's Remote API (remote_api_shell.py)
There seems to be another approach. ..
--Google Cloud Datastore RPC API Wrapper Library --GAE Task Queue
Such.
Accessing App Engine with Remote API
A standard feature is the Remote API.
Put python sdk in gcloud sdk? And I think the path is in remote_api_shell.py
.
Further on GAE's app.yaml
app.yaml
application: PROJECT_ID
version: 1
runtime: python27
api_version: 1
##Add here
builtins:
- remote_api: on
Enable remote_api for builtins as in.
When deploy is completed in that state,
$ remote_api_shell.py PROJECT_ID
By doing so, you will be able to directly touch the remote datastore etc. from the local python environment.
If you can import google.appengine.ext.ndb
in your local environment, you can also import the ndb model and access it in the same way as on App Engine, which is convenient.
from google.appengine.ext import ndb
class MyModel(ndb.Model):
number = ndb.IntegerProperty(default=42)
text = ndb.StringProperty()
#You can get remote data just like you get on GAE.
models = MyModel.query().fetch()
However, note that it is quite slow. Looking at the GAE log to find out the cause,
There is an http request to / _ah / remote_api
. Probably because there are many http requests for each query, the overhead becomes large and the processing is heavy.
The Google Cloud Datastore API released this year is likely to solve performance issues. Google Cloud Datastore API new beta significantly improves performance
It is easy to understand if you do it while looking at this article. Getting started with the Google Cloud Datastore API
In situations where the gcloud command is enabled
$ pip install gcloud
Then you can install gcloud-python.
If gcloud auth is enabled, you can import and use datastore related libraries from python scripts or consoles.
from gcloud import datastore
client = datastore.Client(PROJECT_ID)
#Then you can get the iterator with query.
task = cli.query(kind='Task').fetch()
Check the tutorial and the following github page for how to use it. gcloud-python
First of all, the performance is better than the remote api. Also, it seems that the iterator also communicates and fetches every time the cursor comes there with delayed fetch, probably because the session is still pasted in the datastore. Therefore, when trying to get a lot of information like Remote API, it is less likely to get stuck.
Please note that the version is still in beta, so specifications may change. In addition, the environment must be certified by gcloud. Furthermore, since there is no ORMapper like ndb, it is a little difficult to edit, and it may affect the GAE code.
However, it is convenient when accessing from the GCE environment, so you should definitely use this API from now on. In the official release, ndb etc. may be available.
The Datastore RPC API python wrapper library is officially available. GoogleCloudPlatform/google-cloud-datastore :googledatastore
For the time being, it seems that maintenance has been made finer following the update of the RPC API.
However, it seems that I can not recommend it very much because it requires authenticated credentials and the interface seems a little old.
Task Queue
If you use Push Queue etc. well, you can rewrite the data store information with an external trigger. However, I do not feel that I can access it because I have prepared tasks in advance and there is a time limit for tasks.
Recommended Posts