The procedure for building a data analysis environment created with Docker
+ VSCode
+ Remote Container
is written.
"Jupyter Lab? Or cloud services such as Google Colaboratory? ... I don't know the Python data analysis environment! Easy to install and comfortable analysis. I want to prepare the environment! " I wrote it for those who say.
The sample repository is pushed to GitHub, so if you clone it, launch the container, and connect remotely according to the README procedure, you can build the environment in about 10 minutes.
https://github.com/hatahata7757/sample-analytical-env
We hope that the readers of this article will be able to contribute to the selection of the data analysis environment.
Recently, I had the opportunity to analyze data with Python
, and I needed to build an environment where Jupyter
works.
You can build the Jupyter Lab environment directly locally, but I don't want to pollute the local environment as much as possible, and unlike VS Code, Jupyter Lab has weak support such as input completion, so it is a little difficult to use.
After a lot of research, I found out in VS Code's October-2019-release that the official Microsoft Python extension supported .ipynb
(a file dedicated to Ipython notebooks). Therefore, for the coding environment, select VSCode
, which has intellisense, etc., instead of Jupyter Lab.
As for the virtual environment, I learned that the Docker image of Jupyter Lab is pushed to Docker Hub, so it seems that I can easily build the environment by pulling it.
So
-** Virtual environment ** uses Docker
image of Jupyter Lab
--Use Remote Container
to ** remotely connect to the launched container **
--In the workspace connected by Remote Container, ** Coding with input completion and Linter **
We have selected the above environment as the data analysis environment. The image looks like this.
The following content is a little detailed explanation for those who build their own environment.
In the repository listed at the top of this article, the required extensions and various settings are described in .vscode/settings.json
・ .devcontainer/devcontainer.json
& the minimum procedure is described in the README. So you don't have to read the steps below.
Docker
Articles written by others: [Docker] Create a jupyterLab (python) environment in 3 minutes! as a reference ... or rather, it's a round pakuri, but in docker-compose.yml
, describe the settings required to build the Jupyter Lab environment.
docker-compose.yml
version: '3'
services:
notebook:
image: jupyter/datascience-notebook
ports:
- '8888:8888'
environment:
- JUPYTER_ENABLE_LAB=yes
volumes:
- ./work:/home/jovyan/work
command: start-notebook.sh --NotebookApp.token=''
The directory is cut under the name jovyan under home/in volumes
, but jovyan
seems to mean all the people who use Jupyter Notebook.
Reference:
-The story of making Jupyter Notebook started with Docker on the server accessible from other PCs
Now you have an environment where Jupyter Lab works.
Remote Container
If you start the container with docker-compose up -d
as it is, you can access http: // localhost: 8888 and code in Jupyter Lab.
That's fine, but I want to benefit from powerful extensions such as VS Code input completion, so I would like to remotely connect to the container I launched with Remote Container
and code with VS Code **.
If you look into the Remote Container, other people will explain it in detail, so please refer to that.
Reference: VS Code Remote Container is good
VSCode [Settings] → [Extensions] → [Remote Container]: Installation.
After installation, the pop-up "Reopen in Container" will be displayed, so select it.
This will automatically build & launch a container based on your project's docker-compose.yml
settings, opening the VS Code workspace inside the container.
(From the second time onward, you can connect with a mark like "> <" at the bottom left of VS Code → reopen in Container.)
In the workspace, the terminal is also connected to the container shell, so you don't need to add docker-compose ~
(or alias) every time you run commands.
When I open the workspace, I see a popup that says "No Python interpreter is selected ...". Since I have been asked about the Python execution environment, select the conda
environment" ** Python 3.xx 64-bit ('conda': virtualenv) ** ", which is the execution environment of Jupyter Lab.
VSCode
I connected to the started container remotely, and now I can open VS Code inside the container.
However, if you open .ipynb
as it is, it will not be displayed for IPython Notebook, so add an extension to the workspace.
Find [Extensions] → [python (ms-python.python)] in the workspace and select Dev Container: [Install to project name]. This is the display for the IPython Notebook.
In addition, please install various extensions such as Linter and IntelliSense in your workspace as appropriate. If you describe the extension in extensions
in .devcontainer/devcontainer.json
, it will be automatically loaded when the workspace is started, which is convenient when distributing a project.
(* Set to load python (ms-python.python) and pylance (ms-python.vscode-pylance) that supports the automatic import function and type check function in the repository listed in "Introduction". doing)
Finally.
Every time I open a new .ipynb
, it is not authorized and cannot be edited. You will probably see a popup that says "A notebook could execute harmful code when opened." Select "Trust".
(It's a bit annoying here. Even if you select ALL Trust
, the settings will be initialized once you restart the container, please tell me if you have a solution: qiitan-cry :)
Now, with Docker
+ VSCode
+ Remote Container
, you have an environment where you can analyze data while using powerful tools such as ** input completion! ** **
Keep in mind that the container will continue to start even if you close the workspace, so be sure to stop the container each time.
I created a Jupyter Lab (Python) analysis environment with Docker + VSCode + Remote Container. Input completion works well, and I think it's comfortable for me, but I'm just starting to analyze it in Python, so I don't know if this is the correct answer.
There are also convenient cloud services such as Google Colaboratory and Azure Notebook, but there were times when there was a time limit and it was faster to analyze locally, so this time I focused on how to build a local environment. ..
If there are any mistakes in the content or if there is a better way, please ** point out and teach. ** **
Let's have a comfortable coding environment!
-[Docker] Create a jupyterLab (python) environment in 3 minutes!
-VSCode Remote Container is good
-The story of making Jupyter Notebook started with Docker on the server accessible from other PCs