Recommendation of building a portable Python environment with conda

Overview

Recently, it seems that Anaconda is often used to quickly build a Python environment for data analysis.

For Anaconda and its minimal configuration Miniconda, use the included package manager conda. You can use it to manage the entire environment, including Python itself. In conventional Python development, it was necessary to build an environment using individual tools according to the purpose as follows.

pip: Package management
virtualenv | venv: Managing an environment with multiple packages
pyenv: Manage Python itself

With Anaconda / Miniconda, these can be done only with conda.

Create an environment including Python itself with conda create
Add package with conda install

In situations such as team development or running deliverables on a production server, it is useful to be able to quickly rebuild the built environment on another machine. When using pip and pyenv, the version of the library and Python itself was managed by preparing files such as requirements.txt and .python-version, respectively. With conda, you can easily rebuild your environment from there by exporting your preferences in YAML format.

The detailed method is explained below.

Miniconda installation

Anaconda is an all-in-one platform that includes many major packages, but if you want to build the minimum required environment for each project like this time, it is smarter to use Miniconda.

To install Miniconda, start the downloaded installer as described in here.

$ bash Miniconda2-latest-MacOSX-x86_64.sh

For Mac, you can also install it with brew cask.

$ brew install Caskroom/cask/miniconda

In the following, it is assumed that Miniconda is already installed and each command including conda is in the PATH.

Building the environment

Here, let's create a Python 3.5 environment with the name myenv. Since conda treats Python itself and packages equally as components of the environment, packages can be installed at the same time.

$ conda create --name myenv python=3.5 numpy=1.11.1

Activate the environment you created.

$ source activate myenv

You can also install additional packages.

$ conda install scipy

Not all packages registered with PyPI can be installed with conda. Packages that cannot be installed with conda can still be installed with pip. pip is included in the environment created by conda from the beginning.

$ pip install peewee

Export and reuse preferences

You can export your environment settings in YAML format by running conda env export with your environment enabled.

$ conda env export > myenv.yaml

The exported file looks like this: Packages installed with pip are also exported correctly.

`myenv.yaml`


name: myenv
dependencies:
- mkl=11.3.3=0
- numpy=1.11.1=py35_0
- openssl=1.0.2h=1
- pip=8.1.2=py35_0
- python=3.5.1=5
- readline=6.2=2
- scipy=0.17.1=np111py35_1
- setuptools=23.0.0=py35_0
- sqlite=3.13.0=0
- tk=8.5.18=0
- wheel=0.29.0=py35_0
- xz=5.2.2=0
- zlib=1.2.8=3
- pip:
  - peewee==2.8.1

With this file, you can easily rebuild the same environment on another machine.

$ conda env create --file myenv.yaml

Write the configuration file yourself

The configuration file exported by export will also include packages installed due to the dependencies of the intentionally installed packages. If you want to specify only the libraries that are used directly as the structure of the project, you need to write the configuration file yourself.

`myenv.yaml`


name: myenv
dependencies:
- python=3.5.1
- numpy=1.11.1
- scipy=0.17.1
- pip:
  - peewee==2.8.1

The recommendation is to follow Gemfile, Gemfile.lock in Ruby's Bundler, describe only the library to be used directly in myenv.yaml, and export the environment created using this. The method is to save it with a name like myenv.frozen.yaml. That way, you can look at myenv.yaml to see which packages are used directly in your project, and use myenv.frozen.yaml to reconstruct the exact same environment, including dependent packages.