A story about building a PyPI cache server (with Docker) and making me a little happy again

I was happy to build APT's package repository cache server (apt-cacher-ng) locally (in Docker).

I built an APT cache server in the article.

I'm a Python user, so I wanted to save time not only with apt-get install but also with pip install download time.

devpi-server

It seems that you can easily build a PyPI cache server by installing a wonderful package called devpi-server.

I decided to write this in a Dockerfile as well.

Note that pip install has a local cache mechanism (https://pip.pypa.io/en/latest/reference/pip_install.html?highlight=download#caching), so this method is single. It doesn't seem to make much sense if you just install a single on the server.

--When retrieving packages individually among multiple server instances --When the available line bandwidth is small or limited

Especially useful for.

Dockerfile

Below is the main subject of the Dockerfile.

Dockerfile


#
# Build:
#     docker build -t devpi-server .
#
# Run:
#     docker run -d -p 3141:3141 --name devpi-server-run devpi-server
#
FROM ubuntu
EXPOSE 3141
VOLUME ["/var/cache/devpi"]

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && \
    apt-get install --no-install-recommends -y python3-pip && \
    pip3 install -q -U pip devpi-server

CMD chmod 777 /var/cache/devpi && \
    devpi-server \
        --serverdir /var/cache/devpi \
        --host 0.0.0.0 --port 3141

How to use

--trusted-hostIn the cache server host name,--index-urlSpecify the url of the target cache server with.

$ pip install --trusted-host XXX.XXX.XXX.XXX --index-url http://XXX.XXX.XXX.XXX:3141/root/pypi/ package-name

As you can see in https://pip.pypa.io/en/latest/user_guide.html#environment-variables, The pip install options can also be controlled by environment variables. In that case, PIP_ is prefixed with uppercase letters and snake cases. Specified by PIP_TRUSTED_HOST and PIP_INDEX_URL.

Comparison

Measured with the time command. I only measured it once, so I didn't make a strict comparison, but I felt it was enough. --No-cache-dir is specified to invalidate the local cache.

$ time pip install --no-cache-dir numpy scipy matplotlib pandas ipython

No cache

real	3m29.583s

With cache

real	0m24.950s

There is a big difference. I felt like I was spending a lot of time downloading scipy and matplotlib, especially without the cache.

Recommended Posts

A story about building a PyPI cache server (with Docker) and making me a little happy again
A story about making 3D space recognition with Python
A story about making Hanon-like sheet music with Python
A memo about building a Django (Python) application with Docker
A story about an amateur making a breakout with python (kivy) ②
A story about an amateur making a breakout with python (kivy) ①
A story about making a tanka by chance with Sudachi Py
A story about a GCP beginner building a Minecraft server on GCE
A story about automating online mahjong (Mahjong Soul) with OpenCV and machine learning
Build a CentOS Linux 8 environment with Docker and start Apache HTTP Server
The story of making a sound camera with Touch Designer and ReSpeaker
Build a Pypi cache server on QNAP
Set up a Samba server with Docker
A story about machine learning with Kyasuket
A story about Python pop and append
Put Docker in Windows Home and run a simple web server with Python
A story about building an IDE environment with WinPython on an old Windows OS.
A story about making an x86 bootloader that can boot vmlinux with Rust
Start a simple Python web server with Docker
A story about Go's global variables and scope
A story about implementing a login screen with django
Launch a web server with Python and Flask
A story about modifying Python and adding functions