Recently, I have come to see some articles that use AMeDAS data published by the Japan Meteorological Agency as sample data for machine learning. On the other hand, as the data of the surface forecast released by the Japan Meteorological Agency, there are the results of numerical weather prediction simulations such as GSM and MSM, but I have never seen an article that utilizes these data. In this article, I will introduce pygrib, which enables python to handle the grib2 format, which is one of the formats of numerical weather prediction data such as GSM.
Numerical weather prediction plays a major role in modern weather prediction technology. What is Numerical Weather Prediction? According to the Japan Meteorological Agency…
Numerical weather prediction is a method of predicting future atmospheric conditions by calculating time changes such as wind and temperature with a computer using physics equations.
In short, it is a technology that predicts the weather by simulating the state of the earth with a computer. In the field of weather prediction, I think that forecasts are often created by using the data of this numerical weather prediction with post-processing added, or by selecting the most plausible one from multiple numerical weather prediction results. I will.
There are several types of simulation methods, but most of the simulation results are provided in the form of GPV (grid point value). GPV divides the earth into grids at regular intervals, and data such as temperature and wind speed are stored in each grid. [^ Post process] In meteorological forecasting, GPV data in which multiple grid-like data are stored in the time direction is often handled. One of the storage formats of this GPV is the grib2 format, and most of the data from the Japan Meteorological Agency is also distributed in this format.
[^ Post process]: Since the stored value is the value of the grid point or the average value, it is rare to use the value as it is as a forecast of any point in the grid point, and add post-processing as described above. There are many.
If you want to see the actual data of GPV, you can see it from GPV Weather Forecast. Actually, the data is for each grid, but in many cases, appropriate interpolation processing is performed when visualizing.
There seem to be some options when dealing with grib2 data from python, but this time I will use pygrib which supports python3. [^ python3] If you do it normally, you may get stuck with the installation of grib_api [^ eccodes] published by ECMWF (European Center for Medium-range Forecasting). This time I will put it in anaconda. Furthermore, since it is a good idea, it will be built on docker. With this, the probability that the installation will fail due to environment dependence will be considerably low.
[^ python3]: Other modules used to support only 2 series when I tried it before (early 2016), but it may be supported now. [^ eccodes]: Recently, it seems that the use of ecCodes is recommended.
The Dockerfile is as follows.
Dockerfile
FROM ubuntu:latest
MAINTAINER hangyo
#Package installation and update
RUN apt-get update && apt-get -y upgrade
RUN apt-get -y install build-essential
RUN apt-get -y install git vim curl wget
RUN apt-get -y install zlib1g-dev \
libssl-dev \
libreadline-dev \
libyaml-dev \
libxml2-dev \
libxslt-dev \
libncurses5-dev \
libncursesw5-dev
#Install pyenv
RUN git clone git://github.com/yyuu/pyenv.git /root/.pyenv
RUN git clone https://github.com/yyuu/pyenv-pip-rehash.git /root/.pyenv/plugins/pyenv-pip-rehash
ENV PYENV_ROOT /root/.pyenv
ENV PATH $PYENV_ROOT/bin:$PATH
RUN echo 'eval "$(pyenv init -)"' >> .bashrc
#Installation of anaconda
ENV ANACONDA_VER 4.1.1
ENV LD_LIBRARY_PATH=/lib/x86_64-linux-gnu:$PYENV_ROOT/versions/anaconda3-$ANACONDA_VER/lib
RUN pyenv install anaconda3-$ANACONDA_VER
RUN pyenv global anaconda3-$ANACONDA_VER
ENV PATH $PYENV_ROOT/versions/anaconda3-$ANACONDA_VER/bin:$PATH
#Library update
RUN conda update -y conda
RUN pip install --upgrade pip
RUN conda install -c conda-forge pygrib=2.0.2
RUN conda install -c conda-forge jpeg
RUN mkdir /temp
Please modify the version of anaconda as appropriate. Until the installation of anaconda, I referred to the following repository (also a full copy). https://github.com/iriya-ufo/ml-anaconda
Before building and starting, let's prepare a file that will be a sample of grib2. This time, I will use the data from the Meteorological Agency Data Disclosure Page of the Research Institute for Sustainability of Kyoto University.
wget -P somepath/ http://database.rish.kyoto-u.ac.jp/arch/jmadata/data/gpv/original/2017/01/02/Z__C_RJTD_20170102000000_MSM_GPV_Rjp_Lsurf_FH00-15_grib2.bin
This is the forecast data from FT (forecast time: time from the initial time) 0h to 15h of the surface of the Japan Meteorological Agency MSM whose initial time is 00:00 (UTC) on January 2, 2017. If ordinary people play with numerical weather prediction results, I think that surface data is sufficient.
As a small note, the data is saved in somepath, but let's save it in a different location from the Dockerfile. It is a problem of the behavior of docker, but docker transfers the data under the directory of Dockerfile to docker daemon at the time of docker build, and docker daemon builds. For this reason, if you put grib2 under the Dockerfile, it will take longer than necessary to transfer. You can refer to http://kimh.github.io/blog/jp/docker/gothas-in-writing-dockerfile-jp/.
First, build and run to get inside the container.
cd path_of_Dokcerfile
docker build -t pygrib_ubuntu .
docker run -it -v /somepath:/temp pygrib_ubuntu
You should now be inside the docker container.
If you try ls / temp
and confirm that you have Z__C_RJTD_20170102000000_MSM_GPV_Rjp_Lsurf_FH00-15_grib2.bin
, the mount is successful.
Finally, the pygrib test
python
import pygrib
grbs = pygrib.open('/temp/Z__C_RJTD_20170102000000_MSM_GPV_Rjp_Lsurf_FH00-15_grib2.bin')
for grb in grbs:
print(grb)
The result of
1:Pressure reduced to MSL:Pa (instant):regular_ll:meanSea:level 0:fcst time 0 hrs:from 201701020000
2:Surface pressure:Pa (instant):regular_ll:surface:level 0:fcst time 0 hrs:from 201701020000
3:10 metre U wind component:m s**-1 (instant):regular_ll:heightAboveGround:level 10 m:fcst time 0 hrs:from 201701020000
4:10 metre V wind component:m s**-1 (instant):regular_ll:heightAboveGround:level 10 m:fcst time 0 hrs:from 201701020000
...
172:Medium cloud cover:% (instant):regular_ll:surface:level 0:fcst time 15 hrs:from 201701020000
173:High cloud cover:% (instant):regular_ll:surface:level 0:fcst time 15 hrs:from 201701020000
174:Total cloud cover:% (instant):regular_ll:surface:level 0:fcst time 15 hrs:from 201701020000
175:Total precipitation:kg m-2 (accum):regular_ll:surface:level 0:fcst time 14-15 hrs (accum):from 201701020000
It is OK if it becomes like.
I would like to write another article about the specific usage of pygrib.
Recommended Posts