Python is a very convenient language for analyzing numerical data, but the first step in analyzing data is to load the data. Therefore, we will summarize how to read numerical data in various formats in the form of a numpy array.
Below, in all cases, the contents of the file are stored in'data'.
filename.csv
year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
2001,-0.4,-0.3,-0.2,-0.1,0,0,-0.1,-0.2,-0.3,-0.4,-0.5,-0.4
2002,-0.3,-0.1,0,0.3,0.4,0.5,0.6,0.7,0.8,1,1,1
2003,0.8,0.5,0,-0.2,-0.2,-0.2,-0.1,0.1,0.3,0.4,0.4,0.4
2004,0.4,0.3,0.1,0,-0.1,0,0,0.2,0.3,0.4,0.4,0.3
2005,0.2,0.1,0.1,0.1,0.2,0.3,0.2,0.1,-0.2,-0.5,-0.7,-0.8
2006,-0.8,-0.7,-0.5,-0.3,-0.2,0.1,0.3,0.4,0.6,0.8,0.9,0.8
2007,0.5,0.2,-0.2,-0.5,-0.6,-0.7,-0.9,-1.1,-1.3,-1.4,-1.5,-1.5
2008,-1.4,-1.1,-0.8,-0.5,-0.1,0.1,0.2,0.2,0.2,0,-0.2,-0.4
2009,-0.5,-0.5,-0.3,0,0.3,0.5,0.7,0.8,0.9,1,1,1.1
2010,1.1,0.9,0.7,0.3,0,-0.4,-0.8,-1.1,-1.3,-1.4,-1.5,-1.4
2011,-1.2,-0.9,-0.7,-0.4,-0.2,-0.2,-0.2,-0.4,-0.6,-0.8,-0.8,-0.7
2012,-0.6,-0.3,-0.1,0.1,0.3,0.4,0.5,0.5,0.4,0.2,0,-0.2
2013,-0.2,-0.3,-0.4,-0.4,-0.5,-0.6,-0.6,-0.5,-0.3,-0.2,-0.1,-0.2
2014,-0.2,-0.1,0,0.2,0.4,0.5,0.5,0.5,0.6,0.7,0.7,0.6
2015,0.5,0.5,0.6,0.8,1.2,99.9,99.9,99.9,99.9,99.9,99.9,99.9
Read the text data displayed as above when opened with Notepad.
import numpy as np
data = np.loadtxt('filename.csv', comments='year', delimiter=',', dtype='float')
--In comments, specify the character string that exists at the left end of the line to be skipped. --Specify the delimiter with delimiter. If it is separated by a space, the description of delimiter = ... is not necessary. --Specify in which format to read the data with dtype. The default is float (floating point number). If you want to read it as an integer, use int.
import netCDF4
nc = netCDF4.Dataset('filename.nc', 'r')
data = nc.variables['varname'][:]
--It reads as an array of numpy without importing numpy. --Enter the variable name in the varname part. --No matter how many dimensions the data is read, the last part of the third line can be [:].
write_binary_2D.f90
program main
implicit none
integer,parameter::N=10,M=20
integer::i,j
real,dimension(1:N,1:M)::x
open(10,file='filename.out',form='unformatted',access='direct',recl=N*4)
do i = 1,N
do j = 1,M
x(i,j) = i+j*2
end do
end do
do j = 1,M
write(10,rec=j)(x(i,j),i=1,N)
end do
close(10)
end program main
Let's read the contents of filename.out (a 4-byte floating point binary without a little endian header. What is commonly called GrADS format) created by the above program.
import numpy as np
N = 10 #The number of data stored per record number.
M = 20 #Total number of records.
f = open('filename.out', 'r')
dty = np.dtype([('data', '<' + str(N) + 'f')])
chunk = np.fromfile(f, dtype=dty, count=M)
data = np.array([chunk[j]['data'] for j in range(M)])
--The last line is
data = []
for j in range(M):
data.append(chunk[j]['data'])
data = np.array(data)
Is rewritten in one line.
--chunk [k-1] corresponds to the data of record number k in Fortran. So, for example, if you want to retrieve only the data with record number 6, put the last line
data = chunk[5]['data']
You can replace it with.
Recommended Posts