Numerical calculation is done in Fortran, and figures and analysis are for people called python. It is assumed that the output of the numerical calculation is a binary file. Also, in python, it is assumed that numpy is used for analysis. This article first describes the Fortran binary output format and then describes how to read it in python.
There are three types of Fortran binary output formats: sequential (order search), direct (direct search), and stream. Let's look at each. Note that the binary output here refers to form = "unformatted". Not form = "binary". (I don't know much about form = "binary")
A format that writes from the beginning of the file. Every time you write, a 4-byte marker (which may be 8 bytes if it is old) is added to the beginning and end of the output. The number of bytes of output is entered at the beginning. It is necessary to read from the beginning (although it is not impossible to read with stream if you specify the number of bytes ...)
real(4) :: a=1,b=2
open(10,file="test.seq", form="unformatted", action="write",access="sequential")
write(10) a
write(10) b
Output by specifying the record length (number of bytes; recl). When outputting, specify rec and specify the output position. Therefore, it is not always necessary to write from the beginning (although it is usually output from the beginning). It is convenient that you do not have to read from the beginning when reading. In the case of Intel's compiler (ifort), the default of recl is 4 bytes (for example, if recl = 4, 16 bytes are output). It is safe to fix -assume byte recl in byte units as an option.
real(4) :: a=1,b=2
open(10,file="test.dir", form="unformatted", action="write",access="direct",recl=4)
write(10,rec=1) a
write(10,rec=2) b
stream Stream I / O has been added since Fortran 2003. Similar to sequential, except that there are no markers at the beginning and end of the file.
open(10,file="test.stm", form="unformatted", action="write",access="stream")
write(10) a
write(10) b !Pos is automatically specified.
The output position can also be specified (number of bytes) using pos. If you specify pos when inputting, it is not always necessary to read from the beginning.
There are big endian and little endian. If not specified, the machine default will be used. Either one is fine, but it's safe to use them in a unified way so that you can understand them. The method to specify at compile time is as follows
$ gfortran -fconvert=big-endian test.f90
$ ifort -convert=big_endian -assume byterecl test.f90
It can also be specified with the open statement. Specify with convert (probably set as an extension in most compilers).
open(10, file="test.dat", form="unformatted", action="write, access="stream" , &
& convert="big_endian" )
It can be read with the standard python library. Read the binary and convert it with np.frombuffer. If you create the following class, you can handle it for Fortran. Since the output is a one-dimensional array, convert it with reshape if necessary. The explanation of dtype is only typical. For more information [https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html#] And
symbol | meaning |
---|---|
> | Big endian |
< | Little endian |
i4 | 4-byte integer |
f4 | 4-byte floating point |
f8 | 8-byte floating point |
Examples below. I'm going to read an example of reading a 4-byte real number with 200 elements.
import numpy as np
import struct
class seq_read :
def __init__(self,filename, endian=">") :
self.f = open(filename, "rb")
self.endian = endian
def read(self, dtype) :
num, = struct.unpack(self.endian+"i",self.f.read(4))
data = np.frombuffer(self.f.read(num), dtype )
num, = struct.unpack(self.endian+"i",self.f.read(4))
return data
def rewind(self) :
self.f.seek(0)
def __del__(self) :
self.f.close()
### example ###
f = seq_read("test.seq", endian=">" )
data = f.read(">i") #big endian 4-byte integer
f.rewind() #To the top of the file
Since the type of direct access does not change and the record length is constant, set it when creating an instance.
import numpy as np
import struct
class dir_read :
def __init__(self, filename, recl, dtype) :
self.f = open(filename, "rb")
self.recl = recl
self.dtype = dtype
def read(self, rec) : #rec starts from 1(Fortran-like)
self.f.seek((rec-1)*self.recl)
data = np.frombuffer(self.f.read(self.recl), self.dtype)
return data
def __del__(self) :
self.f.close()
### example ###
f2 = dir_read("test.dir",4*200,">f")
print(f2.read(2))
You can also read it using numpy.fromfile. Specify the number of bytes to start reading with offset, and specify the number of elements to read with dtype.
import numpy as np
recl = 200
data = np.fromfile("test.dir",dtype=">"+str(recl)+"f4",count=1,offset=4*recl)[0]
stream
You can read it with seek and np.frombuffer used above. seek is the same as pos, so anyone who can use stream output in Fortran should be able to do it right away.
Recommended Posts