I had the opportunity to process binary data in Python, but since I had touched the binary data for the first time and had to look it up in various ways, I would like to keep it as a memorandum!

This time, we dealt with the data of the extension file "**. Sl2 **", so we will write the .sl2 data as an example!

About binary data

The data structure of binary data is determined by each format. This time, I was dealing with a file called .sl2 that I saw for the first time, but in that case, I first need to know "the structure of the .sl2 file" in some way. If you don't know this, you can't handle it!

In my case, I referred to the following page, so I will explain based on this reference page. (** As a result, this page was wrong ... **) https://wiki.openstreetmap.org/wiki/SL2

header

It seems that most binary data has a header. This is a fixed value in the first few bytes and contains a description of the data format such as version information.

In this case, the following table and explanation are described in the "** Basic Structure **" column of the reference page. There seem to be several types in the .sl2 file, but for the time being, the header seems to be 10 bytes.

The files show up with a 10 byte header. First 2 bytes describe the format version (01=SLG, 02=SL2). Bytes 5,4 provide the block size. It varies depending on the sensor: 0x07b2 for HDI=Primary/Secondary+DSI and 0x0c80 for Sidescan). Seen values are

Byte order

If you write it very roughly, it means the "arrangement" and "arrangement order" of the data, and it defines the order in which the data is stored when it is written to the memory.

Please refer to various commentary sites for details! Byte order: What is Endian?

As far as I can tell, the byte order most often belongs to "big endian" or "little endian", so you need to find out which one.

In this case, I found out that it is little endian because there is the following description in the "** Basic Structure **" column of the reference page.

The file is a binary file with little endian format. Float values are stored as IEEE 754 floating-point "single format".

Byte block

Finally, we will read the data after the header. On the reference page, the data type and length of each block are defined as shown in the table below (partial excerpt).

First, look at the data description column in the rightmost column and select the data you want to extract. After deciding the data to be extracted, check the data type (variable type) and offset value.

The offset is the information of the position relative to the reference point and represents the address of the data. Since there are 144 bytes in a set of data this time, it means that the number of bytes in which the data is written is shown.

This should close the 144 byte frame.

Python struct module

I've organized a lot about binary data above. We use a module called ** struct ** to handle this binary data. Official documentation

About byte order

The official document has the following table, which defines the characters that represent the byte order.

About data format

As confirmed in the [Byte block](#Byte block) chapter, each data has its own data type (valiable type). It is necessary to change the processing method depending on the data type, but in struct, as long as you pass the data type information After that, it feels like it will do whatever it takes to match the mold.

However, the format may differ from the official document, so you need to read it accordingly. In this case, it will be as follows.

short int → unsigned short(H)
int       → unsigned long(L)
byte(int) → unsigned char(B)

Reading binary data

It can be read by setting the option when opening the file to rb (read binary).

with open(file_name, 'br') as f:
   data = f.read()

Interpretation of binary data

You can convert the read binary data by using the struct.unpack_from () function. The basic format is struct.unpack_from (data type, data, offset). I already know the data type and offset, so all I have to do is specify it!

Below is the big picture. It looks longer than I expected, but basically I adjust the header first and then repeat the work of unpacking by offsetting by the number of items.

import sys
import struct

OLAR_EARTH_RADIUS = 6356752.3142
# PI = Math: : PI
MAX_UINT4 = 4294967295
FT2M = 1/3.2808399  # factor for feet to meter conversions
KN2KM = 1/1.852     # factor for knots to km conversions

args = sys.argv

if args[1] == '':
    print('Usage:  python sl2decoder.py your_file.sl2')

block_offset = 0

#Shift the header by 10 bytes
block_offset += 10   

# Datatypes:
# ===================================================================================================
# Type    Definition                                          Directive for Python's String#unpack
# ---------------------------------------------------------------------------------------------------
# byte 	  UInt8                                               B
# short   UInt16LE                                            H
# int 	  UInt32LE                                            L
# float   FloatLE (32 bits IEEE 754 floating point number)    f
# flags   UInt16LE                                            H
# ---------------------------------------------------------------------------------------------------

#Define offset and data type for each item
block_def = {
    'blockSize'         : {'offset': 26, 'type': '<H'},
    #  'lastBlockSize': {'offset': 28, 'type': '<H'},
    'channel'           : {'offset': 30, 'type': '<H'},
    'packetSize'        : {'offset': 32, 'type': '<H'},
    'frameIndex'        : {'offset': 34, 'type': '<L'},
    'upperLimit'        : {'offset': 38, 'type': '<f'},
    'lowerLimit'        : {'offset': 42, 'type': '<f'},
    'frequency'         : {'offset': 51, 'type': '<B'},
    #  'time1': {'offset': 58, 'type': '<H'}          # unknown resolution, unknown epoche
    'waterDepthFt'      : {'offset': 62, 'type': '<f'},  # in feet
    'keelDepthFt'       : {'offset': 66, 'type': '<f'},  # in feet
    'speedGpsKnots'     : {'offset': 98, 'type': '<f'},  # in knots
    'temperature'       : {'offset': 102, 'type': '<f'}, # in °C
    'lowrance_longitude': {'offset': 106, 'type': '<L'}, # Lowrance encoding (easting)
    'lowrance_latitude' : {'offset': 110, 'type': '<L'}, # Lowrance encoding (northing)
    'speedWaterKnots'   : {'offset': 114, 'type': '<f'}, # from "water wheel sensor" if present, else GPS value(?)
    'courseOverGround'  : {'offset': 118, 'type': '<f'}, # ourseOverGround in radians
    'altitudeFt'        : {'offset': 122, 'type': '<f'}, # in feet
    'heading'           : {'offset': 126, 'type': '<f'}, # in radians
    'flags'             : {'offset': 130, 'type': '<H'},
    #  'time': {'offset': 138, 'type': '<H', 'len': 4}          # unknown resolution, unknown epoche
}

with open('%s_output_py.csv' % args[0], 'w') as f_raw:
    title = ','.join(['Channel', 'Frequency', 'UpperLimit[ft]', 'LowerLimit[ft]', 'Depth[ft]', 'WaterTemp[C]', 'WaterSpeed[kn]',
 'PositionX', 'PositionY', 'Speed[kn]', 'Track[rad]','Altitude[ft]', 'Heading[rad]']) + '\n'
    f_raw.write(title)

    alive_counter = 0

    with open(args[1], 'br') as f:
        data = f.read()
        sl2_file_size = len(data)

        while block_offset < sl2_file_size:
            h = {}
            if alive_counter % 100 == 0:
                print('%d done...' % round(100.0*block_offset/sl2_file_size))

            for k, v in block_def.items():
                t_offset = block_offset + v['offset']
                h[k] = struct.unpack_from(v['type'], data, t_offset)

            print(h['blockSize'])
            block_offset += h['blockSize'][0]

            #Combine into one line of data
            csv_line = ','.join([str(h['channel'][0]), str(h['frequency'][0]), 
                                 str(h['upperLimit'][0]), str(h['lowerLimit'][0]), 
                                 str(h['waterDepthFt'][0]), str(h['temperature'][0]), 
                                 str(h['speedWaterKnots'][0]), str(h['lowrance_longitude'][0]), 
                                 str(h['lowrance_latitude'][0]), str(h['speedGpsKnots'][0]), 
                                 str(h['courseOverGround'][0]), str(h['altitudeFt'][0]), 
                                 str(h['heading'][0])]) + '\n'

            f_raw.write(csv_line)

print('Read up to block_offset %d' % block_offset)

Try working with binary data in Python