At first glance, it looks like a PNG file, and you can actually display images with a PNG decoder without any problems, but if you rewrite the extension to ZIP (in the case of an OS that determines the file type by extension), it can be treated as a ZIP file. Let's call a ZIP file disguised as a PNG file. Also, creating such a file is called "impersonating a ZIP file as a PNG file".
In other words, create a file that can be used as both a PNG file and a ZIP file.
Basically, it is supposed to spoof a combination of an existing small PNG file and an existing small ZIP file.
From a different point of view, it is possible to disguise a PNG file as a ZIP file.
There are some restrictions.
Also, if you change the file as PNG or ZIP (filtering images with PNG, adding / deleting files with ZIP, etc.), there is a high possibility that you cannot use it as a camouflage file after that.
There may be various other implementation restrictions.
I do not know.
I won't go into the detailed specifications of the ZIP file format. Also, ignore ZIP64 and split ZIP.
The basic structure of a ZIP file can be easily understood by following it from the ʻEnd of Central Directory (hereafter ʻEOCD
) near the end of the file.
ʻEOCDhas the position of
Central Directory (hereafter
CEN), that is, the offset from the beginning of the file. In addition,
CEN exists for the number of files in the archive, and each
CENhas the position (offset) of
Local file header (hereinafter
LOC`).
The LOC
is followed by the body of the file (compressed, uncompressed, encrypted, plaintext, etc.). I have omitted various explanations.
Because of this structure, even if there is some data at the beginning or end of the ZIP file, or between each data, it can be handled normally as a ZIP file.
As a specific example, a self-extracting ZIP file (EXE file) can be treated as a normal ZIP file by most ZIP archivers. From a different point of view, the self-extracting ZIP file can be treated as both an EXE file and a ZIP file.
Due to these characteristics, even if you store the entire ZIP file in another file, it can be treated as a normal ZIP file.
However, when embedding an existing ZIP file in another file, it is necessary to properly correct the part that has the offset from the beginning of the file.
The specific correction points are the offset of CEN
in ʻEOCD and the offset of
LOCin each
CEN`.
(If you have extended data for ZIP64, there may be an offset there too, but this time I will ignore it)
By the way, the byte order of ZIP files is little endian.
I won't go into the detailed specifications of the PNG file format.
The basic structure of PNG is that it has an 8-byte PNG file signature at the beginning, followed by multiple chunks. Some chunks are mandatory chunks, while others are optional chunks (auxiliary chunks).
Some chunks have a fixed order (eg, ʻIHDR chunks are at the beginning, ʻIEND
chunks are at the end, etc.), but others are in any order.
Each chunk consists of "length (4 bytes)", "chunk type (4 bytes)", "chunk data (arbitrary number of bytes)", and "CRC (4 bytes)". Byte order is big endian.
"Length" indicates the length of "chunk data" ("length" itself does not include "chunk type" and "CRC"). The maximum value is $ 2 ^ {31} -1 $ bytes. "CRC" is the value obtained by calculating CRC32 for the range of "chunk type" and "chunk data".
The chunk type is 4 alphabetic characters in ASCII code. It is case sensitive. (In the specifications, it is said that it should be treated as binary data instead of being treated as characters, but it will be described on a character basis for the sake of simplicity of explanation.)
In addition, the meaning is different depending on the case of each digit (in the specifications, judge by on / off of the 5th bit).
digit | 5th bit name | uppercase letter(off) | Lowercase(on) |
---|---|---|---|
1st digit | Auxiliary bit | Mandatory chunk | Auxiliary chunk |
2nd digit | Private bit | Public chunk | Private chunk |
3rd digit | Reserved (for future expansion) bits | (Fixed capital letters) | (Do not use lowercase letters) |
4th digit | Copyable bits | Cannot be copied when the image is changed | Can be copied when the image is changed |
Mandatory chunks are required to display the image, so if the PNG decoder encounters an unknown required chunk, it will be an error. Conversely, if you encounter an unknown auxiliary chunk, you can ignore it.
Public chunks are chunks registered in specifications and public chunk lists, and private chunks are used as chunks unique to the app. In general, I think you need a guard that can handle name conflicts between private chunks.
Unlike other bits, the copyable bit indicates whether an unknown chunk can be copied as it is when the image is processed with a PNG editor (a program that filters PNG files) instead of a PNG decoder. ..
This time, I would like to create my own "chunk that stores ZIP files" as an auxiliary chunk. Any chunk type is fine, but since it is a "ZIP container chunk", I will try to make it ziPc
. I'll create the location just after the ʻIHDR` chunk. The size of the impersonated ZIP file must be less than 2GB, as the entire ZIP file must be stored in the chunk.
Also, the reason why the "ZIP container chunk" is placed immediately after the ʻIHDR chunk is because the ʻIHDR
chunk has a fixed length and it is highly likely that the offset of the ZIP file will not shift even after being processed by the PNG editor. is.
Of course, it doesn't work if you insert some other chunk between the ʻIHDRchunk and the
ziPc chunk, or delete the
ziPc` chunk.
Finally, organize the impersonation procedure. Suppose you want to generate a camouflage file based on an existing ZIP file and PNG file.
ziPc
of the ZIP container chunkCEN
of the existing ZIP file ("chunk data" from here)CEN
of the existing ZIP fileCEN
and ʻEOCD` in the existing ZIP file, copy itthat's all.
There is no particularly difficult process (the most difficult is CRC32 calculation), so if you can operate the binary file, I think that it can be implemented in any language.
The caveat is that ZIP is little endian and PNG is big endian.
For the time being, I gave an example implemented in java / javascript / python on github.
Recommended Posts