I may be addicted to it again someday, so a memorandum
I still develop in a Linux 32bit environment, but I was addicted to a certain system. The system manages files with a C program, It handles a large size of several GByte level, and when I try it, it doesn't work at all.
Since the file is huge, it takes time to reproduce it once, and the cause that I found by debugging while frustrating is very simple.
** The file is too big to fopen! ** **
It seems that Linux 32bit is about 2GByte. (Maybe only my environment Well, there will be an upper limit, but what you will actually come across ... Moreover, in this system, an application for C language programs to divide files. It can't even open the file. Cant Believe It…
2018/04/23 postscript In a comment from @ angel_p_57 I heard that there is a compile option called ** _FILE_OFFSET_BITS **.
Following the man page and the OSS code I had, I added the following to CFLAGS so that fopen can be done safely as it is!
-D_FILE_OFFSET_BITS=64
No, it's wonderful that you can change the behavior with just one option, glibc!
After investigating various file splitting methods, I came up with the ** split command **. This is a command for splitting files on Linux, thanks to this Only the split command worked fine.
Then how should we deal with it? The best is to look at the contents of the split, but I don't have time, so ** Let's make a fopen wrapper using the split command! ** **
Open IF instead of fopen, if the size is large, split it with / usr / bin / split and save it in the tmp directory → At the time of fread, when I went to the end during fread, I opened the next file and read the continuation.
//fopenのラッパー
void * large_freader_open(const char *path, unsigned long maxsize) {
//....
//サイズを見て、大きすぎたらseparate_fileでファイル分割
unsigned long fsize = get_size(path);
if(fsize<=maxsize) {
//same as fopen
handle->fp=fopen(path, "r");
handle->max_index=1;
} else {
//separate file and open file as order
separate_file(path, maxsize, handle);
handle->fp = freader_fopen(handle);
}
//..
// return returns the internal struct instead of FILE *
return handle;
//..
}
static void separate_file(const char *path, unsigned long maxsize, struct large_freader_s *handle) {
//...
//tmpディレクトリを取得してファイル分割
char name[FNAME_MAX];
get_current_dirname(handle, name, FNAME_MAX);
snprintf(cmd, sizeof(cmd), "/usr/bin/split -d --suffix-length=6 -b %lu %s %s", maxsize, path, name);
//...
}
//freadのラッパー
size_t large_freader_read(void * prt, size_t size,void * stream) {
//...
//普通にfreadして
size_t ret = fread(prt, 1, size, handle->fp);
if(ret == size) {
//read success, return normaly
return ret;
}
//サイズ分読んでないなら次のファイルに移動
freader_fclose(handle);
//move to next
handle->cur_index++;
//全ファイル読んでるなら終わり
if(IS_LAST_FILE(handle)) {
//finish to read
return ret;
}
//そうじゃないなら次をopenしてread
handle->fp = freader_fopen(handle);
if(handle->fp) {
ret += fread(((char *)prt)+ret, 1, size-ret, handle->fp);
}
return ret;
}
//内部でのfopen処理
static FILE * freader_fopen(struct large_freader_s *handle) {
//splitしたファイル名を取得してfopen。close時にはファイル削除します。
char dname[FNAME_MAX];
get_current_fname(handle, dname, FNAME_MAX);
return fopen(dname, "r");
}
I created it at home overnight and published it on github. I also wanted to split the file during wrtie, so it's a read / write wrapper library. Now that there is -D_FILE_OFFSET_BITS = 64, only write can be used ...
https://github.com/developer-kikikaikai/read_write_wrapper
Since my home PC is 64bit and I haven't been able to try it firmly with 32bit, I will fix the problems I noticed when I tried using it in another environment.
2018/04/23 postscript ・ If something unexpected happens, first check the compile options for something!
Otherwise, ・ Since fopen cannot be used, it is very unpleasant to have to do popen → ls to get the size. ・ I'm worried about guaranteeing operation in a 32-bit environment. ・ It's about the limit of a 32-bit system, so why not try your best? etc You have to rely on a crappy solution.
Recommended Posts