du is a command to display the directory capacity. If you use it without options, it will display all the directory capacities under the current directory. This time, I will follow the implementation to see how the function is realized.
As a behavior, it would be nice if there was a function to trace the directory and add up the file size in it. For that, the following functions are likely to be required.
Let's look at the implementation, focusing on how these features are realized.
This time we'll look at the du.c source code included in GNU coreutils.
The main function calls du_files ()
.
python
/* Recursively print the sizes of the directories (and, if selected, files)
named in FILES, the last entry of which is NULL.
BIT_FLAGS controls how fts works.
Return true if successful. */
static bool
du_files (char **files, int bit_flags)
{
bool ok = true;
if (*files)
{
FTS *fts = xfts_open (files, bit_flags, NULL); //(1)
while (1)
{
FTSENT *ent;
ent = fts_read (fts); //(2)
if (ent == NULL)
{
if (errno != 0)
{
error (0, errno, _("fts_read failed"));
ok = false;
}
break;
}
FTS_CROSS_CHECK (fts);
ok &= process_file (fts, ent); //(3)
}
if (fts_close (fts) != 0)
{
error (0, errno, _("fts_close failed"));
ok = false;
}
}
return ok;
}
The processing performed here is the following three.
(1). Get the FTS structure from the file name with xfts_open ()
.
(2). Get the FTSENT structure from the FTS structure with fts_read ()
.
(3). Call process_file ()
and return to (2).
xfts_open()
Xfts_open ()
in (1) is a wrapper for fts_open ()
and internally fts_open ()
is called. The argument passed to xfts_open ()
is the filename. This file name is specified by the command line argument, and if there is no argument, the current directory is passed.
The FTS structure obtained by xfts_open ()
is a handle of the file hierarchy. The field has an FTSENT structure that represents information about the file, with the current position and child FTSENT structures. Each time you call fts_read ()
in (2), they change and follow the file hierarchy (see below).
fts_read()
In (2) fts_read ()
, the FTSENT structure at the current position is obtained from the FTS structure. The FTSENT structure has the following fields (fts).
python
typedef struct _ftsent {
unsigned short fts_info; /*Flags for FTSENT structures*/
char *fts_accpath; /*Access path*/
char *fts_path; /*Root path*/
short fts_pathlen; /* fts_path length*/
char *fts_name; /*file name*/
short fts_namelen; /* fts_name length*/
short fts_level; /*depth(-1 〜 N) */
int fts_errno; /*File error number*/
long fts_number; /*Local number*/
void *fts_pointer; /*Local address number*/
struct ftsent *fts_parent; /*Parent directory*/
struct ftsent *fts_link; /*The following file structure*/
struct ftsent *fts_cycle; /*Circulating structure*/
struct stat *fts_statp; /* stat(2)Information*/
} FTSENT;
The file size is on fts_stap
(stat).
python
struct stat {
dev_t st_dev; /*ID of the device where the file is located*/
ino_t st_ino; /*inode number*/
mode_t st_mode; /*Access protection*/
nlink_t st_nlink; /*Number of hard links*/
uid_t st_uid; /*Owner's user ID*/
gid_t st_gid; /*Owner's group ID*/
dev_t st_rdev; /*Device ID(For special files) */
off_t st_size; /*Overall size(Byte unit) */
blksize_t st_blksize; /*File system I/In O
Block size*/
blkcnt_t st_blocks; /*Number of 512B blocks allocated*/
};
Therefore, you can get the file size by following the stat structure from the FTSENT structure.
Now, the FTSENT structure was obtained by fts_read ()
in (2). Since fts_read ()
automatically follows the file hierarchy, it seems that the du command can be implemented by repeatedly callingfts_read ()
and totaling the file size. And the process of totaling the file size is done by process_file ()
in (3).
Now, before going into the explanation of (3), let's see the operation of tracing the file hierarchy by fts_read ()
. Check the operation with the following code (Read FreeBSD's ls Let's make ls with fts (3) ~).
python
#include <stdio.h>
#include <stdlib.h>
#include <fts.h>
int main (int argc, char *argv[])
{
FTS *ftsp;
FTSENT *p;
static char dot[] = ".";
static char *dotav[] = {dot, NULL};
if (argc == 1)
argv = dotav;
else
argv++;
ftsp = fts_open(argv, 0, NULL);
while((p = fts_read(ftsp)) != NULL) {
printf("%s\n", p->fts_path);
}
fts_close(ftsp);
exit(EXIT_SUCCESS);
}
This code is an implementation of the ls command using the fts function. All files and directories under the directory are displayed. The fts_read ()
in the while
statement follows the file hierarchy, and the p-> fts_path
displays the file name.
The execution result is as follows.
$ ./ls
.
./dir1
./dir1/file3
./dir1/file4
./dir1/dir2
./dir1/dir2/file5
./dir1/dir2
./dir1
./ls.c
./ls
./file2
./file1
.
The display order is as follows.
You can see that it traverses depth-first, files are traversed once and directories are traversed twice.
Next time will see the process of process_file () in (3).
Recommended Posts