↓ It seems that it is popular recently to drop Python with a Segmentation Fault. Segfault python in three lines Segfault python in 2 lines Segfault python in one line Segfault Python with 33 characters Segfault Python in one line without using ctypes Segfault Rust in 5 lines Just changing the subject (Python) to C makes it boring, but it's the same.
Confirmed operation with gcc 10.1.0 (warning appears)
*a;main(){*a=0;}
> gcc segf.c
segf.c:3:4:warning:Data definition does not have a type or storage class
3 | *a;main(){*a=0;}
| ^
segf.c:3:5:warning:The type will be in the declaration of ‘a’ to the default ‘int’[-Wimplicit-int]
3 | *a;main(){*a=0;}
| ^
segf.c:3:7:warning:Make the return type the default ‘int’[-Wimplicit-int]
3 | *a;main(){*a=0;}
| ^~~~
> ./a.out
zsh: segmentation fault (core dumped) ./a.out
** Caution ** Currently, the minimum number of characters that can be segfaulted in C is 5 characters. (See comments below)
This alone does not meet the article requirements of Qiita, so I will explain it briefly. The first code to see is ↓.
int main(void) {
int *a = 0;
*a = 0; //← Fall here
return 0;
}
The C language has a concept called a pointer, which allows you to access any address in memory.
Declare a pointer named ʻa at ʻint * a = 0
and assign the memory address 0
. Assigning a value to * a
in the second line will access an address [^ 1] that doesn't actually exist, and will throw a CPU exception.
Now let's shorten the above code. First, return 0
can be omitted according to the C language convention. [^ 5] Also, the return type ʻint can be omitted for compatibility reasons. (If omitted, it will be implicitly ʻint
.) Furthermore, it works without writing the argument void
.
Based on these, it is as follows.
main() {
int *a = 0;
*a = 0; //← Fall here
}
Here, the local variable ʻamay be defined as a global variable. Furthermore, if you define it as a global variable, it will be initialized with
0` (unless it is a very special environment), so you do not need to assign it for initialization [^ 6].
int *a;
main() {
*a = 0; //← Fall here
}
In addition, the global variable type specifier ʻint` can be omitted. This is the code at the beginning.
* (0p00000000)
really exist?In a classic computer, memory is a device that can store a lot of information. However, in order to send a lot of information to or receive from memory, intuitively, you need as many wires as the number of information you want to send and receive. However, it is not possible in terms of cost to prepare a lot of thin wires that can withstand the operating speed of the CPU, so we will introduce the concept of memory address. As a result, although the number of information that can be read and written at one time (called 1Byte) [^ 2] is small, a lot of information can be stored in the memory by changing the memory address.
In this way, the concept of memory address is indispensable in the era of large memory capacity, but since it is represented by a positive integer starting with 0
, the address 0
actually exists. .. Then, the reason why the segmentation fault occurred is due to the CPU function called paging used by the OS.
Memory addresses are unique in computers. Therefore, unless otherwise specified, multiple processes will own the same memory address space in a multitasking OS. In this case, when accessing memory, care must be taken not to accidentally access memory used by other processes. Paging eliminates this inconvenience, allowing each process to own its own virtual address space. At this time, the OS manages a table that converts between the virtual address space [^ 3] and the physical (memory) address space.
Therefore, the process should be able to access all existing addresses. However, this is not the case in reality, and the OS is taking advantage of the virtual address space as well. It limits the range of memory available to the program. This kind of virtual address space partitioning is called a memory map. As stated in this article, the area around address 0 is not used (the corresponding physical memory address is not assigned), so A segmentation fault will occur. [^ 4]
It took me about 20 minutes to write it lightly. I should have written it.
[^ 1]: See below.
[^ 2]: Actually, 64bit can be written at the same time per memory module. In addition, the memory can be accessed at high speed by using technologies such as DMA.
[^ 3]: and linear address space
[^ 4]: It depends on how the OS is implemented. Also, depending on the OS, another CPU exception will occur.
[^ 5]: Many people mistakenly think that omitting return 0;
is a mistake.
[^ 6]: It's okay if you don't initialize it because it only causes a segfault.
Recommended Posts