Segfault with 16 characters in C language

↓ It seems that it is popular recently to drop Python with a Segmentation Fault. Segfault python in three lines Segfault python in 2 lines Segfault python in one line Segfault Python with 33 characters Segfault Python in one line without using ctypes Segfault Rust in 5 lines Just changing the subject (Python) to C makes it boring, but it's the same.

code

Confirmed operation with gcc 10.1.0 (warning appears)

*a;main(){*a=0;}
> gcc segf.c
segf.c:3:4:warning:Data definition does not have a type or storage class
    3 |    *a;main(){*a=0;}
      |    ^
segf.c:3:5:warning:The type will be in the declaration of ‘a’ to the default ‘int’[-Wimplicit-int]
    3 |    *a;main(){*a=0;}
      |     ^
segf.c:3:7:warning:Make the return type the default ‘int’[-Wimplicit-int]
    3 |    *a;main(){*a=0;}
      |       ^~~~
> ./a.out
zsh: segmentation fault (core dumped)  ./a.out

** Caution ** Currently, the minimum number of characters that can be segfaulted in C is 5 characters. (See comments below)

Brief commentary

This alone does not meet the article requirements of Qiita, so I will explain it briefly. The first code to see is ↓.

int main(void) {
    int *a = 0;
    *a = 0;  //← Fall here
    return 0;
}

The C language has a concept called a pointer, which allows you to access any address in memory. Declare a pointer named ʻa at ʻint * a = 0 and assign the memory address 0. Assigning a value to * a in the second line will access an address [^ 1] that doesn't actually exist, and will throw a CPU exception.

Short coding

Now let's shorten the above code. First, return 0 can be omitted according to the C language convention. [^ 5] Also, the return type ʻint can be omitted for compatibility reasons. (If omitted, it will be implicitly ʻint.) Furthermore, it works without writing the argument void. Based on these, it is as follows.

 main() {
    int *a = 0;
    *a = 0;  //← Fall here
}

Here, the local variable ʻamay be defined as a global variable. Furthermore, if you define it as a global variable, it will be initialized with0` (unless it is a very special environment), so you do not need to assign it for initialization [^ 6].

int *a;
main() {
    *a = 0;  //← Fall here
}

In addition, the global variable type specifier ʻint` can be omitted. This is the code at the beginning.

bonus. Does the address * (0p00000000) really exist?

In a classic computer, memory is a device that can store a lot of information. However, in order to send a lot of information to or receive from memory, intuitively, you need as many wires as the number of information you want to send and receive. However, it is not possible in terms of cost to prepare a lot of thin wires that can withstand the operating speed of the CPU, so we will introduce the concept of memory address. As a result, although the number of information that can be read and written at one time (called 1Byte) [^ 2] is small, a lot of information can be stored in the memory by changing the memory address.

In this way, the concept of memory address is indispensable in the era of large memory capacity, but since it is represented by a positive integer starting with 0, the address 0 actually exists. .. Then, the reason why the segmentation fault occurred is due to the CPU function called paging used by the OS.

Memory addresses are unique in computers. Therefore, unless otherwise specified, multiple processes will own the same memory address space in a multitasking OS. In this case, when accessing memory, care must be taken not to accidentally access memory used by other processes. Paging eliminates this inconvenience, allowing each process to own its own virtual address space. At this time, the OS manages a table that converts between the virtual address space [^ 3] and the physical (memory) address space.

Therefore, the process should be able to access all existing addresses. However, this is not the case in reality, and the OS is taking advantage of the virtual address space as well. It limits the range of memory available to the program. This kind of virtual address space partitioning is called a memory map. As stated in this article, the area around address 0 is not used (the corresponding physical memory address is not assigned), so A segmentation fault will occur. [^ 4]

in conclusion

It took me about 20 minutes to write it lightly. I should have written it.

[^ 1]: See below. [^ 2]: Actually, 64bit can be written at the same time per memory module. In addition, the memory can be accessed at high speed by using technologies such as DMA. [^ 3]: and linear address space [^ 4]: It depends on how the OS is implemented. Also, depending on the OS, another CPU exception will occur. [^ 5]: Many people mistakenly think that omitting return 0; is a mistake. [^ 6]: It's okay if you don't initialize it because it only causes a segfault.

Recommended Posts

Segfault with 16 characters in C language
Segfault with 0 characters with gcc
Segfault Python with 33 characters
Heapsort made in C language
Multi-instance module test in C language
Realize interface class in C language
Writing C language with Sympy (metaprogramming)
Linked list (list_head / queue) in C language
Generate C language from S-expressions in Python
Communicate with I2C devices in Linux C
Argument implementation (with code) in your own language
How to multi-process exclusive control in C language
Set up a UDP server in C language
Create an image with characters in python (Japanese)
Install python's C language dependent module in wheel format with multi stage build
Behavior when SIGEV_THREAD is set in sigev_notify of sigevent with timer_create (C language)
Handle signals in C
Try to make a Python module in C language
Container-like # 1 made with C
Debugging C / C ++ with gdb
Access MongoDB in C
Try embedding Python in a C ++ program with pybind11
Next Python in C
Container-like # 2 made with C
[C language algorithm] Endianness
C API in Python 3
Go language to see and remember Part 7 C language in GO language
Access the C structure field with the name reserved in Go.
Behavior in each language when coroutines are reused with for
I tried to illustrate the time and time in C language
Try HeloWorld in your own language (with How to & code)
[C language algorithm] Block movement
Extend python in C ++ (Boost.NumPy)
Using X11 with ubuntu18.04 (C)
100 Language Processing with Python Knock 2015
Use "$ in" operator with mongo-go-driver
ABC163 C problem with python3
Working with LibreOffice in Python
Scraping with chromedriver in python
Try Google Mock with C
Use regular expressions in C
Working with sounds in Python
Scraping with Selenium in Python
Imitated Python's Numpy in C #
Binary search in Python / C ++
Scraping with Tor in Python
Combined with permutations in Python
Hello World in GO language
[C language] readdir () vs readdir_r ()
Make python segfault in 2 lines
Format C source with pycparser
C language ALDS1_4_A Linear Search
Minimum spanning tree in C #
ABC188 C problem with python3
ABC187 C problem with python
Introduction to Socket API Learned in C Language Part 1 Server Edition
C> string operation> Implementation to put in char [] with specified digit
Dockerfile with the necessary libraries for natural language processing in python
Let's utilize the design pattern like C language with OSS design_pattern_for_c!
Write a C language linked list in an object-oriented style (with code comparison between Python and Java)