Find the most F-word commit on Linux (git and later)

tl; dr:

Linux in recent years is not so fucking. The maximum "F value" was 25, which was concentrated on a single file.

Mokuki

There was a time when it was a Linux tradition, but I have the impression that it has calmed down recently. Still, there are references in the fuck command and mailing lists.

So what was the event that impressed me? This time, the number of appearances of F-word in the commit log and its source code diff is defined as ** F value , and the commit with the highest F value ( the most F-valued commit **) is defined. I decided to look for it.

(Note that the commit with the F-word deleted is also given a high F value because it is the number of appearances in the source code diff)

git show all commits

Recently I used CMake instead of shell script, so I wrote it in CMake this time as well.

Basically, git rev-list HEAD outputs a list of commits, classifies it into the first 3 characters 000 -- fff of commit hash (= 16 ** 3 = 4096 divisions) build target And let ninja execute in parallel.

In the Linux repository (https://github.com/torvalds/linux),

$ git rev-list HEAD | wc -l
897506

It seems that there are about 900,000 commits that lead to HEAD. However, ** Note that this repository does not contain pre-Git commits **.

Find the F value of a commit

The F value of a commit is usually calculated by grep.

$ git show -p -w HEAD | grep -ci fuck
0

The F value of HEAD is naturally zero, and it is 1 in the commit https://github.com/torvalds/linux/commit/4b550488f894c899aa54dc935c8fee47bca2b7df that edits the peripheral lines of the source code including F-word.

$ git show -p -w 4b550488f894c899aa54dc935c8fee47bca2b7df | grep -ci fuck
1

If you use grep as it is, it will take time, so it is better to use a high-speed grep tool such as ripgrep (https://github.com/BurntSushi/ripgrep).

$ rg -ci fuck
fa/76/fa7662aad7dc033a61ff2f705c33cbb301d0552d.diff:1
f4/db/f4dbc4c20f05ccf6986b0de429f7552b21a1b362.diff:1
51/53/51533b615e605d86154ec1b4e585c8ca1b0b15b7.diff:1
29/57/2957c9e61ee9c37e7ebf2c8acab03e073fe942fd.diff:2
36/5b/365bff806e9faba000fb4956c7486fbf3a746d96.diff:1
5b/5f/5b5f9560354dc5a3a27ce57a86aec6b98531ee21.diff:1
96/39/963945bf93e46b9bf71a07bf9c78183e0f57733a.diff:1
b2/e1/b2e1b30290539b344cbaff0d9da38012e03aa347.diff:2

Commit with high F value

The commit with the highest F-number in the Linux git repository is the ** first commit **, that is, the Linux source code itself when it was migrated to git, with a value of ** imposing 56 **.

$ git show -p -w 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 | grep -ci fuck
56

As expected, this commit could be inducted into the Hall of Fame.

The top 3 commits that don't are

** All the same files ** ʻarch / mips / pci / ops-bridge.c` ** processing **. It's safe to say that this file is ** F-King ** in the history of Linux.

This file was created by splitting from the file pci-ip27.c with a commit with an F value of 24. Looking at the original file, you can see that this file was pretty F from the beginning when Linux started being managed by git.

--https://github.com/torvalds/linux/blob/1da177e4c3f41524e886b7f1b8a0c1fc7321cac2/arch/mips/pci/pci-ip27.c#L90 --Already contains many F-words at the time of the first commit of Git

Conversely, files with such prominent F-numbers are not created after git migration, so it can be said that Linux is not F at least in terms of source code.

By the way, the code with the highest F value in the current Linux kernel source code is drivers / net / ethernet / sun / sunhme.c, and its value is ** only 2 **. The current F-number of ʻops-bridge.c` mentioned above is zero and has been wiped out with a commit with an F-number of 12.

cleaned up language in arch/mips/pci/ops-bridge.c

This commit itself was made in 2019, but in 2018 there is a patch proposal https://lkml.org/lkml/2018/12/1/105 that is difficult to understand, even with a commit with an F value of 25 Trying to improve the grammar a little https://github.com/torvalds/linux/commit/686957e71d34beffe2b29cf5fb2e05025972020b.

Kanso

It turns out that the Linux kernel as software does not have a very high F-number, at least after the Git migration. There seems to be room for investigation before the transition to Git, but it seems that the impression of Hack is due to human activities.

You may need a better indicator than your F-number. For example, rather than defining an F-number for each commit, another indicator such as F-word lifetime can be considered.

In this F-number evaluation, we could only pay attention to ʻops-bridge.c. Especially in diff row extraction, it is vulnerable to row movement and overestimates the F value in commit. However, there is currently no format that can better represent movement across files. In addition to diff, git blame also considers line movement, and you can use -Mor-C` to extract.

Recommended Posts

Find the most F-word commit on Linux (git and later)
Specify the volume on linux and make a sound
Install the JDK on Linux
Recording and playback on Linux
Paste the link on linux
Install the latest version of Git on your Linux server
Find large files / directories on Linux
Find out where the java entity is on Linux (CentOS this time)
[UE4] Build DedicatedServer on Windows and Linux
Discover the most yabe functions on github
Find files like find on linux in Python
Install wsl2 and master linux on windows
Install and launch k3s on Manjaro Linux
Install and Configure TigerVNC server on Linux
Learn sshd_config and authorized_keys (on Amazon Linux 2)
Notes on using OpenCL on Linux on the RX6800
About the relationship between Git and GitHub
Find it in the procession and edit it
Add lines and text on the image
Replacing rmtrash on Mac and replacing rm on Linux
How to find large files on Linux
Compiling the Linux kernel (Linux 5.x on Ubuntu 20.04)
On Linux (Ubuntu), tune the Trackpad and set the function to a three-finger swipe