tl; dr:
Linux in recent years is not so fucking. The maximum "F value" was 25, which was concentrated on a single file.
There was a time when it was a Linux tradition, but I have the impression that it has calmed down recently. Still, there are references in the fuck
command and mailing lists.
So what was the event that impressed me? This time, the number of appearances of F-word in the commit log and its source code diff is defined as ** F value , and the commit with the highest F value ( the most F-valued commit **) is defined. I decided to look for it.
(Note that the commit with the F-word deleted is also given a high F value because it is the number of appearances in the source code diff)
git show
all commitsRecently I used CMake instead of shell script, so I wrote it in CMake this time as well.
Basically, git rev-list HEAD
outputs a list of commits, classifies it into the first 3 characters 000
-- fff
of commit hash (= 16 ** 3 = 4096 divisions) build target And let ninja
execute in parallel.
In the Linux repository (https://github.com/torvalds/linux),
$ git rev-list HEAD | wc -l
897506
It seems that there are about 900,000 commits that lead to HEAD. However, ** Note that this repository does not contain pre-Git commits **.
The F value of a commit is usually calculated by grep.
$ git show -p -w HEAD | grep -ci fuck
0
The F value of HEAD
is naturally zero, and it is 1 in the commit https://github.com/torvalds/linux/commit/4b550488f894c899aa54dc935c8fee47bca2b7df that edits the peripheral lines of the source code including F-word.
$ git show -p -w 4b550488f894c899aa54dc935c8fee47bca2b7df | grep -ci fuck
1
If you use grep as it is, it will take time, so it is better to use a high-speed grep tool such as ripgrep (https://github.com/BurntSushi/ripgrep).
$ rg -ci fuck
fa/76/fa7662aad7dc033a61ff2f705c33cbb301d0552d.diff:1
f4/db/f4dbc4c20f05ccf6986b0de429f7552b21a1b362.diff:1
51/53/51533b615e605d86154ec1b4e585c8ca1b0b15b7.diff:1
29/57/2957c9e61ee9c37e7ebf2c8acab03e073fe942fd.diff:2
36/5b/365bff806e9faba000fb4956c7486fbf3a746d96.diff:1
5b/5f/5b5f9560354dc5a3a27ce57a86aec6b98531ee21.diff:1
96/39/963945bf93e46b9bf71a07bf9c78183e0f57733a.diff:1
b2/e1/b2e1b30290539b344cbaff0d9da38012e03aa347.diff:2
The commit with the highest F-number in the Linux git repository is the ** first commit **, that is, the Linux source code itself when it was migrated to git, with a value of ** imposing 56 **.
$ git show -p -w 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 | grep -ci fuck
56
As expected, this commit could be inducted into the Hall of Fame.
The top 3 commits that don't are
** All the same files ** ʻarch / mips / pci / ops-bridge.c` ** processing **. It's safe to say that this file is ** F-King ** in the history of Linux.
This file was created by splitting from the file pci-ip27.c
with a commit with an F value of 24. Looking at the original file, you can see that this file was pretty F from the beginning when Linux started being managed by git.
--https://github.com/torvalds/linux/blob/1da177e4c3f41524e886b7f1b8a0c1fc7321cac2/arch/mips/pci/pci-ip27.c#L90 --Already contains many F-words at the time of the first commit of Git
Conversely, files with such prominent F-numbers are not created after git migration, so it can be said that Linux is not F at least in terms of source code.
By the way, the code with the highest F value in the current Linux kernel source code is drivers / net / ethernet / sun / sunhme.c
, and its value is ** only 2 **. The current F-number of ʻops-bridge.c` mentioned above is zero and has been wiped out with a commit with an F-number of 12.
cleaned up language in arch/mips/pci/ops-bridge.c
This commit itself was made in 2019, but in 2018 there is a patch proposal https://lkml.org/lkml/2018/12/1/105 that is difficult to understand, even with a commit with an F value of 25 Trying to improve the grammar a little https://github.com/torvalds/linux/commit/686957e71d34beffe2b29cf5fb2e05025972020b.
It turns out that the Linux kernel as software does not have a very high F-number, at least after the Git migration. There seems to be room for investigation before the transition to Git, but it seems that the impression of Hack is due to human activities.
You may need a better indicator than your F-number. For example, rather than defining an F-number for each commit, another indicator such as F-word lifetime can be considered.
In this F-number evaluation, we could only pay attention to ʻops-bridge.c. Especially in diff row extraction, it is vulnerable to row movement and overestimates the F value in commit. However, there is currently no format that can better represent movement across files. In addition to diff,
git blame also considers line movement, and you can use
-Mor
-C` to extract.
Recommended Posts