This article is the 14th day of the Advent Calendar 2019 on the road to the AI dojo "Kaggle" by Nikkei xTECH Business AI ① Advent Calendar 2019.
Kaggle has a ranking system in addition to the Grand master / Master / Expert / Contributor / Novice tiers, which are determined by the color and number of medals won. It seems that it is not so important compared to Tier, but I will explain how it works. (* Competition only)
Kaggle notebook https://www.kaggle.com/d1348k/learn-aboout-competition-points
github https://github.com/uratatsu/kaggle_ranking
In addition to gold / silver / bronze medals, Kaggle competitions also include Competition points, which give participants points when the competition is ranked. Points are calculated by the following formula.
\Biggl[\frac{100000}{\sqrt{N_{teammates}}}\Biggl]\Bigl[Rank^{-0.75}\Bigl]\bigl[\log_{10} (1+\log_{10} (N_{teams})) \bigl]\biggl[e^{-t/500}\biggl]
First of all, 100,000 points will be distributed as basic points. From here, depending on the number of participating teams, ranking, and the number of own teams, a coefficient of 0 to 1 will be applied to determine the final points earned.
\Bigl[Rank^{-0.75}\Bigl]
Not surprisingly, the biggest impact is the ranking of the competition in private. Depending on the rank, the coefficient decays as shown in the graph above.
Ranking | coefficient |
---|---|
1st | 1.0 |
2nd | 0.5946 |
3rd | 0.4387 |
10th | 0.1778 |
50th | 0.05318 |
100th | 0.03162 |
The difference between 1st and 2nd place is very large, and if you are in 2nd place, you will get about 60% of the points earned in 1st place. You can get only about 18% in 10th place and 3% in 100th place.
\bigl[\log_{10} (1+\log_{10} (N_{teams})) \bigl]
This section changes depending on the number of teams participating in the competition. The larger the number of participating teams, the larger the coefficient that can be multiplied, but as you can see from the graph below, even if 10,000 people participate (the highest so far is 8802 teams), it is about 0.7. If 1,000 teams participate, it is about 0.6, so even if the number of participants increases 10 times, the points will only increase 1.16 times.
Kaggle management concept is written on the official blog, but 100 team participation competition and 1,000 team participation It seems that it is based on the idea that the skills required to win in the competition will not change so much. I used to use log10 (x), so it seems that there was a 1.5 times difference between 100 teams and 1000 teams.
\frac{1}{\sqrt{N_{teammates}}}
The number of teammates is multiplied by the factor calculated by the above formula.
Two people take seven, and four people take about half. I have the impression that there is less attenuation due to the number of team members than I expected.
Number of people | coefficient |
---|---|
1 | 1.0 |
2 | 0.7071 |
3 | 0.5774 |
4 | 0.5 |
5 | 0.4472 |
8 | 0.3536 |
\biggl[e^{-t/500}\biggl]
The last term is the decay over the number of days elapsed.
It will be halved in less than a year in 346 days.
Of these, the only things you can control are the number and ranking of your teammates. Although the points earned decrease when forming a team, in general, the ranking tends to rise when team merging is performed, so where the increase in points earned due to the higher ranking is large, the team was formed to raise the final ranking. You may get more points. It is a heat map of the relationship between the ranking and the number of teams.
For example, if the 2nd place person merges with someone and becomes the 1st place, the score will increase because 59.5% → 70.7%.
Well, I don't think it's usually necessary because it's empty to merge teams while thinking about this. .. It may be important for Kaggle Ranking top 30 or the top 30 people.
def calculate_points(teammates, rank, teams, days):
points = 100000 * 1/np.sqrt(teammates) * np.power(rank, -0.75) * np.log10(1+np.log10(teams)) * np.exp(days/500)
return points
If you try to calculate some cases with this formula, you will get the following points.
Ranking | Number of participating teams | Number of teammates | Medal | Earned points |
---|---|---|---|---|
1 | 1000 | 1 | Gold | 60206 |
1 | 1000 | 5 | Gold | 26925 |
5 | 1000 | 1 | Gold | 18006 |
25 | 1000 | 1 | Silver | 5385 |
75 | 1000 | 1 | Bronze | 2362 |
100 | 1000 | 1 | Bronze | 1904 |
The impact of winning the solo is tremendous, and one of them is equivalent to the 32nd place in the kaggle competitions ranking (* as of December 14, 2019). By the way, Mr. bestfitting, who is currently ranked 1st in kaggle competitions ranking in the first image, has 20 solo golds (!) And 3 solo winners (!!), which is unrivaled.
I knew that it would change depending on the number of teams and the ranking, but it was surprisingly interesting to visualize the attenuation rate. I think there are many ways to understand how to add points, but I hope it helps you to understand the calculation method correctly.