A story that visualizes the present of Qiita with Qiita API + Elasticsearch + Kibana

What I felt when I started posting to Qiita recently is that I have been able to stock it for about 3 days to 1 week after posting. So, if you can grasp the popular tags in a span of about 3 days or 1 week, can you write an article that can be read by more people? I thought about it.

Overview

I will explain how to use the Qiita API
I will not talk about Elasticsearch and Kibana
Visualize the information that can be obtained from the Qiita API
You can see trendy tags and times when there are many posts in 1000 recently posted articles.
Finally, write a summary of what you can see from the visualized information.

environment

item	Contents
Server OS	CentOS6.6
Elasticsearch	5.3.1
Kibana	4.3.1
Qiita API	v2
Terminal that calls the API	Mac
Mac OS	OS X Yosemite
Python	2.7.10

I used the Qiita API from my Mac to get the post data and sent it to Elasticsearch running on the server in my lab. After sending to Elasticsearch http://xxx.xxx.xxx.xxx:5601 (* xxx.xxx.xxx.xxx is the laboratory server address *) I accessed and created graphs etc. using Kibana.

How to use Qiita API

Details can be found in the Qiita API v2 documentation. I will list the items that I referred to from the document.

end point

https://qiita.com/api/v2

Authentication

As stated in the Usage Restrictions </ b> in the documentation, you can query about 60 times an hour without authentication. It seems that if you authenticate, it will be 1000 times, but because I was not sure, I did it without authentication this time.

How to get basic post data

$ curl -XGET 'https://qiita.com/api/v2/items?page=3&per_page=20'

If you do, I think that the result will be returned for the time being. The meaning of this command is to get 20 pieces of data on the 3rd page. In other words, note that the number of data that can be obtained is 20, not 3 * 20.

You can decide page and per_page yourself, but there are restrictions. It is summarized in the table below.

parameter minimum value Maximum value

page 1 100

per_page 20 100

If you get 100 post data from each page, you can get up to 10000 post data.

Python code that sends post data (json format) to Elasticsearch as it is

forward_json.py

# coding: utf-8 import json import requests from elasticsearch import Elasticsearch #Address of the server on which Elasticsearch is installed server_address = "xxx.xxx.xxx.xxx" #If installed and standard settings, the port is 9200 port = str(9200) #Create an instance of Elasticsearch es = Elasticsearch("%s:%s" % (server_address, port)) #end point endpoint = 'https://qiita.com/api/v2/items' for p in range(1, 11): #Perform the following processing from page 1 to page 10 in the same way. payload = {'page': p, 'per_page': '100'} #Get 100 data per page r = requests.get(endpoint, params=payload).json() #Receive the result in json format ''' #For your reference print type(r) # => <type 'list'> print r[0].keys() # => [u'body', u'group', u'rendered_body', u'url', u'created_at', u'tags', u'updated_at', u'private', u'coediting', u'user', u'title', u'id'] ''' for it in r: #Loop through the list of results #Insert all the data! !! #This time I tried to name index qiita es.index(index='qiita', doc_type='qiita', id=it['id'], body=it)

In server_address, write the address of the server where Elasticsearch and Kibana are installed. When you run this code, Elasticsearch should store a total of 1000 post data.

Visualization with Kibana

Go to http //xxx.xxx.xxx.xxx:5601. I fetched 1000 post data from Qiita API and stored it in Elasticsearch. This is a screenshot with numbers and graph descriptions added in red. For the time being, the user name was hidden.

There are times when the number of posts is extremely high, so click on it.

Click the green circle in the image above to change the page.

It seems that the number of posts during that time period has increased due to the angry posts by enthusiastic users.

Summary of results

You can see who posted how much in 1000 post data

You may get various information by following people who are currently posting with enthusiasm.

The number of tags is 1076, but the Japanese tag was not obtained well, and the tag "learning" was regarded as two tags, "learning" and "learning". Future tasks

The table below summarizes the 1000 most popular tags.

Ranking Tag name Number of posts

1 python 62

2 ruby 50

3 aoj 49

4 javascript 49

5 c 45

6 ios 41

7 swift 38

8 php 38

9 java 33

10 linux 29

The result was! After all there are many programming tags! Actually, I wanted to know the stock number ranking of tags, but I stopped because it was difficult to obtain the stock number from the posted data without authentication. I would like to challenge when I have time. This time, I found that there are many posts with python tags, so I would like to continue posting so that I can add python tags.

Recommended Posts
A story that visualizes the present of Qiita with Qiita API + Elasticsearch + Kibana

The story of making a module that skips mail with python

The story of making a web application that records extensive reading with Django

A story that reduces the effort of operation / maintenance

A story that struggled with the common set HTTP_PROXY = ~

A story that analyzed the delivery of Nico Nama.

The story of creating a database using the Google Analytics API

The story of making a question box bot with discord.py

I built an application with Lambda that notifies LINE of "likes" using the Qiita API

The story of writing a program

A story stuck with the installation of the machine learning library JAX

Get the number of PVs of Qiita articles you posted with API

A story that struggled to handle the Python package of PocketSphinx

The story of making a standard driver for db with python.

The story of creating a site that lists the release dates of books

The story of visualizing popular Qiita tags with Bar Chart Race

A story that supports electronic scoring of exams with image recognition

The story of creating a bot that displays active members in a specific channel of slack with python

A class that hits the DMM API

The story of blackjack A processing (python)

The story of a Parking Sensor in 10 minutes with GrovePi + Starter Kit

The story of making a university 100 yen breakfast LINE bot with Python

The story of having a hard time introducing OpenCV with M1 MAC

I tried to get the authentication code of Qiita API with Python.

The story of developing a web application that automatically generates catchphrases [MeCab]

Hit a method of a class instance with the Python Bottle Web API

The story of making a sound camera with Touch Designer and ReSpeaker

Get the number of articles accessed and likes with Qiita API + Python

The story of making a package that speeds up the operation of Juman (Juman ++) & KNP

A model that identifies the guitar with fast.ai

The story of doing deep learning with TPU

Created a Python wrapper for the Qiita API

The story of IPv6 address that I want to keep at a minimum

The story of making a box that interconnects Pepper's AL Memory and MQTT

Get a list of articles posted by users with Python 3 Qiita API v2

Around the authentication of PyDrive2, a package that operates Google Drive with Python

Article that can be a human resource who understands and masters the mechanism of API (with Python code)

The story of Django creating a library that might be a little more useful

The story of making a lie news generator

The story of the learning method that acquired LinuC Level 1 with only ping -t

The story of making a Line Bot that tells us the schedule of competitive programming

The story of making a mel icon generator

The story that fits in with pip installation

After hitting the Qiita API with Python to get a list of articles for beginners, we will visit the god articles

I made a twitter app that decodes the characters of Pricone with heroku (failure)

A story that failed when trying to remove the suffix from the string with rstrip

The story of Linux that I want to teach myself half a year ago

The story of launching a Minecraft server from Discord

[Python] A program that counts the number of valleys

The story of stopping the production service with the hostname command

The story of replacing Nvidia GTX 1650 with Linux Mint 20.1.

The story of sharing the pyenv environment with multiple users

Make a BOT that shortens the URL of Discord

Take a screenshot of the LCD with Python-LEGO Mindstorms

# Function that returns the character code of a string

The story of making a music generation neural network

Visualize the characteristic vocabulary of a document with D3.js

Create a tweet heatmap with the Google Maps API

Generate that shape of the bottom of a PET bottle

A memo that I touched the Datastore with python

A story about changing the master name of BlueZ

Ranking	Tag name	Number of posts
1	python	62
2	ruby	50
3	aoj	49
4	javascript	49
5	c	45
6	ios	41
7	swift	38
8	php	38
9	java	33
10	linux	29