A story that visualizes the present of Qiita with Qiita API + Elasticsearch + Kibana

What I felt when I started posting to Qiita recently is that I have been able to stock it for about 3 days to 1 week after posting. So, if you can grasp the popular tags in a span of about 3 days or 1 week, can you write an article that can be read by more people? I thought about it.

Overview

environment

item Contents
Server OS CentOS6.6
Elasticsearch 5.3.1
Kibana 4.3.1
Qiita API v2
Terminal that calls the API Mac
Mac OS OS X Yosemite
Python 2.7.10

I used the Qiita API from my Mac to get the post data and sent it to Elasticsearch running on the server in my lab. After sending to Elasticsearch http://xxx.xxx.xxx.xxx:5601 (* xxx.xxx.xxx.xxx is the laboratory server address *) I accessed and created graphs etc. using Kibana.

How to use Qiita API

Details can be found in the Qiita API v2 documentation. I will list the items that I referred to from the document.

end point

https://qiita.com/api/v2

Authentication

As stated in the Usage Restrictions </ b> in the documentation, you can query about 60 times an hour without authentication. It seems that if you authenticate, it will be 1000 times, but because I was not sure, I did it without authentication this time.

How to get basic post data

$ curl -XGET 'https://qiita.com/api/v2/items?page=3&per_page=20'

If you do, I think that the result will be returned for the time being. The meaning of this command is to get 20 pieces of data on the 3rd page. In other words, note that the number of data that can be obtained is 20, not 3 * 20.

You can decide page and per_page yourself, but there are restrictions. It is summarized in the table below.

parameter minimum value Maximum value
page 1 100
per_page 20 100

If you get 100 post data from each page, you can get up to 10000 post data.

Python code that sends post data (json format) to Elasticsearch as it is

forward_json.py


# coding: utf-8
import json
import requests
from elasticsearch import Elasticsearch

#Address of the server on which Elasticsearch is installed
server_address = "xxx.xxx.xxx.xxx"
#If installed and standard settings, the port is 9200
port = str(9200)
#Create an instance of Elasticsearch
es = Elasticsearch("%s:%s" % (server_address, port))
#end point
endpoint = 'https://qiita.com/api/v2/items'
for p in range(1, 11): #Perform the following processing from page 1 to page 10 in the same way.
    payload = {'page': p, 'per_page': '100'} #Get 100 data per page
    r =  requests.get(endpoint, params=payload).json() #Receive the result in json format
    '''
    #For your reference
    print type(r)
    # => <type 'list'>
    print r[0].keys() 
    # => [u'body', u'group', u'rendered_body', u'url', u'created_at', u'tags', u'updated_at', u'private', u'coediting', u'user', u'title', u'id']
    '''
    for it in r: #Loop through the list of results
        #Insert all the data! !!
        #This time I tried to name index qiita
        es.index(index='qiita', doc_type='qiita', id=it['id'], body=it)

In server_address, write the address of the server where Elasticsearch and Kibana are installed. When you run this code, Elasticsearch should store a total of 1000 post data.

Visualization with Kibana

Go to http //xxx.xxx.xxx.xxx:5601. I fetched 1000 post data from Qiita API and stored it in Elasticsearch. This is a screenshot with numbers and graph descriptions added in red. For the time being, the user name was hidden.

kibana.png

There are times when the number of posts is extremely high, so click on it. スクリーンショット 2016-07-19 22.21.25.png

Click the green circle in the image above to change the page.

spike.png

It seems that the number of posts during that time period has increased due to the angry posts by enthusiastic users.

Summary of results

  • You can see who posted how much in 1000 post data
  • You may get various information by following people who are currently posting with enthusiasm.
  • The number of tags is 1076, but the Japanese tag was not obtained well, and the tag "learning" was regarded as two tags, "learning" and "learning". Future tasks
  • The table below summarizes the 1000 most popular tags.
Ranking Tag name Number of posts
1 python 62
2 ruby 50
3 aoj 49
4 javascript 49
5 c 45
6 ios 41
7 swift 38
8 php 38
9 java 33
10 linux 29

The result was! After all there are many programming tags! Actually, I wanted to know the stock number ranking of tags, but I stopped because it was difficult to obtain the stock number from the posted data without authentication. I would like to challenge when I have time. This time, I found that there are many posts with python tags, so I would like to continue posting so that I can add python tags.

Recommended Posts

A story that visualizes the present of Qiita with Qiita API + Elasticsearch + Kibana
The story of making a module that skips mail with python
The story of making a web application that records extensive reading with Django
A story that reduces the effort of operation / maintenance
A story that struggled with the common set HTTP_PROXY = ~
A story that analyzed the delivery of Nico Nama.
The story of creating a database using the Google Analytics API
The story of making a question box bot with discord.py
I built an application with Lambda that notifies LINE of "likes" using the Qiita API
The story of writing a program
A story stuck with the installation of the machine learning library JAX
Get the number of PVs of Qiita articles you posted with API
A story that struggled to handle the Python package of PocketSphinx
The story of making a standard driver for db with python.
The story of creating a site that lists the release dates of books
The story of visualizing popular Qiita tags with Bar Chart Race
A story that supports electronic scoring of exams with image recognition
The story of creating a bot that displays active members in a specific channel of slack with python
A class that hits the DMM API
The story of blackjack A processing (python)
The story of a Parking Sensor in 10 minutes with GrovePi + Starter Kit
The story of making a university 100 yen breakfast LINE bot with Python
The story of having a hard time introducing OpenCV with M1 MAC
I tried to get the authentication code of Qiita API with Python.
The story of developing a web application that automatically generates catchphrases [MeCab]
Hit a method of a class instance with the Python Bottle Web API
The story of making a sound camera with Touch Designer and ReSpeaker
Get the number of articles accessed and likes with Qiita API + Python
The story of making a package that speeds up the operation of Juman (Juman ++) & KNP
A model that identifies the guitar with fast.ai
The story of doing deep learning with TPU
Created a Python wrapper for the Qiita API
The story of IPv6 address that I want to keep at a minimum
The story of making a box that interconnects Pepper's AL Memory and MQTT
Get a list of articles posted by users with Python 3 Qiita API v2
Around the authentication of PyDrive2, a package that operates Google Drive with Python
Article that can be a human resource who understands and masters the mechanism of API (with Python code)
The story of Django creating a library that might be a little more useful
The story of making a lie news generator
The story of the learning method that acquired LinuC Level 1 with only ping -t
The story of making a Line Bot that tells us the schedule of competitive programming
The story of making a mel icon generator
The story that fits in with pip installation
After hitting the Qiita API with Python to get a list of articles for beginners, we will visit the god articles
I made a twitter app that decodes the characters of Pricone with heroku (failure)
A story that failed when trying to remove the suffix from the string with rstrip
The story of Linux that I want to teach myself half a year ago
The story of launching a Minecraft server from Discord
[Python] A program that counts the number of valleys
The story of stopping the production service with the hostname command
The story of replacing Nvidia GTX 1650 with Linux Mint 20.1.
The story of sharing the pyenv environment with multiple users
Make a BOT that shortens the URL of Discord
Take a screenshot of the LCD with Python-LEGO Mindstorms
# Function that returns the character code of a string
The story of making a music generation neural network
Visualize the characteristic vocabulary of a document with D3.js
Create a tweet heatmap with the Google Maps API
Generate that shape of the bottom of a PET bottle
A memo that I touched the Datastore with python
A story about changing the master name of BlueZ