Purpose of this article

Somehow, I wanted to analyze Qiita's article data, so I touched the API. You don't need to authenticate this time because you only need to get the article information.

I tried two big things.

1. [Get user information list](# 1 Get user information list)
[Getting a list of articles for a specific user](# 2 Getting a list of articles for a specific user)

I will explain in order.

Preparation

Load the library.

import numpy as np
import pandas as pd
import requests
import json
from pandas.io.json import json_normalize

The `` `json_normalize``` at the bottom is a convenient one that formats the json format data returned by the API into the pandas data frame format.

1. 1. Get user information list

For the sample Qiita API v2 documentation,

`GET /api/v2/users?page=1&per_page=It says 20 etc..`


 In other words, you can get information by accessing the following URL.
https://qiita.com/api/v2/users?page=1&per_page=20

 Here, `` `per_page``` is the number of users to get at one time, and `` `` page``` is the number. For example, if you want 1000 user information, `` `per_page`` You must send at least 10 requests with = 100 (upper limit) ```.


 So, the code looks like this:

```python
n = 333 #Number of users you want to get

per_page = 100
df = pd.DataFrame()

for page in range(1, int(n/per_page)+2): #Get a lot
    base_url = "https://qiita.com/api/v2/users?page={0}&per_page={1}"
    url = base_url.format(page, per_page)

    response = requests.get(url)
    res = response.json()

    tmp_df = json_normalize(res)
    df = pd.concat([df, tmp_df])

df.reset_index(drop=True, inplace=True)

df = df.iloc[:n,:] #Delete as much as you get

The result is a data frame like this:

2. Get a list of articles for a specific user

For the sample Qiita API v2 documentation,

`GET /api/v2/items?page=1&per_page=20&query=qiita+user%It says 3 Ayaotti etc..`


 In other words, you can get information by accessing the following URL.
https://qiita.com/api/v2/items?page=1&per_page=20&query=qiita+user%3Ayaotti

 Now, a new `` `query``` has appeared, which gives you the same search options as when searching in a browser, where` ``: `` `is ``` `in the URL. Note that it is encoded in the notation% 3A```.

 ![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/540956/1b6f986d-87c0-8552-0c53-fedf902b54cc.png)

 By using this, you can get the information of a specific user with the feeling of ``` query = user% 3A 〇〇〇```.
 So the code looks like this:

```python
n = 125 #Number of articles you want to get

user = "yaotti"
per_page = 100
df = pd.DataFrame()

for page in range(1, int(n/per_page)+2): #Get a lot
    base_url = "https://qiita.com/api/v2/items?page={0}&per_page={1}&query=user%3A{2}"
    url = base_url.format(page, per_page, user)

    response = requests.get(url)
    res = response.json()

    tmp_df = json_normalize(res)
    df = pd.concat([df, tmp_df])

df.reset_index(drop=True, inplace=True)

df = df.iloc[:n,:] #Delete as much as you get

The result is a data frame like this:

that's all!

reference

Qiita API v2 documentation Convert dictionary list to DataFrame with pandas json_normalize

[Python] Get user information and article information with Qiita API