[Spotify API] Looking back on 2020 with playlists --Part.1 Acquisition of playlist data

Introduction

In 2020, Spotify released a function "# 2020WRAPPED" that evaluates "most listened songs" in a story format.

The Spotify API returns the analyzed musical parameters for each song. Get a list of songs that include those data and analyze the trends of the songs you listened to in 2020.

スクリーンショット 2020-12-28 23.42.22.png

スクリーンショット 2020-12-28 23.56.59.png

Operating environment

Google Colaboratory
Python 3.6

Things to prepare in advance

--Spotify development account (Client ID, Client Secret) --You can make it for free Reference site

things to do

The following is what we will do in this article. After the analysis, I would like to write about it in the next article.

Hit the Spotify API in Python
Get audio analysis information
Put the data acquired by API into the Pandas data frame
Convert Pandas data frame to CSV and download

Preparation 1

Install the required libraries.

`python`


#Install the library to handle Spotify API
!pip install spotipy

#To use Japanese fonts with matplot
!apt-get -y install fonts-ipafont-gothic
!pip install japanize-matplotlib

Preparation 2

Import the required libraries. Also import the ones that will be used for visualization later.

`python`


#Import of used library
from dateutil.parser import parse as parse_date
from matplotlib import pyplot as plt
import japanize_matplotlib
import numpy as np
import pandas as pd
import seaborn as sns
import spotipy
import spotipy.util as util
from spotipy.oauth2 import SpotifyClientCredentials
import sklearn 
from sklearn.decomposition import PCA

Preparation 3

Set the authentication information obtained in advance

`python`


#Set of Spotify credentials
client_id     = 'XXXXXXXXXXXXXXXXXX'
client_secret = 'YYYYYYYYYYYYYYYYYY'
user_id       = 'ZZZZZZZZZZZZZZZZZZ'
playlist_id   = '!!!!!!!!!!!!!!!!!!'

#Authentication process
client_credentials_manager = spotipy.oauth2.SpotifyClientCredentials(client_id, client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

#Get song information in playlist
###Get playlist information
playlist = sp.user_playlist(user_id, playlist_id)

Get song information in playlist

Set the authentication information obtained in advance.

`python`


###Get playlist information
playlist = sp.user_playlist(user_id, playlist_id)

###Get the data in the pandas dataframe(=tracks_df)Set to (narrow down to necessary items)
tracks_df = pd.DataFrame([(track['track']['id'],
                           track['track']['artists'][0]['name'],
                           track['track']['album']['name'],
                           track['track']['disc_number'],
                           track['track']['track_number'],
                           track['track']['name'],
                           parse_date(track['track']['album']['release_date']) if track['track']['album']['release_date'] else None,
                           parse_date(track['added_at']))
                          for track in playlist['tracks']['items']],
                         columns=['id', 'artist', 'album','disc','track_number','name', 'release_date', 'added_at'] )

tracks_df['target'] = 'A'
tracks_df.head(100)

Let's take a look at the acquired data.

`python`


tracks_df \
    .groupby('artist') \
    .count()['id'] \
    .reset_index() \
    .sort_values('id', ascending=False) \
    .rename(columns={'id': 'amount'}) \
    .head(100) \
    .style.background_gradient()

スクリーンショット 2020-12-28 23.23.55.png

It seems that the data has been collected safely. There are many domestic HIPHOP.

Acquire audio information by adding it to the song list

For each trackid, hit the audio_features API to get audio information. Merge with the previously acquired song information data frame.

`python`


#Generate an empty array
features = []

#Acquire various numerical information of music with API. Merge with track information
for n, chunk_series in tracks_df.groupby(np.arange(len(tracks_df)) // 50).id:
    features += sp.audio_features([*map(str, chunk_series)])
features_df = pd.DataFrame.from_dict(filter(None, features))
tracks_with_features_df = tracks_df.merge(features_df, on=['id'], how='inner')

#Save to CSV
csvfile =playlist_id+ 'playlist_songlist.csv'
tracks_with_features_df.to_csv(csvfile)

tracks_with_features_df['target'] = 'A'
tracks_with_features_df.head(100)

You can download and check the acquired data frame as CSV. スクリーンショット 2020-12-29 0.00.56.png

Article here is easy to understand for the correspondence and explanation of the attributes of each column.

`python`


,id,artist,album,disc,track_number,name,release_date,added_at,target,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,uri,track_href,analysis_url,duration_ms,time_signature
0,31Wp85Oqdtx5sYSYoXddhW,Rhythmy,mossmoss,1,2,Transparent,2013-12-25,2020-12-06 05:43:54+00:00,A,0.538,0.577,2,-9.263,1,0.0308,0.902,0.0363,0.113,0.0631,100.976,audio_features,spotify:track:31Wp85Oqdtx5sYSYoXddhW,https://api.spotify.com/v1/tracks/31Wp85Oqdtx5sYSYoXddhW,https://api.spotify.com/v1/audio-analysis/31Wp85Oqdtx5sYSYoXddhW,366000,4
1,5Vtb2MhdgCEo2ohVqtU6rh,satohyoh,"feel like, feel right",1,4,astraea,2020-03-11,2020-12-06 05:43:54+00:00,A,0.764,0.738,2,-5.665,1,0.0366,0.966,0.217,0.132,0.592,119.984,audio_features,spotify:track:5Vtb2MhdgCEo2ohVqtU6rh,https://api.spotify.com/v1/tracks/5Vtb2MhdgCEo2ohVqtU6rh,https://api.spotify.com/v1/audio-analysis/5Vtb2MhdgCEo2ohVqtU6rh,231625,3
2,7zZnRcQwc64Ty7a0KcRFDu,AYNIW TEPO,FLOWERS,1,2,Island,2016-01-06,2020-12-06 05:43:54+00:00,A,0.568,0.82,1,-7.256,1,0.0302,0.133,0.692,0.53,0.717,98.976,audio_features,spotify:track:7zZnRcQwc64Ty7a0KcRFDu,https://api.spotify.com/v1/tracks/7zZnRcQwc64Ty7a0KcRFDu,https://api.spotify.com/v1/audio-analysis/7zZnRcQwc64Ty7a0KcRFDu,257005,4
3,4aBFt5YgjiyXx5C5wCEIQ5,AYNIW TEPO,Beautiful Vibrations Live,1,9,"Beautiful Vibrations (Live at Pangea, Osaka, 2019)",2019-09-16,2020-12-06 05:43:54+00:00,A,0.245,0.56,4,-10.057,1,0.0401,0.538,0.407,0.853,0.177,113.608,audio_features,spotify:track:4aBFt5YgjiyXx5C5wCEIQ5,https://api.spotify.com/v1/tracks/4aBFt5YgjiyXx5C5wCEIQ5,https://api.spotify.com/v1/audio-analysis/4aBFt5YgjiyXx5C5wCEIQ5,297564,4
4,7dBmLvVCErz7azdEqTr2XW,ROTH BART BARON,Colorful festival,1,2,Gokusai| I G L (S),2020-10-14,2020-12-06 05:43:54+00:00,A,0.555,0.777,7,-6.054,1,0.0415,0.0403,0.1,0.601,0.414,108.981,audio_features,spotify:track:7dBmLvVCErz7azdEqTr2XW,https://api.spotify.com/v1/tracks/7dBmLvVCErz7azdEqTr2XW,https://api.spotify.com/v1/audio-analysis/7dBmLvVCErz7azdEqTr2XW,257000,4
5,3pZchFz2qKIq7mOIvk5sra,Supercar,RE:SUPERCAR 2 -redesigned by nakamura koji-,1,14,LAST SCENE,2011-06-15,2020-12-06 05:43:54+00:00,A,0.491,0.226,11,-15.287,1,0.0306,0.901,0.375,0.111,0.043,92.992,audio_features,spotify:track:3pZchFz2qKIq7mOIvk5sra,https://api.spotify.com/v1/tracks/3pZchFz2qKIq7mOIvk5sra,https://api.spotify.com/v1/audio-analysis/3pZchFz2qKIq7mOIvk5sra,295733,4
6,16XdREbnr11xS9dj75ud2g,KID FRESINO,No Sun,1,1,No Sun,2020-10-07,2020-12-06 05:43:54+00:00,A,0.627,0.906,1,-6.149,0,0.185,0.28,1.58e-06,0.129,0.786,115.989,audio_features,spotify:track:16XdREbnr11xS9dj75ud2g,https://api.spotify.com/v1/tracks/16XdREbnr11xS9dj75ud2g,https://api.spotify.com/v1/audio-analysis/16XdREbnr11xS9dj75ud2g,248560,4
7,0VkBkUHz4PvKDbNxOvyjZi,Taro Ninja,Pussy (Remix),1,1,Pussy - Remix,2019-03-27,2020-12-06 05:43:54+00:00,A,0.796,0.75,1,-6.508,1,0.0621,0.15,0.0,0.121,0.262,130.085,audio_features,spotify:track:0VkBkUHz4PvKDbNxOvyjZi,https://api.spotify.com/v1/tracks/0VkBkUHz4PvKDbNxOvyjZi,https://api.spotify.com/v1/audio-analysis/0VkBkUHz4PvKDbNxOvyjZi,252000,4
8,4Bir2My0Lim4IFgraNjTnZ,ANARCHY,The KING,1,12,Lucky 13,2019-03-13,2020-12-06 05:43:54+00:00,A,0.765,0.498,0,-7.625,0,0.22,0.459,0.0,0.123,0.501,139.938,audio_features,spotify:track:4Bir2My0Lim4IFgraNjTnZ,https://api.spotify.com/v1/tracks/4Bir2My0Lim4IFgraNjTnZ,https://api.spotify.com/v1/audio-analysis/4Bir2My0Lim4IFgraNjTnZ,240400,4
9,6X0qb3CiIHKzrmhXSfst9e,APOGEE,Higher Deeper,1,7,KESHIKI,2018-03-21,2020-12-06 05:43:54+00:00,A,0.606,0.726,4,-6.094,0,0.0285,0.0127,0.148,0.323,0.397,88.0,audio_features,spotify:track:6X0qb3CiIHKzrmhXSfst9e,https://api.spotify.com/v1/tracks/6X0qb3CiIHKzrmhXSfst9e,https://api.spotify.com/v1/audio-analysis/6X0qb3CiIHKzrmhXSfst9e,376000,4
10,6lAAEBqCLwllbEG1Pd393V,PUNPEE,MODERN TIMES,1,9,P.U.N.P. (Communication),2017-10-11,2020-12-06 05:43:54+00:00,A,0.783,0.852,11,-6.256,0,0.289,0.22,0.0,0.238,0.772,97.047,audio_features,spotify:track:6lAAEBqCLwllbEG1Pd393V,https://api.spotify.com/v1/tracks/6lAAEBqCLwllbEG1Pd393V,https://api.spotify.com/v1/audio-analysis/6lAAEBqCLwllbEG1Pd393V,227679,4

Summary

I used the Spotify API to get the audio data of the song. Next time, we will visualize the similarity of songs from audio data. Well then.