[Spotify API] Looking back on 2020 with playlists --Part.1 Acquisition of playlist data

Introduction

In 2020, Spotify released a function "# 2020WRAPPED" that evaluates "most listened songs" in a story format.

The Spotify API returns the analyzed musical parameters for each song. Get a list of songs that include those data and analyze the trends of the songs you listened to in 2020.

スクリーンショット 2020-12-28 23.42.22.png

スクリーンショット 2020-12-28 23.56.59.png

Operating environment

Things to prepare in advance

--Spotify development account (Client ID, Client Secret) --You can make it for free Reference site

things to do

The following is what we will do in this article. After the analysis, I would like to write about it in the next article.

  1. Hit the Spotify API in Python
  2. Get audio analysis information
  3. Put the data acquired by API into the Pandas data frame
  4. Convert Pandas data frame to CSV and download

Preparation 1

Install the required libraries.

python


#Install the library to handle Spotify API
!pip install spotipy

#To use Japanese fonts with matplot
!apt-get -y install fonts-ipafont-gothic
!pip install japanize-matplotlib

Preparation 2

Import the required libraries. Also import the ones that will be used for visualization later.

python


#Import of used library
from dateutil.parser import parse as parse_date
from matplotlib import pyplot as plt
import japanize_matplotlib
import numpy as np
import pandas as pd
import seaborn as sns
import spotipy
import spotipy.util as util
from spotipy.oauth2 import SpotifyClientCredentials
import sklearn 
from sklearn.decomposition import PCA

Preparation 3

Set the authentication information obtained in advance

python


#Set of Spotify credentials
client_id     = 'XXXXXXXXXXXXXXXXXX'
client_secret = 'YYYYYYYYYYYYYYYYYY'
user_id       = 'ZZZZZZZZZZZZZZZZZZ'
playlist_id   = '!!!!!!!!!!!!!!!!!!'

#Authentication process
client_credentials_manager = spotipy.oauth2.SpotifyClientCredentials(client_id, client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

#Get song information in playlist
###Get playlist information
playlist = sp.user_playlist(user_id, playlist_id)

Get song information in playlist

Set the authentication information obtained in advance.

python


###Get playlist information
playlist = sp.user_playlist(user_id, playlist_id)

###Get the data in the pandas dataframe(=tracks_df)Set to (narrow down to necessary items)
tracks_df = pd.DataFrame([(track['track']['id'],
                           track['track']['artists'][0]['name'],
                           track['track']['album']['name'],
                           track['track']['disc_number'],
                           track['track']['track_number'],
                           track['track']['name'],
                           parse_date(track['track']['album']['release_date']) if track['track']['album']['release_date'] else None,
                           parse_date(track['added_at']))
                          for track in playlist['tracks']['items']],
                         columns=['id', 'artist', 'album','disc','track_number','name', 'release_date', 'added_at'] )

tracks_df['target'] = 'A'
tracks_df.head(100)

Let's take a look at the acquired data.

python


tracks_df \
    .groupby('artist') \
    .count()['id'] \
    .reset_index() \
    .sort_values('id', ascending=False) \
    .rename(columns={'id': 'amount'}) \
    .head(100) \
    .style.background_gradient()

スクリーンショット 2020-12-28 23.23.55.png

It seems that the data has been collected safely. There are many domestic HIPHOP.

Acquire audio information by adding it to the song list

For each trackid, hit the audio_features API to get audio information. Merge with the previously acquired song information data frame.

python


#Generate an empty array
features = []

#Acquire various numerical information of music with API. Merge with track information
for n, chunk_series in tracks_df.groupby(np.arange(len(tracks_df)) // 50).id:
    features += sp.audio_features([*map(str, chunk_series)])
features_df = pd.DataFrame.from_dict(filter(None, features))
tracks_with_features_df = tracks_df.merge(features_df, on=['id'], how='inner')

#Save to CSV
csvfile =playlist_id+ 'playlist_songlist.csv'
tracks_with_features_df.to_csv(csvfile)

tracks_with_features_df['target'] = 'A'
tracks_with_features_df.head(100) 

You can download and check the acquired data frame as CSV. スクリーンショット 2020-12-29 0.00.56.png

Article here is easy to understand for the correspondence and explanation of the attributes of each column.

python


,id,artist,album,disc,track_number,name,release_date,added_at,target,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,uri,track_href,analysis_url,duration_ms,time_signature
0,31Wp85Oqdtx5sYSYoXddhW,Rhythmy,mossmoss,1,2,Transparent,2013-12-25,2020-12-06 05:43:54+00:00,A,0.538,0.577,2,-9.263,1,0.0308,0.902,0.0363,0.113,0.0631,100.976,audio_features,spotify:track:31Wp85Oqdtx5sYSYoXddhW,https://api.spotify.com/v1/tracks/31Wp85Oqdtx5sYSYoXddhW,https://api.spotify.com/v1/audio-analysis/31Wp85Oqdtx5sYSYoXddhW,366000,4
1,5Vtb2MhdgCEo2ohVqtU6rh,satohyoh,"feel like, feel right",1,4,astraea,2020-03-11,2020-12-06 05:43:54+00:00,A,0.764,0.738,2,-5.665,1,0.0366,0.966,0.217,0.132,0.592,119.984,audio_features,spotify:track:5Vtb2MhdgCEo2ohVqtU6rh,https://api.spotify.com/v1/tracks/5Vtb2MhdgCEo2ohVqtU6rh,https://api.spotify.com/v1/audio-analysis/5Vtb2MhdgCEo2ohVqtU6rh,231625,3
2,7zZnRcQwc64Ty7a0KcRFDu,AYNIW TEPO,FLOWERS,1,2,Island,2016-01-06,2020-12-06 05:43:54+00:00,A,0.568,0.82,1,-7.256,1,0.0302,0.133,0.692,0.53,0.717,98.976,audio_features,spotify:track:7zZnRcQwc64Ty7a0KcRFDu,https://api.spotify.com/v1/tracks/7zZnRcQwc64Ty7a0KcRFDu,https://api.spotify.com/v1/audio-analysis/7zZnRcQwc64Ty7a0KcRFDu,257005,4
3,4aBFt5YgjiyXx5C5wCEIQ5,AYNIW TEPO,Beautiful Vibrations Live,1,9,"Beautiful Vibrations (Live at Pangea, Osaka, 2019)",2019-09-16,2020-12-06 05:43:54+00:00,A,0.245,0.56,4,-10.057,1,0.0401,0.538,0.407,0.853,0.177,113.608,audio_features,spotify:track:4aBFt5YgjiyXx5C5wCEIQ5,https://api.spotify.com/v1/tracks/4aBFt5YgjiyXx5C5wCEIQ5,https://api.spotify.com/v1/audio-analysis/4aBFt5YgjiyXx5C5wCEIQ5,297564,4
4,7dBmLvVCErz7azdEqTr2XW,ROTH BART BARON,Colorful festival,1,2,Gokusai| I G L (S),2020-10-14,2020-12-06 05:43:54+00:00,A,0.555,0.777,7,-6.054,1,0.0415,0.0403,0.1,0.601,0.414,108.981,audio_features,spotify:track:7dBmLvVCErz7azdEqTr2XW,https://api.spotify.com/v1/tracks/7dBmLvVCErz7azdEqTr2XW,https://api.spotify.com/v1/audio-analysis/7dBmLvVCErz7azdEqTr2XW,257000,4
5,3pZchFz2qKIq7mOIvk5sra,Supercar,RE:SUPERCAR 2 -redesigned by nakamura koji-,1,14,LAST SCENE,2011-06-15,2020-12-06 05:43:54+00:00,A,0.491,0.226,11,-15.287,1,0.0306,0.901,0.375,0.111,0.043,92.992,audio_features,spotify:track:3pZchFz2qKIq7mOIvk5sra,https://api.spotify.com/v1/tracks/3pZchFz2qKIq7mOIvk5sra,https://api.spotify.com/v1/audio-analysis/3pZchFz2qKIq7mOIvk5sra,295733,4
6,16XdREbnr11xS9dj75ud2g,KID FRESINO,No Sun,1,1,No Sun,2020-10-07,2020-12-06 05:43:54+00:00,A,0.627,0.906,1,-6.149,0,0.185,0.28,1.58e-06,0.129,0.786,115.989,audio_features,spotify:track:16XdREbnr11xS9dj75ud2g,https://api.spotify.com/v1/tracks/16XdREbnr11xS9dj75ud2g,https://api.spotify.com/v1/audio-analysis/16XdREbnr11xS9dj75ud2g,248560,4
7,0VkBkUHz4PvKDbNxOvyjZi,Taro Ninja,Pussy (Remix),1,1,Pussy - Remix,2019-03-27,2020-12-06 05:43:54+00:00,A,0.796,0.75,1,-6.508,1,0.0621,0.15,0.0,0.121,0.262,130.085,audio_features,spotify:track:0VkBkUHz4PvKDbNxOvyjZi,https://api.spotify.com/v1/tracks/0VkBkUHz4PvKDbNxOvyjZi,https://api.spotify.com/v1/audio-analysis/0VkBkUHz4PvKDbNxOvyjZi,252000,4
8,4Bir2My0Lim4IFgraNjTnZ,ANARCHY,The KING,1,12,Lucky 13,2019-03-13,2020-12-06 05:43:54+00:00,A,0.765,0.498,0,-7.625,0,0.22,0.459,0.0,0.123,0.501,139.938,audio_features,spotify:track:4Bir2My0Lim4IFgraNjTnZ,https://api.spotify.com/v1/tracks/4Bir2My0Lim4IFgraNjTnZ,https://api.spotify.com/v1/audio-analysis/4Bir2My0Lim4IFgraNjTnZ,240400,4
9,6X0qb3CiIHKzrmhXSfst9e,APOGEE,Higher Deeper,1,7,KESHIKI,2018-03-21,2020-12-06 05:43:54+00:00,A,0.606,0.726,4,-6.094,0,0.0285,0.0127,0.148,0.323,0.397,88.0,audio_features,spotify:track:6X0qb3CiIHKzrmhXSfst9e,https://api.spotify.com/v1/tracks/6X0qb3CiIHKzrmhXSfst9e,https://api.spotify.com/v1/audio-analysis/6X0qb3CiIHKzrmhXSfst9e,376000,4
10,6lAAEBqCLwllbEG1Pd393V,PUNPEE,MODERN TIMES,1,9,P.U.N.P. (Communication),2017-10-11,2020-12-06 05:43:54+00:00,A,0.783,0.852,11,-6.256,0,0.289,0.22,0.0,0.238,0.772,97.047,audio_features,spotify:track:6lAAEBqCLwllbEG1Pd393V,https://api.spotify.com/v1/tracks/6lAAEBqCLwllbEG1Pd393V,https://api.spotify.com/v1/audio-analysis/6lAAEBqCLwllbEG1Pd393V,227679,4

Summary

I used the Spotify API to get the audio data of the song. Next time, we will visualize the similarity of songs from audio data. Well then.

Recommended Posts

[Spotify API] Looking back on 2020 with playlists --Part.1 Acquisition of playlist data
[Spotify] Looking back on 2020 with playlists --Part.2 EDA (basic statistics), data preprocessing
Create playlists of bright songs only with Spotify Web API
3. Natural language processing with Python 3-4. A year of corona looking back on TF-IDF [Data creation]
Data acquisition from analytics API with Google API Client for python Part 2 Web application
Looking back on the data M-1 Grand Prix 2020
Looking back on learning with Azure Machine Learning Studio
Automatic acquisition of stock price data with docker-compose
Looking back on creating a web service with Django 1
Looking back on the transition of the Qiita Advent calendar
Looking back on creating a web service with Django 2
Looking back on ABC155
Notes on handling large amounts of data with python + pandas
Looking back on iOS'Healthcare App' 2019
I studied with Kaggle Start Book on the subject of kaggle [Part 1]