My beloved Thrash Metal, ** SLAYER ** is my favorite.
Although they have been active for many years, they have finally reached the final world tour while overcoming the death of the members. And on November 30, 2019, the final lecture in LA graced the endless beauty.
https://www.youtube.com/watch?v=OwsdbuxRc_s
To commemorate this, I would like to confirm what they wanted to convey.
After all blood is a death ... I have nothing to say anymore.
Thanks for the great music and message! !! !! !! !!
First, check the HTML of the target page. The lyrics are written in the "lyrics" class, but there are some extra child tags that need to be eliminated.
sample tags
<div class="lyrics">
<h3><a name="1">1. Evil Has No Boundaries</a></h3><br />
<i>[Lyrics - Hanneman, King; Music - King]</i><br />
<br />
Blasting our way through the boundaries of Hell<br />
・ ・ ・
We conquer then move on ahead<br />
<br />
<i>[Chorus:]</i><br />
Evil<br />
・ ・ ・
Your soul now his to keep<br />
<br />
<div class="thanks">Tom Araya - Bass/Vocals<br />
Kerry King - Lead/Rhythm Guitar<br />
Jeff Hanneman - Lead/Rhythm Guitar<br />
Dave Lombardo - Drums<br />
<br />
Thanks to nwdrk13 for correcting track #4 lyrics.<br />
Thanks to rath00 for correcting track #6 lyrics.</div>
<br />
<div class="note">Submits, comments, corrections are welcomed at [email protected]</div><br />
<a href="http://www.darklyrics.com/s/slayer.html">SLAYER LYRICS</a>
</div>
It turned out to be something like this.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import time
from wordcloud import WordCloud
#URL list for each album
urls = ['http://www.darklyrics.com/lyrics/slayer/shownomercy.html',
'http://www.darklyrics.com/lyrics/slayer/hauntingthechapel.html',
'http://www.darklyrics.com/lyrics/slayer/hellawaits.html',
'http://www.darklyrics.com/lyrics/slayer/reigninblood.html',
'http://www.darklyrics.com/lyrics/slayer/southofheaven.html',
'http://www.darklyrics.com/lyrics/slayer/seasonsintheabyss.html',
'http://www.darklyrics.com/lyrics/slayer/divineintervention.html',
'http://www.darklyrics.com/lyrics/slayer/undisputedattitude.html',
'http://www.darklyrics.com/lyrics/slayer/diabolusinmusica.html',
'http://www.darklyrics.com/lyrics/slayer/godhatesusall.html',
'http://www.darklyrics.com/lyrics/slayer/christillusion.html',
'http://www.darklyrics.com/lyrics/slayer/worldpaintedblood.html',
'http://www.darklyrics.com/lyrics/slayer/repentless.html']
texts = ''
for url in urls:
#Get
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
song_lyrics = soup.find('div', class_='lyrics')
#Delete unnecessary tags
for tag in song_lyrics.find_all('h3'):
tag.extract()
for tag in song_lyrics.find_all('i'):
tag.extract()
for tag in song_lyrics.find_all('div', class_='thanks'):
tag.extract()
for tag in song_lyrics.find_all('div', class_='note'):
tag.extract()
for tag in song_lyrics.find_all('a'):
tag.extract()
song_lyric = song_lyrics.text
song_lyric = song_lyric.replace('\n',' ')
#Wait for 1 second (considering server load)
time.sleep(1)
#Add the acquired lyrics
texts = texts + song_lyric + ' '
#Word removal that seems meaningless
stop_words = ['the', 'of', 'to', 'is', 'in', 'for', 'with', 'that', 'my', 'all', 'will', 'from', 'can', 'your',
'on', 'me', 'it', 'and', 'this', 'be', 'are', 'am', 'their', 'do', 'there', 'you', 'it']
wordcloud = WordCloud(background_color='black', colormap='autumn',
width=800, height=600, stopwords=set(stop_words)).generate(texts)
#The image is wordcloud.Save png in the same directory as the py file
wordcloud.to_file('./wordcloud.png')
From "Show No Mercy" in 1983, the legendary album "Reign in Blood" in 1986. And the last original album, 2015 "Repentless".
No other band has been so musical in their lifetime. They were very rare.
I will hold their sound and the steel soul in my heart for the rest of my life.
Ah ... it wasn't a blog here.
Recommended Posts