Continuing from yesterday, I will write 10 lines of code today as well.
Scraping
I would like to complete the main body of scraping, which I could not complete yesterday.
It was a bundle install that got stuck yesterday, but when I reinstalled xcode, it worked fine lol
$ bundle install --path .bundle
Fetching gem metadata from https://rubygems.org/.................
Resolving dependencies...
Using bundler 1.17.2
Using mini_portile2 2.4.0
Fetching nokogiri 1.10.9
Installing nokogiri 1.10.9 with native extensions
Bundle complete! 1 Gemfile dependency, 3 gems now installed.
Bundled gems are installed into `./.bundle`
Now, let's write the processing of the perspective part using Nokogiri. This time, I just want to see the title of the animation, so I will extract the title.
I want to use nokogiri, so at the top of the file
require 'nokogiri'
I will describe here.
And in the continuation of the last time
#Parse the received HTML
doc = Nokogiri::HTML.parse(response.body, nil, nil)
#Extract necessary information from parsed information
doc.css(".l-searchPageRanking_unit_title").each{|div|
puts div.text.split("\n")[2].gsub(" ","")
}
Write the process to parse the HTML received in this way.
You can execute ruby with the following command.
bundle exec ruby crawler.rb
Then
Do you want Kaguya to tell you? ~ Love brain battle of geniuses ~ (TV anime video)
Dropkick on My Devil'(dash) (TV anime video)
BNA BNA (TV anime video)
Kakushigoto (TV anime video)
I have reincarnated as a villain daughter who has only the ruin flag of the maiden game ... (TV anime video)
Wave, Listen to Me! (TV Anime Video)
Singing Yesterday (TV Anime Video)
Fruit basket 2nd season (TV anime video)
Appare-Ran! (TV anime video)
Princess Connect! Re: Dive (TV anime video)
Major 2nd 2nd series (TV anime video)
Book lover Shimogami-I can't choose the means to become a librarian-Part 2 (TV anime video)
Diary of Our Days after School (TV Anime Video)
Kingdom 3rd series (TV anime video)
Gleipnir (TV anime video)
Troublesome grandfather (TV anime video)
After all my youth romantic comedy is wrong. Complete (TV anime video)
Shokugeki no Soma Gono Dish (TV Anime Video)
Arte (TV anime video)
Millionaire Detective Balance: UNLIMITED (TV Anime Video)
Digimon Adventure: (TV Anime Video)
Re: Life in a different world starting from zero(Second stage)(TV anime video)
LISTENERS (TV anime video)
Hakushon Daimaou 2020 (TV anime video)
Tsugumomo (TV anime video)
I think the list will be displayed like this. To be honest, I feel like I'm going to put the split or gsub place a little more properly, but ... I'm okay with this once lol
However, there is one problem here, when I checked the number of lines of the code added this time, it was 9 lines even if I put a comment .... So, I tried to get the numerical value such as the evaluation number here as well. I would like to add.
Let's tweak the parsing process earlier so that we can also see the number of ratings and comments.
#Extract necessary information from parsed information
doc.css(".l-searchPageRanking_unit").each{|div|
puts "title:" + div.css(".l-searchPageRanking_unit_title")[0].text.split("\n")[2].gsub(" ","")
puts "Evaluation:" + div.css(".l-searchPageRanking_unit_mainBlock_starPoint strong")[0].text
puts "Number of comments:" + div.css(".l-searchPageRanking_unit_mainBlock_starPoint span")[0].text + "\n\n"
}
To briefly explain, in the previous code, only the headline was repeated, but if you want evaluation, you need to specify the tag that encloses the animation information and loop, so as above I'm messing around in front of each.
Execute with the following command
bundle exec ruby crawler.rb
Then
title:Do you want Kaguya to tell you? ~ Love brain battle of geniuses ~ (TV anime video)
Evaluation:3.8
Number of comments:120
title:Dropkick on My Devil'(dash) (TV anime video)
Evaluation:3.9
Number of comments:54
title:BNA BNA (TV anime video)
Evaluation:3.7
Number of comments:88
title:Kakushigoto (TV anime video)
Evaluation:3.6
Number of comments:105
title:I have reincarnated as a villain daughter who has only the ruin flag of the maiden game ... (TV anime video)
Evaluation:3.6
Number of comments:114
title:Wave, Listen to Me! (TV Anime Video)
Evaluation:3.6
Number of comments:77
title:Singing Yesterday (TV Anime Video)
Evaluation:3.8
Number of comments:115
title:Fruit basket 2nd season (TV anime video)
Evaluation:3.5
Number of comments:23
title:Appare-Ran! (TV anime video)
Evaluation:3.4
Number of comments:46
title:Princess Connect! Re: Dive (TV anime video)
Evaluation:3.5
Number of comments:55
title:Major 2nd 2nd series (TV anime video)
Evaluation:3.5
Number of comments:15
title:Book lover Shimogami-I can't choose the means to become a librarian-Part 2 (TV anime video)
Evaluation:3.3
Number of comments:41
title:Diary of Our Days after School (TV Anime Video)
Evaluation:3.4
Number of comments:53
title:Kingdom 3rd series (TV anime video)
Evaluation:3.4
Number of comments:20
title:Gleipnir (TV anime video)
Evaluation:3.5
Number of comments:74
title:Troublesome grandfather (TV anime video)
Evaluation:3.4
Number of comments:11
title:After all my youth romantic comedy is wrong. Complete (TV anime video)
Evaluation:3.3
Number of comments:23
title:Shokugeki no Soma Gono Dish (TV Anime Video)
Evaluation:3.3
Number of comments:30
title:Arte (TV anime video)
Evaluation:3.4
Number of comments:51
title:Millionaire Detective Balance: UNLIMITED (TV Anime Video)
Evaluation:3.2
Number of comments:37
title:Digimon Adventure: (TV Anime Video)
Evaluation:3.2
Number of comments:15
title:Re: Life in a different world starting from zero(Second stage)(TV anime video)
Evaluation:3.2
Number of comments:13
title:LISTENERS (TV anime video)
Evaluation:3.2
Number of comments:57
title:Hakushon Daimaou 2020 (TV anime video)
Evaluation:3.2
Number of comments:14
title:Tsugumomo (TV anime video)
Evaluation:3.1
Number of comments:26
As mentioned above, you can also get the rating and the number of comments for each animation, and you can decide which one to watch! !! !!
PS Personally, I was pushing Soma, but recently the momentum has clearly slowed down ... Then, I thought I would look at the highly rated "Kaguya" or the "Fruit Basket" that cried after the first term. I will.
I would like to publish the code I wrote today on github as well. (I don't know if it's worth it) https://github.com/itayayuichiro/anikore_crawler