Stock number ranking by Qiita tag with python

【Overview】

Extract articles with the top 10 stocks in a certain tag.

【environment】

windows8.1 　python3.5

【program】

We have ranked Python tags. Execution method → python stock_rank.py> output.html

`stock_rank.py`


# -*- coding: utf-8 -*-

import urllib.request
from bs4 import BeautifulSoup

#Initialization of Contribution number
cont = []
for i in range(10):
    cont.append(0)
    
#Title initialization
title = []
for i in range(10):
    title.append("")

page_num = 1

while True:
    try:
        html = urllib.request.urlopen("https://qiita.com/tags/Python/items?page=" + str(page_num)).read()
        
        soup = BeautifulSoup(html, "html.parser")
        
        #HTML extraction by specifying the class
        title_all = soup.find_all(class_="publicItem_body")
        
        # publicItem_Skip pages without body class
        if len(title_all) == 0:
            continue
        
        for i in range(20):
            try:          
                #HTML extraction by specifying the class
                cont_all = soup.find_all(class_="publicItem_stockCount")
                #Remove annoying tags
                cont_sakujo = str(cont_all[i]).replace('<i class="fa fa-stock "></i>','')
                # cont_all_Since after is str type, string property cannot be used
                #Therefore, convert to Beautiful Soup type
                cont_kazu = int(BeautifulSoup(cont_sakujo, "html.parser").string)
                
                for j in range(10):
                    if cont_kazu >= cont[j]:
                        #Contribution number substitution
                        cont.insert(j, cont_kazu)
                        cont.pop()
                        #Title assignment
                        title.insert(j, title_all[i])
                        title.pop()
                        break
                
            #Skip articles that are not stocked by anyone
            except:
                continue
        
        page_num += 1
        
    # HTTP Error 404
    except:
        break

for i in range(len(title)):
    print (str(cont[i]) + "　" + str(title[i].a).replace('href="', 'href="http://qiita.com') + "<br>")

【result】

When displaying the encoding with utf-8, garbled characters occurred, so I changed it to shift-jis.

【problem】

Program execution time is long (> _ <)

[Reference site]

Get information on the net with Python3 + urllib + BeautifulSoup Scraping with Python and Beautiful Soup Scraping with Beautiful Soup

Recommended Posts

Stock number ranking by Qiita tag with python

Recent ranking creation using Qiita API with Python

[Python] Draw a Qiita tag relationship diagram with NetworkX

[Python] Delete by specifying a tag with Beautiful Soup

ABC161D Lunlun Number with python3

Get stock price with Python

Number recognition in images with Python

random French number generator with python

Quine Post with Qiita API (Python)

Prime number generation program by Python

Get Qiita trends with Python scraping

I tried scraping the ranking of Qiita Advent Calendar with Python

Get property information by scraping with python

Try logging in to qiita with Python

Save video frame by frame with Python OpenCV

Download Japanese stock price data with python

Check stock prices with slackbot using python

Web scraping with Python ② (Actually scraping stock sites)

Get the number of articles accessed and likes with Qiita API + Python

Organize data divided by folder with Python

Topic model by LDA with gensim ~ Thinking about user's taste from Qiita tag ~

Get a list of articles posted by users with Python 3 Qiita API v2

[Python] Automatically totals the total number of articles posted by Qiita using the API

[Linux] Qiita Weekly LGTM Number Ranking [Automatic Update]

[Go] Qiita Weekly LGTM Number Ranking [Automatic Update]

Calculate the total number of combinations with python

Get stock price data with Quandl API [Python]

Get git branch name and tag name with python

Read line by line from a file with Python

Let's do web scraping with Python (stock price)

Python> Sort by number and sort by alphabet> Use sorted ()

FizzBuzz with Python3

Scraping with Python

Statistics with python

Scraping with Python

Python with Go

Twilio with Python

Integrate with Python

Play with 2016-Python

AES256 with python

Tested with Python

python starts with ()

with syntax (Python)

Bingo with python

Zundokokiyoshi with python

Qiita, early Python ♪

Excel with Python

Microcomputer with Python

Cast with python

Get corporate number at once via gbizinfo with python

[First API] Try to get Qiita articles with Python

3 things I noticed by analyzing twitter followers with python

Classify articles with tags specified by Qiita by unsupervised learning

Store the stock price scraped by Python in the DB

[Time series with plotly] Dynamic visualization with plotly [python, stock price]

JPEG image generation by specifying quality with Python + OpenCV

[Python] Get user information and article information with Qiita API

Learn Python asynchronous processing / coroutines by comparing with Node.js

[Python] Sort spreadsheet worksheets by sheet name with gspread

Memo of "Cython-Speeding up Python by fusing with C"

I tried to solve AOJ's number theory with Python