I tried to scrape YouTube, but I can use the API, so don't do it.

I made it a long time ago to grasp the fashion from YouTube, but now it's okay with API

Memorial service Also, don't scrape YouTube


from selenium import webdriver
import time
from selenium.webdriver.common.action_chains import ActionChains
import urllib.parse


def main():
    #Search word
    search_words = ['Alpha wave', 'sleep']
    #open chrome
    driver = webdriver.Chrome('../chromedriver')
    s = '+'.join(map(urllib.parse.quote, search_words))
    driver.get("https://www.youtube.com/results?search_query=" + s + '&sp=CAM%253D')
    info_list = []
    time.sleep(1)
    for i in range(10):
        driver.execute_script("scrollBy(0, 1000);")
    for i in range(35, 45):
        info = {'title': '', 'url': '', 'channel': '', 'registrant': 0, 'release': ''}
        loop_flag = 0
        selector = f'#contents > ytd-item-section-renderer:nth-child({i // 20 + 1}) > #contents > ytd-video-renderer:nth-child({20 if i % 20 == 0 else i % 20}) > #dismissable > #video-title > yt-formatted-string'
        while loop_flag <= 2:
            try:
                element = driver.find_element_by_css_selector(selector)
                actions = ActionChains(driver)
                actions.move_to_element(element)
                actions.perform()
                info['url'] = element.get_attribute('href')
                break
            except Exception as e:
                print(i, e)
                print(selector)
                loop_flag += 1
                time.sleep(1)
        if not info['url'] == '':
            info_list += info
    print(info_list)
    print(len(info_list))
    driver.quit()


if __name__ == "__main__":
    main()

Recommended Posts

I tried to scrape YouTube, but I can use the API, so don't do it.

I use python but I don't know the class well, so I will do a tutorial

I tried to touch the COTOHA API

I tried to expand the database so that it can be used with PES analysis software

I installed PySide2, but pyside2-uic didn't work, so I managed to do it.

I tried to touch the API of ebay

I tried to publish my own module so that I can pip install it

I tried to use Java with Termux using Termux Arch but it didn't work

I tried to use Resultoon on Mac + AVT-C875, but I was frustrated on the way.

I didn't understand the Resize of TensorFlow so I tried to summarize it visually.

I wanted to use the find module of Ansible2, but it took some time, so make a note

The tree.plot_tree of scikit-learn was very easy and convenient, so I tried to summarize how to use it easily.

I tried to make OneHotEncoder, which is often used for data analysis, so that it can reach the itch.

[First COTOHA API] I tried to summarize the old story

I tried to search videos using Youtube Data API (beginner)

I tried to get various information from the codeforces API

I tried to summarize how to use the EPEL repository again

[For those who want to use TPU] I tried using the Tensorflow Object Detection API 2

I don't tweet, but I want to use tweepy: just display the search results on the console

I made a function to crop the image of python openCV, so please use it.

The Like (LGTM) order has disappeared from My Page, so use the Qiita API to get it.

I tried to create Quip API

I tried the Naro novel API 2

[Qiita API] [Statistics • Machine learning] I tried to summarize and analyze the articles posted so far.

I tried to touch Tesla's API

[Python] The status of each prefecture of the new coronavirus is only published in PDF, but I tried to scrape it without downloading it.

I tried the Naruro novel API

To celebrate the release of Django 3.0, I tried ASGI, the spiritual successor to WSGI, but I couldn't use websocket.

I don't really understand the difference between modules, packages and libraries, so I tried to organize them.

[Python] I tried to get various information using YouTube Data API!

[Shell script] It's annoying to send the same content every week, so I tried to automate it! !! !!

I tried to move the ball

I tried using the checkio API

I tried to estimate the interval.

From the introduction of GoogleCloudPlatform Natural Language API to how to use it

I want to do it with Python lambda Django, but I will stop

When I tried to run Python, it was skipped to the Microsoft Store

I don't want to admit it ... The dynamical representation of Neural Networks

I tried to make a calculator with Tkinter so I will write it

I tried to get the authentication code of Qiita API with Python.

Matching karaoke keys ~ I tried to put it on Laravel ~ <on the way>

I tried to summarize various sentences using the automatic summarization API "summpy"

I tried to install Docker on Windows 10 Home but it didn't work

I tried using "Streamlit" which can do the Web only with Python

I tried to get the movie information of TMDb API with Python

I tried to find out what I can do because slicing is convenient

I tried to summarize the umask command

I tried to recognize the wake word

I tried using YOUTUBE Data API V3

I tried to use deep learning to extract the part where the plant is shown from the photo of the balcony, but it didn't work, so I will summarize the contents of trial and error. Part 2

I tried to summarize the graphical modeling.

I tried to estimate the pi stochastically

I tried to make a Web API

I tried using the BigQuery Storage API

I couldn't import the python module with VSCODE, but I could do it on jupyterlab, so I searched for the cause (2)

In IPython, when I tried to see the value, it was a generator, so I came up with it when I was frustrated.

I thought it would be slow to use a for statement in NumPy, but that wasn't the case.

I wanted to know the number of lines in multiple files, so I tried to get it with a command

The file edited with vim was readonly but I want to save it

python I don't know how to get the printer name that I usually use.

When I tried to change the root password with ansible, I couldn't access it.