There are many articles on how to retrieve YouTube video URLs, but support for shortened URLs that start with https://youtu.be/
generated when you press the" Share "button, and URLs. If you include query parameters (for example, t = 15
that specifies the time or feature = youtu.be
that indicates the transfer from the shortened URL), I felt that all of them were not considered, so write them here as a memo Try.
By the way, the YouTube URL query parameter t
, which indicates the playback start position, is
https://youtu.be/r4Mkv-q4NmQ?t=5437
and
Like https://youtu.be/r4Mkv-q4NmQ?t=5437s
Of course, all are specified in seconds
Like https://youtu.be/r4Mkv-q4NmQ?t=1h30m37s
If you set ◯ h △ m □ s
, the URL will start playing from" ◯ hours △ minutes □ seconds "!
◯ h
and use △ m □ s
.The YouTube URL in this article is basically my posted video or channel URL!
Works with Python3 series. It seems that there is no ʻurllib.parse` module in Python2 series.
import urllib.parse
import re
##############################################################
##Extract YouTube video id from URL list
##Supports normal URLs and shortened URLs. Error message is displayed for unsupported URLs
##Arguments: List of URLs
##Return value: List of extracted video ids
##############################################################
def pick_up_vid_list(url_list):
vid_list = []
pattern_watch = 'https://www.youtube.com/watch?'
pattern_short = 'https://youtu.be/'
for i, url in enumerate(url_list):
#When using a normal URL
if re.match(pattern_watch,url):
yturl_qs = urllib.parse.urlparse(url).query
vid = urllib.parse.parse_qs(yturl_qs)['v'][0]
vid_list.append(vid)
#For shortened URLs
elif re.match(pattern_short,url):
# "https://youtu.be/"The 11 characters following the video ID
vid = url[17:28]
vid_list.append(vid)
else:
print('error:\n URL is\"https://www.youtube.com/watch?\"Or')
print(' \"https://youtu.be/\"Please specify the URL that starts with.')
print(' - '+ str(i+1)+ 'Item:' + url)
return vid_list
For regular URLs that start with https://www.youtube.com/watch?
, the video ID corresponds to the v
parameter of the URL query, so I'm extracting it!
In the shortened URL that starts with https://youtu.be/
, the 11 characters following https://youtu.be/
are always the video ID, so I'm taking it out!
I was worried about the possibility of carrying up to 12 characters and thought I had to look for it with a regular expression, but apparently it's okay.
→ [About the risk of the v value of YouTube being carried-Nipotan Research Institute](http://blog.livedoor.jp/nipotan/archives/50588074.html" About the risk of the value of v of YouTube being carried --Nipotan Research Institute ")
Also, according to this article, it seems that the video ID is made up of [0-9] [a-z] [A-Z]
, -
and _
. According to "[Characters that can be used in URLs, characters that cannot be used](https://www.ipentec.com/document/web-url-invalid-char" Characters that can be used in URLs, characters that cannot be used ")" It seems that it can not be used for anything other than this, so I will not increase the character type, and if it becomes insufficient, I will increase the number of digits.
url_list = [
'https://www.youtube.com/watch?v=k3nPaVj8-3w',
'https://www.youtube.com/watch?v=2k-uF-QPcEM&t=5',
'https://www.youtube.com/watch?v=5_Vy0ZtPo_w',
'https://youtu.be/_t-i0KLiJBk',
'https://youtu.be/tfIvsrRxaXg',
'https://youtu.be/biaC_2Mx7Mw?t=283',
'https://www.youtube.com/',
'https://www.youtube.com/channel/UCDWM7dKT5vLXqSi_YljdlBw']
vid_list = pick_up_vid_list(url_list)
for vid in vid_list:
print (vid)
Execution result:
error:
URL is"https://www.youtube.com/watch?"Or
"https://youtu.be/"Please specify the URL that starts with.
-7th: https://www.youtube.com/
error:
URL is"https://www.youtube.com/watch?"Or
"https://youtu.be/"Please specify the URL that starts with.
-8th: https://www.youtube.com/channel/UCDWM7dKT5vLXqSi_YljdlBw
k3nPaVj8-3w
2k-uF-QPcEM
5_Vy0ZtPo_w
_t-i0KLiJBk
tfIvsrRxaXg
biaC_2Mx7Mw
Some standard Python methods can analyze query parameters! Great comfort!
I can't do it without using purl.js
with JavaScript!
Well, of course you can implement it yourself, but ... it's a hassle.
How to use regular expressions in Python --Qiita How to use Python's regular expression module re (match, search, sub, etc.)| note.nkmk.me Get / create / change URL query string (parameter) in Python| note.nkmk.me
Recommended Posts