Introduction

I didn't need it because there is a lot of wikipedia data locally, but when I wanted a little data, I came across the wikipedia API, so it is a record at that time.

environment

Operable OS (works on both windows and mac) ┗mac OS Catalina 10.15.7 ┗Widows 10 Python 3.8.3

Installation

Only this. pip install wikipedia

Collect the summary part of wikipedia

When you enter a search word, it will search for articles related to that word without permission. ** python3 wikipedia_data.py search word ** You can do it with. The execution result, that is, the article data of wikipdia is saved in wikipedia.txt.

If you have a problem with your search word, ** wikipedia.exceptions.DisambiguationError: "search word" may refer to: ** After the sentence, it will suggest candidates, so searching again with that word will work.

In rare cases, a long error may occur, but due to the nature of the API, there is probably an error in communication due to some influence. So, if you get an error other than the above, ignore it and try again to succeed.

`wikipedia_data.py`


import sys
import wikipedia

#Set language to Japanese
wikipedia.set_lang("jp")
#Open text file
f = open('wikipedia.txt', 'a')

args = sys.argv
word = args[1]
#Search using search words
words = wikipedia.search(word)

if not words:
    print("No match")
else:
    #Get a summary if the search word hits
    line = str(wikipedia.summary(words[0]))
    f.write(line.rstrip())
    print("success!")

f.write("\n" + "endline" + "\n")
f.close()

How to use the wikipedia API

Official English tutorial ↓ https://wikipedia.readthedocs.io/en/latest/code.html

It doesn't taste good on its own, so I've briefly extracted and summarized what I think I'll use. (I think it's enough to know this, but there are a lot of broken parts, so if you want to master it, please see the tutorial for yourself)

method	Overview
wikipedia.search ("search word", results = 10)	Returns a list of up to 10 search results for a search word
wikipedia.summary ("search word", sentences = 0)	Get the article summary for the search word
wikipedia.page ("search word")	Get the entire article for the search word as an object If you add .content to the generated object, you can get the entire article as text data

# At the end Thank you for your hard work this time as well. You can easily get a large amount of wikipedia data, but if you want only a few dozens of data, this method may be good. If anyone knows how to do it, let me know in the comments. I write articles every time, so I don't know what to write next, but I will write something again. Well then.

[Python] I tried collecting data using the API of wikipedia