I tried Language processing 100 knock 2020. Links to other chapters can be found at here, and source code can be found at here.
Get a string in which the characters of the string "stressed" are arranged in reverse (from the end to the beginning).
000.py
str = 'stressed'
print(str[::-1])
# -> desserts
Output in reverse order using slices. It's interesting to be able to easily write such operations.
Take out the 1st, 3rd, 5th, and 7th characters of the character string "Patatokukashi" and get the concatenated character string.
001.py
str = "Patatoku Kashii"
print(str[0:8:2])
# ->Police car
Since the odd number is taken out, step
is set to 2
.
Obtain the character string "Patatokukashi" by alternately connecting the characters "Police car" + "Taxi" from the beginning.
002.py
str1 = "Police car"
str2 = "taxi"
print(''.join([s1 + s2 for s1, s2 in zip(str1, str2)]))
# ->Patatoku Kashii
At first I thought about looping with ʻindex, It seems that you can handle multiple functions at once by using the
zip` function.
Break down the sentence "Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics."
003.py
sentense = "Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics."
print([len(item) for item in sentense.replace(',', "").replace('.', "").split(' ')])
# -> [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9]
I tried using list comprehension notation. It may be convenient because you can write in a few lines when creating a new list.
Break down the sentence “Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can.” Into words 1, 5, 6, 7, 8, 9, 15, 16, The 19th word is the first character, and the other words are the first two characters, and the associative array (dictionary type or map type) from the extracted character string to the word position (what number of words from the beginning) Create.
004.py
str = "Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can."
str = str.split()
num = [1, 5, 6, 7, 8, 9, 15, 16, 19]
dict = {}
for i_str in range(0, len(str)):
if i_str + 1 == 12:
dict[str[11][:3:2]] = 12 # 'Mg'Output of
elif i_str + 1 in num:
dict[str[i_str][:1]] = i_str + 1
else:
dict[str[i_str][:2]] = i_str + 1
print(dict)
# -> {'H': 1, 'He': 2, 'Li': 3, 'Be': 4, 'B': 5, 'C': 6, 'N': 7, 'O': 8, 'F': 9, 'Ne': 10, 'Na': 11, 'Mg': 12, 'Al': 13, 'Si': 14, 'P': 15, 'S': 16, 'Cl': 17, 'Ar': 18, 'K': 19, 'Ca': 20}
I feel like the code is a little long ... If the rules are followed, the Mg
part will be output as Mi
and I'm curious, so I'm processing it with the ʻif` statement.
No.05 n-gram
Create a function that creates an n-gram from a given sequence (string, list, etc.). Use this function to get the word bi-gram and the letter bi-gram from the sentence "I am an NLPer".
005.py
def n_gram(list, n):
return ["".join(list[list_i: list_i + n]) for list_i in range(len(list) - n + 1)]
sentence = "I am an NLPer"
print(f"Word bi-gran: {n_gram(sentence.split(), 2)}")
print(f"Character bi-gram: {n_gram(sentence, 2)}")
# ->Word bi-gran: ['Iam', 'aman', 'anNLPer']
#Character bi-gram: ['I ', ' a', 'am', 'm ', ' a', 'an', 'n ', ' N', 'NL', 'LP', 'Pe', 'er']
Use join
to join the elements of the list. Since the word bi-gram and the character bi-gram are doing similar processing, I tried to make it a function, but I feel that I was able to write it well.
Find the set of characters bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.
006.py
str1 = "paraparaparadise"
str2 = "paragraph"
def n_gram(list, n):
return {"".join(list[list_i: list_i + n]) for list_i in range(len(list) - n + 1)}
X = n_gram(str1, 2)
Y = n_gram(str2, 2)
print(f"Union:{X | Y}")
print(f"Intersection:{X & Y}")
print(f"Difference set:{X - Y}")
se = {"se"}
print(f"Is se included in X? :{se <= X}")
print(f"Is se included in Y? :{se <= Y}")
# ->Union:{'ph', 'di', 'ar', 'gr', 'ad', 'is', 'se', 'ap', 'pa', 'ra', 'ag'}
#Intersection:{'ra', 'ap', 'ar', 'pa'}
#Difference set:{'is', 'di', 'se', 'ad'}
#Is se included in X? : True
#Is se included in Y? : False
Union () , ʻintersection ()
, difference ()
can also be used for union, product, and difference.
Implement a function that takes arguments x, y, z and returns the string "y at x is z". Furthermore, set x = 12, y = ”temperature”, z = 22.4, and check the execution result.
007.py
def templete(x, y, z):
return f"{x}of time{y}Is{z}"
print(templete(12, "temperature", 22.4))
# ->The temperature at 12:00 is 22.4
nothing special.
Implement the function cipher that converts each character of the given character string according to the following specifications. ・ If lowercase letters, replace with (219 --character code) characters ・ Other characters are output as they are Use this function to encrypt / decrypt English messages.
008.py
def cipher(sentence):
return "".join([chr(219 - ord(ch)) if ch.islower() else ch for ch in sentence])
sen = "FireWork"
print(cipher(sen))
print(cipher(cipher(sen)))
# -> FrivWlip
# FireWork
It seems to be Atbash encryption. You can get it back by passing the cipher
function twice.
No.09 Typoglycemia
Create a program that randomly rearranges the order of the other letters, leaving the first and last letters of each word for the word string separated by spaces. However, words with a length of 4 or less are not rearranged. Give an appropriate English sentence (for example, "I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind.") And check the execution result.
009.py
import random
sentence = "I couldn’t believe that I could actually understand what I was reading : the phenomenal power of the human mind."
new_sent = ""
for item in sentence.split():
if len(item) > 4:
new_item = []
new_item.extend(item[0])
new_item.extend(random.sample(item[1:-1], len(item) - 2))
new_item.extend(item[-1])
item = new_item
new_sent += "".join(item) + " "
print(new_sent)
# -> I could’nt blveeie that I cuold atlculay utnresnadd what I was renadig : the pamohneenl pewor of the human mdin.
In addition to random.sample
, there is random.shuffle
as a function to randomly arrange the elements of the list. The shuffle
function sorts the original list, so I think the code can be a little shorter.
[Upura / nlp100v2020 100 language processing knock 2020] is solved with Python](https://github.com/upura/nlp100v2020) Amateur language processing 100 knock summary
Recommended Posts