A record of solving the problems in the second half of Chapter 1.
05. n-gram
Create a function that creates an n-gram from a given sequence (string, list, etc.). Use this function to get the word bi-gram and the letter bi-gram from the sentence "I am an NLPer".
# -*- coding: utf-8 -*-
__author__ = 'todoroki'
def ngram(data, n):
res = []
for i in xrange(len(data) - 1):
res.append(data[i:i + n])
return res
string = 'I am an NLPer'
print u'Character list bi-gram:'
print ngram(string.split(), 2)
print u'String bi-gram:'
print ngram(string, 2)
#=>Character list bi-gram:
#=> [['I', 'am'], ['am', 'an'], ['an', 'NLPer']]
#=>String bi-gram:
#=> ['I ', ' a', 'am', 'm ', ' a', 'an', 'n ', ' N', 'NL', 'LP', 'Pe', 'er']
For the bi-gram of the character string, spaces are also treated as one character.
Find the set of characters bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.
# -*- coding: utf-8 -*-
__author__ = 'todoroki'
def ngram(data, n):
res = []
for i in xrange(len(data) - 1):
res.append(data[i:i + n])
return res
string1 = 'paraparaparadise'
string2 = 'paragraph'
X = ngram(string1, 2)
Y = ngram(string2, 2)
print u"Union:"
print list(set(X).union(set(Y)))
print u"Intersection:"
print list(set(X).intersection(set(Y)))
print u"Difference set:"
print list(set(X).difference(set(Y)))
print u'\'se\'Is included in X?'
print "se" in X
print u'\'se\'Is included in Y?'
print "se" in Y
#=>Union:
#=> ['ad', 'ag', 'di', 'is', 'ap', 'pa', 'ra', 'ph', 'ar', 'se', 'gr']
#=>Intersection:
#=> ['ap', 'pa', 'ar', 'ra']
#=>Difference set:
#=> ['is', 'ad', 'se', 'di']
#=> 'se'Is included in X?
#=> True
#=> 'se'Is included in Y?
#=> False
The union, intersection, and difference set of bi-gram are obtained by using the set method.
Implement a function that takes arguments x, y, z and returns the string "y at x is z". Furthermore, set x = 12, y = "temperature", z = 22.4, and check the execution result.
# -*- coding: utf-8 -*-
__author__ = 'todoroki'
def func(x, y, z):
return u"%s time%s is%s" % (x, y, z)
x = 12
y = u"temperature"
z = 22.4
print func(x, y, z)
#=>The temperature at 12:00 is 22.4
Use the format specification of the print statement for the template.
Implement the function cipher that converts each character of the given character string with the following specifications. Replace with (219 --character code) characters in lowercase letters Output other characters as they are Use this function to encrypt / decrypt English messages.
# -*- coding: utf-8 -*-
__author__ = 'todoroki'
def cipher(data):
res = ""
for s in data:
if s.islower():
res += chr(219-ord(s))
else:
res += s
return res
string = "re1"
print u'encryption:'
print cipher(string)
print u'Decryption:'
print cipher(cipher(string))
#=>encryption:
#=> iv1
#=>Decryption:
#=> re1
Whether it is lowercase or not is determined by the islower method. There is no need to implement a special compound because it has the property that it will be restored when the encrypted one is converted again.
09. Typoglycemia
Create a program that randomly rearranges the order of the other letters, leaving the first and last letters of each word for the word string separated by spaces. However, words with a length of 4 or less are not rearranged. Give an appropriate English sentence (for example, "I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind.") And check the execution result.
# -*- coding: utf-8 -*-
__author__ = 'todoroki'
import random
def typoglycemia(data):
res = []
for d in data.split():
if len(d) > 4:
pre = d[0]
suf = d[-1]
word = list(d[1:-1])
random.shuffle(word)
res.append(pre + "".join(word) + suf)
else:
res.append(d)
return " ".join(res)
sentence = "I couldn't believe that I could actually understand what I was reading : the phenomenal power of the human mind ."
print typoglycemia(sentence)
#=> I cuodn'lt beevile that I cuold alluacty usdtanrend what I was reidang : the pmeenhnaol peowr of the huamn mind .
The inside of the string is shuffled using the random shuffle method. Of course, the sorting is random, so the output is different each time.
Recommended Posts