100 amateur language processing knocks: 06

It is a challenge record of Language processing 100 knock 2015. The environment is Ubuntu 16.04 LTS + Python 3.5.2 : : Anaconda 4.1.1 (64-bit). Click here for a list of past knocks (http://qiita.com/segavvy/items/fb50ba8097d59475f760).

Chapter 1: Preparatory movement

06. Meeting

Find the set of characters bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.

The finished code:

main.py


# coding: utf-8


def n_gram(target, n):
	'''N from the specified list-Create gram

argument:
	target --Target list
	n -- n-gram n value (1 is uni-gram, 2 for bi-gram...)
Return value:
List of gram
	'''
	result = []
	for i in range(0, len(target) - n + 1):
		result.append(target[i:i + n])

	return result


#Creating a set
set_x = set(n_gram('paraparaparadise', 2))
print('X:' + str(set_x))
set_y = set(n_gram('paragraph', 2))
print('Y:' + str(set_y))

#Union
set_or = set_x | set_y
print('Union:' + str(set_or))

#Intersection
set_and = set_x & set_y
print('Intersection:' + str(set_and))

#Difference set
set_sub = set_x - set_y
print('Difference set:' + str(set_sub))

# 'se'Is included?
print('se is included in X:' + str('se' in set_x))
print('se is included in Y:' + str('se' in set_y))

Execution result:

Terminal


X:{'ar', 'se', 'di', 'is', 'pa', 'ap', 'ad', 'ra'}
Y:{'ar', 'gr', 'ph', 'ra', 'pa', 'ap', 'ag'}
Union:{'ar', 'gr', 'se', 'ph', 'di', 'is', 'pa', 'ap', 'ad', 'ra', 'ag'}
Intersection:{'ar', 'ap', 'pa', 'ra'}
Difference set:{'di', 'is', 'se', 'ad'}
se is included in X:True
se is included in Y:False

n_gram () is a reuse of previous question.

For the answers of seniors, set.union (), set.intersection (), [set.difference ()](http://docs.python.jp/3/library/stdtypes.html # set.difference) is often used. It is said that it is more readable, but I am not good at English, so |, & and - are more intuitive, so I tried this.

Even so, paraparaparadise will appear in such a place ^^   That's all for the 7th knock. If you have any mistakes, I would appreciate it if you could point them out.

Recommended Posts

100 amateur language processing knocks: 41
100 amateur language processing knocks: 71
100 amateur language processing knocks: 56
100 amateur language processing knocks: 24
100 amateur language processing knocks: 50
100 amateur language processing knocks: 59
100 amateur language processing knocks: 70
100 amateur language processing knocks: 62
100 amateur language processing knocks: 60
100 amateur language processing knocks: 92
100 amateur language processing knocks: 30
100 amateur language processing knocks: 06
100 amateur language processing knocks: 84
100 amateur language processing knocks: 81
100 amateur language processing knocks: 33
100 amateur language processing knocks: 46
100 amateur language processing knocks: 88
100 amateur language processing knocks: 89
100 amateur language processing knocks: 40
100 amateur language processing knocks: 45
100 amateur language processing knocks: 43
100 amateur language processing knocks: 55
100 amateur language processing knocks: 22
100 amateur language processing knocks: 61
100 amateur language processing knocks: 94
100 amateur language processing knocks: 54
100 amateur language processing knocks: 04
100 amateur language processing knocks: 63
100 amateur language processing knocks: 78
100 amateur language processing knocks: 12
100 amateur language processing knocks: 14
100 amateur language processing knocks: 08
100 amateur language processing knocks: 42
100 amateur language processing knocks: 19
100 amateur language processing knocks: 73
100 amateur language processing knocks: 75
100 amateur language processing knocks: 98
100 amateur language processing knocks: 83
100 amateur language processing knocks: 95
100 amateur language processing knocks: 32
100 amateur language processing knocks: 96
100 amateur language processing knocks: 87
100 amateur language processing knocks: 72
100 amateur language processing knocks: 79
100 amateur language processing knocks: 23
100 amateur language processing knocks: 05
100 amateur language processing knocks: 00
100 amateur language processing knocks: 02
100 amateur language processing knocks: 37
100 amateur language processing knocks: 21
100 amateur language processing knocks: 68
100 amateur language processing knocks: 11
100 amateur language processing knocks: 90
100 amateur language processing knocks: 74
100 amateur language processing knocks: 66
100 amateur language processing knocks: 28
100 amateur language processing knocks: 64
100 amateur language processing knocks: 34
100 amateur language processing knocks: 36
100 amateur language processing knocks: 77
100 amateur language processing knocks: 01