Sometimes I wanted to compare two lists with strings as elements and retrieve the common elements as a list. Well, I can do it, but I wondered how to do it myself and thought about it.
Suppose you have two lists like this, tag_list and src_list, and you want to retrieve common elements as a list.
tag_list=['igarashi', 'kubo', 'iguchi']
src_list=['taniguchi', 'matsushita', 'koyama', 'asama',
'marui', 'igarashi', 'kubo', 'kondo']
tag_list has three elements.
ʻIgarashiand
kubo are also in src_list, but since there is no ʻiguchi
, the expected value is['igarashi','kubo']
.
matched_list = []
for tag in tag_list:
for src in src_list:
if tag == src:
matched_list.append(tag)
The first thing that came to my mind was, of course, this. It's easy to understand, but the indentation is deep and light.
matched_list = []
for tag in tag_list:
matched_list+=filter(lambda str: str == tag, src_list)
I wanted to use the list operation functions, filter (), map (), reduce (), so I tried my best. Is it intuitive for modern people who are accustomed to languages with abundant array manipulation functions?
src_set = set(src_list)
tag_set = set(tag_list)
matched_list = list(src_set & tag_set)
When I googled, something like this suddenly came out. In a sense, it's intuitive. Is it the point that aggregate types have no order? I learned that it can be used in such cases.
Is it easy to understand, readability, refreshing, python-like, and has advantages and disadvantages? I thought. More and more! Is there a way to write it? Also, is there a difference in processing speed? I was also worried. Next time, I will make a large sample data and measure it.
cmp_list.py https://github.com/yamao2253/qiita
Recommended Posts