4. Creating a structured program

4.1 Back to the Basics Assignment

Assignment is one of the most important concepts in a program. But there are some amazing things about this! !!

>>> foo = 'Monty' 
>>> bar = foo   # (1)
>>> foo = 'Python'  # (2)
>>> bar
'Monty'

In the above code, if you write bar = foo, the value of foo (string'Monty') in (1) will be assigned to bar. In other words, bar is a copy of foo, so overwriting foo with the new string'Python'on line (2) does not affect the value of bar.

Assignment copies the value of an expression, but the value is not always what you expect! !! !! In particular, the values of structured objects such as lists are really just references to the objects! In the following example, (1) assigns a reference to foo to the variable bar. If you change the inside of foo in the next line, the value of bar is also changed.

>>> foo = ['Monty', 'Python']
>>> bar = foo  # (1)
>>> foo[1] = 'Bodkin'  # (2)
>>> bar
['Monty', 'Bodkin']

list-memory

bar = foo does not copy the contents of the variable, only its ** object reference **! !! This is a reference to the object stored at position 3133 in foo. If you assign bar = foo, only the object reference 3133 will be copied, so updating foo will change bar as well!

Let's create a variable empty that holds an empty list and use it three times in the next line for further experimentation.

>>> empty = []
>>> nested = [empty, empty, empty]
>>> nested
[[], [], []]
>>> nested[1].append('Python')
>>> nested
[['Python'], ['Python'], ['Python']]

When I changed one of the items in the list, all of them changed! !! ?? ?? You can see that each of the three elements is actually just a reference to one and the same list in memory! !!

Note that if you assign a new value to one of the elements in the list, it will not be reflected in the other elements!

>>> nested = [[]] * 3
>>> nested[1].append('Python')
>>> nested[1] = ['Monty']
>>> nested
[['Python'], ['Monty'], ['Python']]

The third line changed one of the three object references in the list. However, the'Python'object hasn't changed. This is because you're modifying the object through the object reference, not overwriting the object reference!

Equality Python has two ways to make sure that the pair of items is the same. is Test the operator for object identification. This can be used to validate previous observations about the object.

>>> size = 5
>>> python = ['Python']
>>> snake_nest = [python] * size
>>> snake_nest[0] == snake_nest[1] == snake_nest[2] == snake_nest[3] == snake_nest[4]
True
>>> snake_nest[0] is snake_nest[1] is snake_nest[2] is snake_nest[3] is snake_nest[4]
True

Create a list containing multiple copies of the same object and show that they are not only identical according to ==, but also one and the same object.

>>> import random
>>> position = random.choice(range(size))
>>> snake_nest[position] = ['Python']
>>> snake_nest
[['Python'], ['Python'], ['Python'], ['Python'], ['Python']]
>>> snake_nest[0] == snake_nest[1] == snake_nest[2] == snake_nest[3] == snake_nest[4]
True
>>> snake_nest[0] is snake_nest[1] is snake_nest[2] is snake_nest[3] is snake_nest[4]
False

The id () function makes it easier to detect!

>>> [id(snake) for snake in snake_nest]
[513528, 533168, 513528, 513528, 513528]

This tells us that the second item in the list contains a separate identifier!

Conditionals In the conditional part of the if statement, a non-empty string or list evaluates to true, and an empty string or list evaluates to false.

>>> mixed = ['cat', '', ['dog'], []]
>>> for element in mixed:
...     if element:
...         print element
...
cat
['dog']

In other words, there is no need to do if len (element)> 0: under the condition.

>>> animals = ['cat', 'dog']
>>> if 'cat' in animals:
...     print 1
... elif 'dog' in animals:
...     print 2
...
1

If elif is replaced with if, both 1 and 2 are output. Therefore, the elif clause may have more information than the if clause.

4.2 Sequences This sequence is called a tuple, is formed by the comma operator, and is enclosed in parentheses. In addition, the specified part can be seen by adding an index in the same way as a character string.

>>> t = 'walk', 'fem', 3
>>> t
('walk', 'fem', 3)
>>> t[0]
'walk'
>>> t[1:]
('fem', 3)
>>> len(t)
3

I compared strings, lists, and tuples directly, and tried indexing, slicing, and length operations for each type!

>>> raw = 'I turned off the spectroroute'
>>> text = ['I', 'turned', 'off', 'the', 'spectroroute']
>>> pair = (6, 'turned')
>>> raw[2], text[3], pair[1]
('t', 'the', 'turned')
>>> raw[-3:], text[-3:], pair[-3:]
('ute', ['off', 'the', 'spectroroute'], (6, 'turned'))
>>> len(raw), len(text), len(pair)
(29, 5, 2)

Operating on Sequence Types Different ways to iterate a sequence

Use reverse (sorted (set (s))) to sort the unique elements of s in reverse order. You can use random.shuffle (s) to randomize the contents of the list s before iterating.

FreqDist can be converted into sequences!

>>> raw = 'Red lorry, yellow lorry, red lorry, yellow lorry.'
>>> text = word_tokenize(raw)
>>> fdist = nltk.FreqDist(text)
>>> sorted(fdist)
[',', '.', 'Red', 'lorry', 'red', 'yellow']
>>> for key in fdist:
...     print(key + ':', fdist[key], end='; ')

...
lorry: 4; red: 1; .: 1; ,: 3; Red: 1; yellow: 2

Rearrange the contents of the list using tuples !!

>>> words = ['I', 'turned', 'off', 'the', 'spectroroute']
>>> words[2], words[3], words[4] = words[3], words[4], words[2]
>>> words
['I', 'turned', 'the', 'spectroroute', 'off']

zip () takes two or more sequences of items and "compresses" them together into a list of tuples! Given the sequence s, enumerate (s) returns a pair consisting of the index and the items at that index.

>>> words = ['I', 'turned', 'off', 'the', 'spectroroute']
>>> tags = ['noun', 'verb', 'prep', 'det', 'noun']
>>> zip(words, tags)
<zip object at ...>
>>> list(zip(words, tags))
[('I', 'noun'), ('turned', 'verb'), ('off', 'prep'),
('the', 'det'), ('spectroroute', 'noun')]
>>> list(enumerate(words))
[(0, 'I'), (1, 'turned'), (2, 'off'), (3, 'the'), (4, 'spectroroute')]

Combining Different Sequence Types

>>> words = 'I turned off the spectroroute'.split()  # (1)
>>> wordlens = [(len(word), word) for word in words]  # (2)
>>> wordlens.sort()  # (3)
>>> ' '.join(w for (_, w) in wordlens)  # (4)
'I off the turned spectroroute'

A string is actually an object in which methods such as split () are defined. Create a list of tuples using list comprehension in (2). Each tuple consists of a number (word length) and a word (eg (3,'the')). Use the sort () method to sort the (3) list on the fly! Finally, in (4), the length information is discarded, and the words are combined back into one character string.

** In Python, lists are variable and tuples are immutable. That is, you can change the list, but not the tuples !!!**

Generator Expressions

>>> text = '''"When I use a word," Humpty Dumpty said in rather a scornful tone,
... "it means just what I choose it to mean - neither more nor less."'''
>>> [w.lower() for w in word_tokenize(text)]
['``', 'when', 'i', 'use', 'a', 'word', ',', "''", 'humpty', 'dumpty', 'said', ...]

I want to process these words further.

>>> max([w.lower() for w in word_tokenize(text)])  # (1)
'word'
>>> max(w.lower() for w in word_tokenize(text))  # (2)
'word'