This article is the 14th day article of NIFTY Advent Calendar 2016. Yesterday was @ntoofu's Manage Ansible configuration information in graph DB.
Hello. In my daily work, I build, operate, and maintain the network infrastructure in Nifty Cloud. This time, I would like to write a little about automation in the operation of physical devices on the network, which cannot be seen from the service side of Nifty Cloud.
-The device reboot ran
-The device interface has crashed
-Module failed
-Anyway something is wrong
It often happens that an event like ... occurs and you want to check the status of the device promptly. In these cases, log in to the device (while it is still accessible) and It is necessary to promptly hit show commands to secure information.
It can be said that the command at that time is standardized to some extent. (The following is a list that you can list such commands regardless of the specific device / OS)
show running-config
show interfaces
show logging
show inventory
show modules
show tech-support
···Such. For example, you can check the setting information and status of network devices / OS such as Cisco IOS by entering the above command.
Especially in a highly urgent situation such as when a failure occurs I would like to do routine work with as little human hands as possible in order to devote time to the survey.
In this article, in order to get the result of the above command quickly, I will introduce the story of implementing a script in Python that automatically logs in to the device, inputs commands, and acquires output.
In general, we think that operations for network devices can be divided into three types.
--Check settings Input a command to check the device status and settings, such as the show command mentioned above.
--Change settings Input commands to change the device status / settings such as config mode, commit, and write
--Status monitoring Monitoring Traffic flow rate by SNMP, acquisition of Syslog
In particular, this article focuses on checking settings.
As stated in the article introduced in the reference section, Among the operation methods for network devices, the general-purpose operation is by CLI. Traffic flow etc. can be obtained by SNMP, but the types of information are limited. The API may be implemented, but it varies from manufacturer to manufacturer.
All devices are premised on CLI operation, that is, operation after connecting with ssh or telnet is the operation method for general-purpose network devices.
In other words, automating these operations means This means that the script will perform the operations normally performed by humans on the CLI as shown below.
$ telnet 192.168.0.2
Trying 192.168.0.2...
Connected to 192.168.0.2 (192.168.0.2).
Escape character is '^]'.
User Access Verification
Username: root
Password:
(hostname)#show run
...output
(hostname)#
I will explain the contents of the script.
Using the pexpect module in Python3 Implements the proxy operation for the CLI interface by expect.
First, use pexpect.spawn () to launch a child process. The child process is named child here, and the process proceeds for the child thereafter. In the argument of spawn, it is necessary to specify the process to be started in the command statement, and here it is telnet.
By putting the file descriptor for the log file in the variable child.logfile_read, You can specify the output destination of the launched child process to a file. Alternatively, you can assign the standard output (sys.stdout) to this variable and see the output on the console.
def login(ipaddr, passwd):
child = pexpect.spawn("telnet " + ipaddr)
logname = "./log/" + "log_" + ipaddr + \
"_" + datetime.now().strftime("%s") + ".log"
wb = open(logname, 'wb')
child.logfile_read = wb
Processing using expect, not limited to pexpect,
1. sendline()Send command by
2. expect()Waiting for the output of the expected character string
Will be repeated. In the case of pexpect, the character string to wait can be specified in the list, and here it is specified as follows according to the expected character string.
expect_list = [u"#",
u">",
u"\nlogin: ",
u"Username: ",
u"Password: ",
u"Connection closed by foreign host.",
u"Login incorrect"]
When you run pexpect.spawn (telnet [ipaddress]), telnet is running on the device and the device is waiting for the next input. That is, when executed manually, the state of the image is as follows.
$ telnet 192.168.0.2
Trying 192.168.0.2...
Connected to 192.168.0.2 (192.168.0.2).
Escape character is '^]'.
User Access Verification
Username: [Cursor]
Now run expect as shown below.
index = child.expect(expect_list)
At this time, the element "Username:" stored in the third of the list matches the last line of the output of the device. If a list is specified as an argument in expect, the return value will be the element number of the list. Therefore, a variable called index contains "3". If there is no matching string, expect () will continue to wait for more output from the device. In this case, the process does not end until it times out, but it is possible to implement error processing by waiting for the timeout at this time.
In the current implementation, the following processing is performed according to the index value. Depending on the device and OS, you may be asked for your Username, or you may be asked for your password suddenly. Even if you log in successfully, you may need to enter the password again if the device requires you to execute enable. Since there is a limit to describing the processing individually for those devices, the implementation by while is now calm. For all devices and OSs that are supposed to log in, check how the login process proceeds in the case of manual operation. It must be designed so that branching and processing proceed normally.
while True:
if index == 0: # success to login.
return child
elif index == 1: # need to promoted to enable mode.
child.sendline("enable")
index = child.expect(expect_list)
elif index == 2 or index == 3: # need to input "root".
child.sendline("root")
index = child.expect(expect_list)
elif index == 4: # need to input password.
child.sendline(passwd)
index = child.expect(expect_list)
elif index == 5: # Connection is closed.
print("Unmatched password, or connection is closed.")
return -1
elif index == 6: # incorrect password.
print("\nFault: incorrect password.")
return -1
In the examples so far, index holds 3 as a value, so it will be in the third if statement as a branch. Sends the string requested by sendline () as the Username. (root is an example) At this point, the output of the device is manually as follows.
Username: root
Password: [Cursor]
This matches the "Password:" stored in expect_list, and the while loop goes further. Similarly, if the password is entered into the device and authentication is successful, login is complete.
Here, login is completed when "#" is received. This assumes login in the privileged mode as shown below, but there is room for improvement as it may malfunction depending on the OS.
Username: root
Password:
(hostname)#
After logging in, the device listens for the next command. This time, assuming the command input of the confirmation (show) system, I created the following function. commands is a list, and it is assumed that commands such as "show interfaces" are stored as a character string for each element. In particular, error handling is not implemented here.
def exec_command(commands, child):
expect_list = u"#"
for c in commands:
child.sendline(c)
child.expect(expect_list)
When submitting a setting change command, it is necessary to check the settings in advance and determine whether or not the state is suitable for submitting the command to be set. And even after inputting, it is necessary to check whether the settings are reflected safely and whether there are any other strange logs. Implementing these in string standby as described so far requires a great deal of patience.
Some of the tasks that have been completely stylized are implemented because the output can be expected. On the other hand, since it is a script that sets the contents that are quite close to the actual work, I would like to refrain from introducing it here.
By the way, for "child.logfile_read" set when the child process was started, All the output of the command executed so far from login is written. This time it is a demo implementation, but after making the script enter the password as well, If you set it to hit this script when you see a particular Syslog, For example, it is possible to acquire the device settings and status immediately after a module error is issued.
Some commands require maintenance information to be generated inside the device and manually acquired by FTP, etc. I would like to take another opportunity to do these automations.
This time, I introduced the automation of CLI operation using pexpect. It's a pretty muddy way, but on the other hand, there are no network devices that can't be operated with telnet (?), So It is also a method that can be used with any device if properly designed.
I would like to talk about REST APIs provided by specific manufacturers such as Juniper's PyEz.
Thank you very much.
Tomorrow is a post by @hitsumabushi.
-What you want from network equipment for operation automation -Why network operation automation is not progressing
Recommended Posts