In the previous Introduction to Python's ast Module (Following the Abstract Syntax Tree), I introduced using the helper function of the ast module to follow the abstract syntax tree. did.
One way is to use the helper function of the ast module, but if you use * ast.NodeVisitor *, you can do more. You can easily follow the abstract syntax tree. It is easier to understand that what you are doing is the same as using a helper function by looking at the implementation of * NodeVisitor *, so I will introduce it from that implementation. * NodeVisitor * is one of the design patterns called Visitor pattern.
3.4
class NodeVisitor(object):
def visit(self, node):
"""Visit a node."""
method = 'visit_' + node.__class__.__name__
visitor = getattr(self, method, self.generic_visit)
return visitor(node)
def generic_visit(self, node):
"""Called if no explicit visitor function exists for a node."""
for field, value in iter_fields(node):
if isinstance(value, list):
for item in value:
if isinstance(item, AST):
self.visit(item)
elif isinstance(value, AST):
self.visit(value)
If * visit_NodeClassname * is not defined, * ast.iter_fields * follows the abstract syntax tree * generic_visit * It will be executed. Since the node class of the abstract syntax tree takes * ast.AST * as the base class, * isinstance (value, AST) ) * Is used to determine whether it is a node instance and traverse it recursively (* self.visit () *).
Let's actually use it. Define a class that inherits from * NodeVisitor *.
3.4
>>> import ast
>>> source = """
... import sys
... def hello(s):
... print('hello {}'.format(s))
... hello('world')
... """
>>> class PrintNodeVisitor(ast.NodeVisitor):
... def visit(self, node):
... print(node)
... return super().visit(node)
...
>>> tree = ast.parse(source)
>>> PrintNodeVisitor().visit(tree)
<_ast.Module object at 0x10bec7b38>
<_ast.Import object at 0x10bec7b70>
<_ast.alias object at 0x10bec7ba8>
<_ast.FunctionDef object at 0x10bec7c18>
<_ast.arguments object at 0x10bec7c50>
<_ast.arg object at 0x10bec7c88>
<_ast.Expr object at 0x10bec7d30>
<_ast.Call object at 0x10bec7d68>
<_ast.Name object at 0x10bec7da0>
<_ast.Load object at 0x10bebe0f0>
<_ast.Call object at 0x10bec7e10>
<_ast.Attribute object at 0x10bec7e48>
<_ast.Str object at 0x10bec7e80>
<_ast.Load object at 0x10bebe0f0>
<_ast.Name object at 0x10bec7eb8>
<_ast.Load object at 0x10bebe0f0>
<_ast.Expr object at 0x10bec7f28>
<_ast.Call object at 0x10bec7f60>
<_ast.Name object at 0x10bec7f98>
<_ast.Load object at 0x10bebe0f0>
<_ast.Str object at 0x10bec7fd0>
I was able to easily view the nodes by following the abstract syntax tree. To hook a particular node, define a method for * visit_NodeClassname *.
3.4
>>> class PrintExprNodePisitor(ast.NodeVisitor):
... def visit_Expr(self, node):
... print('Expr is visited')
... return node
...
>>> PrintExprNodePisitor().visit(tree)
Expr is visited
Expr is visited
If you compare it with the output from * PrintNodeVisitor *, you can see that it is tracing the * Expr * node twice.
Let's start with a simple source code.
3.4
>>> import ast
>>> source = """
... print(s)
... """
>>> s = 'hello world'
>>> code = compile(source, '<string>', 'exec')
>>> exec(code)
hello world
Use * ast.dump * to find out what abstract syntax tree this source code is expanded into. confirm.
3.4
>>> tree = ast.parse(source)
>>> ast.dump(tree)
"Module(body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Name(id='s', ctx=Load())], keywords=[], starargs=None, kwargs=None))])"
While looking at this, let's think about some suitable example.
Here, as an example, let's invert the output character string. There seem to be many ways to do this, but I'll try replacing the * print * statement with another function.
3.4
>>> class ReversePrintNodeTransformer(ast.NodeTransformer):
... def visit_Name(self, node):
... if node.id == 'print':
... name = ast.Name(id='reverse_print', ctx=ast.Load())
... return ast.copy_location(name, node)
... return node
...
>>> def reverse_print(s):
... print(''.join(reversed(s)))
...
>>> code = compile(ReversePrintNodeTransformer().visit(tree), '<string>', 'exec')
>>> exec(code)
dlrow olleh
>>> s = 'revese print'
>>> exec(code)
tnirp esever
It worked like that. The * print * statement has been replaced by the * reverse_print * function and is being executed.
Use * ast.copy_location * to copy * lineno * and * col_offset * from the original node. I will. You cannot * compile * an AST object without these two attributes.
Let's try an example that fails.
3.4
>>> from ast import *
>>> expression_without_attr = dump(parse('1 + 1', mode='eval'))
>>> expression_without_attr
'Expression(body=BinOp(left=Num(n=1), op=Add(), right=Num(n=1)))'
>>> code = compile(eval(expression_without_attr), '<string>', 'eval')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: required field "lineno" missing from expr
Now pass * include_attributes = True * to output the attributes to * ast.dump * as well.
3.4
>>> expression_with_attr = dump(parse('1 + 1', mode='eval'), include_attributes=True)
>>> expression_with_attr
'Expression(body=BinOp(left=Num(n=1, lineno=1, col_offset=0), op=Add(), right=Num(n=1, lineno=1, col_offset=4), lineno=1, col_offset=0))'
>>> code = compile(eval(expression_with_attr), '<string>', 'eval')
>>> eval(code)
2
It was also possible to generate (eval) an AST object from the output of * ast.dump * by outputting * lineno * or * col_offset * and compile it.
Another solution is to use * ast.fix_missing_locations *. Let's try using * expression_without_attr * earlier.
3.4
>>> code = compile(fix_missing_locations(eval(expression_without_attr)), '<string>', 'eval')
>>> eval(code)
2
Now you can * compile *. According to the * fix_missing_locations * documentation
It's a rather tedious task to fill these in for the generated node, so this helper recursively sets the same value as the parent node to the one without the two attributes set. ..
It seems that it will be set automatically.
It's a little difficult to actually play with the abstract syntax tree to find the problem you want to solve, but here are some of the things I found.
It may be useful to remember when dealing with Python code as data, that is, when there is something difficult to do without it.
Recommended Posts