-
Welcome
1 Lecutre -
Design of CPython’s Compiler
6 Lecutre -
Running a buildbot worker
2 Lecutre
Parse Tree to AST
The AST is generated from the parse tree (see Python/ast.c
) using the function PyAST_FromNode()
.
The function begins a tree walk of the parse tree, creating various AST nodes as it goes along. It does this by allocating all new nodes it needs, calling the proper AST node creation functions for any required supporting functions, and connecting them as needed.
Do realize that there is no automated nor symbolic connection between the grammar specification and the nodes in the parse tree. No help is directly provided by the parse tree as in yacc.
For instance, one must keep track of which node in the parse tree one is working with (e.g., if you are working with an ‘if’ statement you need to watch out for the ‘:’ token to find the end of the conditional).
The functions called to generate AST nodes from the parse tree all have the name ast_for_xx
where xx is the grammar rule that the function handles (alias_for_import_name
is the exception to this). These in turn call the constructor functions as defined by the ASDL grammar and contained in Python/Python-ast.c
(which was generated by Parser/asdl_c.py
) to create the nodes of the AST. This all leads to a sequence of AST nodes stored in asdl_seq
structs.
Function and macros for creating and using asdl_seq *
types as found in Python/asdl.c
and Include/asdl.h
are as follows:
_Py_asdl_seq_new(Py_ssize_t, PyArena *)
- Allocate memory for an
asdl_seq
for the specified length asdl_seq_GET(asdl_seq *, int)
- Get item held at a specific position in an
asdl_seq
asdl_seq_SET(asdl_seq *, int, stmt_ty)
- Set a specific index in an
asdl_seq
to the specified value asdl_seq_LEN(asdl_seq *)
- Return the length of an
asdl_seq
If you are working with statements, you must also worry about keeping track of what line number generated the statement. Currently the line number is passed as the last parameter to each stmt_ty
function.