This stackoverflow answer [1] shows how to add an until
statement to Python. In this post I will show how to do that in python_terp and how to debug these modifications.
This will also serve as some of the missing documentation for python_terp.
The until
statement is analogous to the while
statement. Its block is executed until the condition is met. As in the stackoverflow answer, we expect
num = 3
until num == 0:
print(num)
num -= 1
to print
3
2
1
Start the interpreter and try this.
$ ipython -i test/python_repl.py
p>> simport simple_ast
p>> num = 3
p>> until num == 0:
... print(num)
... num =- 1
...
...
Not too surprisingly, with no modifications, the parser isn't able to handle the until statement. Let's press Control-D and drop to the host interpreter to make our changes.
We will introduce a new grammar rule until_stmt
similar to while_stmt
[2].
In [1]: grammar = python_grammar.full_definition + python_grammar.extra
In [2]: grammar += r"""
...: until_stmt = "until" {test} ":" {suite}
...: """
and add it to the list of compound statements
In [3]: grammar += r"""
...: compound_stmt = if_stmt | while_true_stmt=while_true | while_stmt
...: | until_stmt
...: | simple_for_stmt | for_stmt | try_stmt | with_stmt
...: | funcdef | classdef | decorated
...: """
and replace the parser with a new parser with the modified grammar
In [4]: pyterp.parser = python.Interpreter(i3.parse("grammar", grammar))
Now add the semantics by adding a function to simple_ast.py
mimicking while_stmt
.
def until_stmt(condition, block):
while_true:
if evaluate(condition):
return None
evaluate(block)
And that's it, we're done! Save simple_ast.py
and test it out.
In [5]: pyterp.repl()
p>> simple_ast.reload_module(simple_ast)
p>> num = 3
p>> until num == 0:
... print(num)
... num -= 1
...
3
2
1
Now, we won't always be so lucky and get everything right on the first try. Let's see some examples of debugging.
To help debug the parsing step, we can add breakpoints to the grammar with !(bp())
. For example
until_stmt = !(bp()) "until" {test} ":" {suite} !(bp())
bp()
is just shorthand for pdb.set_trace()
which is imported by default. We could write any Python expression between !(...)
.
Let's also set .source
this time so we can get stack traces in the parser.
In [7]: grammar += r"""until_stmt = !(bp()) "until" {test} ":" {suite} !(bp())"""
In [8]: pyterp.parser = python.Interpreter(i3.parse("grammar", grammar))
In [9]: pyterp.parser.source = grammar
Now if we read input from a string or file (reading from the p>>
prompt will break too often: once after each line). Instead we'll make a string with the test and parse that.
In [10]: s = """
....: until num == 0:
....: print(num)
....: num -= 1
....: """
In [11]: pyterp.parser.parse("grammar", s)
We hit the first breakpoint. Here, we can examine the input
--Return--
> <string>(1)<module>()->None
(Pdb) self.input
['\nuntil num == 0:\n print(num)\n num -= 1\n', 0]
(0
is the index in the input we are at) and the grammar matcher's stack to make sure we're at the right place.
(Pdb) self.st()
0 In file <grammar> line 162 function None (apply)
grammar = file_input
1 In file <grammar> line 7 function file_input (and)
file_input = (EMPTY_LINE | SAME_INDENT stmt)* ENDMARKER
2 In file <grammar> line 7 function file_input (quantified)
file_input = (EMPTY_LINE | SAME_INDENT stmt)* ENDMARKER
3 In file <grammar> line 7 function file_input (or)
file_input = (EMPTY_LINE | SAME_INDENT stmt)* ENDMARKER
4 In file <grammar> line 7 function file_input (and)
file_input = (EMPTY_LINE | SAME_INDENT stmt)* ENDMARKER
5 In file <grammar> line 7 function file_input (apply)
file_input = (EMPTY_LINE | SAME_INDENT stmt)* ENDMARKER
6 In file <grammar> line 21 function stmt (or)
stmt = compound_stmt | simple_stmt
7 In file <grammar> line 21 function stmt (apply)
stmt = compound_stmt | simple_stmt
8 In file <grammar> line 170 function compound_stmt (or)
compound_stmt = if_stmt | while_true_stmt=while_true | while_stmt
| until_stmt
| simple_for_stmt | for_stmt | try_stmt | with_stmt
| funcdef | classdef | decorated
9 In file <grammar> line 171 function compound_stmt (apply)
| until_stmt
10 In file <grammar> line 169 function until_stmt (and)
until_stmt = !(bp()) "until" {test} ":" {suite} !(bp())
11 In file <grammar> line 169 function until_stmt (action)
until_stmt = !(bp()) "until" {test} ":" {suite} !(bp())
Everything looks alright. Let's keep going.
(Pdb) c
--Return--
> <string>(1)<module>()->None
(Pdb) self.input
['\nuntil num == 0:\n print(num)\n num -= 1\n', 20]
(Pdb) self.st()
[...]
25 In file <grammar> line 169 function until_stmt (action)
until_stmt = !(bp()) "until" {test} ":" {suite} !(bp())
We're now at the first breakpoint in the nested suite of the until
block. Continue until we're finally back and at the second breakpoint.
(Pdb) c
--Return--
> <string>(1)<module>()->None
(Pdb) c
--Return--
> <string>(1)<module>()->None
(Pdb) self.st()
[...]
11 In file <grammar> line 169 function until_stmt (action)
until_stmt = !(bp()) "until" {test} ":" {suite} !(bp())
We can look at the parent frame's cumulated outputs to see if it matches our expectation
(Pdb) self.stack[-2].outputs
[None, 'until', __binary__['==', NAME['num'], NUMBER[0]], ':',
suite[print_stmt[NAME['num']], aug_assign[NAME['num'],
operation['-='], NUMBER[1]]]]
Let's pretty print some of that to see better
(Pdb) self.stack[-2].outputs[2].pprint()
__binary__
str '=='
NAME
str 'num'
NUMBER
int 0
(Pdb) self.stack[-2].outputs[4].pprint()
suite
print_stmt
NAME
str 'num'
aug_assign
NAME
str 'num'
operation
str '-='
NUMBER
int 1
That looks like the condition and block's body, as expected. Now continue until parsing is complete.
For debugging the newly defined until_stmt
function, we can add breakpoints with bpoint
.
def until_stmt(condition, block):
while_true:
bpoint()
if evaluate(condition):
return None
evaluate(block)
This drops us in the pdb
debugger when bpoint()
. From there we can get a stack trace and other information
(Pdb) u
-> return func(*args_value)
(Pdb) self.stack[-1].scope.keys()
['__func_name__', '__caller__', '__break__', 'block',
'__parent__', '__return__', '__func__', '__continue__',
'condition']
(Pdb) self.st()
0 In file <console-3> line 0 function __main__ (until_stmt)
until num == 0:
print(num)
num -= 1
1 In file <console-3> line 0 function __main__ (__call__)
until num == 0:
print(num)
num -= 1
2 In file lib/simple_ast.py line 151 function until_stmt (while_true)
while_true:
bpoint()
if evaluate(condition):
return None
evaluate(block)
3 In file lib/simple_ast.py line 151 function until_stmt (suite)
while_true:
bpoint()
if evaluate(condition):
return None
evaluate(block)
4 In file lib/simple_ast.py line 152 function until_stmt (__call__)
bpoint()
5 In file lib/simple_ast.py line 152 function until_stmt (NAME)
bpoint()
(Pdb) p self.stack[1].scope['num']
3
We'll continue once we're done with examining values
(Pdb) c
We could modify the AST for later iterations. For example if we no longer want a breakpoint and instead want to print the value of the condition.
Examining the stack with self.st()
, we see that the bpoint()
call is the index 0 child of the root of self.stack[3]
.
(Pdb) self.stack[3].root.pprint()
suite
__call__
NAME
str 'bpoint'
arglist
single_if
__call__
NAME
str 'evaluate'
arglist
NAME
str 'condition'
return_stmt
NAME
str 'None'
__call__
NAME
str 'evaluate'
arglist
NAME
str 'block'
EMPTY_LINE
str '\n'
Let's set it to print(evaluate(condition))
to show the truth value of the condition.
(Pdb) self.stack[3].root[0] = Node("print_stmt",
[Node("__call__", [Node("NAME", ["evaluate"]),
Node("arglist", [Node("NAME", ["condition"])])])])
That's pretty tedious to type. Instead we can use the parser
(Pdb) self.stack[3].root[0] = self.parser.parse("grammar",
"print(evaluate(condition))\n")
There's actually a shortcut made for this:
(Pdb) self.stack[3].root[0] = self.ast("print(evaluate(condition))")
Let's check we made the change at the right place and then continue execution.
(Pdb) self.stack[3].root.pprint()
suite
print_stmt
__call__
NAME
str 'evaluate'
arglist
NAME
str 'condition'
single_if
__call__
NAME
str 'evaluate'
arglist
NAME
str 'condition'
return_stmt
NAME
str 'None'
__call__
NAME
str 'evaluate'
arglist
NAME
str 'block'
EMPTY_LINE
str '\n'
(Pdb) c
2
False
1
True
p>>
The loop stopped when the condition is true, as expected.
To execute the next AST node in python_terp
(rather than the host interpreter), we can enable the debugger after hitting the first breakpoint.
(Pdb) self.debug = True
(Pdb) c
(dbg)
This debugger is a bit buggy at the moment from lack of use and maintenance. It supports some of the usual commands such as
It can store multiple commands with the b
prefix. So b n n n s s
is the same as pressing n
three times (followed by enter each time) and then s
twice. Its intended to be used to come back to an earlier point from a breakpoint quickly (in cases where its not worth setting up the appropriate breakpoint at the destination instead).
Finally, a lone !
drops us in the python_terp REPL (while keeping the current call stack).
Since that stackoverflow answer, new features in Python 3.3+ allows more ways to modify CPython by using only Python. So to add until_stmt
, we can probably use token transformations into while_stmt
(and not
) although I haven't thought this one through.
[1] That answer is a mirror of this blog post by the same author.
[2] For simplicity, we'll add until_stmt
without an optional else block. Adding an else block would not be very difficult either.
Posted on Feb 4, 2017