Category: programming | /usr/sbin/blog

Bouncing Python's Generators With A Trampoline

2012-08-12T10:06:00-06:00

In this post, I'm going to combine two of my favorite topics: recursion and Python's generators. My aim is to address a real pain point in Python: its lack of tail call optimization. Let's start with a simple countdown function that counts from start to 0 using recursion:

``` python Recursive Countdown def countdown(start):

print start
if start == 0:
    return 0
else:
    return countdown(start - 1)

```

This function needs no explanation. It simply prints start and then calls itself with start - 1. The issue here is that for large values of start, we can easily blow Python's stack:

python Countdown Demo % countdown(1000) 1000 999 [...] 3 2 Traceback (most recent call last): File "", line 1, in File "", line 6, in countdown [...] RuntimeError: maximum recursion depth exceeded

Of course, one option is to just loop:

python For Loop Countdown print "\n".join([str(i) for i in xrange(1000,-1,-1)])

But I didn't write this post just to tell you that recursive functions can be translated into loops (duh). Instead, I'm going to use a technique called trampolining to convert tail-recursive functions into tail-recursive generators, which won't blow the stack.

Returning to the countdown example, let's think about why the recursive call is a problem in the first place. The issue is that in return countdown(start - 1), the call to countdown must complete before its value can be returned, thus increasing the call stack. "Well, of course," you say. "That's how recursive functions work." But wait a minute. Is that really necessary? What if, rather than countdown calling itself, it simply returned the next recursive call in the sequence for its caller to execute. Hey, I think we might be on to something:

``` python A Tail-Recursive Generator def countdown(start):

print start
if start == 0:
    yield 0
else:
    yield countdown(start - 1)

```

So now countdown is a generator which calls itself. That's kind of wacky. Let's call it and see what happens.

python Trying It Out % countdown(5)

That's interesting. We now have a generator. Think about that for a moment. What does this generator represent? Ah, that's right. It's the next recursive call in the sequence. It just hasn't been executed yet. Let's execute it:

python Sweet % g = countdown(5) % g = g.next() 5

We're getting somewhere.

python Manual Trampoline % g = countdown(5) % g = g.next() 5 % g = g.next() 4 % g = g.next() 3 % g = g.next() 2 % g = g.next() 1 % g = g.next() 0 % g = g.next() Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute 'next'

Well, that's really neat. Each generator represents one recursive call, and a generator's return value is the next recursive call in the sequence, except for the last call. What happened there? We reached the base case, and the last generator returned 0.

We now have a general pattern. Let's write a function which executes these tail-recursive generators automatically:

``` python Trampoline Function import types

def tramp(gen, *args, **kwargs):

g = gen(*args, **kwargs)
while isinstance(g, types.GeneratorType):
    g=g.next()
return g

```

So, a generator function is passed in, which is then used to create the generator g. g.next() is repeatedly called, and the next recursive call is stored back to g. We keep doing this until g no longer returns a generator. If g is not a generator, then we've reached the base case, and we return it (if the base case is a generator, you'll have to use something like exceptions to signal that you've reached the base case).

Note: It's about time that I explained where the word "trampoline" comes from. The idea is that each recursive call is "trampolined" off the trampoline runner (the tramp function in this case). That is, each recursive call is returned to tramp and is bounced off of it. I also like to think that each recursive call bounces higher (just like on a trampoline) because the "virtual" recursion depth increases, but it's very possible that I made that up in my head.

Let's try this with countdown, and a large start value:

``` python Trampoline Test % tramp(countdown, 1000) 1000 999 [...] 2 1 0 0 <--- Since we executed in the interactive interp,

     this is the return val of `tramp`.

```

Excellent! The original function blew the stack, but the trampolined generator does just fine. Furthermore, we've written a general trampoline runner to execute these trampolined generators. But, before we call it a day, we have to write a trampolined version of the Fibonacci function, otherwise, what's even the point of recursion?

``` python Trampolined Fibonacci def fib(count, cur=0, next_=1):

if count <= 1:
    yield cur
else:
    # Notice that this is the tail-recursive version
    # of fib.
    yield fib(count-1, next_, cur + next_)

```

Let's take this for a test drive:

``` python The First 10 Fib Numbers

Print the first 10 Fibonacci numbers.

for i in xrange(1,11):

print tramp(fib, i)

Output: 0 1 1 2 3 5 8 13 21 34 ```

Well, that's terrific. Not only is the tail-recursive version just as efficient as the for-loop version (except, of course, for function calling, and generator overhead), it also won't blow the stack.

So, this is yet another example of the power of Python's generators (Python's most under appreciated feature, as David Beazley argues). What are the takeaways?

This technique only works for tail-recursive functions. The recursive call must be the last thing the function does.
The translation to a trampolined generator is easy! Just turn the return statements into yield expression.
Although this will protect your stack, creating a generator for each call is probably slow.

If this was fun for you, then definitely check out David Beazley's presentation on coroutines and concurrency. He uses trampolining, generators, and coroutines to write an OS in Python, complete with a task scheduler, sys calls, and blocking and non-block IO. This man is a wizard.

Eli Bendersky also has a couple great articles on using generators for lexical scanning and state machines. Those posts inspired me to write a command line argument parser using coroutines, which can process a stream of characters and handles all the usual suspects, like quotes and escape characters. As you can tell, I've been getting quite a bit of mileage out of these techniques.

∞ Permalink

Permutations: An Interview Question

2012-05-05T13:41:00-06:00

Here I go over a coding question I was recently asked during an interview.

You can find the source here: https://gist.github.com/2604992

Notes

In some languages, prepending to a list is faster than appending. In Python it doesn't matter.
The weird characters in the Vim session are whitespace characters. $ is a newline character.

∞ Permalink

Look, Ma! No Loops: Reversing Word Order

2012-04-07T17:08:00-06:00

Here's a classic interview question:

Write a function that takes a string and reverses its word order. So, "Reverse this string" results in "string this Reverse".

Notice that the task isn't to reverse the string, but rather to reverse the order of the string's words. So the first word becomes the last. The second word becomes the second to last, and so on.

Whenever I come across puzzles like this, I shudder when I start to think about the nested loops and pointer arithmetic. Unfortunately, that describes the classic solution pretty well. The common wisdom is that first you should reverse the entire string. Then you should go back and reverse each word. So:

"Reverse this"
"siht esreveR"
"this Reverse"

The first step isn't bad, but reversing each word will require at least one nested loop, and some careful thinking about how to deal with the spaces. Instead, let's think through a functional solution that doesn't involve any loops. It will necessarily be recursive, so the first step will be to come up with each of the cases we'll encouter as we walk the string.

Recursive Case 1: We're in the midst of a word. Add the current letter to the end of some word accumulator.
Recursive Case 2: We're at the end of a word (a space). Add the current word accumulator to the beginning of some string accumulator, then clear the word accumulator.
Base Case: The string is empty. Return the word accumulator prepended to the string accumulator.

So, there are two accumulators. One grabs the letters of each word as we walk the string and keeps the letters in order (the word accumulator). Then, once an entire word has been read into the word accumulator, it's prepended to the string accumulator. Prepending, reverses the word order. Once it's prepended, the word accumulator is cleared in preparation for the next word. Let's translate this case by case.

The base case is the easiest. Let's start there:

```python Base Case if len(s) == 0:

return word_acc + str_acc

```

Start by testing if the input string, s, is empty. If it is, then we've reached the end of the string, so we return the contents of the word_acc, prepended to the str_acc. word_acc contains the last word in the string, and we're prepending it to the rest of the string that has already been reversed.

Now let's deal with the second recursive case:

```python 2nd Recursive Case elif s[0] == " ":

return helper(s[1:], "", " " + word_acc + str_acc)

```

If we've gotten this far, the input string must have some length, so we check if the first character is a space. If it is, we've reached the end of a word, so we need to prepend str_acc with the word that's just been read in (word_acc). We then clear out the word_acc in preparation for the next word, and recursively call the function with the rest of the input string, s. The recursive call is to helper because this will be nested inside a convenience wrapper that only takes one argument.

Finally, the last case:

```python 1st Recursive Case else:

return helper(s[1:], word_acc + s[0], str_acc)

```

If we've gotten this far, we're in the midst of a word (the string isn't empty, and the current character isn't a space). In this case, we add the current character onto the word_acc and then recurse with the rest of the string. Remember, the word_acc is building up each word character by character as we walk the input string.

Finally, we can simply throw these cases into a function, and it should just work. Ahh, the magic of recursion.

```python Reversing Word Order def rev_words(s):

'''Returns the string `s` with its words in the reverse order.
   Example: rev_words("Reverse this!") #"this! Reverse"
'''
def helper(s, word_acc, str_acc):
    # Base case: Return word accumulator plus whatever words
    # you have in the string acc
    if len(s) == 0:
        return word_acc + str_acc
    # This is the end of a word. Clear `word_acc`, and start with
    # the next word.
    elif s[0] == " ":
        return helper(s[1:], "", " " + word_acc + str_acc)
    # In the middle of a word. Add the letter to `word_acc`, and continue
    else:
        return helper(s[1:], word_acc + s[0], str_acc)
return helper(s, "", "")

```

As much as I like this solution, it's not without its drawbacks. The most significant is that Python's string methods are very slow. Each concatenation causes both strings to be copied, and at least one concat is happening for each character in the string. Possible solutions include turning the string into a list of characters (list methods are much faster), or using a language that better supports the functional paradigm.¹ But really, if speed is your highest priority, the imperative solution of swapping characters in place is your best bet, warts and all.

If you'd like to play with this code yourself, I've posted a gist here, along with some tests.

Notes

If Python had a constant time prepend function ('cons' in the functional world), I suspect a better solution would be possible.

∞ Permalink

Building Data Structures from Functions

2012-04-01T09:00:00-06:00

Here's a puzzle I've adapted from exercise 2.4 of SICP.

Suppose you are programming in a language that only supports function application. That is, defining functions and applying arguments to these functions are the only things this language supports. Using only these building blocks, could you construct a linked list?

Surprisingly, the answer is yes, and the exercise linked to above provides a partial solution. Below, I've translated that solution into Python, and completed the exercise:

```python Solution in Python

Create a pair from l and r

def cons(l, r):

return lambda get: get(l, r)

Given a pair, return the head (left) value.

def head(pair):

return pair(lambda l, r: l)

Give a pair, return the tail (right) value.

def tail(pair):

return pair(lambda l, r: r)

```

First, let's examine how these functions can be used, and then I'll explain how they work. Consider the snippet below:

python Usage Example l1 = cons(1, 2) print head(l1) # Prints 1 print tail(l1) # Prints 2 l2 = cons(0, l1) print head(l2) # Prints 0 print head(tail(l2)) # Prints 1 print tail(tail(l2)) # Prints 2

As can be seen, cons() is the constructor for this pair type. Give it two values, and it will create a pair of those two values. head() and tail() are the basic operators that let us access values inside these pairs; they return the left and right element of the pair, respectively. Also notice that we can create pairs of pairs. The last half of the example creates a pair composed of 0 and (1,2). Why is this significant? Well, we've just made a linked list! Linked lists are simply pairs of pairs. The list [1,2,3,4] can, for example, be represented as cons(1,cons(2,cons(3,cons(4,None)))). What's None doing at the end of the list? You can think of it like the NULL pointer at the end of a linked list in C. If a function were traversing the list, None would signify to the function that it has reached the end. Mathematically, you can think of a linked list as an inductively defined data structure, where None is the base case. None is referred to as the empty list.

Now for the interesting part. How do these functions work? First let's look at the cons() function:

```python cons()

Create a pair from l and r

def cons(l, r):

return lambda get: get(l, r)

```

This takes in two parameters (l and r) and returns a function.¹ This returned function takes yet another function, which it applies to l and r and returns the result. So, if we call cons(1,2), we are returned a function, which takes a function and applies that function to the arguments 1 and 2. If, for example, I called cons(1,2)(lambda x, y: x+y) I'd get 3, because the addition function would be applied to 1 and 2.

Now suppose that we didn't want to add l and r. Instead, we wanted to pull either l or r out of a pair that was already constructed. In other words, given list = cons(1,2), how could we pull the 1 or 2 out of the function stored in list? Well, all we need to do is pass it a function that returns only the left or right parameter. So, cons(1,2)(lambda l, r: l) would give us back 1, and cons(1,2)(lambda l,r: r) would give us back 2. This is almost exactly what the head() and tail() functions are doing! They take in a function (presumably produced by the cons() function), and apply either lambda l,r: l or lambda l,r: r. Just to cement this all together, below I step through the evaluation for the example head(cons(1,2)).

python Evaluation Steps for head(cons(1,2,)) head(cons(1,2)) -> head(lambda get: get(1, 2)) -> (lambda get: get(1, 2))(lambda l,r: l) -> (lambda l,r: l)(1,2) -> 1

And here is a bit of code that uses these new functions to add the elements of a list:

```python Sum Function def sum_list(l):

return head(l) if tail(l) == None else head(l) + sum_list(tail(l))

sum_list(cons(1,cons(2,cons(3,None)))) # Returns 6 ```

In the end, this might strike you as nothing more than a useless programming trick. In a sense that's right. I'd never use this in my own code. What makes this technique so valuable is that it actually fits into the broader context of lambda calculus, which is a mathematical abstraction of computation. In the language of lambda calculus, you're given only a very basic set of tools. Essentially all you have are simple one parameter functions (it doesn't even have integers!). Incredibly, it turns out that that's all you need for a Turing complete language. It's possible to build integers, conditionals, loops, arithmetic operations, and (as you've seen here) data structures out of these simple building blocks. As a next step, and to see this for yourself, it might be fun to take a minute and see how the code presented here could be modified to give you binary trees, or even graphs. After that, you could check out this incredible article on writing the infamous FizzBuzz exercise in Ruby using only functions.

Notes

Technically, this is really a closure, which is a function with some of its variables already set. In the terminology of the PL world, it's a function bundled with its environment (the set of variables in its outer scope).

∞ Permalink

Implementing a Logic Evaluator Without if-expressions or Boolean Operators

2012-01-28T10:19:00-07:00

I've slowly been making my way through An Overview of the Scala Programming Language [PDF], and was struck by this interesting implementation of Booleans:

``` scala abstract class Bool {

def && (x: => Bool): Bool
def || (x: => Bool): Bool

} object False extends Bool {

def && (x: => Bool): Bool = this
def || (x: => Bool): Bool = x

} object True extends Bool {

def && (x: => Bool): Bool = x
def || (x: => Bool): Bool = this

} ```

If you're not a Scala programmer, don't let the syntax trip you up. It's pretty straightforward. True and False are singleton objects that inherit from the abstract class Bool. Each object implements the && (and) and || (or) operators. As you'd expect, they each take a Bool as an argument and return a Bool. When you execute:

scala True && False

It applies True's && method to False, and correctly returns x, which is, in this case, False.

What's so interesting about this is the absence of any if-expressions or built-in boolean operators. To illustrate, a less clever, but more obvious implementation would be as follows:

``` scala def and(x: Boolean, y: Boolean): Boolean = {

if(x)
    if(y)
        true
    else
        false
else
    false

} ```

So, if x and y are both true, return true , otherwise return false (Scala returns the value of the last executed expression). It works, but doesn't win any style points. The nested if-expressions are especially ugly.¹

So, this is a neat way of cutting down on if-expressions. Can it be extended to the other operations? Yes:

``` scala object True extends Bool { def && (x: => Bool): Bool = x def || (x: => Bool): Bool = this def unary! : Bool = False def xor (x: => Bool): Bool = !x def if (x: => Bool): Bool = this def iff(x: => Bool): Bool = x }

object False extends Bool { def && (x: => Bool): Bool = this def || (x: => Bool): Bool = x def unary! : Bool = True def xor (x: => Bool): Bool = x def if (x: => Bool): Bool = !this def iff(x: => Bool): Bool = !x } ```

One thing I like about this implementation is the pleasing symmetry between True's methods and False's methods. The unary_! (not) method is trivially opposite. True returns False and vice versa, but the symmetry continues throughout xor, if_, and iff. For example, False's xor returns its parameter, while True's xor returns the opposite of its parameter. False xor True returns True, whereas True xor True returns False. I'll leave verifying the rest of these rules as an exercise to the reader.²

It's also interesting to compare this to the State Pattern, where a state machine is simulated through the use of objects that implement a common interface. Each state is an object, and transitioning between states is as easy as switching between the different objects. Without this pattern, the alternative would be a complex set of if-expressions dictating what you're allowed to do in a certain state, and which states you can switch to given your current state. The State Pattern essentially hides all this behind polymorphism, similar to what's happening in the logic evaluator.

Notes

You can cut it down to one if-expression if you're on your toes.
Note that if_ is backwards. True if_ False is the same as False -> True, because that's closer to the English.

∞ Permalink