18.2. Quick Tour

Python's interactive programming environment helps us to explore the language by example. I'm using the interactive shell which is part of IDLE, the IDE which comes with the Python installation. Starting it shows the following output that invites us to enter Python commands at the prompt >>>.

Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>>

18.2.1. Expressions and Variables

Let's answer this friendly invitation with the unavoidable "Hello World!".

>>> print "Hello World"
Hello World

This was easy (how many lines do you need in Java?). In fact, the interactive shell prints the return value of the expressions we enter so that the plain "Hello World" is enough.

>>> "Hello World"
'Hello World'

Strings use single or double quotes as delimiters or "triple double quote" if the string contains newlines characters.

>>> """Hello
World"""
'Hello\nWorld'

All these forms of string literals behave the same. What about some calculations?

>>> 4*5+2**3
28

The only thing one has to know here is that ** means "to the power of". Can we save the result for later use?

>>> x = 4*5+2**3
>>> x
28

Python's variables don't need to be declared in advance. A variable is automatically created the first time a value is assigned to it. Referring to an undefined variable creates an error.

>>> y
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in ?
    y
NameError: name 'y' is not defined

All variables are local to the current file or function they are defined in unless declared otherwise (see below).

If you ask why scripting languages are so much more productive than more "traditional" languages, you are often referred to the powerful built-in string processing and collections. We can compare strings, concatenate and repeat them, and extract individual characters or substrings with straight-forward operators.

>>> "blah" == "blub"
0
>>> "blah" < "blub"
1
>>> "blah " + "blub"
'blah blub'
>>> 3 * "blah "
'blah blah blah'
>>> "Hello"[2]
'l'
>>> "Hello"[-1]
'o'
>>> "Hello"[1:4]
'ell'

Note that Python's indexes always start at zero, and negative indexes count from the end. We extract substrings with the slice operator which contains the limits, separated by a colon, in square brackets. Python ranges always uses the semantics of a right half open interval, that is, including the lower bound and excluding the upper bound. Once you get used to it, this turns out to be a useful convention, since you can always use the length of a structure as the upper bound.

>>> x = "blah"
>>> print x, x[0:len(x)], x[0:], x[:len(x)], x[:]
blah blah blah blah blah

Besides these operators, there is a large number of functions and methods manipulating strings.

>>> len("blah")
4
>>> "blah".upper()
'BLAH'
>>> "blahblah".replace("ah", "ub")
'blubblub'
>>> "blah".center(10)
'    blah    '
>>> "blahblah".find("ah")
2
>>> "blahblah".count("a")
2

This is the first example using method calls. The syntax is identical to method calls in most popular object based languages (C++, Java, JavaScript). The interpretation, however, is different. Calling a method is a two step process. First, we find the field whose name is the method.

>>> "blah".find
<built-in method upper of str object at 0x00824C98>

This field has to be a "callable object". Next, this object is "called" by passing the arguments in parentheses. Therefore, the parentheses of the method call are significant and can't be omitted (like, e.g., in Perl and Ruby). They make the difference between retrieving the method and actually calling it. We can perform the two steps explicitly using an intermediate variable to store the callable object.

>>> f = "blah".find
>>> f("la")
1

Python's API is not fully "object-oriented", although it is steadily moving in this direction. The length of a any collection (including strings) is still obtained with the len function although it could be a method of the collection. Most string manipulations are now methods of the string type (they were originally only available as functions in the string module).

The examples showed only a small subset of the available string methods. If you would like to see all of them, try dir("") which shows you the list of all methods a string has to offer.

18.2.2. Control Flow

Python supports the usual conditional and loop statements of procedural languages.

>>> i = 0
>>> while i<5:
        print i
        i += 1

0
1
2
3
4

Here we see one of the most debated characteristics of Python: there are no braces, no begin or end to denote the scope of the function. Instead the white space tells the compiler which statements belong together. Consecutive lines with the same identation comprise a block. It does not matter how the lines are indented (TABs or spaces, number of spaces), but the indentation has to be exactly the same for all lines in a block. Like it or not, it is at least a very compact way to define blocks of statements. In general, a complex Python statement such as a function definition, a class definition, or a control statement (if, while, exception handling) starts with a keyword followed by some expression, a colon, and an indented block. Here is a conditional statement.

>>> x = 50
>>> if x < 10:
        print "small"
elif x < 100:
        print "medium"
else:
        print "big"

medium

Here is another example showing a conditional statement nested in a while loop.

>>> i = 0
>>> while i<3:
	if i%2 == 0:
		print i, "is even"
	else:
		print i, "is odd"
	i += 1
0 is even
1 is odd
2 is even

Python's most common loop is the for statement. A for loop uses an iterator to walk through a sequence of values. An iterator is an object which knows how to get the next element in the sequence and when to stop. As an example, the standard string iterator walks through the characters of a string one by one.

>>> i = iter("ab")
>>> i.next()
'a'
>>> i.next()
'b'
>>> i.next()
Traceback (most recent call last):
  File "<pyshell#205>", line 1, in ?
    i.next()
StopIteration

We get the next element by calling the iterator's next method, and the iterator tells us to stop by throwing a StopIteration exception. The for loop assigns the result of the next method to the iteration variable until this exception is encountered.

>>> for i in iter("ab"):
        print i

a
b

The call to the iter function is optional. The for loop tries to find a suitable iterator automatically if it is not given one directly.

>>> for i in "ab":
        print i

We will see more examples in the context of collections and generators.

Python strictly distinguishes expressions from statements. Expressions compute values and statements such as print, if, or while control the program flow or cause other side effects. In Python, statements do not return any value and therefore can not be used in expressions. This is in strong contrast to the functional languages we will cover later where everything (including control statements) is an expression with a well-defined value. The newer scripting language Ruby follows the functional model as well. Programmers working with C-family languages will especially miss an equivalent of the "functional if" operator ?: in Python. [1]

There is another difference to languages of the C family (an all other languages with proper lexical scoping). Python's control statements do not introduces new scopes. A variable defined in a block of a control statement is also visible outside.

>>> if x % 2 == 0:
        result = "even"
else:
        result = "odd"
>>> result
'even'
>>> for i in "ab": pass
>>> i
'b'

18.2.3. Collections

Python has three built-in collection types: list, map (dictionary) and tuple. A list is a sequence of values with fast (constant time) random access. In other languages, this kind of collection is often called a vector. A map associates values with keys. As a generalization of pairs, triples, and so forth, a tuple is sequence of fixed length. All collections can contain any kind of value, including other collections. Let's start with some list examples.

>>> l = [1, 2, "a", "b"]
>>> l[2]
'a'
>>> l[1:3]
[2, 'a']
>>> [1, 2] + [3, 4]
[1, 2, 3, 4]
>>> ["a", "b"] * 3
['a', 'b', 'a', 'b', 'a', 'b']
>>> len([1, "a", [2, 3]])
3

List literals are comma separated elements enclosed in square brackets. We recognize all the operators we have already used for strings. For lists, the subscript and slice operators can also be used to change the list.

>>> l = [1, 2, 3, 4]
>>> l[2] = 100
>>> l
[1, 2, 100, 4]
>>> l[1:3] = ["a", "b", "c"]
>>> l
[1, 'a', 'b', 'c', 4]
>>> del l[2]
>>> l
[1, 'a', 'c', 4]
>>> del l[2:]
[1, 'a']
>>> del l
>>> l
Traceback (most recent call last):
  File "<pyshell#81>", line 1, in ?
    l
NameError: name 'l' is not defined

The only unusual syntax is the del operator which deletes the following object from its contained. In the example we use it to delete a list element, a slice, and the list variable itself.

Lists have a number of methods manipulating the list object in-place (also called destructive methods).

>>> l = [0, 1, 2, 3, 4]
>>> l.reverse()
>>> l
[4, 3, 2, 1, 0]
>>> l.sort()
>>> l
[0, 1, 2, 3, 4]
>>> l.extend(['a', 'b'])
[0, 1, 2, 3, 4, 'a', 'b']
>>> l.append('c')
>>> l
[0, 1, 2, 3, 4, 'a', 'b', 'c']
>>> l.pop()
'c'
>>> l.remove(4)
>>> l
[0, 1, 2, 3, 'a', 'b']

The extends method is the destructive version of the plus operator. The methods append and pop let us view a list as a LIFO stack. Note that all the destructive methods return None. This is a Python convention. Since nothing is returned, the methods can not be used in a context which assumes that the method returns a new changed object and leaves the original one unaltered. Keep in mind that the underlying implementation is an array and not a linked list which means that some methods might take longer for long lists than you expect, because elements have to be shifted or copied.

Dictionaries (also called maps) are not only crucial to many scripts, but also to Python's internal implementation. A map literal is a sequence of key-value pairs enclosed in curly braces. Key and value of each pair are separated by a colon. Keys and values can be of any type.

>>> m = {"John": 55, "Joe": [1, 2, 3]}
>>> m
{'John': 55, 'Joe': [1, 2, 3]}
>>> m["Joe"]
[1, 2, 3]
>>> m["Joe"] = 66
{'John': 55, 'Joe': 66}
>>> m
>>> del m["John"]
>>> m
{'Joe': 66}
>>> m["John"]
Traceback (most recent call last):
  File "<pyshell#90>", line 1, in ?
    m["John"]
KeyError: John
>>> m.has_key("John")
0

Accessing elements in a map is like indexing a list, just with arbitrary keys instead of integers (and there are no slices).

The three methods keys, values and items gives us the keys, values, and key-value pairs as lists.

>>> print m.keys()
['John', 'Joe']
>>> m.values()
[55, [1, 2, 3]]
>>> m.items()
[('John', 55), ('Joe', [1, 2, 3])]

The same information can also be obtained more efficiently in the form of iterators with the iterkeys, itervalues, and iteritems methods. These methods will become useful when iterating through the entries in a map.

A data structure which is less common in other languages is the tuple. Tuple literals are comma separated values in parentheses (just like the arguments of a function). A trailing comma is allowed. It is mandatory for a one-tuple, since it distinguishes the one-tuple from a simple expression in parentheses.

>>> t = (1, 2, 3)
>>> t
(1, 2, 3)
>>> t[1]
2
>>> t[1]
Traceback (most recent call last):
  File "<pyshell#181>", line 1, in ?
    t[1] = 55
TypeError: object doesn't support item assignment
>>> ("blah",)
('blah',)
>>> ("blah")
('blah')

Like strings, tuples can't be changed, they are immutable objects. But you may changed the objects contained in the tuple if they are mutable.

>>> t = (1, [])
>>> t[1].append("blah")
>>> t
(1, ["blah"])

Whenever it makes sense, Python automatically packs a sequence of comma separated values into a tuple and vice versa. This can, for example, be used combine multiple assignments into one.

>>> 1, 2
(1, 2)
>>> x, y = 1, 2
>>> x, y
(1, 2)

As we will see this feature also simplifies the iteration through maps and allows for multiple return values of functions.

How do we iterate throug a collection? The for loop walks through a list without the need to construct the loop explicitly.

>>> for i in ["a", "b", "c"]:
        print "i:", i
i: a
i: b
i: c
>>> i
'c'

Note that the loop variable is visible after the loop. Together with the automatic unpacking of pairs, we can easily iterate through a list of pairs.

>>> for a, b in [(1, 'a'), (2, 'b')]:
        print a, b
1 a
2 b

Combine this with the items method of a map and you get a convenient way to walk through the map's name-value pairs.

>>> m = {"John": 55, "Joe": 44}
>>> for key, value in m.items():
        print key, value

John 55
Joe 44

If we want to avoid the intermediate creation of the list, we can use the iterator instead.

>>> m = {"John": 55, "Joe": 44}
>>> for key, value in m.iteritems():
        print key, value

The default iterator gives us the keys of the map (and probably the most intuitive way to walk through the map).

>>> for key in m:
        print key, m[key]

18.2.4. Functions

Up to now, we have only used built-in functions, but eventually we will need our own. Here is an exciting one.

>>> def times2(x):
        "Multiply argument by two"
        return 2*x
>>> times2(5)
10

A function is defined with the def keyword followed by the name of the function, the parameter list, a colon and the indented function body. If the body starts with a string, the string is interpreted as the documentation of the function. Here the body just consists of the documentation string and the return statement. Unless left with an explicit return value, a function returns None, Python's equivalent of "nothing" or "undefined".

The interesting part is how this definition is handled by the Python interpreter. It creates a new function object and assigns it to the variable whose name is the name of the function. The function object is a first class object with its attributes and methods. We can, for example, ask the function for its name and documentation using the special attributes __name__ and __doc__.

>>> times2
<function times2 at 0x0097D1E0>
>>> times2.__name__
'times2'
>>> times2.__doc__
'Multiply argument by two'

We can even add new attributes dynamically, for example, to add more meta information to the function such as permissions.

>>> times2.permissions = ["everybody"]
>>> times.permissions
['everybody']

Now that we have this function object, we can assign it to another variable, pass it to another ("higher order") function, and so forth.

>>> f = times2
>>> f(3)
6
>>> def printResult(f, x):
        print "%s(%d)=%d" % (f.__name__, x, f(x))
>>> printResult(times2, 5)
times2(5)=10

Higher order functions can often replace explicit loops. Python also supports a number of functions which are well-known for list oriented languages such as Lisp. As a first example, lists can be processed element-wise using the map function.

>>> def times2(x): return 2*x
>>> map(times2, [1, 2, 3])
[2, 4, 6]
>>> map(lambda x, y: x + y, [1, 2, 3], [2, 3, 4])
[3, 5, 7]
>>> map(lambda x, y: (x, y), [1, 2, 3], ["a", "b", "c"])
[(1, 'a'), (2, 'b'), (3, 'c')]

The map function if the first example of a higher order function (also called a functional), that is, a function which takes other functions as arguments. Since functions are first class objects in Python, passing functions to other functions is not different from passing any other kind of value. The last expression combines several lists into a list of tuples and can be more easily written using the built-in zip function:

>>> zip([1, 2, 3], ["a", "b", "c", "d"])
[(1, 'a'), (2, 'b'), (3, 'c')]

It is also possible to extract a sub-list using a filter:

>>> l = range(10)
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> filter(lambda x: x % 2 == 0, l)
[0, 2, 4, 6, 8]

As another example of list processing, you can recursively compute a value from a list using reduce.

>>> reduce(lambda x, y: x+y, [1, 2, 3, 4])
10
>>> reduce(lambda x, y: x + ", " + y, ["Joe", "John", "Mary"])
'Joe, John, Mary'

Like the map function, it takes a function and a list. The function is first applied to the first two elements of the list, then to the result of this computation and the third element, and so forth. Optionally, one can provide a start value so that the recursion starts by applying the function to the start value and the first value of the list.

>>> reduce(lambda x, y: x+y, [], 3)
3
>>> reduce(lambda x, y: x+y, [1], 3)
4

Map and filter are used less often in new programs because of a recent extension of the syntax for list literals known as list comprehensions. Remember the definition of sets from other set in your math class? Something like "all f(x) where x in X and x satisfies some condition"? Here is the Python version of this kind of list (instead of set) definition.

>>> [x**2 for x in range(10) if x % 2 == 0]
[0, 4, 16, 36, 64]

The expression constructs the list of squares of all even integers between zero and ten (not including). List comprehensions may combine multiple lists, e.g.:

>>> firstNames = ["John", "Joe", "Mary"]
>>> lastNames = ["Miller", "Smith"]
>>> [(f, l) for f in firstNames for l in lastNames]
[('John', 'Miller'), ('John', 'Smith'), ('Joe', 'Miller'),
 ('Joe', 'Smith'), ('Mary', 'Miller'), ('Mary', 'Smith')]

Two more features a C/C++ programmer misses when moving to Java are variable argument lists and default arguments. Python takes the C/C++ functionality one step further by allowing arguments to be passed by name.

>>> def f(s="blah", n=1): print n*s
>>> f("x", 5)
xxxxx
>>> f(n=2)
blahblah
>>> f(s="x")
x

Variable argument lists are declared with an asterisk and passed as a tuple to the function body.

>>> def f(s="blah", *args): print s, args
>>> f("blub", 1, 2, 3)
blub (1, 2, 3)

Similarly, keyword arguments can be passed as a generic argument map to a functions.

>>> def f(x, **kw): print x, kw
>>> f(x=1, y=2)
1 {'y': 2}

Putting it all together we can define a function with normal arguments, optional arguments, a variable argument tuple, and a keyword argument map.

>>> def f(x, y=5, *args, **kw):
        print x, y, args, kw
>>> f(1, 2, 3, a=4)
1 2 (3,) {'a': 4}

We can also do the opposite and call a function with an argument tuple and keyword argument list.

>>> def f(x, y=2, z=3): print x, y, z
>>> args = (100, 200)
>>> kw = {'z': 55}
>>> f(*args, **kw)
100 200 55

This feature comes in handy when passing arguments in a generic context from one function to another. As an application, we can now define the higher order function compose which combines two functions to a new function applying one function after the other.

>>> def compose(f, g): return lambda *args, **kw: f(g(*args, **kw))
>>> def times2(x): return 2*x
>>> def add(x, y): return x + y
>>> h = compose(times2, add)
>>> h(3, 4)
14

18.2.5. Objects and Classes

What's the fastest way to teach object-oriented programming? I mean, after drawing some diagrams explaining objects, classes, inheritance, polymorphism and so on. Let's type a Python class and see.

>>> class Person:
	def __init__(self, name):
		self.name = name
	def sayHello(self):
		print "Hello, I'm", self.name		
>>> andy = Person("Andy")
>>> andy.sayHello()
Hello, I'm Andy

Ok, this it not very sophisticated, but we have defined a class Person, created an instance of this class, and called the method sayHello, all in seven lines of code. To do this, we had to know two things: method definitions looks like functions taking the instance self as the first argument (you don't need to call it self, but everybody does), and the constructor is called __init__. This name is one of the function names with special meaning in Python, all starting and ending with two underscore characters. To continue our study of object orientation, let's see what the newly introduced variables are:

>>> type(Person)
<type 'class'>
>>> andy
<__main__.Person instance at 0x009FD0B0>
>>> andy.sayHello
<bound method Person.sayHello of <__main__.Person instance at 0x009FD0B0>>
>>> Person.sayHello
<unbound method Person.sayHello>

Here we can see precisely what we have defined. Person is a class, and andy is an instance of this class. There seem to be two kinds of methods: andy.sayHello is a bound to the object andy, whereas Person.sayHello is not bound to an instance of the class Person yet. Since we can print all these objects, we can also use them in all kinds of expressions.

>>> f = andy.sayHello
>>> f()
Hello, I'm Andy
>>> F = Person.sayHello
>>> F(andy)
Hello, I'm Andy
>>> def evalThreeTimes(f):
	for i in range(3): f()	
>>> evalThreeTimes(andy.sayHello)
Hello, I'm Andy
Hello, I'm Andy
Hello, I'm Andy

The next example demontrates polymorphism. We define a new class Employee derived from Person which adds another attribute for the employee number which the obedient employee has to mention whenever saying hello.

>>> class Employee(Person):
	def __init__(self, name, number):
		Person.__init__(self, name)
		self.number = number
	def sayHello(self):
		print "Hello, I'm", self.name, "also known as number", self.number
>>> homer = Employee("Homer", 1234)
>>> homer.sayHello()
Hello, I'm Homer also known as number 1234

It looks like all methods in Python can be polymorphic, since we don't have to do anything special to define the new behavior (similar to Smalltalk and the default semantic (non-final method) in Java). The next example tells us more about the way Python class work.

>>> def f(self):
        print "Hi there, I'm", self.name
>>> Person.sayHello = f
>>> andy.sayHello()
Hi there, I'm Andy
>>> homer.sayHello()
Hello, I'm Homer also known as number 1234

We changed the sayHello method by assigning a new function to the unbound method, and indeed, calling sayHello on the instance andy now gives new answer, but homer (being an Employee) does not change his behaviour. This shows that methods are just callable members of a class. When calling a method, Python looks for a member with the name of the called method and then executes the "call operator" on this objects. The same happens when calling a function or a class: Python first looks for the object (like for any other variable) in the current environment and then passes the function's arguments to the "call" method of this callable object. You can turn any of your own objects into callable objects which behave like functions by implementing the special __call__ method.

>>> class A:
	def __init__(self, n): self.n = n
	def __call__(self, x): return self.n * x
>>> a = A(5)
>>> a(3)
15

Existing Python classes can not be extended as easily as classes in other dynamic object oriented languages (Smalltalk, Objective C, CLOS). When we define a class again, a new class is created (in Ruby, the new class members are added to the existing class).

>>> class A:
	def __init__(self, n): self.n = n
>>> A(5).n
5
>>> class A: pass
>>> A(5)
Traceback (most recent call last):
  File "<pyshell#244>", line 1, in ?
    A(5)
TypeError: this constructor takes no arguments

However, since methods are just callable member of the class, we can attach functions as methods to an existing class.

>>> class A:
	def __init__(self, name): self.name = name
	
>>> def hello(self): return "My name is " + self.name
>>> A.hello = hello
>>> A("Homer").hello()
'My name is Homer'

An interesting application of keyword arguments is a "universal constructor" for objects.

>>> class Object:
	def __init__(self, **kw):
		self.__dict__.update(kw)
>>> joe = Object(name="Joe", age=25)
>>> joe.__dict__
{'age': 25, 'name': 'Joe'}
>>> print "my name is", joe.name
my name is Joe

Notes

[1]

A functional "if" is about to be added to Python in the next release.