Difference between revisions of "Python1"
| (118 intermediate revisions by the same user not shown) | |||
| Line 4: | Line 4: | ||
| =Introduction= | =Introduction= | ||
| − | [[Image:Python.png| | + | [[Image:Python.png|thumb|1100px|none|http://xkcd.com/353/]] | 
| + | |||
| + | With thanks to Simon Metson and Mike Wallace for much of the following material. | ||
| + | |||
| + | =Getting Started on BlueCrystal Phase-2= | ||
| + | |||
| + | After you have logged in, type the following at the command line: | ||
| + | |||
| + | <pre> | ||
| + | module add languages/python-2.7.2.0 | ||
| + | python | ||
| + | </pre> | ||
| + | |||
| + | This should start up an interactive python session: | ||
| + | |||
| + | <pre> | ||
| + | Python 2.7.2 (default, Aug 25 2011, 10:51:03)  | ||
| + | [GCC 4.3.3] on linux2 | ||
| + | Type "help", "copyright", "credits" or "license" for more information. | ||
| + | >>> | ||
| + | </pre> | ||
| + | |||
| + | where we can type commands at the '''>>>''' prompt. | ||
| =Python as a Calculator= | =Python as a Calculator= | ||
| + | |||
| + | To get started, let's just try a few commands out.  If you type: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> print "Hello!" | ||
| + | </source> | ||
| + | |||
| + | you'll get: | ||
| + | |||
| + | <pre> | ||
| + | Hello! | ||
| + | </pre> | ||
| + | |||
| + | If you try: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> print 5 + 9 | ||
| + | </source> | ||
| + | |||
| + | you'll get: | ||
| + | |||
| + | <pre> | ||
| + | 14 | ||
| + | </pre> | ||
| + | |||
| + | So far so simple!  Here is a copy of a session containing a few more commands where we've set the values of some variables and also defined and run our own function:  | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> five = 5 | ||
| + | >>> neuf = 9 | ||
| + | >>> print five + neuf | ||
| + | 14 | ||
| + | >>> def say_hello(): | ||
| + | ...     print "Hello, world!" | ||
| + | ... # hit return here  | ||
| + | >>> say_hello() | ||
| + | Hello, world! | ||
| + | </source> | ||
| + | |||
| + | You can exit an interactive session at any time by typing '''Ctrl-D'''. | ||
| + | |||
| + | =Getting Help= | ||
| + | |||
| + | One of the good things about Python is that it has lots of useful online documentation.  ([[A_Good_Read|There are good books on the language too]].)  For example, take a look at: http://docs.python.org/.  You can also type '''help()''' and the interpreter prompt: | ||
| + | |||
| + | <pre> | ||
| + | >>> help() | ||
| + | |||
| + | Welcome to Python 2.7!  This is the online help utility. | ||
| + | |||
| + | If this is your first time using Python, you should definitely check out | ||
| + | the tutorial on the Internet at http://docs.python.org/tutorial/. | ||
| + | |||
| + | Enter the name of any module, keyword, or topic to get help on writing | ||
| + | Python programs and using Python modules.  To quit this help utility and | ||
| + | return to the interpreter, just type "quit". | ||
| + | |||
| + | ... | ||
| + | |||
| + | help> keywords | ||
| + | |||
| + | Here is a list of the Python keywords.  Enter any keyword to get more help. | ||
| + | |||
| + | and                 elif                if                  print | ||
| + | ... | ||
| + | |||
| + | help> if | ||
| + | The ``if`` statement | ||
| + | ******************** | ||
| + | |||
| + | The ``if`` statement is used for conditional execution: | ||
| + | |||
| + |    if_stmt ::= "if" expression ":" suite | ||
| + |                ( "elif" expression ":" suite )* | ||
| + |                ["else" ":" suite] | ||
| + | |||
| + | It selects exactly one of the suites by evaluating the expressions one | ||
| + | by one until one is found to be true... | ||
| + | ... | ||
| + | |||
| + | help> quit | ||
| + | |||
| + | You are now leaving help and returning to the Python interpreter. | ||
| + | ... | ||
| + | >>>  | ||
| + | </pre> | ||
| + | |||
| + | =Making a Script= | ||
| + | |||
| + | An interactive session can be fun and useful for trying things out.  However--to save our fingers--we will typically want to execute a series of commands as a script, created using your favourite text editor.  Here are the contents of an example script: | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/bin/env python | ||
| + | |||
| + | print "Hello, from a python script!" | ||
| + | </source>  | ||
| + | |||
| + | Ensure that your script is executable: | ||
| + | |||
| + | <pre> | ||
| + | chmod u+x myscript.py | ||
| + | </pre> | ||
| + | |||
| + | and now you can run it: | ||
| + | |||
| + | <pre> | ||
| + | [ggdagw@bigblue4 ~]$ ./myscript.py  | ||
| + | Hello, from a python script! | ||
| + | </pre> | ||
| + | |||
| + | =Python and Whitespace= | ||
| + | |||
| + | Love it of hate it, Python incorporates whitespace in it's syntax. (It's either that or demarcate blocks with some other syntax, such as ending a line with a semi-colon as it is in C.  Pick your poison.)  Spacing is therefore key in creating a valid python script.  For example: | ||
| + | |||
| + | <source lang="python"> | ||
| + | message = "happy days!" | ||
| + | if len(message) > 10: | ||
| + |     print "longer.." | ||
| + | else: | ||
| + |     print "shorter.." | ||
| + | </source> | ||
| + | |||
| + | will work, but: | ||
| + | |||
| + | <source lang="python"> | ||
| + | message = "happy days!" | ||
| + | if len(message) > 10: | ||
| + |  print "longer.." | ||
| + | else: | ||
| + | print "shorter.." | ||
| + | </source> | ||
| + | |||
| + | will not: | ||
| + | |||
| + | <pre> | ||
| + |   File "./myscript.py", line 7 | ||
| + |     print "shorter.." | ||
| + |         ^ | ||
| + | IndentationError: expected an indented block | ||
| + | </pre> | ||
| + | |||
| + | It is therefore a great advantage, when writing to python script, to use a text editor which has a dedicated python mode--such as '''emacs'''--and will actively help you to keep your spacing correct.  See, http://wiki.python.org/moin/PythonEditors, for an extensive list. | ||
| + | |||
| + | =Some Suggested Exercises= | ||
| + | |||
| + | * Calculate the volume of a sphere.  You can experiment with the following (where r needs to be set to some value): | ||
| + | ** <source lang="python">4/3 * 3.14159265359 * r ** 3</source> | ||
| + | ** <source lang="python">4.0/3.0 * 3.14159265359 * pow(r,3)</source> | ||
| + | ** <source lang="python">float(4)/float(3) * 3.14159265359 * pow(r,3)</source> | ||
| + | * Concatenate two strings | ||
| + | * Write a recursive function to compute fibonacci numbers (Hint: F(n) = F(n-1) +F(n-2), F(0)=0 and F(1)=1) | ||
| + | |||
| + | =Nuts and Bolts= | ||
| + | |||
| + | ==Types== | ||
| + | |||
| + | Python has intrinsic types including, integers, floats, booleans and complex numbers.  It is dynamically typed (meaning that you don't have to have a block of variable declarations at the top of your script), but it is '''not weakly''' typed, for example: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> my_complex = 2 + 0.5j | ||
| + | >>> my_complex | ||
| + | (2+0.5j) | ||
| + | >>> my_complex.real | ||
| + | 2.0 | ||
| + | >>> my_complex.imag | ||
| + | 0.5 | ||
| + | >>> name = 'fred' | ||
| + | >>> lucky = 7 | ||
| + | >>> name + lucky | ||
| + | Traceback (most recent call last): | ||
| + |   File "<stdin>", line 1, in <module> | ||
| + | TypeError: cannot concatenate 'str' and 'int' objects | ||
| + | </source> | ||
| + | |||
| + | ==Strings== | ||
| + | |||
| + | The eagle-eyed will have spotted in a previous examples that we could ask the length a character string--straight off the bat.  No need to write a counting routine ourselves: | ||
| + | |||
| + | <source lang="python"> | ||
| + | message = "happy days!" | ||
| + | print len(message) | ||
| + | 11 | ||
| + | </source> | ||
| + | |||
| + | We also take '''slices''' of our character string.  In my case | ||
| + | |||
| + | <source lang="python"> | ||
| + | print message[:5] | ||
| + | happy | ||
| + | </source> | ||
| + | |||
| + | Since a string is an '''object''' (in the object oriented programming sense of the word, but more of that another time...) we can call a number of methods that operate on a string.  A selected sample include: | ||
| + | |||
| + | {| border="1" cellpadding="10" | ||
| + | || s.find(sub) || Finds the first occurrence of the given substring | ||
| + | |- | ||
| + | || s.islower() || Checks whether all characters are lowercase | ||
| + | |- | ||
| + | || s.upper() || Returns '''s''' converted to uppercase | ||
| + | |- | ||
| + | || s.strip() || Removes leading and trailing whitespace | ||
| + | |- | ||
| + | || s.replace(old,new) || Replaces substring '''old''' with '''new''' | ||
| + | |- | ||
| + | || s.split([sep]) || Splits '''s''' uses (optional) '''sep''' as a delimiter.  Returns a list | ||
| + | |- | ||
| + | |} | ||
| + | |||
| + | ==Lists and Tuples== | ||
| + | |||
| + | An example of a list is: | ||
| + | |||
| + | <source lang="python"> | ||
| + | shopping = ['bread', 'marmalade', 'milk', 'tea'] | ||
| + | </source> | ||
| + | |||
| + | and we can inquire about the length of that using the same function as before: | ||
| + | |||
| + | <source lang="python"> | ||
| + | len(shopping) | ||
| + | </source> | ||
| + | |||
| + | We can also take '''slices''' of a list, as we did with a string: | ||
| + | |||
| + | <source lang="python"> | ||
| + | shopping[0:2] | ||
| + | </source> | ||
| + | |||
| + | and even reset a portion of the list that way: | ||
| + | |||
| + | <source lang="python"> | ||
| + | shopping[0:2] = ['bagels', 'jam'] | ||
| + | </source> | ||
| + | |||
| + | Since a list is also an object, we have more handy methods, including: | ||
| + | |||
| + | {| border="1" cellpadding="10" | ||
| + | || s.append(x) || Appends an new element '''x''' to the end of '''s''' | ||
| + | |- | ||
| + | || s.count(x) || Returns the number of occurences of '''x''' in '''s''' | ||
| + | |- | ||
| + | || s.reverse(x) || Reverses items of '''s''' in place | ||
| + | |- | ||
| + | || s.sort([compfunc]) || Sorts items of '''s''' in place.  '''compfunc''' is an optional comparison function | ||
| + | |} | ||
| + | |||
| + | Tuples are very similar to lists and support many of the same operations (indexing, slicing, concatenation etc.) but differ in that they are '''not mutable''' after creation: | ||
| + | |||
| + | <source land="python"> | ||
| + | >>> mytuple = ('fred', 'ginger', 7, 2.5) | ||
| + | >>> mylist = ['fred', 'ginger', 7, 2.5] | ||
| + | >>> mylist[2] = 8 | ||
| + | >>> print mylist | ||
| + | ['fred', 'ginger', 8, 2.5] | ||
| + | >>> print mytuple[2]     | ||
| + | 7 | ||
| + | >>> mytuple[2] = 8 | ||
| + | Traceback (most recent call last): | ||
| + |   File "<stdin>", line 1, in <module> | ||
| + | TypeError: 'tuple' object does not support item assignment | ||
| + | </source> | ||
| + | |||
| + | List comprehension: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> numbers = [12, 3, 90, 40, 52, 11, 10] | ||
| + | >>> small_numbers_doubled = [number * 2 for number in numbers if number < 20] | ||
| + | </source> | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> small_numbers_doubled | ||
| + | [24, 6, 22, 20] | ||
| + | </source> | ||
| + | |||
| + | ==Dictionaries== | ||
| + | |||
| + | A dictionary is an associative array or hash table, containing '''key-value''' pairs: | ||
| + | |||
| + | <source lang="python"> | ||
| + | mydict = {'thomas':'blue', 'james':'red', 'henry':'green'} | ||
| + | </source> | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> print mydict['james'] | ||
| + | red | ||
| + | </source> | ||
| + | |||
| + | We can write much more user-friendly and intuitive code using dictionaries, rather than arbitrary indexes into a list. | ||
| + | |||
| + | Some example dictionary methods are: | ||
| + | |||
| + | {| border="1" cellpadding="10" | ||
| + | || m.keys() || Returns a list of the keys in '''m''' | ||
| + | |- | ||
| + | || m.items() || Returns a list of the (key,value) pairs in '''m''' | ||
| + | |- | ||
| + | || m[k] = x || Sets m[k] to x | ||
| + | |- | ||
| + | || m.update(b) || Adds objects from dictionary '''b''' to '''m''' | ||
| + | |} | ||
| + | |||
| + | ==Control Structures== | ||
| + | |||
| + | Of course, we'll need conditionals and loops etc. to go beyond the simplest of scripts.  Here is an '''if-then-else''', python style: | ||
| + | |||
| + | <source lang="python"> | ||
| + | if sky == ‘blue’: | ||
| + |     birds_sing() | ||
| + | elif sky == ‘black’: | ||
| + |     birds_sleep() | ||
| + | else: | ||
| + |     pass #do nothing | ||
| + | </source> | ||
| + | |||
| + | and a classic '''for loop''': | ||
| + | |||
| + | <source lang="python"> | ||
| + | for ii in range(1,10): | ||
| + |     print ii | ||
| + | </source> | ||
| + | |||
| + | <pre> | ||
| + | 1 | ||
| + | ... | ||
| + | 9 | ||
| + | >>> | ||
| + | </pre> | ||
| + | |||
| + | We'll also see a '''while loop''' shoehorned into the next example. | ||
| + | |||
| + | For our control statements, we can use comparison operators such as, '''==''', '''!=''', '''>''', '''<''', '''<=''', '''>=''', and logical operators, such as, '''and''', '''or''','''not''' | ||
| + | |||
| + | ==File Input and Output== | ||
| + | |||
| + | Here's some code for printing the contents of a text file: | ||
| + | |||
| + | <source lang="python"> | ||
| + | fp = open("foo.txt","r") | ||
| + | line = fp.readline() | ||
| + | while line: | ||
| + |     line = line.strip() | ||
| + |     print line | ||
| + |     line = fp.readline() | ||
| + | fp.close() | ||
| + | </source> | ||
| + | |||
| + | We could open a file for writing with: | ||
| + | |||
| + | <source lang="python"> | ||
| + | fp = open("foo.txt","w") | ||
| + | </source> | ||
| + | |||
| + | and use: | ||
| + | |||
| + | <source lang="python"> | ||
| + | fp.write(...) | ||
| + | </source> | ||
| + | |||
| + | to write to that file. | ||
| + | |||
| + | =Object Oriented Programming in Python= | ||
| + | |||
| + | Here is an example of using a class in python: | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | |||
| + | class Radio: | ||
| + |     "A simple radio" | ||
| + |     def __init__(self,freq=0.0,name=""): | ||
| + |         "Constructor method" | ||
| + |         self.__frequency=freq | ||
| + |         self.name=name | ||
| + |     def tune(self,freq): | ||
| + |         self.__frequency=freq | ||
| + |     def tuned_to(self): | ||
| + |         print self.name, "tuned to:", self.__frequency | ||
| + | |||
| + | if __name__ == "__main__": | ||
| + |     # declare two radio instances | ||
| + |     car = Radio(name="car") | ||
| + |     kitchen = Radio(91.5,"kitchen") | ||
| + |     # call some methods | ||
| + |     car.tuned_to() | ||
| + |     kitchen.tuned_to() | ||
| + |     car.tune(89.3) | ||
| + |     car.tuned_to() | ||
| + |     # Docstrings--double quotes at the top of the class:                         | ||
| + |     print car.__doc__ | ||
| + |     # NB members not private by default: | ||
| + |     print car.name | ||
| + |     # BUT leading double underscores will trigger | ||
| + |     # name mangling and hence the member will be hidden  | ||
| + |     print car.__frequency | ||
| + | </source> | ||
| + | |||
| + | Running the script gives us: | ||
| + | |||
| + | <pre> | ||
| + | car tuned to: 0.0 | ||
| + | kitchen tuned to: 91.5 | ||
| + | car tuned to: 89.3 | ||
| + | A simple radio | ||
| + | car | ||
| + | Traceback (most recent call last): | ||
| + |   File "./foo.py", line 27, in <module> | ||
| + |     print car.__frequency | ||
| + | AttributeError: Radio instance has no attribute '__frequency' | ||
| + | </pre> | ||
| + | |||
| + | =Using Packages= | ||
| + | |||
| + | Python packages are great because they provide us with a whole lot of extra functionality--above and beyond the core language--that we didn't have to write and debug ourselves. | ||
| + | |||
| + | Let's walk through a simple example using a package.  At an interactive prompt type: | ||
| + | |||
| + | <source lang="python"> | ||
| + | from random import randint | ||
| + | </source> | ||
| + | |||
| + | This will give us access to the '''randint(x,y)''' function, which returns a randomly chosen integer from the given range [x,y]: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> randint(0,10) | ||
| + | 4 | ||
| + | >>> randint(0,10) | ||
| + | 1 | ||
| + | >>> randint(0,10) | ||
| + | 3 | ||
| + | >>> randint(0,10) | ||
| + | 0 | ||
| + | </source> | ||
| + | |||
| + | OK, so far so good.  One thing to note is that the above '''import''' statement has drawn the name ''randint'' into our current '''namespace'''.  What if we had already defined a function named ''randint''.  That could cause problems.  In order to protect ourselves from this kind of problem, there are several import variants. | ||
| + | |||
| + | By default, functions will be added to a namespace with the same name as the package.  In order to call the functions we will, in this case, have to prefix them with there namespace: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> import random | ||
| + | >>> random.randint(0,10) | ||
| + | 6 | ||
| + | </source> | ||
| + | |||
| + | Should we desire, we can apply a little more control and specify the namespace for the import ourselves:  | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> import random as rnd | ||
| + | >>> rnd.randint(0,10) | ||
| + | 3 | ||
| + | </source> | ||
| + | |||
| + | Another--more 'devil-may-care'--approach is to do away with the separate namespace and pull everything from a given package into the current namespace: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> from random import * | ||
| + | >>> randint(0,10) | ||
| + | 9 | ||
| + | >>> random() | ||
| + | 0.3172268098313996 | ||
| + | </source> | ||
| + | |||
| + | (The '''random()''' function returns a randomly selected floating point number in the range [0, 1)--that is, between 0 and 1, including 0.0 but always smaller than 1.0.) | ||
| + | |||
| + | ==Interrogating a Module== | ||
| + | |||
| + | To find all the functions that are in a particular module, type '''dir(<modulename>)'''. | ||
| + | |||
| + | If you have the '''pip''' package installed, you can easily see which other packages are installed using '''pip list''' on the linux command line. | ||
| + | |||
| + | ==A Namespace Collision== | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> def randint(): | ||
| + | ...     print "dummy function" | ||
| + | ...  | ||
| + | >>> randint() | ||
| + | dummy function | ||
| + | >>> from random import randint | ||
| + | >>> randint() | ||
| + | Traceback (most recent call last): | ||
| + |   File "<stdin>", line 1, in <module> | ||
| + | TypeError: randint() takes exactly 3 arguments (1 given) | ||
| + | >>> randint(0,10) | ||
| + | 0 | ||
| + | </source> | ||
| + | |||
| + | =Python for Shell Scripting= | ||
| + | |||
| + | <source lang="python"> | ||
| + | from subprocess import call | ||
| + | call(["ls", "-l"]) | ||
| + | </source> | ||
| + | |||
| + | = Python as a Glue Languge= | ||
| + | |||
| + | * Calling R from python is possible using: http://rpy.sourceforge.net/index.html. | ||
| + | * Calling Matlab from python: http://mlabwrap.sourceforge.net. | ||
| + | * With SWIG you can make many bindings, including Python to C and C++: http://www.swig.org/. | ||
| + | * Or if Fortran is more your cup-of-tea, you can use f2py: http://cens.ioc.ee/projects/f2py2e/. | ||
| + | * There are many more examples. | ||
| + | |||
| + | =Command Line Parsing= | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | |||
| + | import sys | ||
| + | |||
| + | if __name__ == "__main__": | ||
| + |     # We can test on the length of argv | ||
| + |     if len(sys.argv) < 2: | ||
| + |         print "usage: to use this script..." | ||
| + |     else: | ||
| + |         ii = 0 | ||
| + |         for arg in sys.argv: | ||
| + |             # (typically) argv[0] is bound to the script name | ||
| + |             print "arg", ii, "is:", arg | ||
| + |             ii = ii+1 | ||
| + | </source> | ||
| + | |||
| + | <pre> | ||
| + | gethin@gethin-desktop:~$ ./cmdline.py | ||
| + | usage: to use this script... | ||
| + | gethin@gethin-desktop:~$ ./cmdline.py fred ginger | ||
| + | arg 0 is: ./cmdline.py | ||
| + | arg 1 is: fred | ||
| + | arg 2 is: ginger | ||
| + | </pre> | ||
| + | |||
| + | =Databases= | ||
| + | |||
| + | ==Simple Databases== | ||
| + | |||
| + | Python provides access to some database packages through some standard packages.  The '''bsddb''' module allows you to access the highly popular '''Berkeley DB database''' from your python code. | ||
| + | |||
| + | The interface to the database provided by this module is very similar to the way in which we access a dictionary.  First, let's populate a database: | ||
| + | |||
| + | <source lang="python"> | ||
| + | import bsddb | ||
| + | d = bsddb.btopen('engines.db') | ||
| + | d['thomas'] = 'blue' | ||
| + | d['james'] = 'red' | ||
| + | d['henry'] = 'green' | ||
| + | d.close() | ||
| + | </source> | ||
| + | |||
| + | Now let's open the database again and query it's contents: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> d = bsddb.btopen('engines.db') | ||
| + | >>> d.keys() | ||
| + | ['henry', 'james', 'thomas'] | ||
| + | >>> d.first() | ||
| + | ('henry', 'green') | ||
| + | >>> d.last() | ||
| + | ('thomas', 'blue') | ||
| + | >>> colour = d['james'] | ||
| + | >>> colour | ||
| + | 'red' | ||
| + | >>> del d['henry'] | ||
| + | >>> d.keys() | ||
| + | ['james', 'thomas'] | ||
| + | </source> | ||
| + | |||
| + | ==Relational Databases== | ||
| + | |||
| + | Relational databases give us more oomph.  '''SQLite''' is a useful relational database to consider as it is light, in that it requires hardly anything in terms of setup or management, yet still understands queries formulated in SQL.  As such it is useful for creating relatively simple examples of SQL access to a database in python and is a stepping stone toward more powerful database packages. | ||
| + | |||
| + | Here is a script which will create a table called '''planets''' in the file '''pytest.db''' and populate with details of the planets in our solar system: | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | # | ||
| + | # Example python script using sqlite3 package | ||
| + | # to connect to an SQLite database. | ||
| + | # | ||
| + | |||
| + | import sqlite3 | ||
| + | |||
| + | conn = sqlite3.connect('pytest.db') # or use :memory: to put it in RAM | ||
| + | |||
| + | cursor = conn.cursor() | ||
| + | |||
| + | # create a table | ||
| + | cursor.execute("""CREATE TABLE planets | ||
| + |                   (Id INT, Name TEXT, Diameter REAL,  | ||
| + |                    Mass REAL, Orbital_Period REAL)""") | ||
| + | |||
| + | # insert a single record | ||
| + | cursor.execute("INSERT INTO planets VALUES(1,'Mercury',0.382,0.06,0.24)") | ||
| + | conn.commit() # save data to file | ||
| + | |||
| + | # insert multiple records | ||
| + | other_planets = [(2,'Venus',0.949,0.82,0.72), | ||
| + |                  (3,'Earth',1.0,1.0,1.0), | ||
| + |                  (4,'Mars',0.532,0.11,1.52), | ||
| + |                  (5,'Jupiter',11.209,317.8,5.20), | ||
| + |                  (6,'Saturn',9.449,95.2,9.54), | ||
| + |                  (7,'Uranus',4.007,14.6,19.22), | ||
| + |                  (8,'Neptune',3.883,17.2,30.06), | ||
| + |                  (9,'Pluto',0.18,0.002,248.09)] | ||
| + | cursor.executemany("INSERT INTO planets VALUES (?,?,?,?,?)", other_planets) | ||
| + | conn.commit() # save data to file | ||
| + | |||
| + | # delete a record | ||
| + | sql = """ | ||
| + | DELETE FROM planets | ||
| + | WHERE Name = 'Pluto' | ||
| + | """ | ||
| + | cursor.execute(sql)  # poor old pluto!  | ||
| + | conn.commit() | ||
| + | </source> | ||
| + | |||
| + | And here is a short example script showing a couple of ways to interrogate the database:  | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | # | ||
| + | # Example python script using sqlite3 package | ||
| + | # to connect to an SQLite database. | ||
| + | # | ||
| + | |||
| + | import sqlite3 | ||
| + | |||
| + | conn = sqlite3.connect('pytest.db') # or use :memory: to put it in RAM | ||
| + | |||
| + | cursor = conn.cursor() | ||
| + | |||
| + | print "All the records in the table, ordered by Name:\n" | ||
| + | for row in cursor.execute("SELECT rowid, * FROM planets ORDER BY Name"): | ||
| + |     print row | ||
| + | |||
| + | print "\n" | ||
| + | |||
| + | print "All the planets with a mass greater than or equal to that of Earth:\n" | ||
| + | sql = "SELECT * FROM planets WHERE Mass>=?" | ||
| + | cursor.execute(sql, [("1.0")]) | ||
| + | for row in cursor.fetchall():  # or use fetchone() | ||
| + |     print row | ||
| + | </source> | ||
| + | |||
| + | Where the results of running the script are: | ||
| + | |||
| + | <pre> | ||
| + | All the records in the table, ordered by Name: | ||
| + | |||
| + | (3, 3, u'Earth', 1.0, 1.0, 1.0) | ||
| + | (5, 5, u'Jupiter', 11.209, 317.80000000000001, 5.2000000000000002) | ||
| + | (4, 4, u'Mars', 0.53200000000000003, 0.11, 1.52) | ||
| + | (1, 1, u'Mercury', 0.38200000000000001, 0.059999999999999998, 0.23999999999999999) | ||
| + | (8, 8, u'Neptune', 3.883, 17.199999999999999, 30.059999999999999) | ||
| + | (6, 6, u'Saturn', 9.4489999999999998, 95.200000000000003, 9.5399999999999991) | ||
| + | (7, 7, u'Uranus', 4.0069999999999997, 14.6, 19.219999999999999) | ||
| + | (2, 2, u'Venus', 0.94899999999999995, 0.81999999999999995, 0.71999999999999997) | ||
| + | |||
| + | All the planets with a mass greater than or equal to that of Earth: | ||
| + | |||
| + | (3, u'Earth', 1.0, 1.0, 1.0), | ||
| + | (5, u'Jupiter', 11.209, 317.80000000000001, 5.2000000000000002), | ||
| + | (6, u'Saturn', 9.4489999999999998, 95.200000000000003, 9.5399999999999991), | ||
| + | (7, u'Uranus', 4.0069999999999997, 14.6, 19.219999999999999), | ||
| + | (8, u'Neptune', 3.883, 17.199999999999999, 30.059999999999999) | ||
| + | </pre> | ||
| + | |||
| + | For more information on using SQLite with Python, see, e.g.: | ||
| + | * http://zetcode.com/db/sqlitepythontutorial/ | ||
| + | * http://www.blog.pythonlibrary.org/2012/07/18/python-a-simple-step-by-step-sqlite-tutorial/ | ||
| + | |||
| + | You can also connect to a MySQL database from python using, e.g. the [http://mysql-python.sourceforge.net/ python-mysqldb] package.  A snippet of python code for connecting to a database is: | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | import MySQLdb | ||
| + | |||
| + | conn = MySQLdb.connect(host="localhost",   # your host, usually localhost | ||
| + |                      user="gethin",      # your username | ||
| + |                       passwd="changeme", # your password | ||
| + |                       db="menagerie")    # name of the data base | ||
| + | |||
| + | # Create a cursor object, as before with SQLite | ||
| + | cur = conn.cursor()  | ||
| + | |||
| + | # and then you can submit your SQL command: | ||
| + | cur.execute("SELECT * FROM YOUR_TABLE_NAME") | ||
| + | </source> | ||
| + | |||
| + | =Numpy= | ||
| + | |||
| + | OK, let's move onto looking at python's numerical processing capabilities.  We will start by looking at the '''numpy''' package: | ||
| + | |||
| + | <source lang="python"> | ||
| + | from numpy import * | ||
| + | </source> | ||
| + | |||
| + | Now that we have access to the functions from '''numpy''', let's create an array.  '''Note that a numpy array is an object of a different type to an intrinsic array in Python'''.   A simple approach is to use the '''array''' function.  For example we might enter: | ||
| + | |||
| + | <source lang="python"> | ||
| + | a = array([[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0]]) | ||
| + | b = array([[1,2,3],[4,5,6],[7,8,9]]) | ||
| + | </source> | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> a | ||
| + | array([[ 1.,  0.,  0.], | ||
| + |        [ 0.,  1.,  0.], | ||
| + |        [ 0.,  0.,  1.]]) | ||
| + | >>> b         | ||
| + | array([[1, 2, 3], | ||
| + |        [4, 5, 6], | ||
| + |        [7, 8, 9]]) | ||
| + | >>> transpose(b) | ||
| + | array([[1, 4, 7], | ||
| + |        [2, 5, 8], | ||
| + |        [3, 6, 9]]) | ||
| + | </source> | ||
| + | |||
| + | Given an array, we may inquire about it's shape: | ||
| + | |||
| + | <source lang="python"> | ||
| + | print a.shape | ||
| + | </source> | ||
| + | |||
| + | and we are told that it is a 2-dimensional array (i.e. an array of rank 2) and that the length of both dimensions is 3: | ||
| + | |||
| + | <source lang="python"> | ||
| + | (3, 3) | ||
| + | </source>  | ||
| + | |||
| + | We can also apply operators to array objects.  For example: | ||
| + | |||
| + | <source lang="python"> | ||
| + | a = a * 9 | ||
| + | </source> | ||
| + | |||
| + | <source lang="python"> | ||
| + | array([[ 9.,  0.,  0.], | ||
| + |        [ 0.,  9.,  0.], | ||
| + |        [ 0.,  0.,  9.]]) | ||
| + | </source> | ||
| + | |||
| + | '''Note, however, that most operations on numpy arrays are done element-wise''', which is '''different to a linear algebra operation that you may have been expecting.'''  We will return to linear algebra operations when we look at the '''scipy''' package. | ||
| + | |||
| + | Should we so desire, we could re-shape the array.  One way to do this is to to set it's shape attribute directly: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> a.shape = (1,9) | ||
| + | >>> a | ||
| + | array([[ 9.,  0.,  0.,  0.,  9.,  0.,  0.,  0.,  9.]]) | ||
| + | </source> | ||
| + | |||
| + | As with the list example, it can be useful to read or change the value of an element (or sub array) individually.  Let's turn the array back to it's rank-2 form and try it out: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> a.shape = (3,3) | ||
| + | >>> a[1,1] = 777.0 | ||
| + | >>> print a | ||
| + | [[   9.    0.    0.] | ||
| + |  [   0.  777.    0.] | ||
| + |  [   0.    0.    9.]] | ||
| + | >>> a[1:,1:] = [[777.0, 777.0],[777.0, 777.0]] | ||
| + | >>> print a | ||
| + | [[   9.    0.    0.] | ||
| + |  [   0.  777.  777.] | ||
| + |  [   0.  777.  777.]] | ||
| + | </source> | ||
| + | |||
| + | This is all pretty handy so far, but specifying the value of each element explicitly could become a chore.  Happily some helper functions exist to give you a head start with some building blocks.  For example, your can use: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> b = zeros((3,3)) | ||
| + | >>> print b | ||
| + | >>> b = ones((3,2)) | ||
| + | >>> print b | ||
| + | >>> b = identity(2) | ||
| + | >>> print b | ||
| + | >>> big = resize(b, (6,6)) | ||
| + | >>> print big | ||
| + | </source> | ||
| + | |||
| + | The use of '''resize''' in the last example illustrates a useful '''replicating feature'''. | ||
| + | |||
| + | A list of all the functions and operations contained within numpy is: http://scipy.org/Numpy_Example_List. | ||
| + | |||
| + | =Pylab and Matplotlib= | ||
| + | |||
| + | The above examples are quite natty, but we have deliberately kept the array sizes small so that we can print the element values easily.  In practice, you may find that your array sizes are much larger and printing the values to the screen is impractical.  Fear not!  Python has many packages which help you plot your data, so that you can explore it. | ||
| + | |||
| + | Using the pylab plotting interface we can create: | ||
| + | |||
| + | <source lang="python"> | ||
| + | import pylab | ||
| + | from numpy import arange, pi, cos, sin, add, sqrt | ||
| + | t = arange(0.0, 3.0, 0.01) | ||
| + | c = cos(2 * pi * t) | ||
| + | s = sin(2 * pi * t) | ||
| + | pylab.ylabel('some numbers') | ||
| + | pylab.xlabel('some more numbers') | ||
| + | pylab.plot(t, c, 'r', lw=2) | ||
| + | pylab.plot(t, s, 'b', lw=2) | ||
| + | pylab.plot(t, c-s, 'gs', lw=2) | ||
| + | pylab.ylim(-1.5, 1.5) | ||
| + | pylab.title('sin and cos functions') | ||
| + | pylab.savefig('curves', dpi=300) | ||
| + | </source> | ||
| + | |||
| + | Where '''curves.png''' looks like: | ||
| + | |||
| + | [[Image:Curves.png|thumb|600px|none|Some nice curves]] | ||
| + | |||
| + | You can open .png images from the linux command line (inc. bluecrystal) using, e.g.: '''display -resize 1000 curves.png'''  | ||
| + | |||
| + | We can also use Matplotlib directly for more control: | ||
| + | |||
| + | <source lang="python"> | ||
| + | import matplotlib.pyplot as plt | ||
| + | from pylab import meshgrid | ||
| + | from numpy import arange, add, sin, sqrt | ||
| + | x = arange(-5,10) | ||
| + | y = arange(-4,11) | ||
| + | z1 = sqrt(add.outer(x**2,y**2)) | ||
| + | Z = sin(z1)/z1  | ||
| + | X, Y = meshgrid(x,y) | ||
| + | plt.figure() | ||
| + | plt.contour(X,Y,Z) | ||
| + | plt.show() | ||
| + | </source> | ||
| + | |||
| + | and you should get a window similar to: | ||
| + | |||
| + | [[Image:Sinc-matplotlib-contour.png|thumb|600px|none|A contour map of the sinc function]] | ||
| + | |||
| + | Perhaps the best way next step for matplotlib is to look at the gallery: http://matplotlib.org/gallery.html. | ||
| + | Just click on a figure and you will get the code used to generate it--a really great resource! | ||
| + | |||
| + | ==Input and Output== | ||
| + | |||
| + | The foregoing is all very interesting, but life would be rather dull if you had to re-enter all your data by hand whenever you set to work with Python and numpy.  Therefore we need a means to save data to a file and load it again.  Happily, we can do this rather easily using a couple of routines from the '''pylab''' package: | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> from numpy import * | ||
| + | >>> from pylab import load | ||
| + | >>> from pylab import save | ||
| + | >>> data = zeros((3,3)) | ||
| + | >>> save('myfile.txt', data) | ||
| + | >>> read_data = load("myfile.txt") | ||
| + | </source> | ||
| + | |||
| + | '''warning, the load() function of numpy will be shadowed''' in the above example.  One way to protect yourself against this is to make use of '''namespaces''':  Modify your import command to '''import pylab''' and then use '''pylab.load(..)'''. | ||
| + | |||
| + | =Scipy= | ||
| + | |||
| + | * http://www.scipy.org/ | ||
| + | * ..and good examples on http://scipy-lectures.github.com/intro/scipy.html | ||
| + | * Many useful features: | ||
| + | * Integration & Differentiation | ||
| + | * Optimisation (curve fitting, etc) | ||
| + | * Fourier transforms | ||
| + | * Signal processing | ||
| + | * Statistical algorithms | ||
| + | * Much, much more... | ||
| + | * If you know Python you can use SciPy | ||
| + | |||
| + | ==An example: Differentiation== | ||
| + | |||
| + | <source lang="python"> | ||
| + | >>> # derivative of x^2 at x=3 | ||
| + | ... | ||
| + | >>> from scipy import derivative | ||
| + | >>> derivative(lambda x: x**2, 3) | ||
| + | 6.0 | ||
| + | >>> # also works with arrays | ||
| + | ... | ||
| + | >>> from numpy import array | ||
| + | >>> my_array = array([1,2,3]) | ||
| + | >>> derivative(lambda x: x**2,my_array) | ||
| + | array([ 2., 4., 6.]) | ||
| + | </source> | ||
| + | |||
| + | Google for many more examples pertaining to your favourite numerical procedure! | ||
| + | |||
| + | =A Repository of Packages You Could Use= | ||
| + | |||
| + | Now, we've touched on a couple, but there are thousands of python packages available.  Before you start writing your own function for X, check that someone hasn't contributed code for that already at http://pypi.python.org/pypi. | ||
| + | |||
| + | '''pip''', the python package manager will look in pypi by default to install a package.  You can use the '''--user''' option to install python packages in your own user space.  See: | ||
| + | * https://pip.readthedocs.org/en/latest/ | ||
| + | for more information on pip. | ||
| + | |||
| + | =Writing Faster Python= | ||
| + | |||
| + | As with other scripting languages, such as MATLAB and R, one of the simplest ways in which you can write faster python code is to eliminate loops by vectorising your code. | ||
| + | |||
| + | Consider the following two scripts.  First '''for-loop.py''': | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | |||
| + | import numpy as np | ||
| + | arr = np.random.rand(1000000) | ||
| + | |||
| + | def filter(arr): | ||
| + |     for i, val in enumerate(arr): | ||
| + |         if val < 0.5: | ||
| + |             arr[i] = 0 | ||
| + |     return arr | ||
| + | |||
| + | if __name__ == "__main__": | ||
| + |     filter(arr) | ||
| + | </source> | ||
| + | |||
| + | and secondly, '''vectorised.py''': | ||
| + | |||
| + | <source lang="python"> | ||
| + | #!/usr/bin/env python | ||
| + | |||
| + | import numpy as np | ||
| + | arr = np.random.rand(1000000) | ||
| + | |||
| + | def filter(arr): | ||
| + |     arr[arr < 0.5] = 0 | ||
| + |     return arr | ||
| + | |||
| + | if __name__ == "__main__": | ||
| + |     filter(arr) | ||
| + | </source> | ||
| + | |||
| + | If we now run these two scripts through the Linux command line '''time''' utility, we see that the vectorised code runs a lot faster than the for loop: | ||
| + | |||
| + | <pre> | ||
| + | gethin@gethin-desktop:~$ time ./for-loop.py  | ||
| + | |||
| + | real	0m0.963s | ||
| + | user	0m0.952s | ||
| + | sys	0m0.012s | ||
| + | gethin@gethin-desktop:~$ time ./vectorised.py  | ||
| + | |||
| + | real	0m0.116s | ||
| + | user	0m0.096s | ||
| + | sys	0m0.020s | ||
| + | </pre> | ||
| + | |||
| + | For some more tips on writing faster python code, and examples of how to use one of the python profiler modules, take a look at: | ||
| + | * https://wiki.python.org/moin/PythonSpeed/PerformanceTips | ||
| + | * http://technicaldiscovery.blogspot.co.uk/2011/06/speeding-up-python-numpy-cython-and.html | ||
| + | * http://www.huyng.com/posts/python-performance-analysis/ | ||
| + | * http://www.appneta.com/2012/05/21/profiling-python-performance-lineprof-statprof-cprofile/ | ||
| + | |||
| + | =Further Reading= | ||
| + | |||
| + | * http://docs.python.org/tutorial/ | ||
| + | * http://wiki.python.org/moin/PythonBooks | ||
Latest revision as of 15:21, 10 October 2014
Python for Scientists
Introduction
With thanks to Simon Metson and Mike Wallace for much of the following material.
Getting Started on BlueCrystal Phase-2
After you have logged in, type the following at the command line:
module add languages/python-2.7.2.0 python
This should start up an interactive python session:
Python 2.7.2 (default, Aug 25 2011, 10:51:03) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
where we can type commands at the >>> prompt.
Python as a Calculator
To get started, let's just try a few commands out. If you type:
>>> print "Hello!"
you'll get:
Hello!
If you try:
>>> print 5 + 9
you'll get:
14
So far so simple! Here is a copy of a session containing a few more commands where we've set the values of some variables and also defined and run our own function:
>>> five = 5
>>> neuf = 9
>>> print five + neuf
14
>>> def say_hello():
...     print "Hello, world!"
... # hit return here 
>>> say_hello()
Hello, world!
You can exit an interactive session at any time by typing Ctrl-D.
Getting Help
One of the good things about Python is that it has lots of useful online documentation. (There are good books on the language too.) For example, take a look at: http://docs.python.org/. You can also type help() and the interpreter prompt:
>>> help()
Welcome to Python 2.7!  This is the online help utility.
If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/tutorial/.
Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To quit this help utility and
return to the interpreter, just type "quit".
...
help> keywords
Here is a list of the Python keywords.  Enter any keyword to get more help.
and                 elif                if                  print
...
help> if
The ``if`` statement
********************
The ``if`` statement is used for conditional execution:
   if_stmt ::= "if" expression ":" suite
               ( "elif" expression ":" suite )*
               ["else" ":" suite]
It selects exactly one of the suites by evaluating the expressions one
by one until one is found to be true...
...
help> quit
You are now leaving help and returning to the Python interpreter.
...
>>> 
Making a Script
An interactive session can be fun and useful for trying things out. However--to save our fingers--we will typically want to execute a series of commands as a script, created using your favourite text editor. Here are the contents of an example script:
#!/bin/env python
print "Hello, from a python script!"
Ensure that your script is executable:
chmod u+x myscript.py
and now you can run it:
[ggdagw@bigblue4 ~]$ ./myscript.py Hello, from a python script!
Python and Whitespace
Love it of hate it, Python incorporates whitespace in it's syntax. (It's either that or demarcate blocks with some other syntax, such as ending a line with a semi-colon as it is in C. Pick your poison.) Spacing is therefore key in creating a valid python script. For example:
message = "happy days!"
if len(message) > 10:
    print "longer.."
else:
    print "shorter.."
will work, but:
message = "happy days!"
if len(message) > 10:
 print "longer.."
else:
print "shorter.."
will not:
  File "./myscript.py", line 7
    print "shorter.."
        ^
IndentationError: expected an indented block
It is therefore a great advantage, when writing to python script, to use a text editor which has a dedicated python mode--such as emacs--and will actively help you to keep your spacing correct. See, http://wiki.python.org/moin/PythonEditors, for an extensive list.
Some Suggested Exercises
- Calculate the volume of a sphere.  You can experiment with the following (where r needs to be set to some value):
- 4/3 * 3.14159265359 * r ** 3 
- 4.0/3.0 * 3.14159265359 * pow(r,3) 
- float(4)/float(3) * 3.14159265359 * pow(r,3) 
 
- Concatenate two strings
- Write a recursive function to compute fibonacci numbers (Hint: F(n) = F(n-1) +F(n-2), F(0)=0 and F(1)=1)
Nuts and Bolts
Types
Python has intrinsic types including, integers, floats, booleans and complex numbers. It is dynamically typed (meaning that you don't have to have a block of variable declarations at the top of your script), but it is not weakly typed, for example:
>>> my_complex = 2 + 0.5j
>>> my_complex
(2+0.5j)
>>> my_complex.real
2.0
>>> my_complex.imag
0.5
>>> name = 'fred'
>>> lucky = 7
>>> name + lucky
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects
Strings
The eagle-eyed will have spotted in a previous examples that we could ask the length a character string--straight off the bat. No need to write a counting routine ourselves:
message = "happy days!"
print len(message)
11
We also take slices of our character string. In my case
print message[:5]
happy
Since a string is an object (in the object oriented programming sense of the word, but more of that another time...) we can call a number of methods that operate on a string. A selected sample include:
| s.find(sub) | Finds the first occurrence of the given substring | 
| s.islower() | Checks whether all characters are lowercase | 
| s.upper() | Returns s converted to uppercase | 
| s.strip() | Removes leading and trailing whitespace | 
| s.replace(old,new) | Replaces substring old with new | 
| s.split([sep]) | Splits s uses (optional) sep as a delimiter. Returns a list | 
Lists and Tuples
An example of a list is:
shopping = ['bread', 'marmalade', 'milk', 'tea']
and we can inquire about the length of that using the same function as before:
len(shopping)
We can also take slices of a list, as we did with a string:
shopping[0:2]
and even reset a portion of the list that way:
shopping[0:2] = ['bagels', 'jam']
Since a list is also an object, we have more handy methods, including:
| s.append(x) | Appends an new element x to the end of s | 
| s.count(x) | Returns the number of occurences of x in s | 
| s.reverse(x) | Reverses items of s in place | 
| s.sort([compfunc]) | Sorts items of s in place. compfunc is an optional comparison function | 
Tuples are very similar to lists and support many of the same operations (indexing, slicing, concatenation etc.) but differ in that they are not mutable after creation:
>>> mytuple = ('fred', 'ginger', 7, 2.5)
>>> mylist = ['fred', 'ginger', 7, 2.5]
>>> mylist[2] = 8
>>> print mylist
['fred', 'ginger', 8, 2.5]
>>> print mytuple[2]    
7
>>> mytuple[2] = 8
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignmentList comprehension:
>>> numbers = [12, 3, 90, 40, 52, 11, 10]
>>> small_numbers_doubled = [number * 2 for number in numbers if number < 20]
>>> small_numbers_doubled
[24, 6, 22, 20]
Dictionaries
A dictionary is an associative array or hash table, containing key-value pairs:
mydict = {'thomas':'blue', 'james':'red', 'henry':'green'}
>>> print mydict['james']
red
We can write much more user-friendly and intuitive code using dictionaries, rather than arbitrary indexes into a list.
Some example dictionary methods are:
| m.keys() | Returns a list of the keys in m | 
| m.items() | Returns a list of the (key,value) pairs in m | 
| m[k] = x | Sets m[k] to x | 
| m.update(b) | Adds objects from dictionary b to m | 
Control Structures
Of course, we'll need conditionals and loops etc. to go beyond the simplest of scripts. Here is an if-then-else, python style:
if sky == ‘blue’:
    birds_sing()
elif sky == ‘black’:
    birds_sleep()
else:
    pass #do nothing
and a classic for loop:
for ii in range(1,10):
    print ii
1 ... 9 >>>
We'll also see a while loop shoehorned into the next example.
For our control statements, we can use comparison operators such as, ==, !=, >, <, <=, >=, and logical operators, such as, and, or,not
File Input and Output
Here's some code for printing the contents of a text file:
fp = open("foo.txt","r")
line = fp.readline()
while line:
    line = line.strip()
    print line
    line = fp.readline()
fp.close()
We could open a file for writing with:
fp = open("foo.txt","w")
and use:
fp.write(...)
to write to that file.
Object Oriented Programming in Python
Here is an example of using a class in python:
#!/usr/bin/env python
class Radio:
    "A simple radio"
    def __init__(self,freq=0.0,name=""):
        "Constructor method"
        self.__frequency=freq
        self.name=name
    def tune(self,freq):
        self.__frequency=freq
    def tuned_to(self):
        print self.name, "tuned to:", self.__frequency
if __name__ == "__main__":
    # declare two radio instances
    car = Radio(name="car")
    kitchen = Radio(91.5,"kitchen")
    # call some methods
    car.tuned_to()
    kitchen.tuned_to()
    car.tune(89.3)
    car.tuned_to()
    # Docstrings--double quotes at the top of the class:                        
    print car.__doc__
    # NB members not private by default:
    print car.name
    # BUT leading double underscores will trigger
    # name mangling and hence the member will be hidden 
    print car.__frequency
Running the script gives us:
car tuned to: 0.0
kitchen tuned to: 91.5
car tuned to: 89.3
A simple radio
car
Traceback (most recent call last):
  File "./foo.py", line 27, in <module>
    print car.__frequency
AttributeError: Radio instance has no attribute '__frequency'
Using Packages
Python packages are great because they provide us with a whole lot of extra functionality--above and beyond the core language--that we didn't have to write and debug ourselves.
Let's walk through a simple example using a package. At an interactive prompt type:
from random import randint
This will give us access to the randint(x,y) function, which returns a randomly chosen integer from the given range [x,y]:
>>> randint(0,10)
4
>>> randint(0,10)
1
>>> randint(0,10)
3
>>> randint(0,10)
0
OK, so far so good. One thing to note is that the above import statement has drawn the name randint into our current namespace. What if we had already defined a function named randint. That could cause problems. In order to protect ourselves from this kind of problem, there are several import variants.
By default, functions will be added to a namespace with the same name as the package. In order to call the functions we will, in this case, have to prefix them with there namespace:
>>> import random
>>> random.randint(0,10)
6
Should we desire, we can apply a little more control and specify the namespace for the import ourselves:
>>> import random as rnd
>>> rnd.randint(0,10)
3
Another--more 'devil-may-care'--approach is to do away with the separate namespace and pull everything from a given package into the current namespace:
>>> from random import *
>>> randint(0,10)
9
>>> random()
0.3172268098313996
(The random() function returns a randomly selected floating point number in the range [0, 1)--that is, between 0 and 1, including 0.0 but always smaller than 1.0.)
Interrogating a Module
To find all the functions that are in a particular module, type dir(<modulename>).
If you have the pip package installed, you can easily see which other packages are installed using pip list on the linux command line.
A Namespace Collision
>>> def randint():
...     print "dummy function"
... 
>>> randint()
dummy function
>>> from random import randint
>>> randint()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: randint() takes exactly 3 arguments (1 given)
>>> randint(0,10)
0
Python for Shell Scripting
from subprocess import call
call(["ls", "-l"])
Python as a Glue Languge
- Calling R from python is possible using: http://rpy.sourceforge.net/index.html.
- Calling Matlab from python: http://mlabwrap.sourceforge.net.
- With SWIG you can make many bindings, including Python to C and C++: http://www.swig.org/.
- Or if Fortran is more your cup-of-tea, you can use f2py: http://cens.ioc.ee/projects/f2py2e/.
- There are many more examples.
Command Line Parsing
#!/usr/bin/env python
import sys
if __name__ == "__main__":
    # We can test on the length of argv
    if len(sys.argv) < 2:
        print "usage: to use this script..."
    else:
        ii = 0
        for arg in sys.argv:
            # (typically) argv[0] is bound to the script name
            print "arg", ii, "is:", arg
            ii = ii+1
gethin@gethin-desktop:~$ ./cmdline.py usage: to use this script... gethin@gethin-desktop:~$ ./cmdline.py fred ginger arg 0 is: ./cmdline.py arg 1 is: fred arg 2 is: ginger
Databases
Simple Databases
Python provides access to some database packages through some standard packages. The bsddb module allows you to access the highly popular Berkeley DB database from your python code.
The interface to the database provided by this module is very similar to the way in which we access a dictionary. First, let's populate a database:
import bsddb
d = bsddb.btopen('engines.db')
d['thomas'] = 'blue'
d['james'] = 'red'
d['henry'] = 'green'
d.close()
Now let's open the database again and query it's contents:
>>> d = bsddb.btopen('engines.db')
>>> d.keys()
['henry', 'james', 'thomas']
>>> d.first()
('henry', 'green')
>>> d.last()
('thomas', 'blue')
>>> colour = d['james']
>>> colour
'red'
>>> del d['henry']
>>> d.keys()
['james', 'thomas']
Relational Databases
Relational databases give us more oomph. SQLite is a useful relational database to consider as it is light, in that it requires hardly anything in terms of setup or management, yet still understands queries formulated in SQL. As such it is useful for creating relatively simple examples of SQL access to a database in python and is a stepping stone toward more powerful database packages.
Here is a script which will create a table called planets in the file pytest.db and populate with details of the planets in our solar system:
#!/usr/bin/env python
#
# Example python script using sqlite3 package
# to connect to an SQLite database.
#
import sqlite3
 
conn = sqlite3.connect('pytest.db') # or use :memory: to put it in RAM
cursor = conn.cursor()
 
# create a table
cursor.execute("""CREATE TABLE planets
                  (Id INT, Name TEXT, Diameter REAL, 
                   Mass REAL, Orbital_Period REAL)""")
# insert a single record
cursor.execute("INSERT INTO planets VALUES(1,'Mercury',0.382,0.06,0.24)")
conn.commit() # save data to file
 
# insert multiple records
other_planets = [(2,'Venus',0.949,0.82,0.72),
                 (3,'Earth',1.0,1.0,1.0),
                 (4,'Mars',0.532,0.11,1.52),
                 (5,'Jupiter',11.209,317.8,5.20),
                 (6,'Saturn',9.449,95.2,9.54),
                 (7,'Uranus',4.007,14.6,19.22),
                 (8,'Neptune',3.883,17.2,30.06),
                 (9,'Pluto',0.18,0.002,248.09)]
cursor.executemany("INSERT INTO planets VALUES (?,?,?,?,?)", other_planets)
conn.commit() # save data to file
# delete a record
sql = """
DELETE FROM planets
WHERE Name = 'Pluto'
"""
cursor.execute(sql)  # poor old pluto! 
conn.commit()
And here is a short example script showing a couple of ways to interrogate the database:
#!/usr/bin/env python
#
# Example python script using sqlite3 package
# to connect to an SQLite database.
#
import sqlite3
 
conn = sqlite3.connect('pytest.db') # or use :memory: to put it in RAM
cursor = conn.cursor()
print "All the records in the table, ordered by Name:\n"
for row in cursor.execute("SELECT rowid, * FROM planets ORDER BY Name"):
    print row
print "\n"
print "All the planets with a mass greater than or equal to that of Earth:\n"
sql = "SELECT * FROM planets WHERE Mass>=?"
cursor.execute(sql, [("1.0")])
for row in cursor.fetchall():  # or use fetchone()
    print row
Where the results of running the script are:
All the records in the table, ordered by Name: (3, 3, u'Earth', 1.0, 1.0, 1.0) (5, 5, u'Jupiter', 11.209, 317.80000000000001, 5.2000000000000002) (4, 4, u'Mars', 0.53200000000000003, 0.11, 1.52) (1, 1, u'Mercury', 0.38200000000000001, 0.059999999999999998, 0.23999999999999999) (8, 8, u'Neptune', 3.883, 17.199999999999999, 30.059999999999999) (6, 6, u'Saturn', 9.4489999999999998, 95.200000000000003, 9.5399999999999991) (7, 7, u'Uranus', 4.0069999999999997, 14.6, 19.219999999999999) (2, 2, u'Venus', 0.94899999999999995, 0.81999999999999995, 0.71999999999999997) All the planets with a mass greater than or equal to that of Earth: (3, u'Earth', 1.0, 1.0, 1.0), (5, u'Jupiter', 11.209, 317.80000000000001, 5.2000000000000002), (6, u'Saturn', 9.4489999999999998, 95.200000000000003, 9.5399999999999991), (7, u'Uranus', 4.0069999999999997, 14.6, 19.219999999999999), (8, u'Neptune', 3.883, 17.199999999999999, 30.059999999999999)
For more information on using SQLite with Python, see, e.g.:
- http://zetcode.com/db/sqlitepythontutorial/
- http://www.blog.pythonlibrary.org/2012/07/18/python-a-simple-step-by-step-sqlite-tutorial/
You can also connect to a MySQL database from python using, e.g. the python-mysqldb package. A snippet of python code for connecting to a database is:
#!/usr/bin/env python
import MySQLdb
conn = MySQLdb.connect(host="localhost",   # your host, usually localhost
                     user="gethin",      # your username
                      passwd="changeme", # your password
                      db="menagerie")    # name of the data base
# Create a cursor object, as before with SQLite
cur = conn.cursor() 
# and then you can submit your SQL command:
cur.execute("SELECT * FROM YOUR_TABLE_NAME")
Numpy
OK, let's move onto looking at python's numerical processing capabilities. We will start by looking at the numpy package:
from numpy import *
Now that we have access to the functions from numpy, let's create an array. Note that a numpy array is an object of a different type to an intrinsic array in Python. A simple approach is to use the array function. For example we might enter:
a = array([[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0]])
b = array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])
>>> b        
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
>>> transpose(b)
array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])
Given an array, we may inquire about it's shape:
print a.shape
and we are told that it is a 2-dimensional array (i.e. an array of rank 2) and that the length of both dimensions is 3:
(3, 3)
We can also apply operators to array objects. For example:
a = a * 9
array([[ 9.,  0.,  0.],
       [ 0.,  9.,  0.],
       [ 0.,  0.,  9.]])
Note, however, that most operations on numpy arrays are done element-wise, which is different to a linear algebra operation that you may have been expecting. We will return to linear algebra operations when we look at the scipy package.
Should we so desire, we could re-shape the array. One way to do this is to to set it's shape attribute directly:
>>> a.shape = (1,9)
>>> a
array([[ 9.,  0.,  0.,  0.,  9.,  0.,  0.,  0.,  9.]])
As with the list example, it can be useful to read or change the value of an element (or sub array) individually. Let's turn the array back to it's rank-2 form and try it out:
>>> a.shape = (3,3)
>>> a[1,1] = 777.0
>>> print a
[[   9.    0.    0.]
 [   0.  777.    0.]
 [   0.    0.    9.]]
>>> a[1:,1:] = [[777.0, 777.0],[777.0, 777.0]]
>>> print a
[[   9.    0.    0.]
 [   0.  777.  777.]
 [   0.  777.  777.]]
This is all pretty handy so far, but specifying the value of each element explicitly could become a chore. Happily some helper functions exist to give you a head start with some building blocks. For example, your can use:
>>> b = zeros((3,3))
>>> print b
>>> b = ones((3,2))
>>> print b
>>> b = identity(2)
>>> print b
>>> big = resize(b, (6,6))
>>> print big
The use of resize in the last example illustrates a useful replicating feature.
A list of all the functions and operations contained within numpy is: http://scipy.org/Numpy_Example_List.
Pylab and Matplotlib
The above examples are quite natty, but we have deliberately kept the array sizes small so that we can print the element values easily. In practice, you may find that your array sizes are much larger and printing the values to the screen is impractical. Fear not! Python has many packages which help you plot your data, so that you can explore it.
Using the pylab plotting interface we can create:
import pylab
from numpy import arange, pi, cos, sin, add, sqrt
t = arange(0.0, 3.0, 0.01)
c = cos(2 * pi * t)
s = sin(2 * pi * t)
pylab.ylabel('some numbers')
pylab.xlabel('some more numbers')
pylab.plot(t, c, 'r', lw=2)
pylab.plot(t, s, 'b', lw=2)
pylab.plot(t, c-s, 'gs', lw=2)
pylab.ylim(-1.5, 1.5)
pylab.title('sin and cos functions')
pylab.savefig('curves', dpi=300)
Where curves.png looks like:
You can open .png images from the linux command line (inc. bluecrystal) using, e.g.: display -resize 1000 curves.png
We can also use Matplotlib directly for more control:
import matplotlib.pyplot as plt
from pylab import meshgrid
from numpy import arange, add, sin, sqrt
x = arange(-5,10)
y = arange(-4,11)
z1 = sqrt(add.outer(x**2,y**2))
Z = sin(z1)/z1 
X, Y = meshgrid(x,y)
plt.figure()
plt.contour(X,Y,Z)
plt.show()
and you should get a window similar to:
Perhaps the best way next step for matplotlib is to look at the gallery: http://matplotlib.org/gallery.html. Just click on a figure and you will get the code used to generate it--a really great resource!
Input and Output
The foregoing is all very interesting, but life would be rather dull if you had to re-enter all your data by hand whenever you set to work with Python and numpy. Therefore we need a means to save data to a file and load it again. Happily, we can do this rather easily using a couple of routines from the pylab package:
>>> from numpy import *
>>> from pylab import load
>>> from pylab import save
>>> data = zeros((3,3))
>>> save('myfile.txt', data)
>>> read_data = load("myfile.txt")
warning, the load() function of numpy will be shadowed in the above example. One way to protect yourself against this is to make use of namespaces: Modify your import command to import pylab and then use pylab.load(..).
Scipy
- http://www.scipy.org/
- ..and good examples on http://scipy-lectures.github.com/intro/scipy.html
- Many useful features:
- Integration & Differentiation
- Optimisation (curve fitting, etc)
- Fourier transforms
- Signal processing
- Statistical algorithms
- Much, much more...
- If you know Python you can use SciPy
An example: Differentiation
>>> # derivative of x^2 at x=3
...
>>> from scipy import derivative
>>> derivative(lambda x: x**2, 3)
6.0
>>> # also works with arrays
...
>>> from numpy import array
>>> my_array = array([1,2,3])
>>> derivative(lambda x: x**2,my_array)
array([ 2., 4., 6.])
Google for many more examples pertaining to your favourite numerical procedure!
A Repository of Packages You Could Use
Now, we've touched on a couple, but there are thousands of python packages available. Before you start writing your own function for X, check that someone hasn't contributed code for that already at http://pypi.python.org/pypi.
pip, the python package manager will look in pypi by default to install a package. You can use the --user option to install python packages in your own user space. See:
for more information on pip.
Writing Faster Python
As with other scripting languages, such as MATLAB and R, one of the simplest ways in which you can write faster python code is to eliminate loops by vectorising your code.
Consider the following two scripts. First for-loop.py:
#!/usr/bin/env python
import numpy as np
arr = np.random.rand(1000000)
def filter(arr):
    for i, val in enumerate(arr):
        if val < 0.5:
            arr[i] = 0
    return arr
if __name__ == "__main__":
    filter(arr)
and secondly, vectorised.py:
#!/usr/bin/env python
import numpy as np
arr = np.random.rand(1000000)
def filter(arr):
    arr[arr < 0.5] = 0
    return arr
if __name__ == "__main__":
    filter(arr)
If we now run these two scripts through the Linux command line time utility, we see that the vectorised code runs a lot faster than the for loop:
gethin@gethin-desktop:~$ time ./for-loop.py real 0m0.963s user 0m0.952s sys 0m0.012s gethin@gethin-desktop:~$ time ./vectorised.py real 0m0.116s user 0m0.096s sys 0m0.020s
For some more tips on writing faster python code, and examples of how to use one of the python profiler modules, take a look at:
- https://wiki.python.org/moin/PythonSpeed/PerformanceTips
- http://technicaldiscovery.blogspot.co.uk/2011/06/speeding-up-python-numpy-cython-and.html
- http://www.huyng.com/posts/python-performance-analysis/
- http://www.appneta.com/2012/05/21/profiling-python-performance-lineprof-statprof-cprofile/


