Fortran2

'Fortran2: Getting the most from Fortran'

=Following On=

Roll Call: Jonny, Lauren, Emma, SarahS, Rita

These notes follow on from the Fortran1 course. You may want to look back at those notes and examples, or if you're feeling confident, just dive in here!

To get the example programs login in to a Linux box and type:

svn co http://source.ggy.bris.ac.uk/subversion-open/fortran2/trunk fortran2

=Allocatable Arrays=

We'll begin by looking at allocatable arrys.

Previously, we declared the size of our arrays when we wrote our code. This is all well and good, and at least we knew where we stood. However, this exact specifiation of the shape of all things can become a painful limitation. Suppose you were writing a program to process a list of University employees. The number of employees will almost certainly vary from one month to the next. So, we can only know the number of records to read in at runtime. This sort of thing starts to crop up a lot when we aim to write more general-purpose, flexible & robust programs.

Fear not! however, as trusty Fortran90 is here to help. In Fortran90 we can write array declarations of the form:

,allocatable,dimension :: my1dAllocatableArray ! a 1D array of type, which I will give a size to later

Let's go to an example:

cd fortran2/examples/example1

and take a look inside the file allocatable.f90.

In this example, we have a 2D grid and corresponding arrays of longitude and latitude coordinates, the sizes of which we will read from a namelist file at runtime. We will then allocate an appropriate amount of space, fill in the oordinate values according to the number of grid-cells specified in the file and report the whole lot to standard-out. Our arrays are declared with the allocatable attribute:

real,allocatable,dimension  :: x_coords          ! x-coordinate values real,allocatable,dimension  :: y_coords          ! y-coordinate values real,allocatable,dimension :: data             ! 2D data on grid

We've covered namelists before, so reading the values of the variables nx and ny is straighforward. Once we know these values we can request memory space for our arrays. Note that it is an excellent idea to check whether this request was satisfied. If it were not for some reason (if you request a huge amount of memory, your computer may not be able to oblige for example) we probably want the program to stop, lest bad things happen!

allocate(x_coords(nx),stat=errstat) if (errstat /= 0) then write (*,*) 'ERROR: could not allocate x_coords array' stop endif

Next is a spot of arrithmetic to fill out the coordinate arrays properly. We want cell-centre coordinates, with longitude running from 0 to 360 degrees and lattitude from -90 to +90 degrees.

So we're done? Almost, but not quite! It is vital that once you're finished with it, you clear up any memory that you allocated earlier. If you don't, bad things are very likely to happen. So the last few lines look like:

if(allocated(data))      deallocate(data) if(allocated(x_coords))  deallocate(x_coords) if(allocated(y_coords))  deallocate(y_coords)

Here, we're a bit smarter than the average bear, and we've checked to see if an array has indeed been allocated before deallocating it. It would be unwise to try to deallocate something which had not been allocated in the first place.

So there we are. Sorted. Allocatable arrays. Try varying the values in input.nml first, and then procede to writing a few allocatable arrays of your own.

=User Derived Types=

Next up, let's look at another very useful feature provided by good old Fortran90---user-derived types:

cd ../example2

Way back at the start of Fortran1 we looked at the half dozen or so intrinsic types offered by Fortran90. It's not so much that this set of types is becoming a limitation (as we found with static array sizes) but more that for a large program things get a bit, well, messy! "Not the end of the world", you may say. And you're right. However, mess does lead to bugs and bugs are a huge problem, so actually messy, spaghetti-like code, is a significant problem that we want to avoid at all costs.

Now, user-derived types offer us something simple, yet potentially extremely powerful--we can group things which are related into a single composite type. That is, we can group all the things that belong together into a single entity and then pass that entity around as work is done by our program. This is a huge bonus when designing and maintaining a reasonably sized program. As we'll find out with growing experience, user-derived types are, pretty much, the best thing since sliced bread!

User-derived types represent a first step down the road towards object-oriented programming. A step in the right direction in terms of designing programs.

OK, enough of the spiel. Let's take a look at user-types.f90.

I recently bought a DAB radio and I'm as pleased as punch with it. One of the great features is that you can select a radio station by name, push a button and bingo!, you're tuned in. No more scrolling around the dial trying to get clear reception. That makes it really easy to 'station hop' and find something good that's on. Now, imagaine we were designing a program to install into a DAB radio. It's natural that we would want to keep the radio station name and tuning frequency (still needed, of course, but only by the radio itself as it now does the tuning for you) together in a single bundle. They belong together, after all, and are of different intrinsic types--a character string and a real number. In the program it looks like:

type station real                 :: frequency                   ! MHz character(len=maxstr) :: name                       ! str end type station

It wouldn't be much of a radio unless it had a number of stations. We can use an (allocatable) array of entities of our new derived type to store all the required station information:

type(station),allocatable,dimension  :: stations    ! array of stations

I've made the array allocatable for flexibility, or perhaps out of habit. I guess we could get more sophisticated and grow the array should more stations became availble. Which does indeed happen..

After allocating the array (and importantly checking to see whether we hit a problem in doing so), all I need to do is fill out the array and print it to the screen. Note the use of % to access elements of a derived-type.

We can pass derived-types around, into and out of subroutines for example, just like intrinsic types. This enables us to write very elegant, compartmentalised and hence maintainable programs. If we discover that we need some more information added to our radio station type, we just add it in one place and the rest of the program can remain unchanged--very handy!

Have a go now at writing a subroutine which takes the radio station array as an argument. Then add something new to the derived type and access it in your subroutine. Note that you can use the syntax:

type(station),dimension :: stations

to specify an array argument of unknown length. You can use the size and shape operators inside the subroutine to find out the length of the array in due course.

=Modules=

cd ../exampe3

In this example, we'll take a look at another very useful feature provided by Fortran90--modules. These provide further mileage down the road toward Object Oriented programming. They also do away with the need for common blocks, as used by older incarnations of the language.

Put simply, modules provide a way to collect together related variables and routines into a single entity. Other parts of your program may gain access to this entity through the use statement. In this way we see how the spaghetti tendency of large programs can be tamed, and how the slithery mess of pasta noodles starts to become a manageable collection of components that we plug together in a style more reminiscent of lego or mechano.

Modules also have other useful side-effects. For example, we get compiler interface (argument list) checks between your subroutine calls and definitions for free. Not to be sneezed at when a bug causes your program to suddenly die for some inexplicable reason. An argument list mismatch is a common culprit for this.

Let's take a look at our example module in mymod.f90. In this example we see the accumulation of other features of Fortran90 shown in previous examples. For example, we see allocatable arrays, user-derived types, reading data from namelists, checks for memory allocation and deallocation, paramter attributes on 'global' constants etc. etc.

On the first line of the file we see:

module mymod

This is key to creating a module and differs from the program prog we see on the first line of prog.f90. There is, of course, a corresponding end statement on the last line of the file.

We have the ubiquitious implicit none and then the declaration of any variables we wish place in the module. We have a constant named pi, which can be made available to any other part of the program which connects to this module. We could also make available non-constant variables in this way too, if we really had to!. This is how modules replace the sharing of variables via common blocks. Sharing 'global variables' in this way is worth avoiding, however, if you possibly can, as it is another common source of bugs and encourages spaghetti coding.

Next up we have a type declaration for the rather simple 'model' that we'll be using. This user-derived type encapsulates the internal model state together with a record of how many timesteps have been asked of the model. We need an instance of this type to work with and this is declared by:

type(simple_model),private         :: modelA

Note the new attribute, private. This means that this instance will not be made available to other parts of the program which connect to this module. This is a good example of defensive programming through data hiding. It simplifies the interface that the rest of the program has to our model and also guards against bugs which could creep in through the inadvertant modification of our simple model's internal state. Trust me, privacy is a good thing and well worth getting into the habit of specifying. It also prompts one to think about the design of our program a bit more, which is always a good thing.

Following on from our variable declaration, we have:

contains

This signals the end of the variables and the start of the defnitions of any routines (subroutines or functions) that we've included in our module. In this case we have three subroutines for initialising, timestepping and finalising our model, respectively. The routines contain code constructs that we have seen in previous examples. It's worth studying the error checks, use of external files, memory allocation and deallocation and the syntax for accessing elements of a derived type, just as a refresher.

OK, so much for the module, now let's look at a program which uses it---prog.f90.

This is a rather simple program, now that much of the action is hived-off into a module. Note that the very first line of the program, before even our beloved implicit none is:

use mymod, only: pi, initialise_model, timestep_model, finalise_model

Here we state that we will use the named module in this program (we could use a number of modules). Some additional defensive programming is that we only include the things we need from that module. This simplifies our life. We then procede to make use of any imported varaibles and routines as we see fit. Note, however, that we do not need direct access to the hidden internal state of our simple model.

Note also that the subroutine timestep_model_private is only called by routines inside the module. We do not need to call it from outside. We only need access to the routines (and perhaps variables) that comprise the interface to that module. All other details about how the module gets things done can be hidden from view. This greatly simplifies matters and helps us manage complexity in large programs. The exposed routines (and variables) are called the interface to that module. (So called Application Program Interfaces or APIs are used in conjunction with libraries of code. We can onsider the library as a functional 'black-box', aside from the interface routines--they are all we need to know about.)

OK, now that you are familiar with the vital anatomy of Fortran90 modules, have a play around. Start by changing the start-up state of the model by editing imput.nml. You could then proceed, for example, by adding new varaibles or subroutines to the module and making use of them in the program. When you're happy with that, you could experiment with the only and private keywords or even create a whole new module. Go on! Why not?:)

= Using separate files =

In the Fortran practicals so far, all compilations steps have been taken care of by the powerful build system program called make. However, we are reaching the time where you need to get your hands dirty and see how compilation can be done "by hand". Then we will see how the Fortran program can be split over several files.

$ cd ../example4

Invoking the compiler
The compiler that you have been using during the Fortran practicals is called g95. It is a free open source compiler. A range of compilers is available on the Linux platform. gfortran is another free software alternative. ifort and pgf90 are commercial offerings from Intel and the Portland Group respectively. In this section, the g95 compiler will be used but note that the practical would work just as well with any of these compilers.

To compile a Fortran file with g95, simply invoke g95 with the name of the file. It will then create a file called a.out which is executable on the machine you compiled it on.

$ cd hello_world $ g95 hello_world.f90 $ ls-rtl total 420 -rw-r--r-- 1 ggjpr uucp    93 Mar  3 20:39 hello_world.f90 -rwxr-xr-x 1 ggjpr uucp 411750 Mar 3 20:41 a.out

Try to execute the file a.out. Also not the size difference between the source file and the executable. Source files take up a lot less room.

The compiler could be invoked with many options to check differents aspects of the fortran syntax, to control what types of warnings to issue, to perform various optimisations of the code or to create debugging information. We will not go into details in this practical so refer to the compiler documentation if you want more information.

If you modify the source file, you need to recompile this file it before executing a.out for your changes to be taken into account.

The first compiler option we will be using is used to change the name of the executable file. This is done with the option -o followed by the name required:

$ g95 hello_world.f90 -o myprog

Now that we can invoke the compiler manually, let's see how to split a Fortran program into multiple files.

Creating objects and linking them
First of all, let's see how to put a subroutine in a separate file. In the subroutine folder you will find a basic Fortran program in the file onefile.f90. It contains a main program unit and a subroutine.

$ cd ../subroutine

You can compile it using the method above.

$ g95 onefile.f90 -o myprog $ ./myprog

The content of this small Fortran program has also been split into two separate files:
 * one for the main program unit (main_program.f90)
 * one the subroutine (sub1.f90).

If you invoke the compiler on either of these files, it will complain that something is missing. Try it to see the two types of error messages.

The reason for this is that (1) the main program needs the definition of the subroutine and (2) a Fortran program need a main program unit so neither file is enough on its own. Luckily, you can also invoke the compiler on the two files at the same time.

$ g95 main_program.f90 sub1.f90 -o myprog

It works! (and note that the order of the files does not matter). By giving two files as arguments to the compiler, it is clever enough to understand that there might be dependencies between the files and it takes care of them automatically.

The problem with this approach is that if the subroutine is modified, the main program will also have to be recompiled to generate the executable. This would be a issue for larger Fortran programs because it would be time consuming. We will also see later on that we might want to included external subroutines contained in a library and to which the source code is not available.

Instead, it is possible to ask the compiler to compile a file and "ignore" the dependencies. By doing this, the compiler creates an object file (usual extension .o</tt>) which contains all the information needed to create an executable minus the actual dependencies. This is done by using option -c</tt> for "compile only".

$ g95 main_program.f90 -c $ g95 sub1.f90 -c

Check that two new files have been created. By default, they should be named like the source file but with a .o</tt> extension. Again, this default behaviour could be changed with the option -o</tt>.

Then the two objects have to be linked together to create an executable file. This is done by invoking a the compiler on the object files:

$ g95 sub1.0 main_program.o -o myprog

This might seem like an extra step compared to the previous method but the advantage of this method is that you can modify one file independently of the other, re-create the object for the modified file only and then link the objects together. If you have many files in your project, that would save a significant amount of time.

Try to do it,

Note 1: Although you can modify the files independently, they still need to make sense in term of Fortran syntax. For instance, if you add an argument to the subroutine but don't modify the main program unit, the program might not work as expected.

Note 2: if you have many files in your project, understanding which ones need to be recompiled can be a complex task. This is why you are encouraged to use a "build system". The course contains an introduction to the make build system later on.

Dealing with modules
In this section we look at what happens when modules are used. The required files are in the directory called module</tt>.

$ cd ../module

This directory contains two files containing a main program unit and a module. Have a look at the files, compile them and create an executable.

After the module is compiled, you might have noticed that a new file has been created in the directory with the extension .mod</tt>. This file contains information about the module and when the compiler notices that a module is required, it will use this file to get the information. Therefore, this file is necessary when you compile any source file using the module. You can check it by deleting it and trying to compile the other file again.

Note: If you change the module significantly but don't recompile the file that uses the module, you might get funny problems too as the module file that was used to create the object is out of date. So the object should be recreated too.

By default, the compiler looks for the module files in the current directory. However, it is quite usual that some modules are stored elsewhere. For instance create a directory called include/</tt>, move the module file in it and then compile the program... you should get an error.

g95 main_program.f90 -c In file main_program.f90:3

use mymodule 1 Fatal Error: Can't open module file 'mymodule.mod' at (1) for reading: No such file or directory

Luckily, it is possible to tell the compiler to include another directory in its search. This is done with the option -I</tt> for include.

$ g95 main_program.f90 -I ./include -c $ g95 main_program.o mymodule.o -o myprog

All is well, the compiler can now find the relevant bits of information.

=Libraries=

The next step is to package groups of subroutines together in a library. The library can then be re-used by a different project.

$ cd ../example5

The directory contains a program and a directory called operations</tt> which we will use to build our first library.

$ cd operations $ ls addition.f90 multiplication.f90

The operations</tt> directory contains two Fortran files. They are actually modules and contains a set of functions that can be used to perform arithmetic operations. Have a look at them and then compile the two files to create the objects and module files.

The two object files can then be archived in a library with the command ar</tt> for archive:

$ ar rs liboperations.a *.o ar: creating liboperations.a

Note the "r" argument which means replace. You could call the library whatever you want. However, using the "lib" prefix will make the library easier to link to later on.

At this stage, the object files are not required anymore and could be deleted. However, you need to keep the library and the module files.

Now we will go back to the main program and create an object for it.

$ cd ..

Have a look at the main program file. It uses the two modules defined in the operations</tt> folder. Therefore, the module files will be required at the compilation stage. The library will be required at the linking stage.

To compile the file, use the option -I to let the compiler search the operations/</tt> directory for module files..

$ g95 main_program.f90 -c -i./operations

Then we need to link the newly created object to the library that was created earlier on. The simpliest method is to give the full path of the library to the compiler:

$ g95 main_program.o ./operations/liboperations.a -o myprog

However, it is common practice to put many libraries in a given location and let the compiler know where to find the libraries. This is achieved by using the option -L to specify the path to the libraries and then -l to specify which library to use. $ g95 main_program.o -L./operations -loperations -o myprog

Note that in the example above, the compiler will look for the library "liboperations.a" in the folder "./operations". If more than one library was required, you would have to specify them one after the other: "-llib1 -llib2 ..."

That's all you need to know about creating libraries. One important thing to remember is that Fortran programs will require both library and module files. Sometimes they are not kept in the some location and often the module files are kept in a directory called include/</tt> and the libraries in a directory called lib/</tt>. This is the case for the netCDF example that follows.

=A useful library: netCDF=

netCDF is fast becoming the de facto file format for model data. It is a well-designed format for storing your own data and also a convenient way to share data with your collaborators. One of the it's many benefits is that the file format is endian-independent. That means that we don't need to worry, for example, about which architecture (Linux, Unix, MS Windows, Mac OS etc.) the data was created on.

$ cd example6

This section contains a number of data-structures (derived types) and routines for reading and writing NetCDF files. They are bound up into several examples. Hopefully these will be useful to adopt in your own programs, so feel fee to copy and use them.

= To go further = That's it for the Fortran tuition. The Pragmatic Programming course continues with a practical about project management using "make".